question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Caching in tfx and kfp

See original GitHub issue

I’m confused about caching when running a TFX pipeline on KFP.

  • I have KFP 1.0.4 deployed via GCP AI platform and I’ve been using both TFX 0.25 and 0.26 when trying this.
  • I created my pipeline with enable_cache=False.
  • I ran it once and each component runs as expected and produces the expected artifacts (only checked the artifact buckets and not the sql database).
  • I ran it again with no change in inputs or parameters and it’s now using cached results even though I disabled the cache. The following images show the logs for the second run where I would have expected them to run again.
  • I repeated this for an entirely new deployment of KFP for both cases.

With TFX 0.25 image

With TFX 0.26 image

Questions:

  • Why is the enable_cache=False not respected?
  • KFP documentation mentions that its caching mechanisms should not be used for TFX pipelines. Why am I seeing a message about a cached step from KFP in the TFX 0.26 case rather than from the TFX component drivers?
  • Can you enable/disable caching on a per component basis?
  • Is it possible to get some logs for why a cached result was used / not used?
  • Is there anyway to get TFX to also consider if the docker image has changed when determining if a cached result is invalidated?
    • I’m guessing no since it’s determined by the driver which would run inside the (updated) container?
    • Would the custom container based component handle this differently?
  • Is there a better place to ask these kind of questions?

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:2
  • Comments:16 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
ssoudancommented, May 27, 2021

Something like this works for me:

  def dont_cache_components():
    def _dont_cache(pod_op):
      pod_op.add_pod_annotation('pipelines.kubeflow.org/max_cache_staleness', 'P0D')
      return pod_op

    return _dont_cache

  # ... 
  pipeline_op_funcs = kubeflow_dag_runner.get_default_pipeline_operator_funcs()
  pipeline_op_funcs.append(dont_cache_components())
  # ...

  runner_config = kubeflow_dag_runner.KubeflowDagRunnerConfig(
    pipeline_operator_funcs=pipeline_op_funcs,
    kubeflow_metadata_config=metadata_config,
    tfx_image=tfx_image)
1reaction
easadlercommented, Feb 5, 2021

@johnPertoft this is also a blocker for me. We will have to pivot towards vanilla kubeflow, which really isn’t that bad to use. I will miss the interactive runner, but using the kfp is pretty easy. It would be nice to have kubeflow components for each of the tfx steps. Probably will happen eventually.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Building TFX pipelines - TensorFlow
Caching. TFX pipeline caching lets your pipeline skip over components that have been executed with the same set of inputs in a previous...
Read more >
Caching | Kubeflow
Kubeflow Pipelines caching provides step-level output caching. And caching is enabled by default for all pipelines submitted through the KFP ...
Read more >
Workflow Orchestration - Apache Beam
This section describes two orchestrated ML workflows, one with Kubeflow Pipelines (KFP) and one with Tensorflow Extended (TFX). These two ...
Read more >
Guided Project 1 - | notebook.community
Learn how to generate a standard TFX template pipeline using tfx template ... %%bash TFX_PKG="tfx==0.22.0" KFP_PKG="kfp==0.5.1" pip freeze | grep $TFX_PKG ...
Read more >
Build a pipeline | Vertex AI | Google Cloud
To learn more about using Vertex AI Pipelines to run a TFX pipeline, ... The kfp.dsl package contains the domain-specific language (DSL) that...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found