Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Caching in tfx and kfp

See original GitHub issue

I’m confused about caching when running a TFX pipeline on KFP.

I have KFP 1.0.4 deployed via GCP AI platform and I’ve been using both TFX 0.25 and 0.26 when trying this.
I created my pipeline with enable_cache=False.
I ran it once and each component runs as expected and produces the expected artifacts (only checked the artifact buckets and not the sql database).
I ran it again with no change in inputs or parameters and it’s now using cached results even though I disabled the cache. The following images show the logs for the second run where I would have expected them to run again.
I repeated this for an entirely new deployment of KFP for both cases.

With TFX 0.25

With TFX 0.26

Questions:

Why is the enable_cache=False not respected?
KFP documentation mentions that its caching mechanisms should not be used for TFX pipelines. Why am I seeing a message about a cached step from KFP in the TFX 0.26 case rather than from the TFX component drivers?
Can you enable/disable caching on a per component basis?
Is it possible to get some logs for why a cached result was used / not used?
Is there anyway to get TFX to also consider if the docker image has changed when determining if a cached result is invalidated?
- I’m guessing no since it’s determined by the driver which would run inside the (updated) container?
- Would the custom container based component handle this differently?
Is there a better place to ask these kind of questions?

Issue Analytics

State:
Created 3 years ago
Reactions:2
Comments:16 (6 by maintainers)

Top GitHub Comments

1reaction

ssoudancommented, May 27, 2021

Something like this works for me:

  def dont_cache_components():
    def _dont_cache(pod_op):
      pod_op.add_pod_annotation('pipelines.kubeflow.org/max_cache_staleness', 'P0D')
      return pod_op

    return _dont_cache

  # ... 
  pipeline_op_funcs = kubeflow_dag_runner.get_default_pipeline_operator_funcs()
  pipeline_op_funcs.append(dont_cache_components())
  # ...

  runner_config = kubeflow_dag_runner.KubeflowDagRunnerConfig(
    pipeline_operator_funcs=pipeline_op_funcs,
    kubeflow_metadata_config=metadata_config,
    tfx_image=tfx_image)

1reaction

easadlercommented, Feb 5, 2021

@johnPertoft this is also a blocker for me. We will have to pivot towards vanilla kubeflow, which really isn’t that bad to use. I will miss the interactive runner, but using the kfp is pretty easy. It would be nice to have kubeflow components for each of the tfx steps. Probably will happen eventually.

Top Results From Across the Web

Building TFX pipelines - TensorFlow

Caching. TFX pipeline caching lets your pipeline skip over components that have been executed with the same set of inputs in a previous...

Caching | Kubeflow

Kubeflow Pipelines caching provides step-level output caching. And caching is enabled by default for all pipelines submitted through the KFP ...

Workflow Orchestration - Apache Beam

This section describes two orchestrated ML workflows, one with Kubeflow Pipelines (KFP) and one with Tensorflow Extended (TFX). These two ...

Guided Project 1 - | notebook.community

Learn how to generate a standard TFX template pipeline using tfx template ... %%bash TFX_PKG="tfx==0.22.0" KFP_PKG="kfp==0.5.1" pip freeze | grep $TFX_PKG ...

Build a pipeline | Vertex AI | Google Cloud

To learn more about using Vertex AI Pipelines to run a TFX pipeline, ... The kfp.dsl package contains the domain-specific language (DSL) that...