Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

TFX CLI with Kubeflow / AI Platform Pipelines runtime context missing when output is taken from cache

See original GitHub issue

System information

Have I specified the code to reproduce the issue (Yes/No): no (Taxicab example works well)
Environment in which the code is executed (e.g., Local (Linux/MacOS/Windows), Interactive Notebook, Google Cloud, etc): AI Platform Pipelines
TensorFlow version (you are using): /
TFX Version: - Python version: 0.28.0

Describe the current behavior The pipeline deployed with the TFX CLI runs into the following error

Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/kubeflow/container_entrypoint.py", line 360, in <module>
    main()
  File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/kubeflow/container_entrypoint.py", line 353, in main
    execution_info = launcher.launch()
  File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/launcher/base_component_launcher.py", line 198, in launch
    self._exec_properties)
  File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/launcher/base_component_launcher.py", line 167, in _run_driver
    component_info=self._component_info)
  File "/opt/conda/lib/python3.7/site-packages/tfx/dsl/components/base/base_driver.py", line 270, in pre_execution
    driver_args, pipeline_info)
  File "/opt/conda/lib/python3.7/site-packages/tfx/dsl/components/base/base_driver.py", line 158, in resolve_input_artifacts
    producer_component_id=input_channel.producer_component_id)
  File "/opt/conda/lib/python3.7/site-packages/tfx/orchestration/metadata.py", line 948, in search_artifacts
    pipeline_info)
RuntimeError: Pipeline run context for PipelineInfo(pipeline_name: sentiment4, pipeline_root: gs://sascha-playground-doit-kubeflowpipelines-default/sentiment4, run_id: sentiment4-qnknl) does not exist

First run: Screenshot 2021-03-15 at 07 40 55

Second run with additional component Screenshot 2021-03-15 at 07 51 28

Steps to reproduce

deploy pipeline with one component
run pipeline with one component (👍 works)
add another component
run the pipeline (this time the output is taken from cache) (👎 fails)

Assume the second component doesn’t find the cached data because the component did not exist in the first run.

Describe the expected behavior The second component get’s executed

Standalone code to reproduce the issue Taxicab sample works fine as a test case

Name of your Organization (Optional) /

Other info / logs

Issue Analytics

State:
Created 3 years ago
Comments:17 (8 by maintainers)

Top GitHub Comments

1reaction

Bobgycommented, Jun 10, 2021

Refer to https://github.com/kubeflow/pipelines/issues/5303#issuecomment-851904651, the bug is fixed on KFP 1.6.0, root cause is that KFP cache server used a hacky way to detect tfx pods, and it changed in newer versions.

I am working on releasing KFP 1.6.0+ to mkp

0reactions

google-ml-butler[bot]commented, Jan 27, 2022

Are you satisfied with the resolution of your issue? Yes No

Top Results From Across the Web

Kubeflow / AI Platform Pipelines runtime context missing when ...

run pipeline with one component ( works); add another component; run the pipeline (this time the output is taken from cache) ( fails)....

Continuous training with TFX and Cloud AI Platform

In this lab, you use the TFX CLI utility to build and deploy a TFX pipeline that uses Kubeflow pipelines for orchestration, AI...

tfx Changelog - pyup.io

Output artifacts from multiple invocations of the same component are given ... TFX CLI now supports runtime parameter on Kubeflow, Vertex, and Airflow....

ML model monitoring: Logging serving requests by using AI ...

This guide describes the following: How to serve a Keras model with TensorFlow 2.3 by using AI Platform Prediction. How to configure a...

Evaluation of MLOps Tools for Kubernetes - Diva Portal

context, with the focus being on their integration into this ecosystem. ... The platforms are Kubeflow, Pachyderm, and Polyaxon, and.