question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Kubeflow 1.0RC4 metadata config fails

See original GitHub issue

Kubernetes: 1.15 Kubeflow: 1.0RC4 TFX: 0.21.0

While testing

taxi_pipeline_kubeflow_local.py

I got:

Traceback (most recent call last):
  File "/tfx-src/tfx/orchestration/kubeflow/container_entrypoint.py", line 371, in <module>
    main()
  File "/tfx-src/tfx/orchestration/kubeflow/container_entrypoint.py", line 345, in main
    _get_metadata_connection_config(kubeflow_metadata_config))
  File "/tfx-src/tfx/orchestration/kubeflow/container_entrypoint.py", line 93, in _get_metadata_connection_config
    kubeflow_metadata_config.grpc_config)
  File "/tfx-src/tfx/orchestration/kubeflow/container_entrypoint.py", line 110, in _get_grpc_metadata_connection_config
    kubeflow_metadata_config.grpc_service_host)
TypeError: None has type NoneType, but expected one of: bytes, unicode

While in the past TFX versions I had issues described in https://github.com/tensorflow/tfx/issues/1002 , now TFX is getting metadata config via grpc but it’s not getting the configs expected (maybe Kubeflow new version is also involved).

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

3reactions
lipinskicommented, Mar 8, 2020

To solve the issue, you should change the configuration:

metadata_config = kubeflow_dag_runner.get_default_kubeflow_metadata_config()
metadata_config.mysql_db_service_host.value = 'mysql.kubeflow'
metadata_config.mysql_db_service_port.value = "3306"
metadata_config.mysql_db_name.value = "metadb"
metadata_config.mysql_db_user.value = "root"
metadata_config.mysql_db_password.value = ""
metadata_config.grpc_config.grpc_service_host.value ='metadata-grpc-service'
metadata_config.grpc_config.grpc_service_port.value ='8080'

runner_config = kubeflow_dag_runner.KubeflowDagRunnerConfig(
    kubeflow_metadata_config=metadata_config
)
2reactions
numerologycommented, Feb 14, 2020

Unfortunately that’s a known issue. Kubeflow full-fledge deployment does not have the right MLMD config to use gRPC as in TFX 0.21.0. There are two solution to this issue:

  1. Can you try a standalone KFP deployment (this is the only thing you need to run TFX pipeline, if you do not use Kubeflow notebook, katib and so on) with version >= 0.2.1? You can find deploy instruction here

  2. We can work out a kubeflow_metadata_config that works with full fledge kubeflow deployment, might take 1 or 2 days.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Configure Notebook Culling — Rok 2.0 documentation
Configure Notebook Culling¶. The Notebook Controller periodically checks for the state of every Notebook Server. You can inspect the Last activity of each ......
Read more >
Argo CD 是Kubernetes 的声明性GitOps 持续交付工具 - Gitee
Why Argo CD? Application definitions, configurations, and environments should be declarative and version controlled. Application deployment and lifecycle ...
Read more >
gpu - 지구별 여행자
kubeflow 에서는 인증/권한 기능을 위해서 istio 를 사용합니다. ... a failure. ... nvidia-device-plugin-daemonset 1.12; kubeflow 1.0RC4 with istio 1.3 ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found