question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RuntimeParam support for Transform custom_config.

See original GitHub issue

System information

  • TFX Version (you are using):
  • Environment in which you plan to use the feature (e.g., Local (Linux/MacOS/Windows), Interactive Notebook, Google Cloud, etc…):
  • Are you willing to contribute it (Yes/No):

Describe the feature and the current behavior/state. Today Transform does not support Runtimeparams on the custom config see:https://github.com/tensorflow/tfx/blob/5b3b3fc2c903b76f8c75ba04ac030405428b6160/tfx/components/transform/component.py#L104

Will this change the current API? How? Yes it will allow the usage of RuntimeParams for Transform.

Who will benefit with this feature? users that want their pipeline dynamic.

Do you have a workaround or are completely blocked by this? We saw the dev done: https://github.com/tensorflow/tfx/pull/4077/files#diff-1f27fc087329a6364e8a9f610772be8bceb7cf469f25eb56c85ce404c93c5476R199

But not sure how to support that for Transform.

Name of your Organization (Optional)

Any Other info. Reproducible Code:

import json
from typing import Text
from tfx.components import Transform
from tfx.dsl.components.common.importer import Importer
from tfx.extensions.google_cloud_big_query.example_gen import component as big_query_example_gen_component
from tfx.orchestration import pipeline
from tfx.orchestration.data_types import RuntimeParameter
from tfx.orchestration.kubeflow.v2.kubeflow_v2_dag_runner import KubeflowV2DagRunner, KubeflowV2DagRunnerConfig
from kfp.v2.google import client
from tfx.proto import example_gen_pb2
from tfx.types import standard_artifacts

from pipelines.constants import GCS_BUCKET_NAME, GCP_REGION, GOOGLE_CLOUD_PROJECT


def create_full_training_pipeline(pipeline_root: str, _beam_args: dict) -> pipeline.Pipeline:

    config_transform = RuntimeParameter(
        name="config_transform",
        ptype=Text
    )

    example_gen = big_query_example_gen_component.BigQueryExampleGen(query='select 1') \
        .with_id('ExampleGenR')
    fake_import = Importer('pipo', artifact_type=standard_artifacts.Schema)
    transform = Transform(example_gen.outputs['examples'], fake_import.outputs['result'],
                          module_file='/tmp/test.py', custom_config=config_transform)
    return pipeline.Pipeline(
        beam_pipeline_args=_beam_args,
        pipeline_name="full-training",
        pipeline_root=pipeline_root,
        components=[
            example_gen,
            fake_import,
            transform
        ],
    )

_temp_location = 'gs://{}/pipeline_tmp/{}'.format(GCS_BUCKET_NAME, 'test')
_beam_pipeline_args = [
    '--runner=DirectRunner',
    '--direct_running_mode=in_memory',
    '--direct_num_workers=0',
    '--temp_location=' + _temp_location,
    '--project=' + GOOGLE_CLOUD_PROJECT,
    '--region=' + GCP_REGION
]
training_pipeline = create_full_training_pipeline(
    pipeline_root=_temp_location,
    _beam_args=_beam_pipeline_args
)
PIPELINE_DEFINITION_FILE = '/tmp/test'
runner = KubeflowV2DagRunner(
    config=KubeflowV2DagRunnerConfig(),
    output_filename=PIPELINE_DEFINITION_FILE)
_ = runner.run(training_pipeline)

pipelines_client = client.AIPlatformClient(
    project_id=GOOGLE_CLOUD_PROJECT,
    region="europe-west4",
)

pipelines_client.create_run_from_job_spec(PIPELINE_DEFINITION_FILE,
                                          parameter_values={"config_transform": json.dumps({'test': 'coucou'})}
                                          )

```
Raises:
> The pipeline parameter config_transform is not found in the pipeline job input definitions.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
arghyagangulycommented, Aug 12, 2021

@tanguycdls , i rechecked and you’re correct that this was for trainer.The feature is still not there.Let us get back to you.Thanks.

1reaction
1025KBcommented, Aug 12, 2021

Yep, it’s not currently supported, typehint needs to be changed to add RuntimeParam like this thus you can pass in custom_config as a json str dir in runtime, we will add that in later version

Read more comments on GitHub >

github_iconTop Results From Across the Web

tfx.v1.dsl.experimental.RuntimeParameter - TensorFlow
Currently only supported on KubeflowDagRunner. For protos, use text type RuntimeParameter, which holds the proto json string, e.g., ...
Read more >
ASP.Net Web Application Add Config Transform Grayed Out
Config shows the context menu but the option for "Add Config Transform" is grayed out. I'm at a loss. The new project is...
Read more >
tfx Changelog - pyup.io
TFX Transform now supports reading raw and materializing transformed data in ... Added RuntimeParam support for Trainer's custom_config.
Read more >
PSFramework.psm1 1.3.140-preview2 - PowerShell Gallery
All paths are sent through Resolve-Path in order to convert them to the ... are not supported and require using New-PSSession to establish...
Read more >
tfx.orchestration.data_types.RuntimeParameter Example
RuntimeParameter (name='module-file', ptype=str) transform = component. ... Channel): raise ValueError( "Conditional only support using channel as a ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found