question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Transform fails when setting force_tf_compat_v1=False

See original GitHub issue

System information - Have I specified the code to reproduce the issue (Yes/No): Yes

  • Environment in which the code is executed (e.g., Local (Linux/MacOS/Windows), Interactive Notebook, Google Cloud, etc): KubeFlow
  • TensorFlow version (you are using): 2.4.0
  • TFX Version: 0.27.0
  • Python version: 3.7

Describe the current behavior Using to tf.strings.substr operations in the preprocessing fn used in Transform when setting force_tf_compat_v1=False fails while running in KubeFlow. Setting force_tf_compat_v1=True works. Note: running in interactive mode, using force_tf_compat_v1=True causes python to crash and hence can’t be tested properly.

Describe the expected behavior Use native tf2 behaviour should work.

Standalone code to reproduce the issue Providing a bare minimum test case or step(s) to reproduce the problem will greatly help us to debug the issue. If possible, please share a link to Colab/Jupyter/any notebook. Using attached file as preprocessing in KubeFlow with data generated by running (transform.py.zip):

data_path = "data.csv"
df = pd.DataFrame(data=[[random.randint(0,100), '2021-01-12T11:34:08'] for i in range(0, 100)], columns=["random_int", "datetime"])
df.head()
df.to_csv(os.path.join(data_root, data_path), index=False)

Other info / logs This is the operation that I’m using in my preprocessing fn:

dt_str = tf.constant('2021-01-12T11:34:08')

year_str = tf.strings.substr(dt_str, pos=0, len=4, unit='UTF8_CHAR')
month_str = tf.strings.substr(dt_str, pos=5, len=2, unit='UTF8_CHAR')
day_str = tf.strings.substr(dt_str, pos=8, len=2, unit='UTF8_CHAR')
hour_str = tf.strings.substr(dt_str, pos=11, len=2, unit='UTF8_CHAR')
minute_str = tf.strings.substr(dt_str, pos=14, len=2, unit='UTF8_CHAR')
second_str = tf.strings.substr(dt_str, pos=17, len=2, unit='UTF8_CHAR')

Running locally this works in both eager and graph mode by setting tf.config.run_functions_eagerly(False/True). However when running it through the Transform component and setting force_tf_compat_v1=False it fails with the following message:

Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.7/dist-packages/tfx/orchestration/kubeflow/container_entrypoint.py", line 360, in <module>
    main()
  File "/usr/local/lib/python3.7/dist-packages/tfx/orchestration/kubeflow/container_entrypoint.py", line 353, in main
    execution_info = launcher.launch()
  File "/usr/local/lib/python3.7/dist-packages/tfx/orchestration/launcher/base_component_launcher.py", line 209, in launch
    copy.deepcopy(execution_decision.exec_properties))
  File "/usr/local/lib/python3.7/dist-packages/tfx/orchestration/launcher/in_process_component_launcher.py", line 72, in _run_executor
    copy.deepcopy(input_dict), output_dict, copy.deepcopy(exec_properties))
  File "/pipeline/kubeflow/custom_components/transform_master/executor.py", line 493, in Do
    self.Transform(label_inputs, label_outputs, status_file)
  File "/pipeline/kubeflow/custom_components/transform_master/executor.py", line 1074, in Transform
    len(analyze_data_paths))
  File "/pipeline/kubeflow/custom_components/transform_master/executor.py", line 1209, in _RunBeamImpl
    preprocessing_fn, pipeline=pipeline))
  File "/usr/local/lib/python3.7/dist-packages/apache_beam/transforms/ptransform.py", line 1058, in __ror__
    return self.transform.__ror__(pvalueish, self.label)
  File "/usr/local/lib/python3.7/dist-packages/apache_beam/transforms/ptransform.py", line 573, in __ror__
    result = p.apply(self, pvalueish, label)
  File "/usr/local/lib/python3.7/dist-packages/apache_beam/pipeline.py", line 646, in apply
    return self.apply(transform, pvalueish)
  File "/usr/local/lib/python3.7/dist-packages/apache_beam/pipeline.py", line 689, in apply
    pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
  File "/usr/local/lib/python3.7/dist-packages/apache_beam/runners/runner.py", line 188, in apply
    return m(transform, input, options)
  File "/usr/local/lib/python3.7/dist-packages/apache_beam/runners/runner.py", line 218, in apply_PTransform
    return transform.expand(input)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_transform/beam/impl.py", line 1140, in expand
    self).expand(self._make_parent_dataset(dataset))
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_transform/beam/impl.py", line 1087, in expand
    evaluate_schema_overrides=False)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_transform/schema_inference.py", line 196, in infer_feature_schema_v2
    metadata = collections.defaultdict(list, concrete_metadata_fn(inputs))
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 1669, in __call__
    return self._call_impl(args, kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 1679, in _call_impl
    cancellation_manager)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 1762, in _call_with_structured_signature
    cancellation_manager=cancellation_manager)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 1919, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 560, in call
    ctx=ctx)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError:  pos 5 out of range for string at index 0
	 [[node Substr_1 (defined at pipeline/kubeflow/model/preprocessing/transform.py:96) ]] [Op:__inference_metadata_fn_1820]
Errors may have originated from an input operation.
Input Source operations connected to node Substr_1:
 inputs_copy (defined at usr/local/lib/python3.7/dist-packages/tensorflow_transform/tf_utils.py:81)
Function call stack:
metadata_fn

When setting force_tf_compat_v1=False it works as expected.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:40 (25 by maintainers)

github_iconTop GitHub Comments

1reaction
ConverJenscommented, Apr 11, 2022

@varshaan This issue now seems to be resolved. Thank you very much for your help!

1reaction
axeltidemanncommented, Nov 3, 2021

@ConverJens I don’t know for 1.3.0, I am on TFX 1.2.0 which does not have this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to execute the right Python to import the installed ...
I run the following in Jupyter Notebook. Firstly, I installed the packages %%bash pip uninstall -y google-cloud-dataflow pip install --upgrade ...
Read more >
tf.executing_eagerly | TensorFlow v2.11.0
Executing inside a transformation function for tf.dataset . tf.compat.v1.disable_eager_execution() is called.
Read more >
Release history — NengoDL 3.5.0 docs
The previous graph-mode behaviour can be restored by calling tf.compat.v1.disable_eager_execution() , but we cannot guarantee that that behaviour will be ...
Read more >
Field Reference - The workflow engine for Kubernetes
RetryStrategy for all templates in the io.argoproj.workflow.v1alpha1. schedulerName, string, Set scheduler name for all pods. Will be overridden if container/ ...
Read more >
Getting Started with RLlib — Ray 2.2.0 - the Ray documentation
Each algorithm has specific hyperparameters that can be set with --config , see the algorithms ... All RLlib algorithms are compatible with the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found