Dataflow workers not able to install tfx from requirements file due to `no-binary` option from beam stager
See original GitHub issueWhen no Beam packaging arguments are provided by the user, TFX generates a requirements file with the tfx
package inside.
This ends up failing on Dataflow, because the Beam stager uses pip’s --no-binary
flag: https://github.com/apache/beam/blob/v2.15.0/sdks/python/apache_beam/runners/portability/stager.py#L483.
Indeed, in a fresh virtualenv (Python 3.6.3):
pip download tfx==0.14.0 --no-binary :all:
Collecting tfx==0.14.0
ERROR: Could not find a version that satisfies the requirement tfx==0.14.0 (from versions: none)
ERROR: No matching distribution found for tfx==0.14.0
Whereas if I remove the --no-binary
flag, it works just fine.
I’m not all that knowledgable about Python packaging, but is this because TFX is built as a wheel? Is there some Beam option I can pass to make this work?
Issue Analytics
- State:
- Created 4 years ago
- Reactions:7
- Comments:18 (6 by maintainers)
Top Results From Across the Web
Can't pass in Requirements.txt for Dataflow - Stack Overflow
txt file which I believe I'm passing in correctly. My pipeline code: import apache_beam as beam from apache_beam.runners.interactive.
Read more >Facing issues running tensorflow_io library on dataflow in a tfx ...
I am currently facing a related issue on dataflow when using tfx library. The tfx pipeline works fine locally but it fails on...
Read more >Managing Python Pipeline Dependencies - Apache Beam
txt file and delete all packages that are not relevant to your code. Run your pipeline with the following command-line option: --requirements_file requirements....
Read more >Re: Pipeline is passing on local runner and failing on Dataflow ...
The file does not include apache-beam package, only apache-airflow==1.9.0 ... the package related to indexes.base is not installed in >> the workers.
Read more >Degree Programs UNDERGRADUATE CATALOG
PrintFX and FabLab ... principles of visual organization, including the ability to work with ... There is no option for appeal of this...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi @tejaslodaya - glad you found a workaround, but it is just that - a workaround. That said I’m going to keep this open.
Hi @andrewsmartin and @charlesccychen
I managed to solve this issue by doing these steps:
_populate_requirements_cache
function and remove these two lines ‘–no-binary’, ‘:all:’In my case, I had created conda environment and changed this file:
~/miniconda3/envs/tfx_test/lib/python3.7/site-packages/apache_beam/runners/portability/stager.py
where my environment name istfx_test
.This solves the issue.