emr_pyspark_step_launcher fails with packaged airflow dags
See original GitHub issueWhen the EmrPysparkStepLauncher(deploy_local_pipeline_package=True
) is used in conjunction with dagster-airflow where the dag is a packaged dag (i.e. in a zip file), emr_pyspark_step_launcher
post_artifacts
will zip up the already zipped dag.
This results in an EMR artifact code.zip
containing {packaged_dag}.zip
which cannot be imported in pyspark EMR-land.
The step_run_ref
that gets pickled and executed from EMR also references an import of {packaged_dag}.zip
.
The only resolution I see at the moment is to unzip the packaged dag prior to post_artifacts
.
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (4 by maintainers)
Top Results From Across the Web
Apache Airflow giving broken DAG error cannot import ...
The import sequence of tasks in the dag file are as follows: from datetime import timedelta # The DAG object; we'll need this...
Read more >1.1.7 (core) / 0.17.7 (libraries) - Dagster Docs
[dagstermill] Failed notebooks can be saved for inspection and debugging using the new save_on_notebook_failure parameter. [dagster-airflow] Added a new option ...
Read more >DAGs, Operators, Connections, and other issues in Apache ...
The topics on this page contain errors and resolutions to Apache Airflow v1.10.12 Python dependencies, custom plugins, DAGs, Operators, Connections, tasks, ...
Read more >D3433.id16352.diff
+ tools that allowed developers to write Dagster pipelines and then compile them into Airflow DAGs. + for execution. We've now added ingestion...
Read more >Source code for airflow.providers.amazon.aws.example_dags ...
This is an example dag for a AWS EMR Pipeline with auto steps. """ from datetime import timedelta from airflow import DAG from ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hey! Thanks @Benyuel for bringing this to our attention. We don’t have the bandwidth to investigate a fix right now, but as we head into 0.12.0 planning in a month or so, will be happy to touch base and revisit if you are still running into this.
bump @dpeng817?