question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

kubernetes executor: python3: argument list too long for complex dependency graphs

See original GitHub issue

Summary

I have a re-execution of a job with a complex dynamic DAG (> 1K ops, of which i 600 ran during the first try) i’m using the kubernetes executor to launch the steps. the kubernetes executor adds a “known_state” to the list of python arguments to launch the pod:

this “known_state” object seems to be huge (3K lines) and the pod fails with:

exec /usr/bin/python3: argument list too long

Reproduction

any job with a large enough DAG will trigger this (at least in its re-execution)

Dagit UI/UX Issue Screenshots

image

image

Additional Info about Your Environment


Message from the maintainers:

Impacted by this bug? Give it a 👍. We factor engagement into prioritization.

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:6
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
sinnfashencommented, Oct 21, 2022

Hi @johannkm - we have upgraded dagster to v1.0.13 and attempted to run a job which includes ~600 dynamically generated ops. However, we are able to consistently replicate the error described in this issue: exec /opt/conda/envs/user/bin/dagster: argument list too long, where all ops fail at around the same time, both in a standalone run and via job re-execution.

We have confirmed that the known_state field is represented as part of the DAGSTER_EXECUTE_STEP_ARGS env var as opposed to being included as in a CLI arg, in accordance with the merged fix. Can it be possible that the issue is not actually fully resolved yet / is there still a limit to how large a single workflow can be?

Screen Shot 2022-10-17 at 9 44 49 AM Screen Shot 2022-10-17 at 9 44 12 AM
0reactions
johannkmcommented, Nov 1, 2022

The above fix will go out in the 1.0.16 release this Thursday

Read more comments on GitHub >

github_iconTop Results From Across the Web

Argument list too long application failures - Azure
Solution: Shorten the argument list. Eliminate any redundant or unnecessary arguments that you specify for the executable.
Read more >
The Kubernetes executor for GitLab Runner
The Kubernetes executor, when used with GitLab CI, connects to the Kubernetes API in the cluster creating a Pod for each GitLab CI...
Read more >
Spark Streaming Programming Guide
This guide shows you how to start writing Spark Streaming programs with DStreams. You can write Spark Streaming programs in Scala, Java or...
Read more >
1.1.7 (core) / 0.17.7 (libraries) - Dagster Docs
[dagit] When viewing the config dialog for a run with a very long config, ... Google dependencies. dagster-gcp now supports google-api-python-client 2.x.
Read more >
Concepts - Apache Airflow Documentation - Read the Docs
DAGs¶. In Airflow, a DAG – or a Directed Acyclic Graph – is a collection of all the tasks you want to run,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found