Celery & Docker: `docker.errors.DockerException: Command 'dagster api execute_step {...} returned non-zero exit status 1`
See original GitHub issueSummary
I’ve tried to run a Dagster pipeline with the following configuration:
- Separate gRPC server for pipelines
- Docker run launcher
- Celery-Docker executor
And got an error without any useful logs. It just says that process failed with en exit code and I can’t find any additional information anywhere. So I can’t even understand whether my configuration is wrong or it’s a bug inside Dagster
celery-worker_1 | Traceback (most recent call last):
celery-worker_1 | File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 450, in trace_task
celery-worker_1 | R = retval = fun(*args, **kwargs)
celery-worker_1 | File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 731, in __protected_call__
celery-worker_1 | return self.run(*args, **kwargs)
celery-worker_1 | File "/usr/local/lib/python3.8/site-packages/dagster_celery_docker/executor.py", line 282, in _execute_step_docker
celery-worker_1 | docker_response = client.containers.run(
celery-worker_1 | File "/usr/local/lib/python3.8/site-packages/docker/models/containers.py", line 848, in run
celery-worker_1 | raise ContainerError(
celery-worker_1 | docker.errors.DockerException: Command 'dagster api execute_step "{\"__class__\": \"ExecuteStepArgs\",
\"instance_ref\": {\"__class__\": \"InstanceRef\", \"compute_logs_data\": {\"__class__\": \"ConfigurableClassData\",
\"class_name\": \"LocalComputeLogManager\", \"config_yaml\": \"base_dir: /opt/dagster/dagster_home/storage\\n\",
\"module_name\": \"dagster.core.storage.local_compute_log_manager\"}, \"custom_instance_class_data\": null,
\"event_storage_data\": {\"__class__\": \"ConfigurableClassData\", \"class_name\": \"PostgresEventLogStorage\",
\"config_yaml\": \"postgres_db:\\n db_name:\\n env: DAGSTER_POSTGRES_DB\\n hostname: postgresql\\n password:\\n
env: DAGSTER_POSTGRES_PASSWORD\\n port: 5432\\n username:\\n env: DAGSTER_POSTGRES_USER\\n\",
\"module_name\": \"dagster_postgres.event_log\"}, \"local_artifact_storage_data\": {\"__class__\": \"ConfigurableClassData\",
\"class_name\": \"LocalArtifactStorage\", \"config_yaml\": \"base_dir: /opt/dagster/dagster_home/\\n\", \"module_name\":
\"dagster.core.storage.root\"}, \"run_coordinator_data\": {\"__class__\": \"ConfigurableClassData\", \"class_name\":
\"QueuedRunCoordinator\", \"config_yaml\": \"{}\\n\", \"module_name\": \"dagster.core.run_coordinator\"},
\"run_launcher_data\": {\"__class__\": \"ConfigurableClassData\", \"class_name\": \"DockerRunLauncher\", \"config_yaml\":
\"env_vars:\\n- DAGSTER_POSTGRES_USER\\n- DAGSTER_POSTGRES_PASSWORD\\n-
DAGSTER_POSTGRES_DB\\nnetworks:\\n- dagster-celery-docker-bug_default\\n\", \"module_name\": \"dagster_docker\"},
\"run_storage_data\": {\"__class__\": \"ConfigurableClassData\", \"class_name\": \"PostgresRunStorage\", \"config_yaml\":
\"postgres_db:\\n db_name:\\n env: DAGSTER_POSTGRES_DB\\n hostname: postgresql\\n password:\\n env:
DAGSTER_POSTGRES_PASSWORD\\n port: 5432\\n username:\\n env: DAGSTER_POSTGRES_USER\\n\",
\"module_name\": \"dagster_postgres.run_storage\"}, \"schedule_storage_data\": {\"__class__\": \"ConfigurableClassData\",
\"class_name\": \"PostgresScheduleStorage\", \"config_yaml\": \"postgres_db:\\n db_name:\\n env:
DAGSTER_POSTGRES_DB\\n hostname: postgresql\\n password:\\n env: DAGSTER_POSTGRES_PASSWORD\\n port:
5432\\n username:\\n env: DAGSTER_POSTGRES_USER\\n\", \"module_name\": \"dagster_postgres.schedule_storage\"},
\"scheduler_data\": {\"__class__\": \"ConfigurableClassData\", \"class_name\": \"DagsterDaemonScheduler\", \"config_yaml\":
\"{}\\n\", \"module_name\": \"dagster.core.scheduler\"}, \"settings\": {}}, \"known_state\": {\"__class__\":
\"KnownExecutionState\", \"dynamic_mappings\": {}, \"previous_retry_attempts\": {}}, \"pipeline_origin\": {\"__class__\":
\"PipelinePythonOrigin\", \"pipeline_name\": \"demo_pipeline\", \"repository_origin\": {\"__class__\":
\"RepositoryPythonOrigin\", \"code_pointer\": {\"__class__\": \"FileCodePointer\", \"fn_name\": \"demo_pipeline\",
\"python_file\": \"/pipelines/demo.py\", \"working_directory\": \"/opt/dagster/dagster_home\"}, \"container_image\":
\"pipelines-image\", \"executable_path\": \"/usr/local/bin/python\"}}, \"pipeline_run_id\": \"be551732-9fb6-4faf-b0cb-
3d4c44b3ce72\", \"retry_mode\": {\"__enum__\": \"RetryMode.DEFERRED\"}, \"should_verify_step\": false,
\"step_keys_to_execute\": [\"hello\"]}"' in image 'pipelines-image' returned non-zero exit status 1
Reproduction
I’ve prepared full environment for reproduction: https://github.com/VladX09/dagster-celery-docker-bug/tree/non-zero-exit-example Just run follow the README.md You will see this error in executor logs and Dagit UI.
Additional Info about Your Environment
MacOS 11.4 Docker 3.0.1 (engine 20.10.0)
Also reproduces in production environment.
Message from the maintainers:
Impacted by this bug? Give it a 👍. We factor engagement into prioritization.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:22
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Source code for dagster_celery_docker.executor - Dagster Docs
Source code for dagster_celery_docker.executor. import os import docker.client from dagster_celery.config import DEFAULT_CONFIG, dict_wrapper from ...
Read more >Celery failing to start in docker - python - Stack Overflow
I'm starting celery using the following Dockerfile: FROM python:3.7-alpine RUN apk add --no-cache postgresql-libs && \ apk add --no-cache ...
Read more >No tasks logs when using docker-compose celery #522 - GitHub
I have a dag that fails when I used celery but it runs fine when I use local executor. ... returned non-zero exit...
Read more >Dagster Celery Docker - :: Anaconda.org
noarch v1.1.6. conda install. To install this package run one of the following: conda install -c conda-forge dagster-celery-docker. Description.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
OK, I’ve understood, variables for solid’s container are forwarded from Celery worker node. Anyway it will be nice to have an option to disable
auto_remove
Closing since it looks like this is resolved (plenty of docs works to do though). Please reopen if that’s not the case