EcsRunLauncher tasks fail to start with CLI error
See original GitHub issueDagster version
dagster, version 1.0.6
What’s the issue?
When attempting to launch a run using the EcsRunLauncher class, ECS tasks are outputting an error from the injected command from Dagster. From the ECS console, Dagster is sending the command:
["dagster","api","execute_run","<large JSON string>"]
In the logs I see the Dagster CLI complaining about the input command:
2022-08-31 08:56:18 Usage: dagster [OPTIONS] COMMAND [ARGS]...
2022-08-31 08:56:18 CLI tools for working with Dagster.
2022-08-31 08:56:18 Options:
2022-08-31 08:56:18 -v, --version Show the version and exit.
2022-08-31 08:56:18 -h, --help Show this message and exit.
2022-08-31 08:56:18 Commands:
2022-08-31 08:56:18 asset Commands for working with Dagster assets.
2022-08-31 08:56:18 debug Commands for debugging Dagster job runs.
2022-08-31 08:56:18 instance Commands for working with the current Dagster instance.
...
If I copy and paste the large JSON string
and run the command via the container locally with docker run <image> dagster api execute_run <large JSON string>
it can at least start the task.
What did you expect to happen?
When the ECS task starts, I would expect the dagster run to at least be started and not error on parsing the command from the CLI
How to reproduce?
As part of my dagster.yaml
I have the EcsRunLauncher
defined:
run_launcher:
module: "dagster_aws.ecs"
class: "EcsRunLauncher"
config:
task_definition: <task_definition_arn>
container_name: <task_container_name>
include_sidecars: true
I have a simple job to keep a task busy for ~60s
import time
from dagster import graph, op, repository
@op
def my_op():
start = time.time()
i = 0
while time.time() - start < 60:
i = i + 1
if i % 1000 == 0:
print(i)
return True
@graph
def my_graph():
my_op()
my_job = my_graph.to_job()
@repository
def busy():
return [my_job]
I can start the job from the UI and it can run fine using the default run launcher, but when switching to the EcsRunLauncher
I am getting errors starting the job.
I have two other containers running dagit
and the dagster-daemon
in a ECS separate task
Deployment type
Other
Deployment details
This repo https://github.com/datarootsio/terraform-aws-ecs-dagster serves as the basis for our configuration currently which we are setting up to compare dagster against other job tools
Additional information
No response
Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.
Issue Analytics
- State:
- Created a year ago
- Comments:10 (5 by maintainers)
Entrypoint in the task definition seems to have been the culprit! I’ve ran into the issue before as well in a docker-specific context, which is why we tried running with both
sh -c
and/bin/bash -c
, but I never thought to remove it 🤦I think a note within this section about the entrypoint would be beneficial, as our standard terraform setup for ECS uses
sh -c
for all task definitions and a command syntax of/bin/bash -c \"${var.command}\"
Do you have a
CMD
orENTRYPOINT
defined in your task definition? You might be running into: https://aws.amazon.com/blogs/opensource/demystifying-entrypoint-cmd-docker/So Dagster is indeed sending the command:
but ECS might then be array concatenating it with whatever is in
CMD
orENTRYPOINT
in your task definition to instead run:At least when we’ve seen similar issues in the past, that has been the culprit.
I’m a little hesitant to change the default command behavior for the launcher without knowing if this is specific to your custom task definition but I’m curious to know what your task definition looks like so we can either provide better docs/error messages or make the launcher compatible with this kind of customization.