Workdir of docker container is overwritten
See original GitHub issueI’m trying to get my workflow up and running with digdag and docker. After starting the container, the workdir which is defined in the Dockerfile of the container gets overwritten. I can’t seem to find a way to set the workdir properly.
Example Workflow:
timezone: UTC
_export:
MINIO_HOST: "127.0.0.1:9001"
MINIO_USERNAME: "minio"
MINIO_PASSWORD: "minio"
+my_task_1:
_workdir: "/usr/src/app"
_export:
docker:
image: python:3
for_each>:
crawler: [abc, def]
_do:
sh>: pwd
The output is something like this: /tmp/digdag-tempdir3664891302268477936/workspace/3_report_32_180_1070820252556486184
My use-case is, that I have a work-flow that consists of multiple steps, each step is somewhat complex and has it’s own docker container with it’s own entrypoint and workdir.
Another problem is, that I’m not sure how to pass intermediate results / data between steps or how to scale this application onto multiple servers (e.g. have multiple workers)? Is there any documentation on this that I missed? The documentation only says:
Tasks can run on local machine, distributed servers, or in a Docker container.
Thanks! I really like digdag.
Issue Analytics
- State:
- Created 5 years ago
- Comments:5 (2 by maintainers)
Hello, @David-Development
MicroAd(one of Digdag user company in Japan) use Digdag with Docker. https://developers.microad.co.jp/entry/2018/05/24/131136
They use
py>
operator for executing a query in Hadoop. They don’t process datum in Docker container. Instead, datum stored in Hadoop. so, Digdag becomes scalable.They also use S3 and
s3_wait>
operator for waiting for task completion.Hi, @David-Development As far as I know, there isn’t a way to avoid overwritting workdir if you run docker by digdag. You may use
sh>:
operator to run docker like below[FYI] you can also use for_range operator instead of for_each to run subtasks multiple times. https://docs.digdag.io/operators/for_range.html#for-range-repeat-tasks-for-a-range