Add support for staging INPUTS/OUTPUTS from/to AWS S3
See original GitHub issueFunctionality is needed to stage data files to/from S3 on a per-task level.
For the example WDL file:
task copy_file {
String output_file
File input_file
command {
cp ${input_file} ${output_file}
}
runtime {
docker: "ubuntu:latest"
}
}
workflow wf_copy_file {
call copy_file
}
and corresponding inputs.json
{
"wf_cop_file.copy_file.input_file": "s3://myBucket/hello.txt",
"wf_cop_file.copy_file.output_file": "greetings.txt"
}
The workflow execution should be able to copy the input file from S3 to the working task directory, and copy the output file “greetings.txt” to the configured S3 bucket for writing logs and outputs. An example of files written to the output S3 bucket would be:
# $WF_ID is the workflow identifier (e.g. "E6D5143C-89BC-4823-AED7-2A6AE00A1C2B")
s3://cromwell-output-bucket/$WF_ID/copy_file/outputs/greetings.txt
s3://cromwell-output-bucket/$WF_ID/copy_file/wf_copy_file-rc.txt
s3://cromwell-output-bucket/$WF_ID/copy_file/wf_copy_file-stdout.txt
s3://cromwell-output-bucket/$WF_ID/copy_file/wf_copy_file-stderr.txt
Issue Analytics
- State:
- Created 5 years ago
- Comments:12 (3 by maintainers)
Top Results From Across the Web
Input and output artifacts - AWS CodePipeline
Stages use input and output artifacts that are stored in the Amazon S3 artifact ... as an input artifact to the Deploy stage,...
Read more >Create a pipeline that uses Amazon S3 as a deployment ...
The pipeline then uses Amazon S3 to deploy the files to your bucket. ... In Step 2: Add source stage, in Source provider,...
Read more >Deploy artifacts to Amazon S3 in different accounts using ...
1. Open the Amazon S3 console in the development account. 2. In the Bucket name list, choose your development input S3 bucket. For...
Read more >Tutorial: Create a simple pipeline (S3 bucket)
Follow the steps in this CodePipeline tutorial to create a simple two-stage pipeline using an S3 bucket as a code repository.
Read more >Staging Data and Tables with Pipeline Activities
AWS Data Pipeline can stage input and output data in your pipelines to make it easier to use certain activities, such as ShellCommandActivity...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@delagoya as of not too long ago you can override the default bash usage in favor of the shell of your choice
@elerch Careful with that
always-on
restart policy from docker. In my experience, it did not re-readenv-files
(in my case those env vars are sitting on the host’s/etc/defaults/ecs
). I expected SIGHUP-like behavior when changing ecs-agent attributes likeECS_CLUSTER
, i.e:https://github.com/umccr/umccrise/blob/master/deploy/roles/brainstorm.umccrise-docker/files/bootstrap_instance.sh#L39
Instead, I had to resort to a systemd service that re-runs the ecs-agent docker container on boot:
https://github.com/umccr/umccrise/blob/master/deploy/roles/brainstorm.ecs-agent/tasks/main.yml#L75