Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add support for staging INPUTS/OUTPUTS from/to AWS S3

See original GitHub issue

Functionality is needed to stage data files to/from S3 on a per-task level.

For the example WDL file:

task copy_file {
  String output_file
  File input_file

  command {
    cp  ${input_file} ${output_file}
  }
  runtime {
    docker: "ubuntu:latest"
  }
}

workflow wf_copy_file {
  call copy_file
}

and corresponding inputs.json

{
  "wf_cop_file.copy_file.input_file": "s3://myBucket/hello.txt",
  "wf_cop_file.copy_file.output_file": "greetings.txt"
}

The workflow execution should be able to copy the input file from S3 to the working task directory, and copy the output file “greetings.txt” to the configured S3 bucket for writing logs and outputs. An example of files written to the output S3 bucket would be:

# $WF_ID is the workflow identifier (e.g. "E6D5143C-89BC-4823-AED7-2A6AE00A1C2B")
s3://cromwell-output-bucket/$WF_ID/copy_file/outputs/greetings.txt
s3://cromwell-output-bucket/$WF_ID/copy_file/wf_copy_file-rc.txt
s3://cromwell-output-bucket/$WF_ID/copy_file/wf_copy_file-stdout.txt
s3://cromwell-output-bucket/$WF_ID/copy_file/wf_copy_file-stderr.txt

Issue Analytics

State:
Created 5 years ago
Comments:12 (3 by maintainers)

Top GitHub Comments

1reaction

geoffjentrycommented, Jun 29, 2018

@delagoya as of not too long ago you can override the default bash usage in favor of the shell of your choice

0reactions

brainstormcommented, Aug 15, 2018

@elerch Careful with that always-on restart policy from docker. In my experience, it did not re-read env-files (in my case those env vars are sitting on the host’s /etc/defaults/ecs). I expected SIGHUP-like behavior when changing ecs-agent attributes like ECS_CLUSTER, i.e:

https://github.com/umccr/umccrise/blob/master/deploy/roles/brainstorm.umccrise-docker/files/bootstrap_instance.sh#L39

Instead, I had to resort to a systemd service that re-runs the ecs-agent docker container on boot:

https://github.com/umccr/umccrise/blob/master/deploy/roles/brainstorm.ecs-agent/tasks/main.yml#L75