Publish to Dockstore via zip
See original GitHub issueFeature Request
Desired behaviour
It would be helpful to have the option to publish to Dockstore by uploading a tarball, or something similar.
The rationale for this was discussed on 8/22 in a call with Dockstore developers and Human Cell Atlas Data Coordination Platform developers and is described further below.
Decoupling imports from Git repo layout
Essentially, our import statements do not match the file layout in our Git repo, and this causes things to break when we publish to Dockstore.
Simplifying a bit, the layout in our repo looks like:
- workflows
- my_workflow.wdl
- tasks
- task1.wdl
- task2.wdl
… and in my_workflow.wdl, the imports look like:
import task1.wdl
import task2.wdl
However, as Dockstore exists today, we can’t publish directly from this repo and have Dockstore able to submit our workflow to an execution engine.
Publishing via tarball
We could solve this if Dockstore provided a way to publish via a tarball.
We would make a flat tarball that looks like:
- my_workflow.wdl
- task1.wdl
- task2.wdl
Relative to that layout, the imports would make sense, so Dockstore could then package up the dependencies and submit to an execution engine.
Workarounds have drawbacks
Symlinks
It’s possible to add symlinks to our repo as a workaround, but this adds clutter and a possible source of confusion to the repo.
HTTP imports
This would require us to bake version info into the import statements in our WDL files, e.g.:
import https://raw.githubusercontent.com/HumanCellAtlas/skylab/v1.2.3/tasks/task1.wdl.
which we would rather not have to do.
Also, submitting dependencies in a zip file rather than via HTTP imports also allows the system submitting WDL files to cache them, rather than having the execution engine download them anew for each workflow run. Although execution engines might choose to implement their own caching, doing the caching ourselves gives us more control.
Why our imports don’t match our repo layout
Assuming a flat structure in the import statements improves portability in some scenarios. For example, we can easily download the main workflow file to the root directory of a VM, expand an archive of dependencies next to it, and then run that workflow locally (whereas complications arise if we have imports like …/task1.wdl – we could imagine workarounds, but we’d rather not have to handle that problem.)
Also, the layout of our Git repo changes sometimes, and we don’t want to have to rewrite all the import statements when that happens.
Conclusion
Having the ability to publish via tarball or something similar would make it easier for us to integrate with Dockstore.
┆Issue is synchronized with this Jira Story ┆Issue Type: Story ┆Fix Versions: Dockstore 1.6 ┆Sprint: Seabright Sprint 4 Dory ┆Issue Number: DOCK-154 ┆Epic: Publish Workflows via Zip
Issue Analytics
- State:
- Created 5 years ago
- Reactions:1
- Comments:8 (4 by maintainers)
Top GitHub Comments
Hi @coverbeck - Thanks for following up! I’m going to link this to our engineers that are at Broad, to see if they have answers to your questions.
Hi @samanehsan , thanks for the information!