Change the cos directory layout for pipeline artifacts
See original GitHub issueIs your feature request related to a problem? Please describe. Spawned from https://github.com/elyra-ai/elyra/pull/1732
Describe the solution you’d like A clear and concise description of what you want to happen.
Currently Elyra uses the following directory layout for input and output artifacts that are stored in a cloud storage bucket:
<pipeline-name-with-timestamp>/
<pipeline-name-with-timestamp>/<node-1-archive>.tgz # input artifacts for node 1
<pipeline-name-with-timestamp>/<node-2-archive>.tgz # input artifacts for node 2
<pipeline-name-with-timestamp>/<output-artifact-1>
<pipeline-name-with-timestamp>/<output-artifact-2>
<pipeline-name-with-timestamp>/<output-artifact-3>
...
There are two issues with this layout:
- input and output artifacts are stored in the same location
- output artifacts can be overwritten if the pipeline is run multiple times outside of Elyra (e.g. on KFP or AA)
To resolve the issues we could
- separate input and output artifacts
- associate output artifacts with a “run id”
For example:
<pipeline-name-with-timestamp>/
<pipeline-name-with-timestamp>/<node-1-archive>.tgz
<pipeline-name-with-timestamp>/<node-2-archive>.tgz
<pipeline-name-with-timestamp>/<run-1-id>/<output-artifact-1>
<pipeline-name-with-timestamp>/<run-1-id>/<output-artifact-2>
<pipeline-name-with-timestamp>/<run-1-id>/<output-artifact-3>
<pipeline-name-with-timestamp>/<run-2-id>/<output-artifact-1>
<pipeline-name-with-timestamp>/<run-2-id>/<output-artifact-2>
<pipeline-name-with-timestamp>/<run-2-id>/<output-artifact-3>
...
Ideally this unique “run id” is something that’s meaningful to the user.
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
azure-pipelines-yaml/pipeline-artifacts.md at master - GitHub
No change to current behavior. Artifacts are downloaded to $(System.DefaultWorkingDirectory) , which is the sources folder on Build and the artifacts folder on ......
Read more >Publish and download pipeline Artifacts - Azure
Using Azure Pipelines, you can download artifacts from earlier stages in your pipeline or from another pipeline. You can also publish your ...
Read more >File pattern for Publish Pipeline Artifact in Azure DevOps
I have 2 archived zip files in artifact staging directory: $(Build.ArtifactStagingDirectory)/$(Build.BuildId).zip; $(Build.
Read more >Azure DevOps directories and folders Cheat-Sheet - Medium
This directory represents the current Pipeline, it is the default directory a task like Publish Pipeline Artifacts task puts the artifacts to be ......
Read more >GoCD Configuration Reference - Documentation
The default value is 'artifacts' in the folder where the GoCD Server is ... (Element param) elements to be used in a pipeline...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I agree. We can always improve this incrementally and start by separating input and output artifacts. While such a change would impact the user workflow a bit, it doesn’t have any impact on any of the Elyra tooling/UI. A new release would be good timing to make such a change.
RIght. Perhaps it might be better to treat it purely as a classifier or scope comes to mind. However, I don’t think we want to entertain having to open the pipeline files to determine their ‘scope’. If we need that reverse lookup (which is natural) this might require more thought.