Support no-command meta stages
See original GitHub issueI would find it very useful to be able to define a DVC stage that has inputs but no command or outputs. The use case is to create meta-stages that serve only to depend on other stages, either to provide convenient ways to run part of the pipeline, or in the top-level dvc.yaml to make a bare dvc repro reproduce outputs defined in subdirectories.
I see two options to enable this that shouldn’t cause conflict with existing pipelines:
-
Support a stage without a
cmdentry; such a stage checks its inputs and updates itsmd5but does not run any command or produce outputs. -
If implicit behavior without
cmdis undesirable, requirecmd, but allow it to be YAMLnull:cmd: nullThis will make it explicit in the
dvc.yamlfile that this stage does not have a command.
I currently have two workarounds for this lack:
- If my pipeline will only be used on *nix systems, use
cmd: 'true'. - Use another no-op command, like
cmd: python -V; it works, but it’s a bit weird and doesn’t clearly communicate the intent of the stage.
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (2 by maintainers)

Top Related StackOverflow Question
That’s fair. And the most self-documenting would be to create some kind of ‘finalize’ script that reports something meaningful but brief, and use that as a command.
Yes, it can (provided that all stages are wanted, and not just some of them). Somehow I missed that as a prospective solution.
If you want to close this, it’s fine - can make things work with these solutions, and it’s only affecting the potential ergonomics of my pipelines, not my ability to specify the pipelines I need. Just thought I’d suggest it to raise discussion about the use cases and in case others might find it useful as well 😃.
There are two use cases I have:
reprocan help with this, but are less convenient, particularly if the stages to group are split across multiple subdirectories (each with its owndvc.yaml), and/or may change from time to time. Defined meta-stages allows you to document things like “rundvc repro dataprepto ensure data preparation is up to date”, and just keep the meta-stagedataprepdefinition up to date to ensure the instructions keep working.dvc reprowithout arguments in a project root, it reproduces all stages in the rootdvc.yaml; stages defined indvc.yamlfiles in subdirectories don’t get reproduced unless something in the rootdvc.yamldepends on one of their outputs. Putting a meta-stage in the top-leveldvc.yamlwould provide an easy way to include those outputs in a no-argumentdvc reprorun.The second of these is currently my more common use case, but I have felt the need for the first from time to time.