New command for updating dependencies hashes.
See original GitHub issueIdea
Main idea is introduce new command allowing refresh dependencies hash in dvc.lock file without running pipeline again. This command will be useful when one or more dependencies will be modified and these modifications don’t affect results. For example adding comment or add new function to utils module.
Problem
Currently updating dependencies is possible by running pipeline once again. This solution have some drawbacks:
- Pipeline execution can be time-consuming
- To reproduce pipeline it’s necessary to download input data.(This can be problematic when we want to make changes from new machine)
- Using a module shared by multiple pipelines compounds the previous problems. (Modification in this module cause the need to update many lock files)
Possible solution
Introducing new command dvc refresh
recomputed hashes.
Interface
usage: dvc commit [-h] [-q | -v] [-f] [-d <stage> <filename>] target
positional arguments:
target - Limit command scope to specific pipeline.
Options
-d <stage> <filename>
- recompute hash of file. If file is tracked by dvc ask for confirmation when file is modified. (Can be use multiple time to specify more targets)-f
,--force
- overwrite an existing hashes in dvc.lock file without asking for confirmation.-q
,--quiet
- do not write anything to standard output. Exit with 0 if no problems arise-h
,--help
- prints the usage/help message, and exit.-v
,--verbose
- displays detailed tracing information
Behavior
- Because command can corrupt state command can be used only with specified target.
- If command is executed without any
-d
option apply for all dependencies in pipeline. - If
-d
option occur at least once apply only for these dependencies - If file is tracked by dvc ask before hash update:
- yes - update file hash and raise error if file don’t exist.
- no - don’t modify the hash.
- This command don’t commit changes to cache
Benefits
- Pipeline dependencies can be updated without running pipeline once again
- Downloading massive data is no longer needed to update pipeline dependencies
- Exporting code to shared modules will be easier
- Hashes can be updated with surgeon precision
Drawbacks
- Command can corrupt tracking state
Final Notes
I would greatly appreciate your feedback on what you think about this idea.
Issue Analytics
- State:
- Created 2 years ago
- Comments:5
Top Results From Across the Web
How To Upgrade Golang Dependencies
How To Upgrade Golang Dependencies. ... This command will eventually update your go.mod and go.sum file ... or specifying a commit hash go...
Read more >How Yarn Lock Files Work and Upgrading Dependencies
This article has a goal of explaining the purpose of a yarn.lock file as well as how to upgrade dependencies when a lock...
Read more >Force maven to fetch dependencies from remote - Seralahthan
We can use -U/--update-snapshots flag when building a maven project to force maven to download dependencies from the remote repository. mvn clean install...
Read more >Commands | Documentation | Poetry - Python dependency ...
This command will help you kickstart your new Python project by creating a ... In order to get the latest versions of the...
Read more >Dependency Management With Python Poetry
After an update, a package might not work as it did before the update. ... You can create a new Poetry project by...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Thanks for link these issues. I don’t see they when I opening this one.
The problem presented in this issue, will be solved if functionality provided as result of #4657, will be allow update dependencies without downloading any data and work when only dependencies are modified.
If I’m understanding this correctly, this is already possible using
dvc commit
. If you have modifications to dependencies or outputs in your local workspace,dvc commit
will commit the current state of those files from your workspace intodvc.lock
, without the need to rundvc repro
.