Allow `dvc fetch` to target checksums directly
See original GitHub issueIn my understanding, dvc files are good targets because they store data files checksums. It’s ergonomic, they work like refs
in git.
But I think we should be able to fetch files also by checksum directly, no matter the current state of the workspace. In git
, you can fetch a ref but you can also fetch a specific commit by hash.
My use case is that I have my own diff tooling, which is used by reference testing, that compares the current version of the dataset with the previous version reading it directly from the cache. If the file is not there, currently I have to checkout that commit in git before doing a dvc fetch and then checkout back, which is often impossible because I have pending changes.
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (6 by maintainers)
Top Results From Across the Web
fetch | Data Version Control - DVC
Any targets given to this command limit what to fetch. ... and --all-commits options enable fetching files/dirs referenced in multiple Git commits. The...
Read more >Data Version Control With Python and DVC - Real Python
To save space, DVC allows you to set up a shared cache. When you initialize a DVC repository with dvc init , DVC...
Read more >Creating reproducible data science workflows with DVC
This allows to later recall the performance of the model: Moreover, DVC can fetch specific metrics directly: ...
Read more >Managing versioned machine learning datasets in DVC, and ...
dvc directory, and contains a mirror of the files in the workspace using MD5 checksums to track the files. If a file changes...
Read more >fsspec Documentation - Read the Docs
This allows for concurrent calls within bulk operations such as cat (fetch the contents of many files at once) even from normal code,...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
The issue itself was not implemented, though it’s not needed anymore. IMO we can close it 😉
@villasv In that case, should we close this issue for now? 😃