question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Allow `dvc fetch` to target checksums directly

See original GitHub issue

In my understanding, dvc files are good targets because they store data files checksums. It’s ergonomic, they work like refs in git.

But I think we should be able to fetch files also by checksum directly, no matter the current state of the workspace. In git, you can fetch a ref but you can also fetch a specific commit by hash.

My use case is that I have my own diff tooling, which is used by reference testing, that compares the current version of the dataset with the previous version reading it directly from the cache. If the file is not there, currently I have to checkout that commit in git before doing a dvc fetch and then checkout back, which is often impossible because I have pending changes.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
villasvcommented, Sep 26, 2019

The issue itself was not implemented, though it’s not needed anymore. IMO we can close it 😉

0reactions
efiopcommented, Sep 26, 2019

@villasv In that case, should we close this issue for now? 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

fetch | Data Version Control - DVC
Any targets given to this command limit what to fetch. ... and --all-commits options enable fetching files/dirs referenced in multiple Git commits. The...
Read more >
Data Version Control With Python and DVC - Real Python
To save space, DVC allows you to set up a shared cache. When you initialize a DVC repository with dvc init , DVC...
Read more >
Creating reproducible data science workflows with DVC
This allows to later recall the performance of the model: Moreover, DVC can fetch specific metrics directly: ...
Read more >
Managing versioned machine learning datasets in DVC, and ...
dvc directory, and contains a mirror of the files in the workspace using MD5 checksums to track the files. If a file changes...
Read more >
fsspec Documentation - Read the Docs
This allows for concurrent calls within bulk operations such as cat (fetch the contents of many files at once) even from normal code,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found