question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

checkout: partial directory cache

See original GitHub issue

Currently, if, for some reason, you’ve lost some of the cache files for a part of your directory, dvc will throw assertion error:

ERROR: failed to pull data from the cloud
------------------------------------------------------------
Traceback (most recent call last):
  File "/var/akonshin/.virtualenvs/tr/lib/python3.5/site-packages/dvc/command/data_sync.py", line 46, in do_run
    recursive=self.args.recursive,
  File "/var/akonshin/.virtualenvs/tr/lib/python3.5/site-packages/dvc/repo/pull.py", line 27, in pull
    target=target, with_deps=with_deps, force=force, recursive=recursive
  File "/var/akonshin/.virtualenvs/tr/lib/python3.5/site-packages/dvc/repo/checkout.py", line 54, in checkout
    stage.checkout(force=force, progress_callback=progress_callback)
  File "/var/akonshin/.virtualenvs/tr/lib/python3.5/site-packages/dvc/stage.py", line 822, in checkout
    force=force, tag=self.tag, progress_callback=progress_callback
  File "/var/akonshin/.virtualenvs/tr/lib/python3.5/site-packages/dvc/output/base.py", line 228, in checkout
    progress_callback=progress_callback,
  File "/var/akonshin/.virtualenvs/tr/lib/python3.5/site-packages/dvc/remote/base.py", line 370, in checkout
    progress_callback=progress_callback,
  File "/var/akonshin/.virtualenvs/tr/lib/python3.5/site-packages/dvc/remote/local.py", line 353, in do_checkout
    self.link(c, p)
  File "/var/akonshin/.virtualenvs/tr/lib/python3.5/site-packages/dvc/remote/local.py", line 155, in link
    assert os.path.isfile(cache)
AssertionError
------------------------------------------------------------

This is because unlike for regular data files, we don’t check for cache file existence before linking it. With standalone data files we print a warning that “cache file doesn’t exist and file is not going to be created”, so we need to do something similar here.

Some backstory: user on ODS was getting this error and turned out that he had ran dvc gc -c for some time and then interrupted it, so parts of data were removed and so when he tried to dvc pull later, he got this error.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:3
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
paredcommented, May 24, 2019

Mentioned by @efiop cache optimization was introduced in #1526

0reactions
efiopcommented, Jun 11, 2019
Read more comments on GitHub >

github_iconTop Results From Across the Web

How do I clone a subdirectory only of a Git repository?
What you are trying to do is called a sparse checkout, and that feature was added in Git 1.7.0 (Feb. 2012). The steps...
Read more >
Partial clone - GitLab Docs
Partial clone is a performance optimization that “allows Git to function without having a complete copy of the repository. The goal of this...
Read more >
checkout | Data Version Control - DVC
Missing data files or directories are restored from the cache. Those that don't match with ... It also lists the partial progress made...
Read more >
Is it possible to clone only part of a git project?
Now you need to define which files/folders you want to actually check out. This is done by listing them in .git/info/sparse-checkout , eg:...
Read more >
git-sparse-checkout Documentation - Git
This command is used to create sparse checkouts, which change the working tree from having all tracked files present to only having a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found