pull: fails in V2 if original import was in V1 and specified a git tag
See original GitHub issueBug Report
Description
DVC 2.x does not seem to be backwards compatible with 1.x in cases where the dvc import specified a tag
Attempting to dvc pull after a dvc upgrade results in
ERROR: unexpected error - '_pygit2.Tag' object has no attribute 'tree'
Of course re-importing the same files using a new DVC version is a “solution”, but the migration would be quite painful for us - large number of files in multiple places.
BTW, once a file has been re-imported using new DVC, the resulting .dvc file differs in 2 ways:
- the
rev_lockSHA- the new one has the SHA of the commit the tag refers to
- the old one has the SHA of the tag object itself
- the
md5in the first line of the file (not surprising)
Reproduce
- import a file using DVC 1.X
- Remove the data file - keeping the
.dvc - Upgrade DVC to 2.X
dvc pull- fails withERROR: unexpected error - '_pygit2.Tag' object has no attribute 'tree'
(the example below includes a path to an existing repo created for debugging and an existing tag within that repo)
~/dev ❯ mkdir test; cd test; mkvirtualenv test
~/dev/test ❯ pip install --quiet "dvc<2"
~/dev/test ❯ deactivate; workon test
~/dev/test ❯ dvc --version
1.11.16
~/dev/test ❯ git init; dvc init
~/dev/test master +9 !8 ❯ dvc import git@github.com:tomaszpietruszka-globality/toy-dvc-registry.git subdir/images/owl_sticker.png --rev some_tag
Importing 'subdir/images/owl_sticker.png (git@github.com:tomaszpietruszka-globality/toy-dvc-registry.git)' -> 'owl_sticker.png'
To track the changes with git, run:
git add .gitignore owl_sticker.png.dvc
~/dev/test master +9 !8 ?2 ❯ rm owl_sticker.png
~/dev/test master +9 !8 ?2 ❯ pip install --upgrade --quiet dvc; deactivate; workon test;
~/dev/test master +9 !8 ?2 ❯ dvc --version
2.4.3
~/dev/test master +9 !8 ?2 ❯ dvc pull
ERROR: unexpected error - '_pygit2.Tag' object has no attribute 'tree'
Expected
The imported file should get pulled again
Environment information
It’s an issue of compatibility between versions - below showing the version after the upgrade.
Output of dvc doctor:
After the V2 upgrade:
$ dvc doctor
DVC version: 2.4.3 (pip)
---------------------------------
Platform: Python 3.7.8 on Darwin-19.6.0-x86_64-i386-64bit
Supports: http, https
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s6
Caches: local
Remotes: None
Workspace directory: apfs on /dev/disk1s6
Repo: dvc, git
Additional Information (if any): Verbose log:
~/dev/test master ?1 ❯ dvc pull --verbose 17s test
2021-06-30 13:08:17,592 DEBUG: Check for update is enabled.
2021-06-30 13:08:17,644 DEBUG: Creating external repo git@github.com:tomaszpietruszka-globality/toy-dvc-registry.git@c9c612b36d7a803ea3e0856ec28e7900fa40b6a3
2021-06-30 13:08:17,644 DEBUG: erepo: git clone 'git@github.com:tomaszpietruszka-globality/toy-dvc-registry.git' to a temporary dir
2021-06-30 13:08:18,977 ERROR: unexpected error - '_pygit2.Tag' object has no attribute 'tree'
------------------------------------------------------------
Traceback (most recent call last):
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/main.py", line 55, in main
ret = cmd.do_run()
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/command/base.py", line 50, in do_run
return self.run()
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/command/data_sync.py", line 41, in run
glob=self.args.glob,
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/repo/__init__.py", line 51, in wrapper
return f(repo, *args, **kwargs)
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/repo/pull.py", line 38, in pull
run_cache=run_cache,
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/repo/__init__.py", line 51, in wrapper
return f(repo, *args, **kwargs)
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/repo/fetch.py", line 54, in fetch
revs=revs,
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/repo/__init__.py", line 408, in used_objs
filter_info=filter_info,
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/stage/__init__.py", line 633, in get_used_objs
for odb, objs in out.get_used_objs(*args, **kwargs).items():
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/output.py", line 855, in get_used_objs
return self.get_used_external(**kwargs)
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/output.py", line 899, in get_used_external
return dep.get_used_objs()
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/dependency/repo.py", line 107, in get_used_objs
locked=locked, cache_dir=local_odb.cache_dir
File "/Users/tomaszpietruszka/.pyenv/versions/3.7.8/lib/python3.7/contextlib.py", line 112, in __enter__
return next(self.gen)
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/external_repo.py", line 70, in external_repo
repo = Repo(**repo_kwargs)
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/repo/__init__.py", line 157, in __init__
root_dir=root_dir, scm=scm, rev=rev, uninitialized=uninitialized
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/repo/__init__.py", line 104, in _get_repo_dirs
fs = scm.get_fs(rev) if isinstance(scm, Git) and rev else None
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/scm/git/__init__.py", line 354, in get_fs
tree_obj = self.pygit2.get_tree_obj(rev=resolved)
File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/scm/git/backend/pygit2.py", line 206, in get_tree_obj
tree = self.repo[rev].tree
AttributeError: '_pygit2.Tag' object has no attribute 'tree'
Issue Analytics
- State:
- Created 2 years ago
- Comments:5

Top Related StackOverflow Question
fwiw, I’ve opened an issue in
pygit2: https://github.com/libgit2/pygit2/issues/1082@pmrowla I believe I’m using the same versions:
The key thing is: the problem does not manifest when resolving the name of the annotated tag, it manifests when resolving the SHA of the annotated tag.
It’s probably a very rare use case, because why would someone use the SHAs of tags… except that’s what DVC 1.x stores as
rev_lock, if a name of an annotated tag was given in--rev.Another way of reproducing, purely with pygit2: (please note the 3rd command includes the output of the second one (the tag SHA) - so requires a manual edit when reproducing):