question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pull: fails in V2 if original import was in V1 and specified a git tag

See original GitHub issue

Bug Report

Description

DVC 2.x does not seem to be backwards compatible with 1.x in cases where the dvc import specified a tag

Attempting to dvc pull after a dvc upgrade results in

ERROR: unexpected error - '_pygit2.Tag' object has no attribute 'tree'

Of course re-importing the same files using a new DVC version is a “solution”, but the migration would be quite painful for us - large number of files in multiple places.

BTW, once a file has been re-imported using new DVC, the resulting .dvc file differs in 2 ways:

  • the rev_lock SHA
    • the new one has the SHA of the commit the tag refers to
    • the old one has the SHA of the tag object itself
  • the md5 in the first line of the file (not surprising)

Reproduce

  1. import a file using DVC 1.X
  2. Remove the data file - keeping the .dvc
  3. Upgrade DVC to 2.X
  4. dvc pull - fails with ERROR: unexpected error - '_pygit2.Tag' object has no attribute 'tree'

(the example below includes a path to an existing repo created for debugging and an existing tag within that repo)

~/dev ❯ mkdir test; cd test; mkvirtualenv test
~/dev/test ❯ pip install --quiet "dvc<2"
~/dev/test ❯ deactivate; workon test
~/dev/test ❯ dvc --version 
1.11.16
~/dev/test ❯ git init; dvc init
~/dev/test master +9 !8 ❯ dvc import git@github.com:tomaszpietruszka-globality/toy-dvc-registry.git subdir/images/owl_sticker.png --rev some_tag
Importing 'subdir/images/owl_sticker.png (git@github.com:tomaszpietruszka-globality/toy-dvc-registry.git)' -> 'owl_sticker.png'

To track the changes with git, run:

        git add .gitignore owl_sticker.png.dvc
~/dev/test master +9 !8 ?2 ❯ rm owl_sticker.png
~/dev/test master +9 !8 ?2 ❯ pip install --upgrade --quiet dvc; deactivate; workon test;
~/dev/test master +9 !8 ?2 ❯ dvc --version
2.4.3
~/dev/test master +9 !8 ?2 ❯ dvc pull
ERROR: unexpected error - '_pygit2.Tag' object has no attribute 'tree'

Expected

The imported file should get pulled again

Environment information

It’s an issue of compatibility between versions - below showing the version after the upgrade.

Output of dvc doctor: After the V2 upgrade:

$ dvc doctor
DVC version: 2.4.3 (pip)
---------------------------------
Platform: Python 3.7.8 on Darwin-19.6.0-x86_64-i386-64bit
Supports: http, https
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s6
Caches: local
Remotes: None
Workspace directory: apfs on /dev/disk1s6
Repo: dvc, git

Additional Information (if any): Verbose log:

~/dev/test master ?1 ❯ dvc pull --verbose                                                                                                                                                                                        17s  test
2021-06-30 13:08:17,592 DEBUG: Check for update is enabled.
2021-06-30 13:08:17,644 DEBUG: Creating external repo git@github.com:tomaszpietruszka-globality/toy-dvc-registry.git@c9c612b36d7a803ea3e0856ec28e7900fa40b6a3
2021-06-30 13:08:17,644 DEBUG: erepo: git clone 'git@github.com:tomaszpietruszka-globality/toy-dvc-registry.git' to a temporary dir
2021-06-30 13:08:18,977 ERROR: unexpected error - '_pygit2.Tag' object has no attribute 'tree'
------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/main.py", line 55, in main
    ret = cmd.do_run()
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/command/base.py", line 50, in do_run
    return self.run()
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/command/data_sync.py", line 41, in run
    glob=self.args.glob,
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/repo/__init__.py", line 51, in wrapper
    return f(repo, *args, **kwargs)
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/repo/pull.py", line 38, in pull
    run_cache=run_cache,
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/repo/__init__.py", line 51, in wrapper
    return f(repo, *args, **kwargs)
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/repo/fetch.py", line 54, in fetch
    revs=revs,
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/repo/__init__.py", line 408, in used_objs
    filter_info=filter_info,
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/stage/__init__.py", line 633, in get_used_objs
    for odb, objs in out.get_used_objs(*args, **kwargs).items():
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/output.py", line 855, in get_used_objs
    return self.get_used_external(**kwargs)
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/output.py", line 899, in get_used_external
    return dep.get_used_objs()
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/dependency/repo.py", line 107, in get_used_objs
    locked=locked, cache_dir=local_odb.cache_dir
  File "/Users/tomaszpietruszka/.pyenv/versions/3.7.8/lib/python3.7/contextlib.py", line 112, in __enter__
    return next(self.gen)
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/external_repo.py", line 70, in external_repo
    repo = Repo(**repo_kwargs)
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/repo/__init__.py", line 157, in __init__
    root_dir=root_dir, scm=scm, rev=rev, uninitialized=uninitialized
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/repo/__init__.py", line 104, in _get_repo_dirs
    fs = scm.get_fs(rev) if isinstance(scm, Git) and rev else None
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/scm/git/__init__.py", line 354, in get_fs
    tree_obj = self.pygit2.get_tree_obj(rev=resolved)
  File "/Users/tomaszpietruszka/.virtualenvs/test/lib/python3.7/site-packages/dvc/scm/git/backend/pygit2.py", line 206, in get_tree_obj
    tree = self.repo[rev].tree
AttributeError: '_pygit2.Tag' object has no attribute 'tree'

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5

github_iconTop GitHub Comments

2reactions
tomaszpietruszka-globalitycommented, Jun 30, 2021

fwiw, I’ve opened an issue in pygit2: https://github.com/libgit2/pygit2/issues/1082

1reaction
tomaszpietruszka-globalitycommented, Jul 2, 2021

@pmrowla I believe I’m using the same versions:

~/dev/dvc-test-import master ?1 ❯ pip freeze | grep pygit
pygit2==1.6.1
~/dev/dvc-test-import master ?1 ❯ python -c "import pygit2; print(pygit2.LIBGIT2_VER)" 
(1, 1, 0)

The key thing is: the problem does not manifest when resolving the name of the annotated tag, it manifests when resolving the SHA of the annotated tag.

It’s probably a very rare use case, because why would someone use the SHAs of tags… except that’s what DVC 1.x stores as rev_lock, if a name of an annotated tag was given in --rev.

Another way of reproducing, purely with pygit2: (please note the 3rd command includes the output of the second one (the tag SHA) - so requires a manual edit when reproducing):

~ ❯ mkdir repo; cd repo; git init; touch file; git add file; git commit -m "first commit"; git tag -a "new_tag" -m "tag message"                                                                                                      test
Initialized empty Git repository in /Users/tomaszpietruszka/repo/.git/
[master (root-commit) f5c40fd] first commit
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 file
~/repo master ❯ git show-ref --tags                                                                                                                                                                                                   test
b3f85c779d0a9fae876284efd166cd2af48df819 refs/tags/new_tag
~/repo master ❯ python -c "from pygit2 import Repository; r = Repository('.'); print(r.resolve_refish('new_tag')); print(r.resolve_refish('b3f85c779d0a9fae876284efd166cd2af48df819'))"                                               test
(<pygit2.Object{commit:f5c40fd1f5806fc9a2646697497bf7b3e7bf9072}>, <_pygit2.Reference object at 0x10452bcb0>)
(<pygit2.Object{tag:b3f85c779d0a9fae876284efd166cd2af48df819}>, None)
Read more comments on GitHub >

github_iconTop Results From Across the Web

git-tag Documentation - Git
If no number is given to -n , only the first line is printed. If the tag is not annotated, the commit message...
Read more >
How to point Go module dependency in go.mod to a latest ...
mod to point not to a release of a module, but to a specific commit in the module's repository? It looks like I...
Read more >
Git Large File Storage (LFS) - GitLab Docs
Git LFS v1 original API is not supported, because it was deprecated early in LFS development. ... Pushes then fail if you have...
Read more >
git tag | Atlassian Git Tutorial
Git tag command is the primary driver of tag: creation, modification and ... above git tag invocation will create a new annotated commit...
Read more >
Managing Images | Developer Guide - OpenShift Documentation
Using a tag to specify the version of what is contained in the image is a common use case. If you have an...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found