dvc get: fails with gdrive remote
See original GitHub issueBug Report
dvc get . $relative_path: fails with private organisational github repo and gdrive
Description
Thanks for the amazing package - loving it.
I’m trying to download an individual file from a dvc google drive remote based on a private organisational github repository. Although dvc list
works, dvc get . $path
and dvc import . $path
is unsuccessful. I’m attaching error logs.
Reproduce
In the root of the directory corresponding to the private organizational github repository:
# works
dvc list . $relative_path
# doesn't work -> error.log.1
dvc get -v . $relative_path 2>& 1 | tee error.log.1.txt
# doesn't work -> error.log.2
dvc get -v $github_url $relative_path 2>&1 | tee error.log.2.txt
Error logs: error.log.1.txt error.log.2.txt
Downgrading to dvc==2.10.2
solves the issue.
Expected
I expect a particular data file to be downloaded from google drive.
Output of dvc doctor
:
DVC version: 2.11.0 (pip)
---------------------------------
Platform: Python 3.8.13 on Linux-5.13.0-52-generic-x86_64-with-glibc2.17
Supports:
gdrive (pydrive2 = 1.10.1),
webhdfs (fsspec = 2022.5.0),
http (aiohttp = 3.8.1, aiohttp-retry = 2.4.8),
https (aiohttp = 3.8.1, aiohttp-retry = 2.4.8)
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: gdrive
Workspace directory: ext4 on /dev/sda2
Repo: dvc, git
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:8 (7 by maintainers)
Top Results From Across the Web
ERROR: configuration error - GDrive remote auth failed with ...
dvc pull runs before I make any changes to the container i.e. before I install dependencies. So, it should be the default version....
Read more >configuration error - Failed to authenticate GDrive remote ...
I ran this github actions workflow with several variations, but I cannot pull the data from DVC. name: auto-testing on: [push] jobs: run: ......
Read more >Developers • gpgr - umccr
For DVC, make sure you install dvc and the corresponding remote package e.g. dvc-gdrive , dvc-s3 etc., or else you'll get something ...
Read more >Troubleshooting - DagsHub Docs
Users might get an error when trying to push files to DagsHub while pulling files ... Root cause 2: You didn't configure your...
Read more >Version Control your Large Datasets using Google Drive
dvc remote add -d storage gdrive://auth_code_folder_location ... You may get an error as above if you don't have the package pydrive2 ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@ursueugen Next dvc release should have this fixed (feel free to try installing from upstream to try it out for now). Thank you for reporting the issue! 🙏
Two things here, and looks like can be more or less quickly fixed. Happy to help someone with this since I’m not sure I’ll have enough focus to get there soon 😦.
Bug. Should be really quick fix and can unblock users. Assert is not needed (or at least no always needed) since in some cases below we don’t use
tmp_dir
at all. Assert prevents a possible workaround(s) (that you want to use anyway if you usedvc get
withgdrive
). Namely, overriding thegdrive_user_credentials_file
globally.Improvement to the way we write (cache) the credentials files in different scenarios. A bit more involved but it seems to be reasonable in scope.
tmp_fname(repo.tmp_dir)
- in all those place in that function we can and should use a globaltmp
- we don’t persist these files between sessions. We need to be careful and see that we have that location at all. This can be potentially improved in the PyDrive itself. We can let clients pass these values as is, right from the env variables. Or letpydrive
read those env vars (this can be a bit involved)appdir
instead of therepo.tmp_dir
here. There will be some implications - project will see now each other and we need to think about backward compatibility, but overall it should be fine. Question - does appdir always exist, should users be able to write torepo/.dvc
is the need it?