dvc pull not fetching all data (cache file not found)
See original GitHub issuePlease provide information about your setup
dvc --version 0.23.2
uname -a Linux arachne-postgres 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
On server-1 data pushed to remote is in cache and in s3
$ ls -l .dvc/cache/9c/
total 63360
-rw-rw-r-- 1 ubuntu ubuntu 783548 Jan 4 20:53 01b58cb0faab4ee28a9228552ffd8d
-rw-rw-r-- 1 ubuntu ubuntu 3719779 Jan 4 20:53 2861308e6110dc7f4850cbe331e63a
-rw-rw-r-- 2 ubuntu ubuntu 14722 Jan 4 20:51 2fa17d3b0c9486c5af435329f62151
-rw-rw-r-- 1 ubuntu ubuntu 849013 Jan 4 20:52 416b598605a8fcd6fc04c2edab4edc
-rw-rw-r-- 2 ubuntu ubuntu 22852 Jan 4 20:52 55bb368600627e7e14ad7648d8f26b
-rw-rw-r-- 2 ubuntu ubuntu 39899 Jan 4 20:52 5ffd6a38f12b14c6a5e6aafacf133c
-rw-rw-r-- 1 ubuntu ubuntu 614053 Jan 4 20:51 711b4315305a21dabfa44e74740ff7
-rw-rw-r-- 2 ubuntu ubuntu 23825 Jan 4 20:52 765b898179e0c88af48a638dfe6586
-rw-rw-r-- 1 ubuntu ubuntu 148555 Jan 7 07:36 7b5eca364544bf04f63c390dde7f6e.dir
-rw-rw-r-- 1 ubuntu ubuntu 865287 Jan 4 20:53 7f3444015a68ee039d84986d5f9a98
-rw-rw-r-- 2 ubuntu ubuntu 18086 Jan 4 20:52 8bc9b783a8e0568e71d350e4a1fc37
-rw-rw-r-- 2 ubuntu ubuntu 9823 Jan 4 20:52 95e0f6ac3455b66b69974c3936eba7
-rw-rw-r-- 1 ubuntu ubuntu 673028 Jan 4 20:52 997353cce6f0997dc97e311893f3fb
-rw-rw-r-- 1 ubuntu ubuntu 54562699 Jan 4 22:43 9d7b83b536edc6666c76d16e9bfc6b
-rw-rw-r-- 1 ubuntu ubuntu 436909 Jan 4 20:52 a485af08514391eaf58f7a607a1aaa
-rw-rw-r-- 1 ubuntu ubuntu 365795 Jan 4 20:51 a76a77aa1343aba087211e42f6d2b7
-rw-rw-r-- 1 ubuntu ubuntu 1232517 Jan 4 20:53 ae610c9a0246025b573888e84d766e
-rw-rw-r-- 1 ubuntu ubuntu 364246 Jan 4 20:51 bec0f4fc35ecf1377e62c3859362c4
-rw-rw-r-- 1 ubuntu ubuntu 59057 Nov 9 02:47 cf83393276d56191a88c3d54ef6a5d
-rw-rw-r-- 2 ubuntu ubuntu 33169 Jan 4 20:52 e00d88888f810720aee5b46c3f0772
aws --endpoint=https://ceph.acc.ohsu.edu s3 ls s3://bmeg/dvc/9c/
2019-01-08 04:21:07 783548 01b58cb0faab4ee28a9228552ffd8d
2019-01-08 04:20:50 3719779 2861308e6110dc7f4850cbe331e63a
2019-01-08 04:01:19 14722 2fa17d3b0c9486c5af435329f62151
2019-01-08 04:20:47 849013 416b598605a8fcd6fc04c2edab4edc
2019-01-08 04:02:18 22852 55bb368600627e7e14ad7648d8f26b
2018-12-18 21:41:50 62535 5b6377d120103a5a8e841a7b94ff4c
2019-01-08 04:02:14 39899 5ffd6a38f12b14c6a5e6aafacf133c
2019-01-08 04:19:52 614053 711b4315305a21dabfa44e74740ff7
2019-01-08 04:01:46 23825 765b898179e0c88af48a638dfe6586
2019-01-08 04:01:17 148555 7b5eca364544bf04f63c390dde7f6e.dir
2019-01-08 04:20:18 865287 7f3444015a68ee039d84986d5f9a98
2019-01-08 04:01:38 18086 8bc9b783a8e0568e71d350e4a1fc37
2019-01-08 04:01:44 9823 95e0f6ac3455b66b69974c3936eba7
2019-01-08 04:19:13 673028 997353cce6f0997dc97e311893f3fb
2019-01-04 22:41:27 54562699 9d7b83b536edc6666c76d16e9bfc6b
2019-01-08 04:20:23 436909 a485af08514391eaf58f7a607a1aaa
2019-01-08 04:19:44 365795 a76a77aa1343aba087211e42f6d2b7
2019-01-08 04:20:40 1232517 ae610c9a0246025b573888e84d766e
2018-12-18 21:41:50 799324 b818767f237d4c9647c1208ce8c28b
2019-01-08 04:20:53 364246 bec0f4fc35ecf1377e62c3859362c4
2019-01-07 07:07:32 59057 cf83393276d56191a88c3d54ef6a5d
2019-01-08 04:02:17 33169 e00d88888f810720aee5b46c3f0772
On server-2 dvc never loads all files
aws --endpoint=https://ceph.acc.ohsu.edu s3 ls s3://bmeg/dvc/9c/
2019-01-07 20:21:07 783548 01b58cb0faab4ee28a9228552ffd8d
2019-01-07 20:20:50 3719779 2861308e6110dc7f4850cbe331e63a
2019-01-07 20:01:19 14722 2fa17d3b0c9486c5af435329f62151
2019-01-07 20:20:47 849013 416b598605a8fcd6fc04c2edab4edc
2019-01-07 20:02:18 22852 55bb368600627e7e14ad7648d8f26b
2018-12-18 13:41:50 62535 5b6377d120103a5a8e841a7b94ff4c
2019-01-07 20:02:14 39899 5ffd6a38f12b14c6a5e6aafacf133c
2019-01-07 20:19:52 614053 711b4315305a21dabfa44e74740ff7
2019-01-07 20:01:46 23825 765b898179e0c88af48a638dfe6586
2019-01-07 20:01:17 148555 7b5eca364544bf04f63c390dde7f6e.dir
2019-01-07 20:20:18 865287 7f3444015a68ee039d84986d5f9a98
2019-01-07 20:01:38 18086 8bc9b783a8e0568e71d350e4a1fc37
2019-01-07 20:01:44 9823 95e0f6ac3455b66b69974c3936eba7
2019-01-07 20:19:13 673028 997353cce6f0997dc97e311893f3fb
2019-01-04 14:41:27 54562699 9d7b83b536edc6666c76d16e9bfc6b
2019-01-07 20:20:23 436909 a485af08514391eaf58f7a607a1aaa
2019-01-07 20:19:44 365795 a76a77aa1343aba087211e42f6d2b7
2019-01-07 20:20:40 1232517 ae610c9a0246025b573888e84d766e
2018-12-18 13:41:50 799324 b818767f237d4c9647c1208ce8c28b
2019-01-07 20:20:53 364246 bec0f4fc35ecf1377e62c3859362c4
2019-01-06 23:07:32 59057 cf83393276d56191a88c3d54ef6a5d
2019-01-07 20:02:17 33169 e00d88888f810720aee5b46c3f0772
ls -l .dvc/cache/9c/
total 908
-rw-rw-r-- 1 ubuntu ubuntu 62535 Nov 9 16:09 5b6377d120103a5a8e841a7b94ff4c
-rw-rw-r-- 1 ubuntu ubuntu 799324 Nov 11 16:54 b818767f237d4c9647c1208ce8c28b
-rw-rw-r-- 1 ubuntu ubuntu 59057 Nov 9 16:22 cf83393276d56191a88c3d54ef6a5d
dvc fetch runs without errors (although it constantly recalcs md5) dvc pull always returns the following.
Warning: Cache '9c7b5eca364544bf04f63c390dde7f6e.dir' not found. File '{'path': '/mnt/bmeg/bmeg-etl/source/ccle/vcfs', 'scheme': 'local'}' won't be created.
both servers are at the same git branch / commit
Issue Analytics
- State:
- Created 5 years ago
- Comments:28 (28 by maintainers)
Top Results From Across the Web
Troubleshooting | Data Version Control - DVC
Users may encounter errors when running dvc pull and dvc fetch , like WARNING: Cache 'xxxx' not found. or ERROR: failed to pull...
Read more >Getting this weird error when trying to run DVC pull
But I am getting this error: WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:...
Read more >DVC - Data Version Control Cheatsheet - Derek Chia
Next, we create a data directory and then use dvc get to get data from a data ... We can now try to...
Read more >Data & Model Management with DVC | Analytics Vidhya
DVC uses a remote repository (including supports all major cloud providers) to store all the data and models for a project. In the...
Read more >shcheklein/example-get-started: Get started DVC project
1-dvc-init : DVC has been initialized. .dvc/ with the cache directory created. 2-track-data : Raw data file data.xml downloaded and tracked with ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Wrote a quick test to confirm:
hypothesis: only fetching first page
https://github.com/iterative/dvc/blob/9528ad6a1dbe205644431bfb0e02b1e2ae8449bb/dvc/remote/s3.py#L221
Confirmed
Could we use
list_objects
instead ?Thank you, we will try it out next week.