Timeout Error while pushing to S3 remote storage
See original GitHub issueBug Report
push and status --cloud throws FSTimeoutError
Description
We made a storage in s3, pushed a couple of files there and everything was fine. After a few days, we set up the pipeline, accumulated some files in the cache, tried to push them, but at the "Estimating size of cache in ‘s3://myremote’ " stage dvc throws an unexpeted error. In a verbose mode it shows fsspec.exceptions.FSTimeoutError. As a result neither dvc push nor dvc status --cloud works, everything crashes on “Estimating size of cache”.
There is access to s3 from that machine: s3cmd works, and I can get files via boto3.
Reproduce
I don’t really know how you can reproduce it, because this storage is on our servers. I will provide any the additional information you need.
Expected
Push files from cache to the remote storage
Environment information
Output of dvc doctor:
$ dvc doctor
2021-08-18 12:54:36,426 DEBUG: Version info for developers:
DVC version: 2.5.4 (pip)
---------------------------------
Platform: Python 3.8.5 on Linux-4.4.0-17763-Microsoft-x86_64-with-glibc2.29
Supports:
hdfs (pyarrow = 5.0.0),
http (requests = 2.26.0),
https (requests = 2.26.0),
s3 (s3fs = 2021.6.1, boto3 = 1.17.49),
ssh (paramiko = 2.7.2)
Cache types: hardlink, symlink
Cache directory: lxfs on rootfs
Caches: local
Remotes: s3
Workspace directory: lxfs on rootfs
Repo: dvc, git
Additional Information (if any):
(my_env) sirily@mow-ws-16:~/work/project_dir $ dvc status -v -r myremote
2021-08-18 12:53:08,099 DEBUG: Check for update is enabled.
2021-08-18 12:53:08,334 DEBUG: Preparing to collect status from s3://myremote
2021-08-18 12:53:08,334 DEBUG: Collecting information from local cache...
2021-08-18 12:53:08,336 DEBUG: Collecting information from remote cache...
2021-08-18 12:53:08,336 DEBUG: Querying 1 hashes via object_exists
2021-08-18 12:53:08,399 DEBUG: Matched '0' indexed hashes
2021-08-18 12:54:36,153 ERROR: unexpected error
------------------------------------------------------------
Traceback (most recent call last):
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/main.py", line 55, in main
ret = cmd.do_run()
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/command/base.py", line 50, in do_run
return self.run()
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/command/status.py", line 54, in run
st = self.repo.status(
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/repo/__init__.py", line 51, in wrapper
return f(repo, *args, **kwargs)
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/repo/status.py", line 135, in status
return _cloud_status(
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/repo/status.py", line 98, in _cloud_status
status_info = self.cloud.status(
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/data_cloud.py", line 134, in status
return remote_obj.status(
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/remote/base.py", line 57, in wrapper
return f(obj, *args, **kwargs)
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/remote/base.py", line 126, in status
dir_status, file_status, _ = self._status(
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/remote/base.py", line 195, in _status
self.hashes_exist(
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/remote/base.py", line 145, in hashes_exist
return indexed_hashes + self.odb.hashes_exist(list(hashes), **kwargs)
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/objects/db/base.py", line 442, in hashes_exist
remote_size, remote_hashes = self._estimate_remote_size(hashes, name)
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/objects/db/base.py", line 249, in _estimate_remote_size
remote_hashes = set(hashes)
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/objects/db/base.py", line 203, in _hashes_with_limit
for hash_ in self.list_hashes(prefix, progress_callback):
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/sirily/db/base.py", line 193, in list_hashes
for path in self._list_paths(prefix, progress_callback):
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/objects/db/base.py", line 173, in _list_paths
for file_info in self.fs.walk_files(path_info, prefix=prefix):
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/fs/fsspec_wrapper.py", line 109, in walk_files
for file in self.find(path_info, **kwargs):
File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/fs/fsspec_wrapper.py", line 178, in find
files = self.fs.find(
File "/home/sirily/work/my_env/lib/python3.8/site-packages/fsspec/asyn.py", line 87, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/home/sirily/work/my_env/lib/python3.8/site-packages/fsspec/asyn.py", line 66, in sync
raise FSTimeoutError
fsspec.exceptions.FSTimeoutError
------------------------------------------------------------
2021-08-18 12:54:36,426 DEBUG: Version info for developers:
DVC version: 2.5.4 (pip)
---------------------------------
Platform: Python 3.8.5 on Linux-4.4.0-17763-Microsoft-x86_64-with-glibc2.29
Supports:
hdfs (pyarrow = 5.0.0),
http (requests = 2.26.0),
https (requests = 2.26.0),
s3 (s3fs = 2021.6.1, boto3 = 1.17.49),
ssh (paramiko = 2.7.2)
Cache types: hardlink, symlink
Cache directory: lxfs on rootfs
Caches: local
Remotes: s3
Workspace directory: lxfs on rootfs
Repo: dvc, git
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2021-08-18 12:54:36,429 DEBUG: Analytics is enabled.
2021-08-18 12:54:36,538 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmpxujreews']'
2021-08-18 12:54:36,551 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmpxujreews']'
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (3 by maintainers)

Top Related StackOverflow Question
Both works
Hi @efiop We use SeaweedFS