question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Timeout Error while pushing to S3 remote storage

See original GitHub issue

Bug Report

push and status --cloud throws FSTimeoutError

Description

We made a storage in s3, pushed a couple of files there and everything was fine. After a few days, we set up the pipeline, accumulated some files in the cache, tried to push them, but at the "Estimating size of cache in ‘s3://myremote’ " stage dvc throws an unexpeted error. In a verbose mode it shows fsspec.exceptions.FSTimeoutError. As a result neither dvc push nor dvc status --cloud works, everything crashes on “Estimating size of cache”. There is access to s3 from that machine: s3cmd works, and I can get files via boto3.

Reproduce

I don’t really know how you can reproduce it, because this storage is on our servers. I will provide any the additional information you need.

Expected

Push files from cache to the remote storage

Environment information

Output of dvc doctor:

$ dvc doctor
2021-08-18 12:54:36,426 DEBUG: Version info for developers:
DVC version: 2.5.4 (pip)
---------------------------------
Platform: Python 3.8.5 on Linux-4.4.0-17763-Microsoft-x86_64-with-glibc2.29
Supports:
        hdfs (pyarrow = 5.0.0),
        http (requests = 2.26.0),
        https (requests = 2.26.0),
        s3 (s3fs = 2021.6.1, boto3 = 1.17.49),
        ssh (paramiko = 2.7.2)
Cache types: hardlink, symlink
Cache directory: lxfs on rootfs
Caches: local
Remotes: s3
Workspace directory: lxfs on rootfs
Repo: dvc, git

Additional Information (if any):

(my_env) sirily@mow-ws-16:~/work/project_dir $ dvc status -v -r myremote
2021-08-18 12:53:08,099 DEBUG: Check for update is enabled.
2021-08-18 12:53:08,334 DEBUG: Preparing to collect status from s3://myremote
2021-08-18 12:53:08,334 DEBUG: Collecting information from local cache...
2021-08-18 12:53:08,336 DEBUG: Collecting information from remote cache...
2021-08-18 12:53:08,336 DEBUG: Querying 1 hashes via object_exists
2021-08-18 12:53:08,399 DEBUG: Matched '0' indexed hashes
2021-08-18 12:54:36,153 ERROR: unexpected error
------------------------------------------------------------
Traceback (most recent call last):
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/main.py", line 55, in main
    ret = cmd.do_run()
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/command/base.py", line 50, in do_run
    return self.run()
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/command/status.py", line 54, in run
    st = self.repo.status(
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/repo/__init__.py", line 51, in wrapper
    return f(repo, *args, **kwargs)
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/repo/status.py", line 135, in status
    return _cloud_status(
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/repo/status.py", line 98, in _cloud_status
    status_info = self.cloud.status(
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/data_cloud.py", line 134, in status
    return remote_obj.status(
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/remote/base.py", line 57, in wrapper
    return f(obj, *args, **kwargs)
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/remote/base.py", line 126, in status
    dir_status, file_status, _ = self._status(
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/remote/base.py", line 195, in _status
    self.hashes_exist(
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/remote/base.py", line 145, in hashes_exist
    return indexed_hashes + self.odb.hashes_exist(list(hashes), **kwargs)
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/objects/db/base.py", line 442, in hashes_exist
    remote_size, remote_hashes = self._estimate_remote_size(hashes, name)
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/objects/db/base.py", line 249, in _estimate_remote_size
    remote_hashes = set(hashes)
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/objects/db/base.py", line 203, in _hashes_with_limit
    for hash_ in self.list_hashes(prefix, progress_callback):
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/sirily/db/base.py", line 193, in list_hashes
    for path in self._list_paths(prefix, progress_callback):
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/objects/db/base.py", line 173, in _list_paths
    for file_info in self.fs.walk_files(path_info, prefix=prefix):
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/fs/fsspec_wrapper.py", line 109, in walk_files
    for file in self.find(path_info, **kwargs):
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/dvc/fs/fsspec_wrapper.py", line 178, in find
    files = self.fs.find(
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/fsspec/asyn.py", line 87, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/home/sirily/work/my_env/lib/python3.8/site-packages/fsspec/asyn.py", line 66, in sync
    raise FSTimeoutError
fsspec.exceptions.FSTimeoutError
------------------------------------------------------------
2021-08-18 12:54:36,426 DEBUG: Version info for developers:
DVC version: 2.5.4 (pip)
---------------------------------
Platform: Python 3.8.5 on Linux-4.4.0-17763-Microsoft-x86_64-with-glibc2.29
Supports:
        hdfs (pyarrow = 5.0.0),
        http (requests = 2.26.0),
        https (requests = 2.26.0),
        s3 (s3fs = 2021.6.1, boto3 = 1.17.49),
        ssh (paramiko = 2.7.2)
Cache types: hardlink, symlink
Cache directory: lxfs on rootfs
Caches: local
Remotes: s3
Workspace directory: lxfs on rootfs
Repo: dvc, git

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2021-08-18 12:54:36,429 DEBUG: Analytics is enabled.
2021-08-18 12:54:36,538 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmpxujreews']'
2021-08-18 12:54:36,551 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmpxujreews']'

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
sirilycommented, Aug 19, 2021

Both works

1reaction
sirilycommented, Aug 19, 2021

Hi @efiop We use SeaweedFS

Read more comments on GitHub >

github_iconTop Results From Across the Web

dvc get: S3 timeout error when trying to dowload files #8007
Bug Report Description I have several files tracked with dvc in a S3 bucket. When I try to download these files with dvc...
Read more >
'dvc push' multiple small files to aws s3 causes timeout error
Hi, I have problem with pushing large amount of small files to s3 via 'dvc push' command (~2000 files few hundred kb each)...
Read more >
Troubleshoot endpoint URL connection error ... - Amazon AWS
I'm trying to run the cp or sync command on my Amazon Simple Storage Service (Amazon S3) bucket. However, I'm getting the "Could...
Read more >
S3 Upload/Download Timeout Issues - Stack Overflow
I'm attempting to create a kmz format file of geotagged images, using S3 file storage and sdk access through a ruby-on-rails app on...
Read more >
Artifact object store timing out on S3 upload (#270077) - GitLab
We've recently configured consolidated settings for object store, explicitly for artifacts. We're seeing Rack timeouts when attempting to upload ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found