question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

push: fails for S3 remote on DVC 2.3.0

See original GitHub issue

Bug Report

Resolved: Bug caused by an issue with Cython on python 3.7. Fixed by using python 3.8.

Description

Having an issue with DVC version 2.3.0 where trying to push to an S3 remote results in an error. This error does NOT happen with version 2.10.18 which is what I was using before upgrading. The verbose output is below:

$ dvc push -v a.txt.dvc
2021-06-15 15:13:29,227 DEBUG: Check for update is enabled.
2021-06-15 15:13:29,249 DEBUG: Checking if stage 'a.txt' is in 'dvc.yaml'
2021-06-15 15:13:30,084 DEBUG: Preparing to upload data to 's3://dvc-bucket/'
2021-06-15 15:13:30,084 DEBUG: Preparing to collect status from s3://dvc-bucket/
2021-06-15 15:13:30,084 DEBUG: Collecting information from local cache...
2021-06-15 15:13:30,085 DEBUG: Collecting information from remote cache...
2021-06-15 15:13:30,085 DEBUG: Matched '0' indexed hashes
2021-06-15 15:13:30,085 DEBUG: Querying 1 hashes via object_exists
2021-06-15 15:13:30,165 ERROR: unexpected error - Cannot add child handler, the child watcher does not have a loop attached
------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/main.py", line 55, in main
    ret = cmd.do_run()
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/command/base.py", line 50, in do_run
    return self.run()
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/command/data_sync.py", line 66, in run
    glob=self.args.glob,
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/repo/__init__.py", line 50, in wrapper
    return f(repo, *args, **kwargs)
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/repo/push.py", line 41, in push
    return len(used_run_cache) + self.cloud.push(used, jobs, remote=remote)
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/data_cloud.py", line 68, in push
    show_checksums=show_checksums,
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/remote/base.py", line 56, in wrapper
    return f(obj, *args, **kwargs)
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/remote/base.py", line 469, in push
    download=False,
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/remote/base.py", line 326, in _process
    download=download,
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/remote/base.py", line 174, in _status
    md5s, jobs=jobs, name=str(self.fs.path_info)
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/remote/base.py", line 130, in hashes_exist
    return indexed_hashes + self.odb.hashes_exist(list(hashes), **kwargs)
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/objects/db/base.py", line 411, in hashes_exist
    remote_hashes = self.list_hashes_exists(hashes, jobs, name)
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/objects/db/base.py", line 362, in list_hashes_exists
    ret = list(itertools.compress(hashes, in_remote))
  File "/Users/alexhua/.pyenv/versions/3.7.5/lib/python3.7/concurrent/futures/_base.py", line 598, in result_iterator
    yield fs.pop().result()
  File "/Users/alexhua/.pyenv/versions/3.7.5/lib/python3.7/concurrent/futures/_base.py", line 435, in result
    return self.__get_result()
  File "/Users/alexhua/.pyenv/versions/3.7.5/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/Users/alexhua/.pyenv/versions/3.7.5/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/objects/db/base.py", line 353, in exists_with_progress
    ret = self.fs.exists(path_info)
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/fs/fsspec_wrapper.py", line 92, in exists
    return self.fs.exists(self._with_bucket(path_info))
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/fsspec/asyn.py", line 72, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/fsspec/asyn.py", line 53, in sync
    raise result[0]
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/fsspec/asyn.py", line 20, in _runner
    result[0] = await coro
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/s3fs/core.py", line 746, in _exists
    await self._info(path, bucket, key, version_id=version_id)
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/s3fs/core.py", line 1003, in _info
    out = await self._simple_info(path)
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/s3fs/core.py", line 923, in _simple_info
    **self.req_kw,
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/s3fs/core.py", line 225, in _call_s3
    await self.set_session()
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/s3fs/core.py", line 369, in set_session
    self._s3 = await s3creator.__aenter__()
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/aiobotocore/session.py", line 20, in __aenter__
    self._client = await self._coro
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/aiobotocore/session.py", line 96, in _create_client
    credentials = await self.get_credentials()
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/aiobotocore/session.py", line 121, in get_credentials
    'credential_provider').load_credentials())
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/aiobotocore/credentials.py", line 813, in load_credentials
    creds = await provider.load()
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/aiobotocore/credentials.py", line 435, in load
    creds_dict = await self._retrieve_credentials_using(credential_process)
  File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/aiobotocore/credentials.py", line 456, in _retrieve_credentials_using
    stderr=subprocess.PIPE)
  File "/Users/alexhua/.pyenv/versions/3.7.5/lib/python3.7/asyncio/subprocess.py", line 217, in create_subprocess_exec
    stderr=stderr, **kwds)
  File "/Users/alexhua/.pyenv/versions/3.7.5/lib/python3.7/asyncio/base_events.py", line 1529, in subprocess_exec
    bufsize, **kwargs)
  File "/Users/alexhua/.pyenv/versions/3.7.5/lib/python3.7/asyncio/unix_events.py", line 193, in _make_subprocess_transport
    self._child_watcher_callback, transp)
  File "/Users/alexhua/.pyenv/versions/3.7.5/lib/python3.7/asyncio/unix_events.py", line 941, in add_child_handler
    "Cannot add child handler, "
RuntimeError: Cannot add child handler, the child watcher does not have a loop attached
------------------------------------------------------------
2021-06-15 15:13:30,264 DEBUG: Version info for developers:
DVC version: 2.3.0 (pip)
---------------------------------
Platform: Python 3.7.5 on Darwin-19.6.0-x86_64-i386-64bit
Supports: http, https, s3
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s5
Caches: local
Remotes: s3
Workspace directory: apfs on /dev/disk1s5
Repo: dvc, git

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2021-06-15 15:13:30,265 DEBUG: Analytics is enabled.
2021-06-15 15:13:30,459 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/5n/qnysg58114b_lggngtlgnpmh0000gn/T/tmp6hqq08el']'
2021-06-15 15:13:30,461 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/5n/qnysg58114b_lggngtlgnpmh0000gn/T/tmp6hqq08el']'

Reproduce

  1. Set up S3 bucket for dvc remote with a profile in ~/.aws/config
  2. Add the remote and specify the profile
  3. Create a test file, run ‘dvc add’ and ‘dvc push’

Expected

For the file to pushed to the remote

Environment information

Output of dvc doctor:

$ dvc doctor

DVC version: 2.3.0 (pip)
---------------------------------
Platform: Python 3.7.10 on Darwin-19.6.0-x86_64-i386-64bit
Supports: http, https, s3
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s5
Caches: local
Remotes: s3, local
Workspace directory: apfs on /dev/disk1s5
Repo: dvc, git

Additional Information (if any):

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

6reactions
alhuang10commented, Jun 17, 2021

@isidentical Switching to 3.8 fixed the issue, good find! Greatly appreciate the help!

2reactions
isidenticalcommented, Jun 17, 2021

@alhuang10 Could you try with Python 3.8+? This seems like a CPython issue that is fixed on 3.8+ but not on 3.7, see https://bugs.python.org/issue35621 for details. One thing that would be sort of a general fix to some of the problems we get due to non-main thread async executions is that authenticating outside of the thread pool which is not really straight-forward, since we don’t tend to create the filesystem instances until we actually need them. There might be other options, but all the stuff that comes to my mind at the first glance is kind of hacky.

Read more comments on GitHub >

github_iconTop Results From Across the Web

s3 remote: unexpected error with session token #5130 - GitHub
Description. When using a session token for aws s3 remote the dvc push produces the following error: ERROR: unexpected error - An error...
Read more >
Troubleshooting | Data Version Control - DVC
The most common cause is changes pushed to Git without the corresponding data being uploaded to the DVC remote. Make sure to dvc...
Read more >
Troubleshooting - DagsHub Docs
When trying to push files using DVC, the operation fails. Error. ERROR: failed to push data to the cloud - '503 Service Temporarily...
Read more >
dvc Changelog - pyup.io
1. Commands such as `dvc status\fetch\pull\push` now take into account locked stages. 2. Support `dvc add` for external files(e.g. `dvc add s3://mybucket/myfile ...
Read more >
Data & Model Management with DVC | Analytics Vidhya
With this, we have set up our AWS Credentials & Amazon S3 bucket to store our data remotely using DVC. Pushing Data to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found