push: fails for S3 remote on DVC 2.3.0
See original GitHub issueBug Report
Resolved: Bug caused by an issue with Cython on python 3.7. Fixed by using python 3.8.
Description
Having an issue with DVC version 2.3.0 where trying to push to an S3 remote results in an error. This error does NOT happen with version 2.10.18 which is what I was using before upgrading. The verbose output is below:
$ dvc push -v a.txt.dvc
2021-06-15 15:13:29,227 DEBUG: Check for update is enabled.
2021-06-15 15:13:29,249 DEBUG: Checking if stage 'a.txt' is in 'dvc.yaml'
2021-06-15 15:13:30,084 DEBUG: Preparing to upload data to 's3://dvc-bucket/'
2021-06-15 15:13:30,084 DEBUG: Preparing to collect status from s3://dvc-bucket/
2021-06-15 15:13:30,084 DEBUG: Collecting information from local cache...
2021-06-15 15:13:30,085 DEBUG: Collecting information from remote cache...
2021-06-15 15:13:30,085 DEBUG: Matched '0' indexed hashes
2021-06-15 15:13:30,085 DEBUG: Querying 1 hashes via object_exists
2021-06-15 15:13:30,165 ERROR: unexpected error - Cannot add child handler, the child watcher does not have a loop attached
------------------------------------------------------------
Traceback (most recent call last):
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/main.py", line 55, in main
ret = cmd.do_run()
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/command/base.py", line 50, in do_run
return self.run()
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/command/data_sync.py", line 66, in run
glob=self.args.glob,
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/repo/__init__.py", line 50, in wrapper
return f(repo, *args, **kwargs)
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/repo/push.py", line 41, in push
return len(used_run_cache) + self.cloud.push(used, jobs, remote=remote)
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/data_cloud.py", line 68, in push
show_checksums=show_checksums,
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/remote/base.py", line 56, in wrapper
return f(obj, *args, **kwargs)
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/remote/base.py", line 469, in push
download=False,
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/remote/base.py", line 326, in _process
download=download,
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/remote/base.py", line 174, in _status
md5s, jobs=jobs, name=str(self.fs.path_info)
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/remote/base.py", line 130, in hashes_exist
return indexed_hashes + self.odb.hashes_exist(list(hashes), **kwargs)
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/objects/db/base.py", line 411, in hashes_exist
remote_hashes = self.list_hashes_exists(hashes, jobs, name)
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/objects/db/base.py", line 362, in list_hashes_exists
ret = list(itertools.compress(hashes, in_remote))
File "/Users/alexhua/.pyenv/versions/3.7.5/lib/python3.7/concurrent/futures/_base.py", line 598, in result_iterator
yield fs.pop().result()
File "/Users/alexhua/.pyenv/versions/3.7.5/lib/python3.7/concurrent/futures/_base.py", line 435, in result
return self.__get_result()
File "/Users/alexhua/.pyenv/versions/3.7.5/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/Users/alexhua/.pyenv/versions/3.7.5/lib/python3.7/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/objects/db/base.py", line 353, in exists_with_progress
ret = self.fs.exists(path_info)
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/dvc/fs/fsspec_wrapper.py", line 92, in exists
return self.fs.exists(self._with_bucket(path_info))
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/fsspec/asyn.py", line 72, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/fsspec/asyn.py", line 53, in sync
raise result[0]
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/fsspec/asyn.py", line 20, in _runner
result[0] = await coro
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/s3fs/core.py", line 746, in _exists
await self._info(path, bucket, key, version_id=version_id)
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/s3fs/core.py", line 1003, in _info
out = await self._simple_info(path)
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/s3fs/core.py", line 923, in _simple_info
**self.req_kw,
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/s3fs/core.py", line 225, in _call_s3
await self.set_session()
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/s3fs/core.py", line 369, in set_session
self._s3 = await s3creator.__aenter__()
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/aiobotocore/session.py", line 20, in __aenter__
self._client = await self._coro
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/aiobotocore/session.py", line 96, in _create_client
credentials = await self.get_credentials()
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/aiobotocore/session.py", line 121, in get_credentials
'credential_provider').load_credentials())
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/aiobotocore/credentials.py", line 813, in load_credentials
creds = await provider.load()
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/aiobotocore/credentials.py", line 435, in load
creds_dict = await self._retrieve_credentials_using(credential_process)
File "/Users/alexhua/.pyenv/versions/3.7.5/envs/speaker/lib/python3.7/site-packages/aiobotocore/credentials.py", line 456, in _retrieve_credentials_using
stderr=subprocess.PIPE)
File "/Users/alexhua/.pyenv/versions/3.7.5/lib/python3.7/asyncio/subprocess.py", line 217, in create_subprocess_exec
stderr=stderr, **kwds)
File "/Users/alexhua/.pyenv/versions/3.7.5/lib/python3.7/asyncio/base_events.py", line 1529, in subprocess_exec
bufsize, **kwargs)
File "/Users/alexhua/.pyenv/versions/3.7.5/lib/python3.7/asyncio/unix_events.py", line 193, in _make_subprocess_transport
self._child_watcher_callback, transp)
File "/Users/alexhua/.pyenv/versions/3.7.5/lib/python3.7/asyncio/unix_events.py", line 941, in add_child_handler
"Cannot add child handler, "
RuntimeError: Cannot add child handler, the child watcher does not have a loop attached
------------------------------------------------------------
2021-06-15 15:13:30,264 DEBUG: Version info for developers:
DVC version: 2.3.0 (pip)
---------------------------------
Platform: Python 3.7.5 on Darwin-19.6.0-x86_64-i386-64bit
Supports: http, https, s3
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s5
Caches: local
Remotes: s3
Workspace directory: apfs on /dev/disk1s5
Repo: dvc, git
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2021-06-15 15:13:30,265 DEBUG: Analytics is enabled.
2021-06-15 15:13:30,459 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/5n/qnysg58114b_lggngtlgnpmh0000gn/T/tmp6hqq08el']'
2021-06-15 15:13:30,461 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/5n/qnysg58114b_lggngtlgnpmh0000gn/T/tmp6hqq08el']'
Reproduce
- Set up S3 bucket for dvc remote with a profile in ~/.aws/config
- Add the remote and specify the profile
- Create a test file, run ‘dvc add’ and ‘dvc push’
Expected
For the file to pushed to the remote
Environment information
Output of dvc doctor:
$ dvc doctor
DVC version: 2.3.0 (pip)
---------------------------------
Platform: Python 3.7.10 on Darwin-19.6.0-x86_64-i386-64bit
Supports: http, https, s3
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s5
Caches: local
Remotes: s3, local
Workspace directory: apfs on /dev/disk1s5
Repo: dvc, git
Additional Information (if any):
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (3 by maintainers)
Top Results From Across the Web
s3 remote: unexpected error with session token #5130 - GitHub
Description. When using a session token for aws s3 remote the dvc push produces the following error: ERROR: unexpected error - An error...
Read more >Troubleshooting | Data Version Control - DVC
The most common cause is changes pushed to Git without the corresponding data being uploaded to the DVC remote. Make sure to dvc...
Read more >Troubleshooting - DagsHub Docs
When trying to push files using DVC, the operation fails. Error. ERROR: failed to push data to the cloud - '503 Service Temporarily...
Read more >dvc Changelog - pyup.io
1. Commands such as `dvc status\fetch\pull\push` now take into account locked stages. 2. Support `dvc add` for external files(e.g. `dvc add s3://mybucket/myfile ...
Read more >Data & Model Management with DVC | Analytics Vidhya
With this, we have set up our AWS Credentials & Amazon S3 bucket to store our data remotely using DVC. Pushing Data to...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

@isidentical Switching to 3.8 fixed the issue, good find! Greatly appreciate the help!
@alhuang10 Could you try with Python 3.8+? This seems like a CPython issue that is fixed on 3.8+ but not on 3.7, see https://bugs.python.org/issue35621 for details. One thing that would be sort of a general fix to some of the problems we get due to non-main thread async executions is that authenticating outside of the thread pool which is not really straight-forward, since we don’t tend to create the filesystem instances until we actually need them. There might be other options, but all the stuff that comes to my mind at the first glance is kind of hacky.