dvc pull: unexpected error - [Errno 22] Bad Request
See original GitHub issueBug Report
dvc pull: unexpected error
Description
I have several dvc resources imported into a project… The tracking files (.dvc) are committed to the repository utilizing these resources. When attempting to pull (dvc pull), the associated tracked resources - am getting an error:
An error occurred (400) when calling the HeadObject operation: Bad Request (relevant log from dvc pull —v below).
Reproduce
Example:
- dvc import resource into project
- latter or from a fresh checkout of the above git repo and then attempt to dvc pull
Expected
Expecting tracked resource to be retrieved from DVC. I am able to perform a dvc update <resource> and that will pull the dvc resource into the folder structure, of course this causes the .dvc file to show as modified (even though it points to the same location/revision).
How to proceed here? something corrupted?
Environment information
Output of dvc doctor:
$ dvc doctor
DVC version: 2.8.3 (pip)
---------------------------------
Platform: Python 3.8.0 on Linux-3.10.0-1160.45.1.el7.x86_64-x86_64-with-glibc2.27
Supports:
webhdfs (fsspec = 2021.11.0),
http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
s3 (s3fs = 2021.11.0, boto3 = 1.17.106)
Cache types: hardlink, symlink
Cache directory: nfs on LEB1MLNAS.company.com:/leb1mlnas_projects
Caches: local
Remotes: None
Workspace directory: nfs on LEB1MLNAS.company.com:/leb1mlnas_projects
Repo: dvc, git
$ dvc pull -vvv
CUT......
2021-11-23 09:03:02,390 TRACE: Assuming '/projects/shared_dvc_cache/d6/9a712149e586998fb73c9566bd7e9f' is unchanged since it is read-only
2021-11-23 09:03:02,394 DEBUG: Preparing to transfer data from 's3://dvc-inspection-ai/bit_tool_artifacts/RepoA' to '../../../../../../../shared_dvc_cache'
2021-11-23 09:03:02,394 DEBUG: Preparing to collect status from '../../../../../../../shared_dvc_cache'
2021-11-23 09:03:02,394 DEBUG: Collecting status from '../../../../../../../shared_dvc_cache'
2021-11-23 09:03:02,395 DEBUG: Preparing to collect status from 's3://dvc-inspection-ai/bit_tool_artifacts/RepoA'
2021-11-23 09:03:02,396 DEBUG: Collecting status from 's3://dvc-inspection-ai/bit_tool_artifacts/RepoA'
2021-11-23 09:03:02,396 DEBUG: Querying 1 hashes via object_exists
2021-11-23 09:03:02,510 ERROR: unexpected error - [Errno 22] Bad Request: An error occurred (400) when calling the HeadObject operation: Bad Request
------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/s3fs/core.py", line 250, in _call_s3
out = await method(**additional_kwargs)
File "/usr/local/lib/python3.8/dist-packages/aiobotocore/client.py", line 155, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (400) when calling the HeadObject operation: Bad Request
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/dvc/main.py", line 55, in main
ret = cmd.do_run()
File "/usr/local/lib/python3.8/dist-packages/dvc/command/base.py", line 45, in do_run
return self.run()
File "/usr/local/lib/python3.8/dist-packages/dvc/command/data_sync.py", line 30, in run
stats = self.repo.pull(
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/__init__.py", line 50, in wrapper
return f(repo, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/pull.py", line 29, in pull
processed_files_count = self.fetch(
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/__init__.py", line 50, in wrapper
return f(repo, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/fetch.py", line 67, in fetch
d, f = _fetch(
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/fetch.py", line 87, in _fetch
downloaded += repo.cloud.pull(obj_ids, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/data_cloud.py", line 114, in pull
return transfer(
File "/usr/local/lib/python3.8/dist-packages/dvc/objects/transfer.py", line 153, in transfer
status = compare_status(src, dest, obj_ids, check_deleted=False, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/objects/status.py", line 166, in compare_status
src_exists, src_missing = status(
File "/usr/local/lib/python3.8/dist-packages/dvc/objects/status.py", line 132, in status
odb.hashes_exist(hashes, name=str(odb.path_info), **kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/objects/db/base.py", line 468, in hashes_exist
remote_hashes = self.list_hashes_exists(hashes, jobs, name)
File "/usr/local/lib/python3.8/dist-packages/dvc/objects/db/base.py", line 419, in list_hashes_exists
ret = list(itertools.compress(hashes, in_remote))
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 611, in result_iterator
yield fs.pop().result()
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/objects/db/base.py", line 410, in exists_with_progress
ret = self.fs.exists(path_info)
File "/usr/local/lib/python3.8/dist-packages/dvc/fs/fsspec_wrapper.py", line 136, in exists
return self.fs.exists(self._with_bucket(path_info))
File "/usr/local/lib/python3.8/dist-packages/fsspec/asyn.py", line 91, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/fsspec/asyn.py", line 71, in sync
raise return_result
File "/usr/local/lib/python3.8/dist-packages/fsspec/asyn.py", line 25, in _runner
result[0] = await coro
File "/usr/local/lib/python3.8/dist-packages/s3fs/core.py", line 822, in _exists
await self._info(path, bucket, key, version_id=version_id)
File "/usr/local/lib/python3.8/dist-packages/s3fs/core.py", line 1016, in _info
out = await self._call_s3(
File "/usr/local/lib/python3.8/dist-packages/s3fs/core.py", line 270, in _call_s3
raise err
OSError: [Errno 22] Bad Request
------------------------------------------------------------
2021-11-23 09:03:02,825 DEBUG: Version info for developers:
DVC version: 2.8.3 (pip)
---------------------------------
Platform: Python 3.8.0 on Linux-3.10.0-1160.45.1.el7.x86_64-x86_64-with-glibc2.27
Supports:
webhdfs (fsspec = 2021.11.0),
http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
s3 (s3fs = 2021.11.0, boto3 = 1.17.106)
Cache types: hardlink, symlink
Cache directory: nfs on LEB1MLNAS.company.com:/leb1mlnas_projects
Caches: local
Remotes: None
Workspace directory: nfs on LEB1MLNAS.company.com:/leb1mlnas_projects
Repo: dvc, git
Issue Analytics
- State:
- Created 2 years ago
- Comments:17 (8 by maintainers)
Top Results From Across the Web
Troubleshooting | Data Version Control - DVC
A known problem some users run into with the dvc pull , dvc fetch and dvc push commands is [Errno 24] Too many...
Read more >How to fix DVC error 'FileNotFoundError: [Errno 2] No such file ...
Trying to pull a folder with test data into a GitHub actions container, I get. FileNotFoundError: [Errno 2] No such file or directory....
Read more >Untitled
Rojo desde el amanecer acordes, Hmv dvd box set breaking bad, Sk504 motion ... Creative bc films, Savitri serial 6th jan 2016, Maglia...
Read more >shcheklein/example-get-started: Get started DVC project
$ dvc repro Data and pipelines are up to date. If you'd like to test commands like dvc push , that require write...
Read more >Package List — Spack 0.20.0.dev0 documentation
AMD also provides highly optimized libraries, which extract the optimal performance from each x86 processor core when utilized. The AOCC Compiler Suite ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Yes - that is correct… In this particular cases - all of the individually imported dvc artifacts originate in the same git artifacts repository.
Yes - that work just fine… I just performed
Sorry I had meant to say
dvc pull --jobs 1
produced the same error that I originally posted….
I am in a Linux vm - only 5 cpus - the minio endpoint is a reasonable size - and I am the only one currently interacting with it…. (Doubt it’s a capacity issue)
I get the same error on a Windows client 12core laptop… a colleague of mine also received the same error attempting access to the same repo
The ‘none’ - I am guessing might be the git project doesn’t have its own dvc repo configured (remote), rather it only contains dvc imported resources…. Is that a clue?