question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

What happened: I get a KeyError when trying to copy files from bucket to local:

>       _, _, parts_suffix = info["ETag"].strip('"').partition("-")
E       KeyError: 'ETag'

What you expected to happen: files get copied

Minimal Complete Verifiable Example:

from s3fs import S3FileSystem
s3 = S3FileSystem()
s3.copy(input_path, '/tmp/foobar', recursive=True)

Anything else we need to know?: in the _info() function, when the ls was cached, the response never includes the ‘Etag’ key, so I think it’s a bug to rely on it always being there.

Environment:

  • s3fs version: 2021.4.0
  • Python version: 3.6.10
  • Operating System: Mac
  • Install method (conda, pip, source): pip

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
martindurantcommented, May 11, 2021

It does seem reasonable not to depend on the existence of ETag, if some implementations do not have it. I imagine that for moto, this is version-dependent, since we are not seeing the problem in CI.

From the description, I couldn’t tell if the problem was with attempting to copy a folder (which is not an S3 thing) or one of the constituent files.

1reaction
machielgcommented, May 11, 2021

in the _info() function, when the ls was cached, the response never includes the ‘Etag’ key, so I think it’s a bug to rely on it always being there.

I think this is the actual issue. @machielg can you share the fs.info(input_path) too? I guess as an end result we could just do info.get("ETag", "") though I wonder why the etag is missing in the first place.

s3.info(input_path)
{'Key': 'source_bucket/processed/scoring_set/partitioning_date=2021-04-18', 'name': 'source_bucket/processed/scoring_set/partitioning_date=2021-04-18', 'type': 'directory', 'Size': 0, 'size': 0, 'StorageClass': 'DIRECTORY'}
Read more comments on GitHub >

github_iconTop Results From Across the Web

modin pandas read_parquet() failed on ETag KeyError trying ...
This issue on the Modin GitHub tracked support for reading partitioned files with read_parquet in Modin, as you are trying to do here....
Read more >
What is error 'KeyError: etag'? - S3cmd
What is error 'KeyError: etag'?. This is an old error in s3cmd that is now fixed. Please upgrade to s3cmd 0.9.8.4 or later....
Read more >
modin pandas read_parquet() failed on ETag KeyError trying ...
modin pandas read_parquet() failed on ETag KeyError trying to read a partitioned parquet from s3 #3185. Closed.
Read more >
KeyError Pandas – How To Fix - Data Independent
Pandas KeyError - This annoying error means that Pandas can not find your column name in your dataframe. Here's how to fix this...
Read more >
How to Fix: KeyError in Pandas - GeeksforGeeks
Pandas KeyError occurs when we try to access some column/row label in our DataFrame that doesn't exist. Usually, this error occurs when you ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found