metadata is case sensitive due to dict conversion
See original GitHub issueOverview
I believe boto3 should force-lowercase all metadata key names. Metadata keys should be case insensitive due to how they are stored as headers, but once converted to dict keys in boto, they become case sensitive.
Details
As per RFC compliance headers are case insensitive and their case cannot be guaranteed in transit.
Currently aws s3 uses headers for metadata.
Thus, when boto3 retrieves headers for an s3 object, depending on how the metadata is served, the capitalization of the key is non-deterministic.
assuming we have a bucket called bucket
and we’ve assigned metadata to a key prefix
with the key name SHA256
.
e.g. with this response:
$ curl -i https://s3.amazonaws.com/bucket/prefix/meta
HTTP/1.1 200 OK
x-amz-id-2: PDrTPr1jqa915ct8WteuGMyA5Sf/7tsab0ZSIZaIFM7JWhj3fju1cRzqgxpY0QR5DC9km5p6N0M=
x-amz-request-id: A2CD149C4C89B730
Date: Sun, 16 Sep 2018 21:26:08 GMT
Last-Modified: Sun, 16 Sep 2018 21:25:40 GMT
ETag: "38103f9a76bb14e2abbc051b65722cff"
x-amz-meta-sha256: 8ba8085ad1bb937158eb6db5de263219ff69a7ca87637425e85f3fcf683cd1a3,8ba8085ad1bb937158eb6db5de263219ff69a7ca87637425e85f3fcf683cd1a3,8ba8085ad1bb937158eb6db5de263219ff69a7ca87637425e85f3fcf683cd1a3,8ba8085ad1bb937158eb6db5de263219ff69a7ca87637425e85f3fcf683cd1a3
x-amz-version-id: fk9ln8tfFKd55Ex9.mKY_NdJY13XqGfb
Accept-Ranges: bytes
Content-Type: binary/octet-stream
Content-Length: 26
Server: AmazonS3
one would end up with a metadata key name of sha256
however with this response (note the capitalization of the headers):
$ curl -i https://play.minio.io:9000/bucket/prefix/meta
HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Length: 26
Content-Security-Policy: block-all-mixed-content
Etag: "38103f9a76bb14e2abbc051b65722cff"
Last-Modified: Sun, 16 Sep 2018 21:27:16 GMT
Server: Minio/DEVELOPMENT.2018-09-13T21-51-05Z (linux; amd64)
Vary: Origin
X-Amz-Meta-Sha256: 8ba8085ad1bb937158eb6db5de263219ff69a7ca87637425e85f3fcf683cd1a3,8ba8085ad1bb937158eb6db5de263219ff69a7ca87637425e85f3fcf683cd1a3,8ba8085ad1bb937158eb6db5de263219ff69a7ca87637425e85f3fcf683cd1a3,8ba8085ad1bb937158eb6db5de263219ff69a7ca87637425e85f3fcf683cd1a3
X-Amz-Request-Id: 1554FE8F17FD9015
X-Xss-Protection: 1; mode=block
Date: Sun, 16 Sep 2018 21:27:43 GMT
Content-Type: text/plain; charset=utf-8
one would end up with a metadata key name of Sha256
Thus, depending on endpoint or transit behavior, one cannot know the expected capitalization of the metadata, even if they started from the same format on input (e.g. SHA256
).
If they were always lowercased, then whether my input is SHA256
, Sha256
, sha256
, sHa256
, etc… then I can always access that key in boto by looking for sha256
.
Python 3.6.6
boto3==1.9.2
botocore==1.12.2
Issue Analytics
- State:
- Created 5 years ago
- Reactions:1
- Comments:9 (3 by maintainers)
Top GitHub Comments
Doing workarounds rather than the supported interface (especially service resources) is not a solution for me. Hopefully this can be properly addressed in a future major version.
Thanks for the response.
We also just hit this issue in https://github.com/singularityhub/sregistry-cli/pull/169#issuecomment-451903100. Specifically, the key “sizemb” is added to the metadata (note all lowercase) but returns the next time as “Sizemb.” The only solution is a work around to check for both, which I +1, is not a good long term fix.