GET Request fails when using Alluxio S3 API, same request succeeds when AWS S3 API is used directly
See original GitHub issueAlluxio Version: What version of Alluxio are you using? enterprise-2.8.0-2.0
Describe the bug A clear and concise description of what the bug is.
S3 GET request fails with a 404 response when using endpoint. Similar request succeeds when S3 API is used directly
OK Response from S3 API
- http-outgoing-1 >> GET /REPLACE_BUCKET_NAME/?list-type=2&delimiter=%2F&max-keys=2&prefix=REPLACE_WITH_PREFIX%2F&fetch-owner=false HTTP/1.1
- http-outgoing-1 >> Host: s3.us-west-2.amazonaws.com
- http-outgoing-1 >> amz-sdk-invocation-id: ...
- http-outgoing-1 >> amz-sdk-request: ...
- http-outgoing-1 >> amz-sdk-retry: 0/0/500
- http-outgoing-1 >> Authorization: ...
- http-outgoing-1 >> Content-Type: application/octet-stream
- http-outgoing-1 >> Content-Length: 0
- http-outgoing-1 >> Connection: Keep-Alive
- http-outgoing-1 >> [\r][\n]
- http-outgoing-1 << HTTP/1.1 200 OK
- http-outgoing-1 << x-amz-id-2: ...
- http-outgoing-1 << x-amz-request-id: ..
- http-outgoing-1 << Date: Thu, 07 Jul 2022 05:46:22 GMT
- http-outgoing-1 << x-amz-bucket-region: ...
- http-outgoing-1 << Content-Type: application/xml
- http-outgoing-1 << Transfer-Encoding: chunked
- http-outgoing-1 << Server: AmazonS3
- http-outgoing-1 << [\r][\n]
- http-outgoing-1 << *HTTP/1.1 200 OK*
404 Response from Alluxio S3 API For Same Request
- http-outgoing-0 >> GET /api/v1/s3/REPLACE_BUCKET_NAME/?list-type=2&delimiter=%2F&max-keys=2&prefix=REPLACE_WITH_PREFIX%2F&fetch-owner=false HTTP/1.1[\r][\n]
- http-outgoing-0 >> Host: api.g.....com:39999[\r][\n]
- http-outgoing-0 >> amz-sdk-invocation-id: ...
- http-outgoing-0 >> amz-sdk-request: ...
- http-outgoing-0 >> amz-sdk-retry: 0/0/500
- http-outgoing-0 >> Authorization: ...
- http-outgoing-0 >> Content-Type: application/octet-stream
- http-outgoing-0 >> Content-Length: 0
- http-outgoing-0 >> Connection: Keep-Alive
- http-outgoing-0 >> [\r][\n]
- http-outgoing-0 << HTTP/1.1 404 Not Found
- http-outgoing-0 << Date: Thu, 07 Jul 2022 05:48:00 GMT[\r][\n]
- http-outgoing-0 << Content-Type: application/xml[\r][\n]
- http-outgoing-0 << Content-Length: 196[\r][\n]
- http-outgoing-0 << Server: Jetty(9.4.43.v20210629)[\r][\n]
- http-outgoing-0 << [\r][\n]
- http-outgoing-0 << <Error><RequestId></RequestId><Code>NoSuchBucket</Code><Message>Path /REPLACE_BUCKET_NAME/REPLACE_WITH_PREFIX does not exist.</Message><Resource>REPLACE_BUCKET_NAME</Resource></Error>
- http-outgoing-0 << *HTTP/1.1 404 Not Found*
To Reproduce Steps to reproduce the behavior (as minimally and precisely as possible)
- create a 1:1 mapped s3 bucket mount which is such that: s3://some_bucket on alluxio === s3://some_bucket on s3
Try to Expected behavior Expect a similar response, 200 OK as returned from S3 API
Urgency Critical, blocks creation of files from Spark
Are you planning to fix it TBD
Additional context Trying to use Alluxio to read/write data from Spark. Writing some data from Spark works fine when s3a endpoint is not used, fails when Alluxio S3 API endpoint is used due to above 404 response.
Issue Analytics
- State:
- Created a year ago
- Comments:5 (4 by maintainers)
Top Results From Across the Web
S3 API - Alluxio v2.9.0 (stable) Documentation - Introduction
Amazon S3 is a distributed system. If it receives multiple write requests for the same object simultaneously, it overwrites all but the last...
Read more >Troubleshoot HTTP 5xx errors from Amazon S3
When I make a request to Amazon Simple Storage Service (Amazon S3), Amazon S3 returns a 5xx status error. How do I troubleshoot...
Read more >Building an Event-Driven, Fault-Tolerant Data Pipeline with AWS ...
What we did to get around this was set up the configuration we wanted, ... One method must be used when using the...
Read more >Common Alluxio Commands - Tencent Cloud
Output a list of node that have the specified file data. ls, ls "path", List all the files and directories directly under the...
Read more >Evaluation of Storage Systems for Big Data Analytics - KEEP
application is benchmarked in different use cases to demonstrate the benefits of the hybrid model. ... patible with Amazon S3 or OpenStack Swift...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@abmo-x thanks for reporting. @ZhuTopher can you take a look?
This hasn’t been resolved by that above PR, instead here is a PR which should fix this: https://github.com/Alluxio/alluxio/pull/16074