question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support automatic abortion s3 multipartUpload after a timeout.

See original GitHub issue

If a MultipartUpload request fails and the MultipartUpload request is not aborted, directory xxx_s3_multipart_tmp will be preserved, and future MultipartUpload requests will always error.

Describe the solution you’d like

  1. Create a callable instance when a MultipartUpload request received which used to abort this MultipartUpload. Callable instance contains:

    1. upload file name (xxx)
    2. uploadId
    3. last_update_time (xxx_s3_multipart_tmp modify time or xxx modify time)
  2. Create a ScheduledExecutorService instance used to schedule auto abort MultipartUpload callable instance.

    1. last_update_time + timeout = delay
  3. When execute auto abort func:

    1. check xxx_s3_multipart_tmp if exist
    2. check xxx_s3_multipart_tmp dir id if equals uploadId
    3. check if xxx_s3_multipart_tmp modify time changed
    4. check xxx if exist and if xxx modify time changed

May caused problem

  • Upload object part or finish MultipartUpload request may be distributed to different proxies, how to coordinate different proxies? Leader to do this?

Urgency

Normal

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:13 (13 by maintainers)

github_iconTop GitHub Comments

2reactions
ZacBlancocommented, Oct 21, 2021

yes, we do need to make code changes – in fact there is a TODO in the code for it left by the original author of the S3 handler

1reaction
jffreecommented, Oct 21, 2021

Hey @jffree , is this answer you question?

Regarding

Upload object part or finish MultipartUpload request may be distributed to different proxies, how to coordinate different proxies? Leader to do this?

I don’t see a clear way to get around this if someone is using multiple proxies. Alluxio just does not allow multiple clients to upload+complete the same file. I think we will just have to document a restriction that the same proxy must be used during a multipart upload to properly complete the request. If a load balancer in front of the proxies is desired it should deterministically redirect the request to the correct proxy

I agree to explain how we implement it in the document, and the consistency problem should be guaranteed by the client.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Does AWS S3 automatically abort multipart uploads after a ...
No. From the doc page you linked, "Once you initiate a multipart upload, Amazon S3 retains all the parts until you either complete...
Read more >
abort-multipart-upload — AWS CLI 1.27.28 Command ...
This action aborts a multipart upload. After a multipart upload is aborted, no additional parts can be uploaded using that upload ID.
Read more >
abort-multipart-upload — AWS CLI 2.9.5 Command Reference
This action aborts a multipart upload. After a multipart upload is aborted, no additional parts can be uploaded using that upload ID.
Read more >
MultipartUploadCleaner (Alluxio Parent 2.9.1-SNAPSHOT API)
A lazy method (not scan the whole fileSystem to find tmp directory) to support automatic abortion s3 multipartUpload after a timeout.
Read more >
S3 multipart upload timeout policy · Issue #1704 - GitHub
I've received 3 different error messages during several tries, the following being one of them: Error { RequestTimeout: Your socket connection ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found