Add "smart" upload function
See original GitHub issueThe AWS S3 SDK for JavaScript has an upload
function that does not correspond to any particular API request. You can give it a buffer or a stream, and it will automatically perform either a single PutObject call or a multi-part upload.
It would be a great benefit to this library to provide something similar. Right now, large file uploads are unnecessarily cumbersome, especially when the input is a stream. Authorization token management is a giant pain.
I am working on such a function right now for our own internal use. I’m writing it as a module that exposes a single function that can be attached to the prototype of the B2
class provided by this library (B2.prototype.uploadAny = require('backblaze-b2-upload-any');
).
This issue is intended to convey my intent to integrate this function into this library and submit a PR. Therefore, I would very much appreciate any feedback on my proposal so that I can accommodate any necessary design changes as early as possible.
The current planned features of this function (many of which are already done) are:
- Performs the upload using a single upload_file call or switches to a large-file upload as appropriate.
- In large-file mode, uploads multiple parts with configurable concurrency.
- Automatic management of upload tokens. An upload token (URL + authorization token) can be reused by future part uploads. Expired tokens (where the server returns 503 or 400) are discarded.
- Automatic re-authorization if the server returns 401 in the middle of an upload.
- Retry with exponential backoff.
- Support for uploading:
- Buffers
- Streams
- Local files (specified as a string path)
- If the operation is aborted for whatever reason, any outstanding large-file upload is canceled with cancel_large_file.
- The caller need not (and cannot) supply a hash. When uploading in large-file mode, a hash of the entire content is not provided to B2 – a hash is provided for each part. A caller-supplied hash of the content is therefore useless in large-file mode anyway.
There is a difference between the local file and stream cases. When uploading a local file, no content is buffered in memory. Rather, multiple read streams are created (and re-created as necessary if a part upload must be retried).
Stream support necessarily requires some buffering in memory to facilitate retries since node streams cannot be seeked (and not all stream types would be seekable, anyway).
Note that I currently introduce two new dependencies:
Issue Analytics
- State:
- Created 4 years ago
- Reactions:6
- Comments:9
Top GitHub Comments
@cdhowie Sorry for the delay, the API looks good but hopefully I can take a look at the code over the weekend.
The module is published. Feel free to leave code review comments here, or as issues on the module’s repository.
https://www.npmjs.com/package/@gideo-llc/backblaze-b2-upload-any