[BUG] Random Blob Upload Errors for Moderate Workloads
See original GitHub issueDescribe the bug Uploading files randomly fails when sequentially uploading 30 files. Largest file is 100 MB, some are 1-100MB, and some are less than 1MB. Error is always “Request body emitted ${n+1} bytes, more than the expected ${n} bytes.” Uploading all 30 files occasionally works, but typically not.
Exception or Stack Trace
2019-12-05 20:22:58 ERROR Managed to upload 17 files
2019-12-05 20:22:58 ERROR com.azure.core.exception.UnexpectedLengthException: Request body emitted 7962674 bytes, more than the expected 7962673 bytes.
com.nielsen.redacted.InputUploadException: com.azure.core.exception.UnexpectedLengthException: Request body emitted 7962674 bytes, more than the expected 7962673 bytes.
at com.nielsen.Redacted.method(Redacted.java:496)
To Reproduce Steps to reproduce the behavior:
- SDK Versions: both 12.0.0 and 12.0.0-preview-4
- Java 11, including containerized openjdk:11-jre-slim
- Setup multiple
InputStream
(particularly, from a remote connection instead of file system). - Synchronously and sequentially invoke
BlockBlobClient.upload(inputStream, size)
Code Snippet
if (contentStream.getSize() < TWO_HUNDRED_FORTY_MB) { // api limit is 256MB
BlockBlobItem blobItem = blockBlobClient.upload(stream, contentStream.getSize());
Expected behavior Expected higher success rate; 95% would be fine. Observed rate is under 50%.
Screenshots N/A
Setup (please complete the following information):
- OS: [Linux]
- IDE : [IntelliJ]
- Version of the Library used: 12.0.0
Additional context
Things are more reliable when using blockBlobClient.getBlobOutputStream()
, but that is much slower, unless there’s a trick to using it with arbitrary input streams.
Information Checklist Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report
- Bug Description Added
- Repro Steps Added
- Setup information Added
Issue Analytics
- State:
- Created 4 years ago
- Comments:23 (13 by maintainers)
Thanks, and likewise. I’ll see if I can catch it.
I did try another experiment where I wrapped the input stream with something that copies bytes to the filesystem as well as blob, and noticed strange behavior such as the files having a lot of empty data written at the end, even up to a few GB. That was probably an implementation problem on my end though.
Similar to your suggestion, I’ll also see if I can break at the call to
available()
when time permits.@arti-shinde What is the size of your data as reported by s3?
I’ll also recommend you upgrade to latest (12.14)