question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] Random Blob Upload Errors for Moderate Workloads

See original GitHub issue

Describe the bug Uploading files randomly fails when sequentially uploading 30 files. Largest file is 100 MB, some are 1-100MB, and some are less than 1MB. Error is always “Request body emitted ${n+1} bytes, more than the expected ${n} bytes.” Uploading all 30 files occasionally works, but typically not.

Exception or Stack Trace

2019-12-05 20:22:58 ERROR Managed to upload 17 files 
2019-12-05 20:22:58 ERROR com.azure.core.exception.UnexpectedLengthException: Request body emitted 7962674 bytes, more than the expected 7962673 bytes.
com.nielsen.redacted.InputUploadException: com.azure.core.exception.UnexpectedLengthException: Request body emitted 7962674 bytes, more than the expected 7962673 bytes.
	at com.nielsen.Redacted.method(Redacted.java:496)

To Reproduce Steps to reproduce the behavior:

  1. SDK Versions: both 12.0.0 and 12.0.0-preview-4
  2. Java 11, including containerized openjdk:11-jre-slim
  3. Setup multiple InputStream (particularly, from a remote connection instead of file system).
  4. Synchronously and sequentially invoke BlockBlobClient.upload(inputStream, size)

Code Snippet

if (contentStream.getSize() < TWO_HUNDRED_FORTY_MB) { // api limit is 256MB
				BlockBlobItem blobItem = blockBlobClient.upload(stream, contentStream.getSize());

Expected behavior Expected higher success rate; 95% would be fine. Observed rate is under 50%.

Screenshots N/A

Setup (please complete the following information):

  • OS: [Linux]
  • IDE : [IntelliJ]
  • Version of the Library used: 12.0.0

Additional context Things are more reliable when using blockBlobClient.getBlobOutputStream(), but that is much slower, unless there’s a trick to using it with arbitrary input streams.

Information Checklist Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

  • Bug Description Added
  • Repro Steps Added
  • Setup information Added

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:23 (13 by maintainers)

github_iconTop GitHub Comments

1reaction
rbrako-nlsncommented, Dec 10, 2019

Thanks, and likewise. I’ll see if I can catch it.

I did try another experiment where I wrapped the input stream with something that copies bytes to the filesystem as well as blob, and noticed strange behavior such as the files having a lot of empty data written at the end, even up to a few GB. That was probably an implementation problem on my end though.

Similar to your suggestion, I’ll also see if I can break at the call to available() when time permits.

0reactions
rickle-msftcommented, Sep 21, 2021

@arti-shinde What is the size of your data as reported by s3?

I’ll also recommend you upgrade to latest (12.14)

Read more comments on GitHub >

github_iconTop Results From Across the Web

[BUG] Random Blob Upload Errors for Moderate Workloads
Describe the bug. Uploading files randomly fails when sequentially uploading 30 files. Largest file is 100 MB, some are 1-100MB, and some are...
Read more >
Azure blob storage upload fail - Microsoft Q&A
I have an issue uploading a large file to my azure storage blob thru azure storage explorer. The file is approx. 160GB.
Read more >
OpenShift Container Platform 4.8 release notes
A cluster administrator using Operator Lifecycle Manager (OLM) to install an Operator can encounter error conditions that are related either to the current...
Read more >
Rafiki: A Middleware for Parameter Tuning of ... - Somali Chaterji
workloads and with only 7.5% error for unseen con gurations. e ... the secondary storage representation, called SSTables. SSTables are.
Read more >
Amazon EC2 FAQs - Amazon Web Services
High I/O instances (Im4gn, Is4gen, I4i, I3, I3en) are targeted at workloads that demand low latency and high random I/O in addition to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found