question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

container_client.download_blob("some/path").readall() sometimes results in partial downloads (with HTTP 206 being logged in the httplib)

See original GitHub issue
  • Package Name: azure-storage-blob
  • Package Version: 12.4.0
  • Operating System: Based on docker image python:3.7-slim-buster
  • Python Version: CPython 3.7.9

Describe the bug When downloading a blob, sometimes it gets downloaded only partially and still allows the rest of the code to continue while the docs clearly state the readall method blocks until all data is downloaded (see https://docs.microsoft.com/en-us/python/api/azure-storage-blob/azure.storage.blob.storagestreamdownloader?view=azure-python#readall--).

To Reproduce Steps to reproduce the behavior:

  1. Create a container client to
  2. container_client.download_blob(“some/path”).readall() This only fails very sporadically, but is still causing issues in our system as we are downloading many files during a run. We’ve seen errors being raised during the parsing of the blob contents and when we retry automatically (in the error handling code), it usually comes through. It’s only since we integrated Sentry for error reporting that we notice that the underlying httplib is logging a status code of 206, which is likely to be the issue.

Expected behavior That the code blocks until all data is downloaded, as the documentation states is should.

Screenshots Some snapshots of the logged error in Sentry (in which you can see the breadcrumb in which httplib reports the status code of 206 being received): image Snapshot of the code triggering the download using a helper method: image Snapshot of the helper method that uses the Azure SDK to actually download the blob into an stream: image

Additional context We are accessing our storage account from an AKS cluster in the same region using the DefaultAzureCredential combined with aad-pod-identity.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:9 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
rblock-awcommented, Oct 2, 2020

We’re still working on it. As said: it occurs sporadically and we need time to collect the data… I will update this as soon as we have the results in.

1reaction
rblock-awcommented, Sep 24, 2020

We’ll give it a go, but it’ll take some time since the error only pops up very sporadically. I’ll come back to you when I have collected the necessary data points.

Read more comments on GitHub >

github_iconTop Results From Across the Web

azure.storage.blob.ContainerClient class | Microsoft Learn
The readall() method must be used to read all the content or readinto() must be used to download the blob into a stream....
Read more >
Unable to Download all the Blobs in a container from Azure ...
This error usually occurs when the accessed value does not exists. make sure to check the value within its condition.
Read more >
Azure Blob Storage SDK for Go - Go Packages
DownloadFile downloads an Azure blob to a local file. The file would be truncated if the size doesn't match. func (*Client) DownloadStream ¶...
Read more >
Cheat Sheet: Microsoft Azure Blob Storage - Zuar
People often think of the container as the directory in the above example, and try to create folders within the containers to replicate...
Read more >
azure-storage-blob - PyPI
Create a container from where you can upload or download blobs. from azure.storage.blob import ContainerClient container_client = ContainerClient.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found