container_client.download_blob("some/path").readall() sometimes results in partial downloads (with HTTP 206 being logged in the httplib)
See original GitHub issue- Package Name: azure-storage-blob
- Package Version: 12.4.0
- Operating System: Based on docker image python:3.7-slim-buster
- Python Version: CPython 3.7.9
Describe the bug
When downloading a blob, sometimes it gets downloaded only partially and still allows the rest of the code to continue while the docs clearly state the readall
method blocks until all data is downloaded (see https://docs.microsoft.com/en-us/python/api/azure-storage-blob/azure.storage.blob.storagestreamdownloader?view=azure-python#readall--).
To Reproduce Steps to reproduce the behavior:
- Create a container client to
- container_client.download_blob(“some/path”).readall() This only fails very sporadically, but is still causing issues in our system as we are downloading many files during a run. We’ve seen errors being raised during the parsing of the blob contents and when we retry automatically (in the error handling code), it usually comes through. It’s only since we integrated Sentry for error reporting that we notice that the underlying httplib is logging a status code of 206, which is likely to be the issue.
Expected behavior That the code blocks until all data is downloaded, as the documentation states is should.
Screenshots
Some snapshots of the logged error in Sentry (in which you can see the breadcrumb in which httplib reports the status code of 206 being received):
Snapshot of the code triggering the download using a helper method:
Snapshot of the helper method that uses the Azure SDK to actually download the blob into an stream:
Additional context We are accessing our storage account from an AKS cluster in the same region using the DefaultAzureCredential combined with aad-pod-identity.
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (3 by maintainers)
We’re still working on it. As said: it occurs sporadically and we need time to collect the data… I will update this as soon as we have the results in.
We’ll give it a go, but it’ll take some time since the error only pops up very sporadically. I’ll come back to you when I have collected the necessary data points.