Refreshing Storage Explorer while writing to AppendBlob leads to exception: "The blob has been modified while being read"
See original GitHub issue- Package Name: azure-storage-blob
- Package Version: 12.13.1
- Operating System: Ubuntu 20.04 on WSL, Windows 10
- Python Version: 3.8.9
- Storage Explorer Version: 1.25.1
Describe the bug Refreshing storage explorer while uploading data to an AppendBlob in a container in an ADLS Gen2 storage account leads to the following error: “azure.core.exceptions.ResourceExistsError: The blob has been modified while being read.”
To Reproduce Steps to reproduce the behavior:
- Run this code:
import os
import asyncio
from azure.identity.aio import DefaultAzureCredential
from azure.storage.blob import BlobType
from azure.storage.blob.aio import BlobServiceClient
async def example():
account_name = os.environ['AZURE_STORAGE_ACCOUNT']
credential = DefaultAzureCredential(
exclude_environment_credential=False,
exclude_managed_identity_credential=False,
exclude_visual_studio_code_credential=True,
exclude_shared_token_cache_credential=True,
exclude_cli_credential=True,
)
blob_service_client = BlobServiceClient(f"https://{account_name}.blob.core.windows.net/", credential)
container_client = blob_service_client.get_container_client("censor")
file_name = f"example.jsonl"
for _ in range(1000):
data = str([x for x in range(1000000)]) + "\n"
await container_client.upload_blob(file_name, data, blob_type=BlobType.APPENDBLOB)
container_client.close()
asyncio.run(example())
- While browsing the container in Azure Storage Explorer, click refresh during the upload.
Expected behavior It shouldn’t crash, refreshing storage explorer shouldn’t interfere with writes.
Traceback:
Traceback (most recent call last):
File "crash.py", line 31, in <module>
asyncio.run(example())
File "/home/username/.pyenv/versions/3.8.9/lib/python3.8/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/home/username/.pyenv/versions/3.8.9/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "crash.py", line 27, in example
await container_client.upload_blob(file_name, data, blob_type=BlobType.APPENDBLOB)
File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/core/tracing/decorator_async.py", line 79, in wrapper_use_tracer
return await func(*args, **kwargs)
File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/aio/_container_client_async.py", line 847, in upload_blob
await blob.upload_blob(
File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/core/tracing/decorator_async.py", line 79, in wrapper_use_tracer
return await func(*args, **kwargs)
File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/aio/_blob_client_async.py", line 406, in upload_blob
return await upload_append_blob(**options)
File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/aio/_upload_helpers.py", line 326, in upload_append_blob
process_storage_error(error)
File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/_shared/response_handlers.py", line 181, in process_storage_error
exec("raise error from None") # pylint: disable=exec-used # nosec
File "<string>", line 1, in <module>
File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/aio/_upload_helpers.py", line 313, in upload_append_blob
return await upload_data_chunks(
File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/_shared/uploads_async.py", line 76, in upload_data_chunks
range_ids.append(await uploader.process_chunk(chunk))
File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/_shared/uploads_async.py", line 194, in process_chunk
return await self._upload_chunk_with_progress(chunk_offset, chunk_bytes)
File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/_shared/uploads_async.py", line 210, in _upload_chunk_with_progress
range_id = await self._upload_chunk(chunk_offset, chunk_data)
File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/_shared/uploads_async.py", line 335, in _upload_chunk
self.response_headers = await self.service.append_block(
File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/core/tracing/decorator_async.py", line 79, in wrapper_use_tracer
return await func(*args, **kwargs)
File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/_generated/aio/operations/_append_blob_operations.py", line 370, in append_block
map_error(status_code=response.status_code, response=response, error_map=error_map)
File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/core/exceptions.py", line 107, in map_error
raise error
azure.core.exceptions.ResourceExistsError: The blob has been modified while being read.
RequestId:CENSORED
Time:CENSORED
ErrorCode:BlobModifiedWhileReading
Content: <?xml version="1.0" encoding="utf-8"?><Error><Code>BlobModifiedWhileReading</Code><Message>The blob has been modified while being read.
RequestId:CENSORED
Time:CENSORED</Message></Error>
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7f3bf08291f0>
Unclosed connector
connections: ['[(<aiohttp.client_proto.ResponseHandler object at 0x7f3bf07e7340>, 26403.3207315)]']
connector: <aiohttp.connector.TCPConnector object at 0x7f3bf0829c70>
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7f3bf0829b80>
Unclosed connector
connections: ['[(<aiohttp.client_proto.ResponseHandler object at 0x7f3bf0818f40>, 26400.7723782)]']
connector: <aiohttp.connector.TCPConnector object at 0x7f3bf0829b20>
Issue Analytics
- State:
- Created a year ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
Azure Blob storage append blob 409 / modified error when ...
I tried to reproduce the scenario in my system not facing the issue that you are facing. Able to get the appended data...
Read more >Managing concurrency in Blob storage - Azure - Microsoft Learn
Learn how to manage multiple writers to a blob by implementing either optimistic or pessimistic concurrency in your application.
Read more >Azure Storage Blob Service - Apache Camel
Store and retrieve blobs from Azure Storage Blob Service. ... during the poll operation before an Exchange have been created and being routed...
Read more >Storing Growing Files Using Azure Blob Storage and Append ...
When you modify an append blob, blocks are added to the end of the blob only, via the Append Block operation. Updating or...
Read more >azure-storage-blob - PyPI
Azure Storage Blobs client library for Python · Serving images or documents directly to a browser · Storing files for distributed access ·...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @vholmer Viktor, I talked with the service team, and this is an unfortunate side-effect / limitation of how Append Blobs on ADLS Gen 2 accounts work in the backend. Without going into too much detail, performing a List Blobs call (the operation that happens when you refresh the container view in Storage Explorer) when writing to a blob will currently cause a conflict in the backend and return this error.
The current recommendation for this is to retry the write request as it should succeed the next time. The SDK has automatic retry logic for most HTTP 5xx errors, but the service is currently returning this as a 409 and therefore the automatic retry is not taking place. The service team has mentioned they are planning to fix this and start returning a 5xx error so that clients can retry (no ETA currently). In the meantime, you can try catching the
azure.core.exceptions.ResourceExistsError
and retrying the request.The service team has additionally mentioned they are actively working on improving the logic to reduce / eliminate the chance for this conflict in the backend, but I do not have an ETA for that work either.
I know this isn’t a great answer but there is not much we can do about this from the client side currently. Thanks.
@jalauzon-msft thanks for the reply and assistance! I’ll go with a regular storage account for this. 😄