question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Refreshing Storage Explorer while writing to AppendBlob leads to exception: "The blob has been modified while being read"

See original GitHub issue
  • Package Name: azure-storage-blob
  • Package Version: 12.13.1
  • Operating System: Ubuntu 20.04 on WSL, Windows 10
  • Python Version: 3.8.9
  • Storage Explorer Version: 1.25.1

Describe the bug Refreshing storage explorer while uploading data to an AppendBlob in a container in an ADLS Gen2 storage account leads to the following error: “azure.core.exceptions.ResourceExistsError: The blob has been modified while being read.”

To Reproduce Steps to reproduce the behavior:

  1. Run this code:
import os
import asyncio
from azure.identity.aio import DefaultAzureCredential
from azure.storage.blob import BlobType
from azure.storage.blob.aio import BlobServiceClient

async def example():
    account_name = os.environ['AZURE_STORAGE_ACCOUNT']

    credential = DefaultAzureCredential(
        exclude_environment_credential=False,
        exclude_managed_identity_credential=False,
        exclude_visual_studio_code_credential=True,
        exclude_shared_token_cache_credential=True,
        exclude_cli_credential=True,
    )

    blob_service_client = BlobServiceClient(f"https://{account_name}.blob.core.windows.net/", credential)

    container_client = blob_service_client.get_container_client("censor")

    file_name = f"example.jsonl"

    for _ in range(1000):
        data = str([x for x in range(1000000)]) + "\n"

        await container_client.upload_blob(file_name, data, blob_type=BlobType.APPENDBLOB)

    container_client.close()

asyncio.run(example())
  1. While browsing the container in Azure Storage Explorer, click refresh during the upload.

Expected behavior It shouldn’t crash, refreshing storage explorer shouldn’t interfere with writes.

Traceback:

Traceback (most recent call last):
  File "crash.py", line 31, in <module>
    asyncio.run(example())
  File "/home/username/.pyenv/versions/3.8.9/lib/python3.8/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/home/username/.pyenv/versions/3.8.9/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "crash.py", line 27, in example
    await container_client.upload_blob(file_name, data, blob_type=BlobType.APPENDBLOB)
  File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/core/tracing/decorator_async.py", line 79, in wrapper_use_tracer
    return await func(*args, **kwargs)
  File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/aio/_container_client_async.py", line 847, in upload_blob
    await blob.upload_blob(
  File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/core/tracing/decorator_async.py", line 79, in wrapper_use_tracer
    return await func(*args, **kwargs)
  File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/aio/_blob_client_async.py", line 406, in upload_blob
    return await upload_append_blob(**options)
  File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/aio/_upload_helpers.py", line 326, in upload_append_blob
    process_storage_error(error)
  File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/_shared/response_handlers.py", line 181, in process_storage_error
    exec("raise error from None")   # pylint: disable=exec-used # nosec
  File "<string>", line 1, in <module>
  File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/aio/_upload_helpers.py", line 313, in upload_append_blob
    return await upload_data_chunks(
  File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/_shared/uploads_async.py", line 76, in upload_data_chunks
    range_ids.append(await uploader.process_chunk(chunk))
  File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/_shared/uploads_async.py", line 194, in process_chunk
    return await self._upload_chunk_with_progress(chunk_offset, chunk_bytes)
  File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/_shared/uploads_async.py", line 210, in _upload_chunk_with_progress
    range_id = await self._upload_chunk(chunk_offset, chunk_data)
  File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/_shared/uploads_async.py", line 335, in _upload_chunk
    self.response_headers = await self.service.append_block(
  File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/core/tracing/decorator_async.py", line 79, in wrapper_use_tracer
    return await func(*args, **kwargs)
  File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/storage/blob/_generated/aio/operations/_append_blob_operations.py", line 370, in append_block
    map_error(status_code=response.status_code, response=response, error_map=error_map)
  File "/home/username/git/repository/.venv/all/lib/python3.8/site-packages/azure/core/exceptions.py", line 107, in map_error
    raise error
azure.core.exceptions.ResourceExistsError: The blob has been modified while being read.
RequestId:CENSORED
Time:CENSORED
ErrorCode:BlobModifiedWhileReading
Content: <?xml version="1.0" encoding="utf-8"?><Error><Code>BlobModifiedWhileReading</Code><Message>The blob has been modified while being read.
RequestId:CENSORED
Time:CENSORED</Message></Error>
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7f3bf08291f0>
Unclosed connector
connections: ['[(<aiohttp.client_proto.ResponseHandler object at 0x7f3bf07e7340>, 26403.3207315)]']
connector: <aiohttp.connector.TCPConnector object at 0x7f3bf0829c70>
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7f3bf0829b80>
Unclosed connector
connections: ['[(<aiohttp.client_proto.ResponseHandler object at 0x7f3bf0818f40>, 26400.7723782)]']
connector: <aiohttp.connector.TCPConnector object at 0x7f3bf0829b20>

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
jalauzon-msftcommented, Sep 13, 2022

Hi @vholmer Viktor, I talked with the service team, and this is an unfortunate side-effect / limitation of how Append Blobs on ADLS Gen 2 accounts work in the backend. Without going into too much detail, performing a List Blobs call (the operation that happens when you refresh the container view in Storage Explorer) when writing to a blob will currently cause a conflict in the backend and return this error.

The current recommendation for this is to retry the write request as it should succeed the next time. The SDK has automatic retry logic for most HTTP 5xx errors, but the service is currently returning this as a 409 and therefore the automatic retry is not taking place. The service team has mentioned they are planning to fix this and start returning a 5xx error so that clients can retry (no ETA currently). In the meantime, you can try catching the azure.core.exceptions.ResourceExistsError and retrying the request.

The service team has additionally mentioned they are actively working on improving the logic to reduce / eliminate the chance for this conflict in the backend, but I do not have an ETA for that work either.

I know this isn’t a great answer but there is not much we can do about this from the client side currently. Thanks.

1reaction
vholmercommented, Sep 15, 2022

@jalauzon-msft thanks for the reply and assistance! I’ll go with a regular storage account for this. 😄

Read more comments on GitHub >

github_iconTop Results From Across the Web

Azure Blob storage append blob 409 / modified error when ...
I tried to reproduce the scenario in my system not facing the issue that you are facing. Able to get the appended data...
Read more >
Managing concurrency in Blob storage - Azure - Microsoft Learn
Learn how to manage multiple writers to a blob by implementing either optimistic or pessimistic concurrency in your application.
Read more >
Azure Storage Blob Service - Apache Camel
Store and retrieve blobs from Azure Storage Blob Service. ... during the poll operation before an Exchange have been created and being routed...
Read more >
Storing Growing Files Using Azure Blob Storage and Append ...
When you modify an append blob, blocks are added to the end of the blob only, via the Append Block operation. Updating or...
Read more >
azure-storage-blob - PyPI
Azure Storage Blobs client library for Python · Serving images or documents directly to a browser · Storing files for distributed access ·...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found