question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Docs: batch limit of `azure.storage.blob.ContainerClient.delete_blobs()` is poorly documented

See original GitHub issue

I wanted to move an entire “folder” within a storage account and for unrelated reasons I had to use this python sdk for that. I did this in basically a 3-step action:

  1. List all blobs that match this “folder” in the pseudo-hierarchy
  2. Copy those blobs individually to the new destination
  3. Delete the source blobs, for which I was happy to see that there already was a batched method: azure.storage.blob.ContainerClient.delete_blobs() so I didn’t have to individually delete each blob one by one.

Or so I thought … the batch failed with a PartialBatchErrorException and when I analyzed the parts, I noticed that a request failed with error code ExceedsMaxBatchRequestCount. The thing is: this “max batch request count” was nowhere to be found - neither in the code documentation, nor anywhere on the azure limits documentation page. The only thing I found was this test case from the .NET SDK:

https://github.com/Azure/azure-sdk-for-net/blob/402b7b71c310bbe0cb1c49862ba33c19a026f97d/sdk/storage/Azure.Storage.Blobs.Batch/tests/BlobBatchClientTests.cs#L64-L70

The 257 there seemed a bit suspicious so I experimented a bit with chunking and indeed 256 seems to be the maximum number of blobs that can be passed to delete_blobs(). However, 256 as a number is nowhere to be found in the storage section of the python sdk. Why did it have to be so difficult to find anything about this limit?

Once you decide if and where this should be documented, I’d of course offer my help in contributing this documentation.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
Gerrit-Kcommented, Jan 29, 2022

Oof … yeah I apparently didn’t check the API docs 😓 And also sorry if I sounded a bit grumpy there. Thank’s for picking this up and responding so quickly, much appreciated! I agree that automatically looping through the batches would be the most ideal solution, but the PR you’ve submitted is already really helpful!

1reaction
jalauzon-msftcommented, Jan 28, 2022

Hi @Gerrit-K, thanks for bringing this up and your investigation!

You are correct. The batch size for delete_blobs(), as well as the other batch APIs we support, is limited to 256 by the service. https://docs.microsoft.com/en-us/rest/api/storageservices/blob-batch#request-body

I will create a PR to add this limit to the code documentation for all of our batch APIs. Once updated and released, the documentation will also be pushed to our online docs.

Read more comments on GitHub >

github_iconTop Results From Across the Web

BlobContainerClient Class (Azure.Storage.Blobs)
The BlobContainerClient allows you to manipulate Azure Storage containers and their ... Gets the Storage account name corresponding to the container client.
Read more >
azure.storage.blob package - NET
This client provides operations to retrieve and configure the account properties as well as list, create and delete containers within the account. For ......
Read more >
azure-storage-blob - PyPI
Azure Storage Blobs client library for Python · Serving images or documents directly to a browser · Storing files for distributed access ·...
Read more >
azblob - Go Packages
Package azblob allows you to manipulate Azure Storage containers and blobs objects. URL Types ¶. The most common types you'll work with are ......
Read more >
How to list all blobs inside of a specific subdirectory in Azure ...
Please try something like: generator = blob_service.list_blobs(top_level_container_name, prefix="dir1/"). This should list blobs and folders ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found