STORAGE: Bucket.delete_blobs() should use Batch
See original GitHub issueThis will also allow Bucket.delete(force=True) to use Batch()
Issue Analytics
- State:
- Created 9 years ago
- Comments:17 (10 by maintainers)
Top Results From Across the Web
Deleting multiple blobs from Google Cloud Storage efficiently
We can simply leverage com.google.cloud.storage.StorageBatch to efficiently delete multiple blobs in a bucket. public static rmAll(Storage ...
Read more >Delete objects | Cloud Storage - Google Cloud
In the list of buckets, click on the name of the bucket that contains the objects you want to delete. The Bucket details...
Read more >com.google.cloud.storage.StorageBatch.delete java code ...
Adds a request representing the "delete blob" operation to this batch. Calling StorageBatchResult#get() on the return value yields true upon successful deletion ...
Read more >BlobBatch.DeleteBlob Method (Azure.Storage.Blobs ...
The blob is later deleted during garbage collection which could take several minutes. Note that in order to delete a blob, you must...
Read more >Source code for google.cloud.storage.bucket
This is used in Bucket.delete() and Bucket.make_public(). ... This will return None if the blob doesn't exist:: >>> from google.cloud import storage ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Came here while trying to open a new issue concerning this. My usecase is deleting or moving millions of blobs. Currently I use one main thread that lists all blobs in a bucket and puts the ones to be deleted on a queue. A few dozen worker threads then individually delete each blob. I’m averaging about 60 deletes/sec, adding more threads doesn’t help. And that’s a bit slow for millions of blobs (takes days to run).
I’m trying to use the
Batchclass for this, but it is not entirely clear to me how to correctly use it and if it even supports deletes, am getting a header parsing error when I try to batch deletes.It would be great if the
bucket.delete_blobs()method (and other related methods for that matter) would use a batch by default. The current code is:The batch version might look something like (just a draft, not tested):
Hello, One of the challenges of maintaining a large open source project is that sometimes, you can bite off more than you can chew. As the lead maintainer of
google-cloud-python, I can definitely say that I have let the issues here pile up.As part of trying to get things under control (as well as to empower us to provide better customer service in the future), I am declaring a “bankruptcy” of sorts on many of the old issues, especially those likely to have been addressed or made obsolete by more recent updates.
My goal is to close stale issues whose relevance or solution is no longer immediately evident, and which appear to be of lower importance. I believe in good faith that this is one of those issues, but I am scanning quickly and may occasionally be wrong. If this is an issue of high importance, please comment here and we will reconsider. If this is an issue whose solution is trivial, please consider providing a pull request.
Thank you!