azure-storage-blob : BlobClient creates an unwanted subsirectory inside the container
See original GitHub issue- Package Name: azure-storage-blob
- Package Version: 12.12.0
- Operating System: Windows
- Python Version: 3.10.4
Describe the bug When creating a blob inside a container with BlobClient, the blob is created but inside a subfolder having the same name as the container. Note : I use BlobClient directly and the parameter “account_url” is not the one of the whole storage account but the one of the container (with a SAS Token). In fact, I do not really understand that the parameter “container_name” is mandatory if the connexion_string already points to the container_adresse (see additionnal content of this post.
–EDIT– Just to emphase my last sentence about the “container_name” argument to the BlobClient constructor. It has a strange behaviour : if I put a random dummy value there, it will create a subfoler with this name in the right Container (because the right Container is specified in the connexion_string)… (see additionnal content of this post.)
To Reproduce Steps to reproduce the behavior:
def az_blob_storage(connection_string, az_container):
blob_client = BlobClient(account_url=connection_string, container_name=az_container, blob_name="retest.test" )
# Upload the created file
with open(Path("test.test"), "rb") as data:
try:
blob_client.upload_blob(data=data, overwrite=True)
except Exception as e:
print(e)
az_blob_url_asa_token_connection_string = "https://xxxxxx.blob.core.windows.net/testcont?sp=racwdl&st=2022-06-09T14:05:02Z&se=2022-06-09T22:05:02Z&sip=xxxxxxx&spr=https&sv=2021-06-08&sr=c&sig=xxxxxxxx"
az_blob_storage(az_blob_url_asa_token_connection_string, "testcont")
Expected behavior A file “restest.test” is created inside the container, at the root of the container.
What i got
Indeed the retest.test is created but inside a folder having the same name as the container:
- container: “testcont”
- subfolder ??? “testcont”
- the file : “retest.test”
- subfolder ??? “testcont”
Additional context
Most of the examples show the connexion_string as the connexion string of the Storage Instance.
I would like to narrow it to the container for security reasons.
This is why i use connexion_string to this exact container.
In fact, BobClient should not need the container name if its provided in the connexion_string.
For exemple, if I write this :
blob_client = BlobClient(account_url=connection_string, container_name="blabla" blob_name="retest.test" )
Then a subfoler named “blabla” is created… But still in the right Container (because the Container name is specified in the connexion_string).
Issue Analytics
- State:
- Created a year ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
Actually, I ended up trying this with
create_append_blob()
- seems to be the actual way of doing it. Works by chunks. Pretty fast and no objects growing in memory.Hi @vincenttran-msft and @jalauzon-msft. Thanks a lot for your precious support.
Yes, this is very interesting. I have a first Proof of Concept that works well : point is to manage backup & restore of CosmosDB database with the MongoDB API.
Here is a little code snippet:
As you can see, my use of the BytesIO object is really sub-optimal. Because, in the end, it holds the whole database in memory. Database is small by now, but it’s growing !
what i’d like to achieve is to only hold each MongoDB doc in the BytesIO object and upload it in “append mode” to the blob Object. Well, probably more optimal for memory use, maybe not for Blob Storage, but I’ll find out…
I also tried with “stage_block()” and “commit_block_list()”. From this example
Definitly very slow ! And I really don’t see where I would save memory as the list will contain the whole data at the end…
I found this “AppendBlobService” object in Python Azure SDK. Is this the right lib to use for this use case ? I’ll give it a try.
---- EDIT ---- “AppendBlobService” is deprecated and not part of the package azure-storage-blob anymore…