Failure to upload large file (>200GB)
See original GitHub issueHi guys,
I have been having issue with uploading large file (>200GB) with the library. According to the documentation, with the REST API version after 2016-05-31, the size limit can be up to 4.75TB (100 MB chunk x 50000 blocks).
I have been playing with the chunk size limit (default is 4MB). When I changed the MAX_BLOCK_SIZE to be larger than 10MB, i saw ReadTimeout
AzureException: ReadTimeout: HTTPSConnectionPool(host='host.blob.core.windows.net', port=443): Read timed out. (read timeout=20)
When the block size was 10MB and validate_content=True
, I got md5 mismatched
AzureHttpError: The MD5 value specified in the request did not match with the MD5 value calculated by the server.
When I kept the block size the same (10MB) and disable md5 check, the file got uploaded successfully (263GB). I downloaded the file down and manually performed md5sum on the whole file, the md5 matched with the file on local drive.
I updated the library version to 0.36.0, but it didn’t make any difference.
Could you please take a look if this is a bug/service limit, or if I did not do it correctly?
Thanks
Issue Analytics
- State:
- Created 6 years ago
- Comments:5 (4 by maintainers)
Top GitHub Comments
Hi @thanhnhut90, thank you for bring this to our attention!
Concerning the ReadTimeout, I would suggest to increase the socket_timeout (in BlockBlobService’s constructor) to higher than 20 seconds, and see if that helps with your situation.
And for the MD5 mismatch, I will try to reproduce your issue, and update there. 😃
@thanhnhut90 This was fixed in azure-storage-blob v0.37.1. Please let me know if you encounter any other problem. 😃