TransferManager can't download the same blob to two MemoryStreams concurrently
See original GitHub issueWhich service(blob, file) does this issue concern?
Blob
Which version of the SDK was used?
Microsoft.Azure.Storage.DataMovement version 0.8.0
On which platform were you using? (.Net Framework version or .Net Core version, and OS version)
.NET Framework v4.7.1
How can the problem be reproduced? It’d be better if the code caused the problem can be shared.
It’s a concurrency problem; it can repro if the concurrency case is hit.
What problem was encountered?
Microsoft.WindowsAzure.Storage.DataMovement.TransferException: A transfer operation with the same source and destination already exists.
at Microsoft.WindowsAzure.Storage.DataMovement.TransferManager.<DoTransfer>d__73.MoveNext() in C:\Local\Jenkins\jobs\DMLib_0.7.1\workspace\lib\TransferManager.cs:line 1249
Have you found a mitigation/solution?
No
The problem appears to be here: https://github.com/Azure/azure-storage-net-data-movement/blob/7c8258a8ffbef96b10ebdabd14f5d129048ce1de/lib/TransferManager.cs#L1277 Maintains state across concurrent transfers, using some kind of key to distinguish them. https://github.com/Azure/azure-storage-net-data-movement/blob/7c8258a8ffbef96b10ebdabd14f5d129048ce1de/lib/TransferJobs/TransferLocation.cs#L62 Uses the TransferLocation ToString to tell whether two sources or destinations are the same. https://github.com/Azure/azure-storage-net-data-movement/blob/7c8258a8ffbef96b10ebdabd14f5d129048ce1de/lib/TransferJobs/TransferLocation.cs#L62 For streams uses Stream.ToString (!) to determine if two streams are the same. https://github.com/Azure/azure-storage-net-data-movement/blob/c133c98dc7cb211abb171f99d153d93141f907d0/lib/TransferJobs/StreamLocation.cs#L108 This just calls object.ToString, which returns the type name (“System.IO.MemoryStream”), and is insufficient to determine instance identity. One option worth considering instead is to use object.ReferenceEquals rather than .ToString to tell if two stream instances are the same.
Issue Analytics
- State:
- Created 5 years ago
- Comments:8
Top GitHub Comments
Any update on this bug? We’re having to write an expensive workaround decorating the entire stream with an override of ToString just to avoid hitting this error condition in production.
@evgenyvinnik @davidmatson,
We just released a new version (0.9.1) which allows adding transferring from the same source to multiple Stream instances. Please be noted, to download one Blob/Azure File to multiple Stream instances, DMLib will try to download content of Blob/Azure File multiple times, instead of download once and write the same content to multiple Stream instances.