Hang on calling database from Lazy<T> initialization by multiple threads
See original GitHub issueLooks closely related to the #1043
Describe the bug I’ve migrated two projects from SDK v2 to V3. After migration I noticed that tests are dramatically slower and they hang randomly. They all pass only when parallel execution is disabled.
To Reproduce I’ve submitted repro to the https://github.com/lukasz-pyrzyk/CosmosDbHangWithParallelTests. It has also branch with SDK V2, which works correctly.
Worth to mention
CheckIfDatabaseExists
looks like a root cause of the issue. It does an async database call and it’s called from the Lazy<T>. We usually use this concept when we want to initialize a single instance across the app and run some provisioning, like upsert of the stored procedures.
private async Task<Database> CheckIfDatabaseExists(Settings settings)
{
var database = _cosmosClient.GetDatabase(settings.DatabaseId);
var read = await database.ReadStreamAsync().ConfigureAwait(false);
if (read.StatusCode == HttpStatusCode.NotFound)
{
throw new Exception($"CosmosDB database {settings.DatabaseId} does not exist!");
}
return database;
}
During the investigation, I have replaced the async call to the database inside CheckIfDatabaseExists
with await Task.Delay(100).ConfigureAwait(false);
private async Task<Database> CheckIfDatabaseExists(Settings settings)
{
var database = _cosmosClient.GetDatabase(settings.DatabaseId);
await Task.Delay(100).ConfigureAwait(false);
return database;
}
with the following change, tests finish successfully, even in parallel.
Environment summary SDK Version: 3.4.1 OS Version: Windows 10
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:13 (7 by maintainers)
Top GitHub Comments
OK, I have the issue.
I noticed that
ConfigureAwait
was missing in the call method callingReadStreamAsync
. Because of that, continuation wasn’t able to start inside lazy<t> when more threads were not available.I have double-checked and I’m happy to say that issue was occurring on SDK 3.4.1, but it’s fixed in 3.5.1.
Thanks for help
This seems very similar to #1043 . @ealsur is working on a fix that will hopefully fix this issue too.