Help needed: deadlock occurring in functional tests
See original GitHub issueI hope the EF Core team can help out with this issue, it may actually affect non-Npgsql usage as well.
At some point during the 2.1.0 development, Npgsql’s functional test suite started hanging when executing on Appveyor (but runs fine locally). Unfortunately I don’t know exactly when this started (the state of the build was regardless broken for a while).
Using process explorer, I was able to get a stack trace of the two blocking threads during the test run (Appveyor has two virtual CPUs, so xunit automatically uses that many parallel threads). I’ve cut out some irrelevant native parts but otherwise here they are:
Thread 1:
mscorlib.dll!System.Threading.ManualResetEventSlim.Wait+0x2e3
mscorlib.dll!System.Threading.Tasks.Task.SpinThenBlockingWait+0xb6
mscorlib.dll!System.Threading.Tasks.Task.InternalWait+0x1a1
mscorlib.dll!System.Threading.Tasks.Task`1.GetResultCore+0x26
System.Interactive.Async.dll!<ToEnumerable_>d__442`1.MoveNext+0xd7
System.Core.dll!System.Linq.Buffer`1..ctor+0xa2
System.Core.dll!<GetEnumerator>d__1.MoveNext+0xd8
mscorlib.dll!System.Collections.Generic.List`1..ctor+0x1b4
System.Core.dll!System.Linq.Enumerable.ToList+0x46
Microsoft.EntityFrameworkCore.Specification.Tests.dll!<>c__DisplayClass63_0`1.<CollectionAsserter>b__0+0x31b
System.Core.dll!System.Dynamic.UpdateDelegates.UpdateAndExecuteVoid3+0x2c4
[Unmanaged to Managed Transition]
clr.dll+0x1f3c
[Managed to Unmanaged Transition]
Microsoft.EntityFrameworkCore.Specification.Tests.dll!<>c.<Correlated_collections_on_select_many>b__60_1+0x1143
Microsoft.EntityFrameworkCore.Specification.Tests.dll!Microsoft.EntityFrameworkCore.TestUtilities.TestHelpers.AssertResults+0x318
Microsoft.EntityFrameworkCore.Specification.Tests.dll!<AssertQuery>d__20`2.MoveNext+0x4f6
mscorlib.dll!System.Threading.ExecutionContext.RunInternal+0x160
mscorlib.dll!System.Threading.ExecutionContext.Run+0x14
mscorlib.dll!MoveNextRunner.Run+0x6c
xunit.execution.desktop.dll!<>c__DisplayClass7_0.<Post>b__1+0x29
xunit.execution.desktop.dll!Xunit.Sdk.MaxConcurrencySyncContext.RunOnSyncContext+0x2b
mscorlib.dll!System.Threading.ExecutionContext.RunInternal+0x160
mscorlib.dll!System.Threading.ExecutionContext.Run+0x14
mscorlib.dll!System.Threading.ExecutionContext.Run+0x52
xunit.execution.desktop.dll!Xunit.Sdk.ExecutionContextHelper.Run+0x7f
xunit.execution.desktop.dll!Xunit.Sdk.MaxConcurrencySyncContext.WorkerThreadProc+0xe7
mscorlib.dll!System.Threading.ExecutionContext.RunInternal+0x160
mscorlib.dll!System.Threading.ExecutionContext.Run+0x14
mscorlib.dll!System.Threading.ExecutionContext.Run+0x52
mscorlib.dll!System.Threading.ThreadHelper.ThreadStart+0x5c
Thread 2:
mscorlib.dll!System.Threading.ManualResetEventSlim.Wait+0x2e3
mscorlib.dll!System.Threading.Tasks.Task.SpinThenBlockingWait+0xb6
mscorlib.dll!System.Threading.Tasks.Task.InternalWait+0x1a1
mscorlib.dll!System.Threading.Tasks.Task`1.GetResultCore+0x26
System.Interactive.Async.dll!<ToEnumerable_>d__442`1.MoveNext+0xd7
System.Core.dll!System.Linq.Enumerable.Count+0xd5
Microsoft.EntityFrameworkCore.Specification.Tests.dll!<>c__61`1.<CollectionSorter>b__61_0+0x199
System.Core.dll!System.Linq.EnumerableSorter`2.ComputeKeys+0x85
System.Core.dll!System.Linq.EnumerableSorter`1.Sort+0x1d
System.Core.dll!<GetEnumerator>d__1.MoveNext+0x118
mscorlib.dll!System.Collections.Generic.List`1..ctor+0x1b4
System.Core.dll!System.Linq.Enumerable.ToList+0x46
Microsoft.EntityFrameworkCore.Specification.Tests.dll!Microsoft.EntityFrameworkCore.TestUtilities.TestHelpers.AssertResults+0x2ae
Microsoft.EntityFrameworkCore.Specification.Tests.dll!<AssertQuery>d__20`2.MoveNext+0x4f6
mscorlib.dll!System.Threading.ExecutionContext.RunInternal+0x160
mscorlib.dll!System.Threading.ExecutionContext.Run+0x14
mscorlib.dll!MoveNextRunner.Run+0x6c
xunit.execution.desktop.dll!<>c__DisplayClass7_0.<Post>b__1+0x29
xunit.execution.desktop.dll!Xunit.Sdk.MaxConcurrencySyncContext.RunOnSyncContext+0x2b
mscorlib.dll!System.Threading.ExecutionContext.RunInternal+0x160
mscorlib.dll!System.Threading.ExecutionContext.Run+0x14
mscorlib.dll!System.Threading.ExecutionContext.Run+0x52
xunit.execution.desktop.dll!Xunit.Sdk.ExecutionContextHelper.Run+0x7f
xunit.execution.desktop.dll!Xunit.Sdk.MaxConcurrencySyncContext.WorkerThreadProc+0xe7
mscorlib.dll!System.Threading.ExecutionContext.RunInternal+0x160
mscorlib.dll!System.Threading.ExecutionContext.Run+0x14
mscorlib.dll!System.Threading.ExecutionContext.Run+0x52
mscorlib.dll!System.Threading.ThreadHelper.ThreadStart+0x5c
As you can see, both threads are synchronously blocking a task, somewhere within System.Interactive.Async.
Disabling parallelization entirely makes the deadlock go away, and manually setting MaxParallelThreads to 4 instead of 2 seems to do the same. At this point it feels to me like there’s some sort of sync-over-async deadlock, where the 2 threads are synchronously blocked, possibly waiting for asynchronous operations to complete, but no thread is available to execute those async callbacks. When parallelization is active, xunit sets up a SynchronizationContext which enforces this (I’ve already complained once in https://github.com/xunit/xunit/issues/864). I don’t really have any proof that this is what’s happening, but since I’m not very familiar with System.Interactive.Async I thought I’d ask for help on this.
If something like this is indeed happening, it could be affecting EF Core in general - probably only the test suites - although I’m not sure why you guys haven’t bumped into this yourselves.
Any help would be greatly appreciated, in the meantime I’ll just crank up the number of threads or disable parallelization altogether.
Issue Analytics
- State:
- Created 5 years ago
- Comments:10 (10 by maintainers)
Relieved to know that it’s not a problem somewhere in Npgsql, I was really scared at some point that the lock-free connection pool was somehow responsible.
For sure, if you guys are calling
ToEnumerable()
(which blocks synchronously) over any sort of async I/O, that’ll cause a deadlock in a constrained-thread SynchronizationContext (like xunit’s).This in the plan == deadlock:
_ToEnumerable