question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Help needed: deadlock occurring in functional tests

See original GitHub issue

I hope the EF Core team can help out with this issue, it may actually affect non-Npgsql usage as well.

At some point during the 2.1.0 development, Npgsql’s functional test suite started hanging when executing on Appveyor (but runs fine locally). Unfortunately I don’t know exactly when this started (the state of the build was regardless broken for a while).

Using process explorer, I was able to get a stack trace of the two blocking threads during the test run (Appveyor has two virtual CPUs, so xunit automatically uses that many parallel threads). I’ve cut out some irrelevant native parts but otherwise here they are:

Thread 1:

mscorlib.dll!System.Threading.ManualResetEventSlim.Wait+0x2e3
mscorlib.dll!System.Threading.Tasks.Task.SpinThenBlockingWait+0xb6
mscorlib.dll!System.Threading.Tasks.Task.InternalWait+0x1a1
mscorlib.dll!System.Threading.Tasks.Task`1.GetResultCore+0x26
System.Interactive.Async.dll!<ToEnumerable_>d__442`1.MoveNext+0xd7
System.Core.dll!System.Linq.Buffer`1..ctor+0xa2
System.Core.dll!<GetEnumerator>d__1.MoveNext+0xd8
mscorlib.dll!System.Collections.Generic.List`1..ctor+0x1b4
System.Core.dll!System.Linq.Enumerable.ToList+0x46
Microsoft.EntityFrameworkCore.Specification.Tests.dll!<>c__DisplayClass63_0`1.<CollectionAsserter>b__0+0x31b
System.Core.dll!System.Dynamic.UpdateDelegates.UpdateAndExecuteVoid3+0x2c4
[Unmanaged to Managed Transition]
clr.dll+0x1f3c
[Managed to Unmanaged Transition]
Microsoft.EntityFrameworkCore.Specification.Tests.dll!<>c.<Correlated_collections_on_select_many>b__60_1+0x1143
Microsoft.EntityFrameworkCore.Specification.Tests.dll!Microsoft.EntityFrameworkCore.TestUtilities.TestHelpers.AssertResults+0x318
Microsoft.EntityFrameworkCore.Specification.Tests.dll!<AssertQuery>d__20`2.MoveNext+0x4f6
mscorlib.dll!System.Threading.ExecutionContext.RunInternal+0x160
mscorlib.dll!System.Threading.ExecutionContext.Run+0x14
mscorlib.dll!MoveNextRunner.Run+0x6c
xunit.execution.desktop.dll!<>c__DisplayClass7_0.<Post>b__1+0x29
xunit.execution.desktop.dll!Xunit.Sdk.MaxConcurrencySyncContext.RunOnSyncContext+0x2b
mscorlib.dll!System.Threading.ExecutionContext.RunInternal+0x160
mscorlib.dll!System.Threading.ExecutionContext.Run+0x14
mscorlib.dll!System.Threading.ExecutionContext.Run+0x52
xunit.execution.desktop.dll!Xunit.Sdk.ExecutionContextHelper.Run+0x7f
xunit.execution.desktop.dll!Xunit.Sdk.MaxConcurrencySyncContext.WorkerThreadProc+0xe7
mscorlib.dll!System.Threading.ExecutionContext.RunInternal+0x160
mscorlib.dll!System.Threading.ExecutionContext.Run+0x14
mscorlib.dll!System.Threading.ExecutionContext.Run+0x52
mscorlib.dll!System.Threading.ThreadHelper.ThreadStart+0x5c

Thread 2:

mscorlib.dll!System.Threading.ManualResetEventSlim.Wait+0x2e3
mscorlib.dll!System.Threading.Tasks.Task.SpinThenBlockingWait+0xb6
mscorlib.dll!System.Threading.Tasks.Task.InternalWait+0x1a1
mscorlib.dll!System.Threading.Tasks.Task`1.GetResultCore+0x26
System.Interactive.Async.dll!<ToEnumerable_>d__442`1.MoveNext+0xd7
System.Core.dll!System.Linq.Enumerable.Count+0xd5
Microsoft.EntityFrameworkCore.Specification.Tests.dll!<>c__61`1.<CollectionSorter>b__61_0+0x199
System.Core.dll!System.Linq.EnumerableSorter`2.ComputeKeys+0x85
System.Core.dll!System.Linq.EnumerableSorter`1.Sort+0x1d
System.Core.dll!<GetEnumerator>d__1.MoveNext+0x118
mscorlib.dll!System.Collections.Generic.List`1..ctor+0x1b4
System.Core.dll!System.Linq.Enumerable.ToList+0x46
Microsoft.EntityFrameworkCore.Specification.Tests.dll!Microsoft.EntityFrameworkCore.TestUtilities.TestHelpers.AssertResults+0x2ae
Microsoft.EntityFrameworkCore.Specification.Tests.dll!<AssertQuery>d__20`2.MoveNext+0x4f6
mscorlib.dll!System.Threading.ExecutionContext.RunInternal+0x160
mscorlib.dll!System.Threading.ExecutionContext.Run+0x14
mscorlib.dll!MoveNextRunner.Run+0x6c
xunit.execution.desktop.dll!<>c__DisplayClass7_0.<Post>b__1+0x29
xunit.execution.desktop.dll!Xunit.Sdk.MaxConcurrencySyncContext.RunOnSyncContext+0x2b
mscorlib.dll!System.Threading.ExecutionContext.RunInternal+0x160
mscorlib.dll!System.Threading.ExecutionContext.Run+0x14
mscorlib.dll!System.Threading.ExecutionContext.Run+0x52
xunit.execution.desktop.dll!Xunit.Sdk.ExecutionContextHelper.Run+0x7f
xunit.execution.desktop.dll!Xunit.Sdk.MaxConcurrencySyncContext.WorkerThreadProc+0xe7
mscorlib.dll!System.Threading.ExecutionContext.RunInternal+0x160
mscorlib.dll!System.Threading.ExecutionContext.Run+0x14
mscorlib.dll!System.Threading.ExecutionContext.Run+0x52
mscorlib.dll!System.Threading.ThreadHelper.ThreadStart+0x5c

As you can see, both threads are synchronously blocking a task, somewhere within System.Interactive.Async.

Disabling parallelization entirely makes the deadlock go away, and manually setting MaxParallelThreads to 4 instead of 2 seems to do the same. At this point it feels to me like there’s some sort of sync-over-async deadlock, where the 2 threads are synchronously blocked, possibly waiting for asynchronous operations to complete, but no thread is available to execute those async callbacks. When parallelization is active, xunit sets up a SynchronizationContext which enforces this (I’ve already complained once in https://github.com/xunit/xunit/issues/864). I don’t really have any proof that this is what’s happening, but since I’m not very familiar with System.Interactive.Async I thought I’d ask for help on this.

If something like this is indeed happening, it could be affecting EF Core in general - probably only the test suites - although I’m not sure why you guys haven’t bumped into this yourselves.

Any help would be greatly appreciated, in the meantime I’ll just crank up the number of threads or disable parallelization altogether.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:10 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
rojicommented, May 1, 2018

Relieved to know that it’s not a problem somewhere in Npgsql, I was really scared at some point that the lock-free connection pool was somehow responsible.

For sure, if you guys are calling ToEnumerable() (which blocks synchronously) over any sort of async I/O, that’ll cause a deadlock in a constrained-thread SynchronizationContext (like xunit’s).

1reaction
anpetecommented, Apr 30, 2018

This in the plan == deadlock: _ToEnumerable

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Handle Deadlock and Isolation Issues in Database ...
You can simulate and verify different scenarios and cases of deadlock and isolation issues by using various techniques and tools.
Read more >
How to resolve deadlocks in SQL Server
A deadlock problem occurs when two (or more than two) operations already want to access resources locked by the other one.
Read more >
Can I test deadlock in googletest? - c++
I think to test against deadlock you would need to identify all resource accesses in each thread, and then permute all possible orderings....
Read more >
Understanding Async, Avoiding Deadlocks in C# | by Eke ...
The method you can use here is stress testing, launch many threads in parallel and see if the application survives. However this might...
Read more >
How to Fix Flaky Tests - Semaphore CI
Randomly failing tests are the hardest to debug. Here's a framework you can use to fix them and keep your test suite healthy....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found