ANCM intermittently notifies the wrong HttpContext of disconnect
See original GitHub issueIs there an existing issue for this?
- I have searched the existing issues
Describe the bug
We are observing an intermittent issue in a local development environment (using IIS Express) whereby ANCM appears to be notifying the wrong HttpContext of a disconnect.
The behaviour we observe is as follows:
For a “normal” (successful) request (made by a Blazor application running under Edge), we see the following sequence in the logs for a CORS pre-flight OPTIONS request followed by a POST:
Microsoft.AspNetCore.Hosting.Diagnostics: Information: Request starting HTTP/2 OPTIONS https://localhost:44302/...
Microsoft.AspNetCore.Routing.Matching.DfaMatcher: Debug: 1 candidate(s) found for the request path '...'
[...]
Microsoft.AspNetCore.Hosting.Diagnostics: Information: Request finished HTTP/2 OPTIONS https://localhost:44302 ...
Microsoft.AspNetCore.Server.IIS.Core.IISHttpServer: Debug: Connection ID "5908722738490511504" disconnecting.
Microsoft.AspNetCore.Hosting.Diagnostics: Information: Request starting HTTP/2 POST https://localhost:44302/...
Microsoft.AspNetCore.Routing.Matching.DfaMatcher: Debug: 1 candidate(s) found for the request path '...'
[...]
Microsoft.AspNetCore.Hosting.Diagnostics: Information: Request finished HTTP/2 POST https://localhost:44302/...
Microsoft.AspNetCore.Server.IIS.Core.IISHttpServer: Debug: Connection ID "1585267091919867554" disconnecting.
For a request that fails, we see the following sequence, with the “Connection ID … disconnecting” message appearing after the second (POST) request starts:
Microsoft.AspNetCore.Hosting.Diagnostics: Information: Request starting HTTP/2 OPTIONS https://localhost:44302/...
Microsoft.AspNetCore.Routing.Matching.DfaMatcher: Debug: 1 candidate(s) found for the request path '...'
[...]
Microsoft.AspNetCore.Hosting.Diagnostics: Information: Request finished HTTP/2 OPTIONS https://localhost:44302/...
Microsoft.AspNetCore.Hosting.Diagnostics: Information: Request starting HTTP/2 POST https://localhost:44302/...
Microsoft.AspNetCore.Server.IIS.Core.IISHttpServer: Debug: Connection ID "10520408744032996894" disconnecting.
Microsoft.AspNetCore.Routing.Matching.DfaMatcher: Debug: 1 candidate(s) found for the request path '/DistanceService.svc'
[...]
Microsoft.AspNetCore.Hosting.Diagnostics: Information: Request finished HTTP/2 POST https://localhost:44302/
Additionally, for a request that fails we observe the following behaviour:
- Adding a custom middleware to log
HttpContext.RequestAborted.IsCancellationRequested
shows that the POST request appears to be disconnected from the very start of request processing. - For terminal middleware that does not check RequestAborted, the browser receives a response that has the correct headers but no body (presumably since
IISHttpContext.AbortIO
has called.Complete()
on the body output pipe). This can be observed both in the IIS FRT logs, and also in the browser dev tools. - For terminal middleware that does check
RequestAborted
, the browser receives a response with whatever response was generated up until that point (e.g. typically a 200 but with no body and only the default response headers set). Again, observed in both IIS FRT logs and browser dev tools. - IIS Failed Request Tracing logs show that the
ANCM_INPROC_REQUEST_DISCONNECT
event for both requests only happened after normal processing (i.e. immediately after theGENERAL_FLUSH_RESPONSE_END
event), i.e. there is no evidence in the FRT log that the browser disconnected early, or that the second (POST) request should be showing as aborted when checkingRequestAborted.IsCancellationRequested
.
Our entire middleware pipeline is very simple, and is currently just returning mock data and as such does not even use any asynchronous APIs other than for HTTP IO (i.e. reading the request body and writing the response body). We are not using any multithreading outside of what is provided by ASP.NET, i.e. we are not using Task.Run()
or any other API that could be introducing concurrency issues, nor are we doing anything unusual which could explain this issue (such as allowing a HttpContext
or other request scoped object to outlive its normal scope). Additionally, the logs (as above) show that the “Connection ID … disconnecting” message appears before routing has even occurred, i.e. before the request has reached any custom middleware.
Note that the issue is very intermittent, but we have observed it on at least four different development machines. For one developer it sometimes happens several times per day during the course of normal development/testing/debugging, but for others it very rarely happens (less than once per week under the same conditions).
Expected Behavior
HttpContext.RequestAborted.IsCancellationRequested should not return true if the browser is able to receive a reponse.
Steps To Reproduce
We are unable to consistently reproduce this issue. However, we can provide detailed logs (e.g. IIS Failed Request Trace logs, VS debug output, etc) that clearly demonstrates the issue occurring.
Exceptions (if any)
No exceptions are logged.
.NET Version
6.0.200
Anything else?
- ASP.NET Core 6.0.2
- VS 2022 17.1.0
- Windows 10 Pro 10.0.19043
- aspnetcorev2.dll 16.0.21322.1
- iisexpress.exe 10.0.22489.1000
- Microsoft Edge 98.0.1108.62
- dotnet --info: https://github.com/dotnet/aspnetcore/files/8168912/dotnetinfo.txt
We can provide detailed IIS Failed Request Trace logs and debugging output from VS if required.
Issue Analytics
- State:
- Created 2 years ago
- Comments:14 (5 by maintainers)
Top GitHub Comments
I’ll also add that
IN_PROCESS_HANDLER::NotifyDisconnect
does check to see if the managed request has completed before callingm_pDisconnectHandler
(i.e.IISHttpServer.OnDisconnect
).IN_PROCESS_HANDLER::IndicateManagedRequestComplete
is called before disposingIISHttpContext
(byPostCompletion
inIISHttpContext.HandleRequest
).However, there is still a race condition here because the call to
m_pDisconnectHandler
happens outside the lock, soIN_PROCESS_HANDLER::NotifyDisconnect
can readm_fManagedRequestComplete
andm_pManagedHttpContext
beforeIN_PROCESS_HANDLER::IndicateManagedRequestComplete
has set them (to TRUE and nullptr respectively), but thenm_pDisconnectHandler
can still be called afterIndicateManagedRequestComplete
has completed:Thread 1:
(
m_srwDisconnectLock
now exited,m_pDisconnectHandler
not yet called on the localpManagedHttpContext
)Thread 2:
Thread 1: (continuing
IN_PROCESS_HANDLER::NotifyDisconnect()
, using the now stalepManagedHttpContext
)@zalmane I added the following middleware delegate to the start of the pipeline:
This should delay the disposal of the IISHttpContext for long enough to greatly reduce the chance of hitting the race condition. Obviously if this is for a high traffic web server you probably want to use this with caution!