App initialization times out when an Azure App Service scale out occurs - 500.37 ANCM Failed to Start Within Startup Time Limit
See original GitHub issueDescribe the bug
In our production environment, whenever App Service initiates a scale out and a new instance is brought online, the application initialization inevitably times out and the new instance just returns 500.37 ANCM Failed to Start Within Startup Time Limit
.
Based on this comment on a similar bug report, we added logging to see how long initialization was taking:
As a general rule of thumb, as long as you get from the start of Program.Main to the end of Startup.Configure within that 120 seconds window you should be OK (obviously there is some additional work going on after Configure but it’s pretty minimal).
Startup had been taking close to a minute, so we removed non-essential tasks like migrations. After these modifications,
from entry into Main(string[] args)
until just prior to reaching host.Run()
, our application initialization takes ~20 seconds:
2020-12-24 05:14:18.875 [Information] [FooCorp.Extensions.IHostExtensions] Starting AutoMapper configuration check...
2020-12-24 05:14:19.030 [Information] [FooCorp.Extensions.IHostExtensions] Done. Skipped AutoMapper configuration check because we're in Production. Took 00:00:00.0000498. Startup elapsed: 00:00:05.0453677.
2020-12-24 05:14:23.675 [Information] [FooCorp.Extensions.IHostExtensions] ApplyCommonMigrationsAndSeed took 00:00:04.6388031. (Common Migrations commented out). Startup elapsed: 00:00:09.6898124.
2020-12-24 05:14:28.353 [Information] [FooCorp.Extensions.IHostExtensions] SeedAllAccountPartitions took 00:00:04.6678194. Startup elapsed: 00:00:14.3677977.
2020-12-24 05:14:28.355 [Information] [] [Main] Starting redis configuration...
2020-12-24 05:14:28.926 [Information] [] [Main] Done. redis configuration took 00:00:00.5650512. Startup elapsed: 00:00:14.9405856.
2020-12-24 05:14:28.927 [Information] [] [Main] Starting Azure Blob/Queue/Table configuration...
2020-12-24 05:14:30.445 [Information] [] [Main] Done. Azure Blob/Queue/Table configuration took 00:00:01.5169586. Startup elapsed: 00:00:16.4598798.
2020-12-24 05:14:30.447 [Information] [] [Main] Host startup/building done. About to Run(). Startup elapsed: 00:00:16.4615245.
Even with the reduced startup time, we’re still getting the 500.37
timeout when a new node scales out.
We looked at the Event Log in the SCM site, and saw entries similar to this:
<Event>
<System>
<Provider Name="IIS AspNetCore Module V2" />
<EventID>1007</EventID>
<Level>1</Level>
<Task>0</Task>
<Keywords>Keywords</Keywords>
<TimeCreated SystemTime="2020-12-24T05:09:07Z" />
<EventRecordID>73324453</EventRecordID>
<Channel>Application</Channel>
<Computer>REDACTED</Computer>
<Security />
</System>
<EventData>
<Data>Application '/LM/W3SVC/669372563/ROOT' with physical root 'D:\home\site\wwwroot\' failed to load coreclr. Exception message: Managed server didn't initialize after 120000 ms.</Data>
<Data>Process Id: 1380.</Data>
<Data>File Version: 13.1.20267.9. Description: IIS ASP.NET Core Module V2 Request Handler. Commit: d12868dd7c10ff0433c16b06d3b59d03c40d987a</Data>
</EventData>
</Event>
I don’t see anything applicable to startup timeouts in the IIS stdout logs.
For the time being, we have increased startupTimeLimit
to 240 seconds in hopes of bandaiding the problem, but I’d really like to figure out what the root cause is. The only way we can get the instance to come online is to kill w3wp.exe on the affected machine; then it comes online almost instantly.
Any idea on what might be causing this or further steps we can take to find the issue?
Thanks!
To Reproduce
This happens whenever Azure App Service scales out and adds a new instance.
Further technical details
- ASP.NET Core 3.1
- Visual Studio 2019 16.8.3
- .NET Core SDK 3.1.403
- Self-contained application, published to App Service as a .zip file from Azure DevOps
- App Service is a Windows S2 plan, currently with 5 instances manually enabled; we have disabled auto scale
- We’re using ANCMv2 and IIS
dotnet --info
(note: the application is self-contained, built with .NET Core SDK 3.1.403):
D:\home>dotnet --info
.NET Core SDK (reflecting any global.json):
Version: 3.1.108
Commit: c423b556b5
Runtime Environment:
OS Name: Windows
OS Version: 10.0.14393
OS Platform: Windows
RID: win10-x86
Base Path: D:\Program Files (x86)\dotnet\sdk\3.1.108\
Host (useful for support):
Version: 3.1.8
Commit: 9c1330dedd
.NET Core SDKs installed:
1.1.14 [D:\Program Files (x86)\dotnet\sdk]
2.1.518 [D:\Program Files (x86)\dotnet\sdk]
2.2.109 [D:\Program Files (x86)\dotnet\sdk]
3.1.108 [D:\Program Files (x86)\dotnet\sdk]
.NET Core runtimes installed:
Microsoft.AspNetCore.All 2.1.22 [D:\Program Files (x86)\dotnet\shared\Microsoft.AspNetCore.All]
Microsoft.AspNetCore.All 2.2.14 [D:\Program Files (x86)\dotnet\shared\Microsoft.AspNetCore.All]
Microsoft.AspNetCore.App 2.1.22 [D:\Program Files (x86)\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 2.2.14 [D:\Program Files (x86)\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 3.0.3 [D:\Program Files (x86)\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 3.1.8 [D:\Program Files (x86)\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.NETCore.App 1.0.16 [D:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 1.1.13 [D:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.0.9 [D:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.1.22 [D:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.2.14 [D:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 3.0.3 [D:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 3.1.8 [D:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App]
To install additional .NET Core runtimes or SDKs:
https://aka.ms/dotnet-download
D:\home>
Issue Analytics
- State:
- Created 3 years ago
- Comments:12 (5 by maintainers)
Correct, you might need to ask Azure App Service about that.
As for your timing code, you should split
host.Run
intohost.StartAsync()
andhost.WaitForShutdownAsync()
with the timing afterStartAsync()
.Thank you for your time.