question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

App initialization times out when an Azure App Service scale out occurs - 500.37 ANCM Failed to Start Within Startup Time Limit

See original GitHub issue

Describe the bug

In our production environment, whenever App Service initiates a scale out and a new instance is brought online, the application initialization inevitably times out and the new instance just returns 500.37 ANCM Failed to Start Within Startup Time Limit.

Based on this comment on a similar bug report, we added logging to see how long initialization was taking:

As a general rule of thumb, as long as you get from the start of Program.Main to the end of Startup.Configure within that 120 seconds window you should be OK (obviously there is some additional work going on after Configure but it’s pretty minimal).

Startup had been taking close to a minute, so we removed non-essential tasks like migrations. After these modifications, from entry into Main(string[] args) until just prior to reaching host.Run(), our application initialization takes ~20 seconds:

2020-12-24 05:14:18.875 [Information] [FooCorp.Extensions.IHostExtensions] Starting AutoMapper configuration check...
2020-12-24 05:14:19.030 [Information] [FooCorp.Extensions.IHostExtensions] Done. Skipped AutoMapper configuration check because we're in Production. Took 00:00:00.0000498. Startup elapsed: 00:00:05.0453677.
2020-12-24 05:14:23.675 [Information] [FooCorp.Extensions.IHostExtensions] ApplyCommonMigrationsAndSeed took 00:00:04.6388031. (Common Migrations commented out). Startup elapsed: 00:00:09.6898124.
2020-12-24 05:14:28.353 [Information] [FooCorp.Extensions.IHostExtensions] SeedAllAccountPartitions took 00:00:04.6678194. Startup elapsed: 00:00:14.3677977.
2020-12-24 05:14:28.355 [Information] [] [Main] Starting redis configuration...
2020-12-24 05:14:28.926 [Information] [] [Main] Done. redis configuration took 00:00:00.5650512. Startup elapsed: 00:00:14.9405856.
2020-12-24 05:14:28.927 [Information] [] [Main] Starting Azure Blob/Queue/Table configuration...
2020-12-24 05:14:30.445 [Information] [] [Main] Done. Azure Blob/Queue/Table configuration took 00:00:01.5169586. Startup elapsed: 00:00:16.4598798.
2020-12-24 05:14:30.447 [Information] [] [Main] Host startup/building done. About to Run(). Startup elapsed: 00:00:16.4615245.

Even with the reduced startup time, we’re still getting the 500.37 timeout when a new node scales out.

We looked at the Event Log in the SCM site, and saw entries similar to this:

<Event>
  <System>
    <Provider Name="IIS AspNetCore Module V2" />
    <EventID>1007</EventID>
    <Level>1</Level>
    <Task>0</Task>
    <Keywords>Keywords</Keywords>
    <TimeCreated SystemTime="2020-12-24T05:09:07Z" />
    <EventRecordID>73324453</EventRecordID>
    <Channel>Application</Channel>
    <Computer>REDACTED</Computer>
    <Security />
  </System>
  <EventData>
    <Data>Application '/LM/W3SVC/669372563/ROOT' with physical root 'D:\home\site\wwwroot\' failed to load coreclr. Exception message: Managed server didn't initialize after 120000 ms.</Data>
    <Data>Process Id: 1380.</Data>
    <Data>File Version: 13.1.20267.9. Description: IIS ASP.NET Core Module V2 Request Handler. Commit: d12868dd7c10ff0433c16b06d3b59d03c40d987a</Data>
  </EventData>
</Event>

I don’t see anything applicable to startup timeouts in the IIS stdout logs.

For the time being, we have increased startupTimeLimit to 240 seconds in hopes of bandaiding the problem, but I’d really like to figure out what the root cause is. The only way we can get the instance to come online is to kill w3wp.exe on the affected machine; then it comes online almost instantly.

Any idea on what might be causing this or further steps we can take to find the issue?

Thanks!

To Reproduce

This happens whenever Azure App Service scales out and adds a new instance.

Further technical details

  • ASP.NET Core 3.1
  • Visual Studio 2019 16.8.3
  • .NET Core SDK 3.1.403
  • Self-contained application, published to App Service as a .zip file from Azure DevOps
  • App Service is a Windows S2 plan, currently with 5 instances manually enabled; we have disabled auto scale
  • We’re using ANCMv2 and IIS
  • dotnet --info (note: the application is self-contained, built with .NET Core SDK 3.1.403):
D:\home>dotnet --info
.NET Core SDK (reflecting any global.json):
 Version:   3.1.108
 Commit:    c423b556b5

Runtime Environment:
 OS Name:     Windows
 OS Version:  10.0.14393
 OS Platform: Windows
 RID:         win10-x86
 Base Path:   D:\Program Files (x86)\dotnet\sdk\3.1.108\

Host (useful for support):
  Version: 3.1.8
  Commit:  9c1330dedd

.NET Core SDKs installed:
  1.1.14 [D:\Program Files (x86)\dotnet\sdk]
  2.1.518 [D:\Program Files (x86)\dotnet\sdk]
  2.2.109 [D:\Program Files (x86)\dotnet\sdk]
  3.1.108 [D:\Program Files (x86)\dotnet\sdk]

.NET Core runtimes installed:
  Microsoft.AspNetCore.All 2.1.22 [D:\Program Files (x86)\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.2.14 [D:\Program Files (x86)\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.22 [D:\Program Files (x86)\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.2.14 [D:\Program Files (x86)\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 3.0.3 [D:\Program Files (x86)\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 3.1.8 [D:\Program Files (x86)\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 1.0.16 [D:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 1.1.13 [D:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.0.9 [D:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.22 [D:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.2.14 [D:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 3.0.3 [D:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 3.1.8 [D:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App]

To install additional .NET Core runtimes or SDKs:
  https://aka.ms/dotnet-download

D:\home> 

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:12 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
BrennanConroycommented, Dec 30, 2020

Am I correct in thinking that ZipFS has nothing to do with ASP.NET Core?

Correct, you might need to ask Azure App Service about that.

As for your timing code, you should split host.Run into host.StartAsync() and host.WaitForShutdownAsync() with the timing after StartAsync().

0reactions
jonsagaracommented, Jan 8, 2021

Thank you for your time.

Read more comments on GitHub >

github_iconTop Results From Across the Web

500.37 ANCM Failed to Start Within Startup Time Limit
I had trouble running the application in Azure because of the error 500.37 ANCM Failed to Start Within Startup Time Limit. I managed...
Read more >
HTTP Error 500.37 - ANCM Failed to Start Within Startup ...
Troubleshooting steps: Check the system event log for error messages; Enable logging the application process' stdout messages; Attach a debugger ...
Read more >
An error occurred while starting the application, Process ...
It comes in many colors and flavors, it is usually a headache because there ... ANCM Failed to Start Within Startup Time Limit;...
Read more >
HTTP Error 500.30 - ASP.NET Core app failed to start help
In this article, I'll guide you through problems during the startup of an ASP.NET Core project. Get help where to look and possible ......
Read more >
Solving HTTP Error 500.30 - ANCM In-Process Start Failure ...
- The app is misconfigured due to targeting a version of the ASP.NET Core shared framework that isn't present. Check which versions of...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found