question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Regression in CPU utilisation with ASP.NET Core 2.2.0 web application

See original GitHub issue

Describe the bug

We have an ASP.NET Core web application that uses MVC to render Razor views that was recently updated from 2.1.6 to 2.2.0 which is showing semi-regular spikes in CPU utilisation where usage can increase by up to 4x for a short period before returning to baseline.

These spikes appear to occur semi-regularly, but inconsistently in time and across instances in a load-balanced fleet.

As series of screenshots that illustrate these CPU behaviours are included below.

The diff for the code and infrastructure changes between the two releases was to update all relevant NuGet package versions to 2.2.0 and to use CompatibilityVersion.Version_2_2, as well as installing the 2.2.0 runtime and Windows hosting pack.

The only other code change made was to use IHttpMessageHandlerFactory in a code path that is not in use for the environment configuration where we are observing this issue.

The application uses IIS out-of-process hosting (rather than the new default in-process mode) out of an abundance of caution pending a released fix for #4398 as this issue caused compatibility issues between the API this web application consumes and a separate application running ASP.NET 4.6.1 (see #4437).

In tandem with the above changes, the API that the web application consumes was also updated to use ASP.NET Core 2.2.0 in the same manner, and this application is not observing the same CPU utilisation changes despite its load being mirrored by the fact the web application depends on it.

This naively leads me to the conclusion that there is a regression in ASP.NET Core somewhere in code paths related to Razor views (compared to say, APIs, controllers, model-binding, routing, runtime etc.), hence posting this issue here rather than in coreclr or corefx.

To Reproduce

Render Razor views with ASP.NET Core 2.2.0 on Windows using IIS out-of-process hosting for an extended period of time (more than ~1 hour to get a good chance of observing the CPU spike).

Expected behavior

Steady-state CPU utilisation of the application using 2.2.x should be comparable with 2.1.x.

Screenshots

Below are various Grafana chart screenshots that illustrate the issue in our production environment for the web application with commentary.

CPU spiking for a single EC2 instance

This graph shows the CPU usage of a specific EC2 instance of the web application during a peak period of traffic.

image

Zoom on a specific spike of the single EC2 instance

This graph zooms in on a specific spike from the chart above to show at a higher fidelity.

image

CPU spiking across EC2 fleet during the same time window

This graph shows the CPU usage of all the EC2 instances of the web application during the same peak period of traffic as the chart above.

image

Average CPU across the EC2 fleet during the same window (green line) with a -7 day comparison (yellow line)

This graph shows the average CPU usage of all the EC2 instances of the web application during the same peak period of traffic as the charts above.

image

Average CPU across the fleet for the last 7 days

This graph shows the average CPU usage of all the EC2 instances of the web application during the last seven days, which is the green line. The yellow line shows the same metric for 7 days previously.

The vertical red lines indicate code deployments. The second red line from the line indicates where the 2.2.0 version of the application was deployed. Subsequent lines are business-as-usual code deployments subsequent to the upgrade.

Note: The largest spikes come from CPU utilisation when new EC2 instances come into service from CPU-based auto-scaling and the application is installed onto the fresh instances. These spikes are expected and are separate from the issue being described here.

image

CPU utilisation per EC2 instance for the last 7 days

This graph is the same as the above, except for all EC2 instances, rather than the overall average of the fleet.

image

Average CPU across the fleet for the last 7 days for the underlying API

This graph shows the average CPU usage of all the EC2 instances of the API dependency of the web application during the last seven days, which is the green line. The yellow line shows the same metric for 7 days previously.

The vertical red lines indicate code deployments. The second red line from the line indicates where the 2.2.0 version of the application was deployed. Subsequent lines are business-as-usual code deployments subsequent to the upgrade.

This graph serves as a comparison baseline for a different application that is just a HTTP API that is also running 2.2.0, but is unaffected with the same CPU spiking.

image

Average CPU across the fleet for the last 7 days for the underlying API

This graph is the same as the above, except for all EC2 instances of the API, rather than the overall average of the fleet.

image

Additional context

Application runs on AWS EC2 c5.large instances using Windows Server 2012 R2.

More overall context for the application can be found in this blog post from back when we updated it from 2.0.x to 2.1.0.

As this is an internal line-of-business application that only appears to reproduce the issue under load, I cannot provide a repro for the issue. However, we’re happy to provider any required telemetry/trace/dumps etc. that may help resolve the root cause to you privately.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:5
  • Comments:45 (39 by maintainers)

github_iconTop GitHub Comments

3reactions
pakrymcommented, Feb 26, 2019

@pakrym @jkotalik If the fix for this underlying issue was indeed add in 3.0.0 preview-2, would it be a candidate for porting to the 2.2.x branch for inclusion with 2.2.3 (or 2.2.4?).

Yes.

2reactions
martincostellocommented, Apr 4, 2019

@pakrym @jkotalik If the fix for this underlying issue was indeed add in 3.0.0 preview-2, would it be a candidate for porting to the 2.2.x branch for inclusion with 2.2.3 (or 2.2.4?).

Yes.

Has the underlying fix for this been ported to 2.2 for servicing yet?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Regression in CPU utilisation with ASP.NET Core 2.2.0 ...
We have an ASP.NET Core web application that uses MVC to render Razor views that was recently updated from 2.1.6 to 2.2.0 which...
Read more >
ASP.NET Core 2.2 experiencing high CPU usage
So I have hosted asp.net core 2.2 web service on Azure(S2 plan). The problem is that my application sometimes getting high CPU usage(almost ......
Read more >
Debug high CPU usage - .NET Core
In this tutorial, you'll learn how to debug an excessive CPU usage scenario. Using the provided example ASP.NET Core web app source code ......
Read more >
NET Code Analysis For High CPU Usage
In this article, Toptal engineer Juan Pablo Scida analyzes a real scenario of high CPU usage of a web application. He covers all...
Read more >
.NET Core Compatibility Requirements
The .NET Tracer supports all .NET-based languages (for example, C#, F#, Visual Basic). It has beta support for trimmed apps.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found