Memory leak when using MongoDb integration
See original GitHub issueDescribe the bug When the MongoDb integrations are present in the dotnet-integrations.json, our memory usage slowly increases at an approximate average rate of 5MB per hour. This continues until it has accumulated enough memory for our tasks in ECS/Fargate to run out of memory and kill/restart our containers.
To Reproduce Steps to reproduce the behavior:
- Create a task in AWS ECS/Fargate with a .NET 5 service auto-instrumented with dd-trace-dotnet and standard dotnet-integrations.json
- Hook up the periodic health check to run the Ping command on the Mongo server
- Let the service sit idle for many days
- See the memory usage of the task increase over time from within CloudWatch (or other monitoring tool)
Screenshots
- Start to A = services sitting idle with version 1.26.1 and the full contents of datadog-integration.json
- A = ECS/Fargate killed and restarted the tasks due to out of memory
- A to B = no change, services eating memory and sitting idle after the restart
- B to C = irrelevant tests
- C = manual restart with a completely empty datadog-integration.json (only contents is an empty json array
[]
) - C to D = sitting idle for a day with very little increase in memory
- D = manual restart with all contents of datadog-integrations.json restored, but removed the MongoDb sections
- D’ = enabled a Hangfire job that runs periodically to execute MongoDb queries
- D to E = mostly sitting idle with very little increase in memory
- E = manual restart with all contents of datadog-integrations.json restored (including MongoDb sections)
- E to F = sitting idle with memory increase
- F = manual restart after deploy with upgrade to 1.28.0 and updated datadog-integrations.json at that tag (includes MongoDb sections)
- F to now = sitting idle and looks like memory increase is continuing with latest
Runtime environment (please complete the following information):
- Instrumentation mode: Automatic with Debian APM installed in container
- Tracer version: 1.26.1 and 1.28.0
- OS: Container based on image
mcr.microsoft.com/dotnet/aspnet:5.0
which is a Debian Buster-Slim distro - CLR: .NET 5
Issue Analytics
- State:
- Created 2 years ago
- Comments:16 (7 by maintainers)
Top Results From Across the Web
Suspect that $exists is causing a memory leak
I am using the latest version of MongoDB. The problem is observed when working with a fairly large collection. mongodb_memory_leak_1.
Read more >mongodb memory leak or big collection?
Short version, no it is not a memory leak, and yes it is expected behavior over time when the data set exceeds available...
Read more >Possible Memory Leak? · Issue #655 · nodkz/mongodb- ...
The memory leak is reported by jest using the --detectLeaks cmdline option (output below). I will look into that integration page and update ......
Read more >Avoiding memory leaks when processing 360gb of ...
Avoiding memory leaks when processing 360gb of MongoDB data in Node. So I have a MongoDB table with 9 million records.
Read more >Meteor v2.8 - Memory leak in mongo driver?
Looks like there is an issue specifically with count - the same happens with Meteor.users.rawCollection() - so it smells like an issue with...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thanks @hiiru for providing all the extra details, this is really useful.
The source of the problem
We think we’ve tracked down the source of the issue, unfortunately there appears to be multiple facets to this.
MongoClient
(incorrectly) flows the execution context from the current thread when it is being created. That means if there’s an active trace at the time theMongoClient
is created, the background thread that checks the status of the cluster inherits that active trace. Even if the scope is closed on another thread, the background thread will keep adding spans to the trace.MongoClient
as a singleton service in ASP.NET Core’s DI container, it is “lazily” created when it is first needed. That’s very application-dependent, but it’s likely to be inside a request trace or something similar, which means it’s likely to capture a trace.Solutions
Depending on the exact behaviour you’re seeing, and the nature of the issue, there are various ways to mitigate/fix the issue.
1. Stop flowing execution context in
MongoClient
This is the real solution to the problem, which would involve an update to the Mongo Csharp Driver to prevent flowing the execution context. I’d suggest filing a support case with Mongo - as a customer hopefully this will be prioritised!
2. Create the
MongoClient
ahead of timeDepending on your application, you may find that you can solve the problem by registering your singleton
MongoClient
instance in DI directly. For example, if you were previously doing this:Do this instead:
This ensures that the context is captured when building
Startup
, instead of when the client is first used. As long as you aren’t creating long-lived scopes that encompass this part of the app lifecycle (which we would not recommend anyway) then this may solve your issue.More generally, if you’re not using this DI pattern, ensure that you’re not creating the
MongoClient
inside a “long-lived” trace/scope. If you are creating the client inside an existing scope, you can ensure you don’t capture the context by callingExecutionContext.SuppressFlow()
, for example:3. Enable partial flush
We generally advise against creating long-lived traces. By design, traces remain in memory until the last span is closed, so if you have a long-lived trace with many child spans, these will use an increasing amount of memory as the app lifetime increases.
In some cases, you can mitigate this issue by enabling partial flush for the .NET Tracer, as described in the documentation. If the above solution isn’t possible or doesn’t resolve the issue, it may be worth trying
DD_TRACE_PARTIAL_FLUSH_ENABLED=true
.Other possible enhancements
There seems to be little utility in seeing these infrastructural
isMaster
spans in DataDog, as they mostly seem to clutter up traces (completely aside from the memory issue described). Would you be interested in an option in the tracer that exclude these spans from automatic instrumentation by default (and other “infrastructural” commands), allowing enabling them with an environment variable?After 2 days with the
SuppressFlow
fix, the memory still looks good. This definitely fixed the issue. 👍@andrewlock Thank you for the solution and your MongoDb PR to fix it.