Memory Leak on DbContext.SaveChanges() with Invalid Data
See original GitHub issueI’ve recently updated a number of APIs from .NET Core 2.2 to .NET 5.0 and am now experiencing some issues when saving changes back to the database context. I’ve found consistently that if there is something invalid in the data I am trying to insert (e.g. a string too long for the VARCHAR it is going into, a duplicate PK being inserted, etc.), rather than some error being percolated up to actually make this apparent, when calling SaveChanges on the DbContext, the RAM usage starts to spike. I’ve only been able to figure out which values are invalid by pulling down all the debugging symbols and inspecting down into the MS code from my own.
This is a stack trace down into EF Core:
Microsoft.EntityFrameworkCore.DbUpdateException
HResult=0x80131500
Message=An error occurred while updating the entries. See the inner exception for details.
Source=Microsoft.EntityFrameworkCore.Relational
StackTrace:
at Microsoft.EntityFrameworkCore.Update.ReaderModificationCommandBatch.Execute(IRelationalConnection connection) in /_/src/EFCore.Relational/Update/ReaderModificationCommandBatch.cs:line 257
This exception was originally thrown at this call stack:
Microsoft.Data.SqlClient.SqlConnection.OnError(Microsoft.Data.SqlClient.SqlException, bool, System.Action<System.Action>)
Microsoft.Data.SqlClient.SqlInternalConnection.OnError(Microsoft.Data.SqlClient.SqlException, bool, System.Action<System.Action>)
Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(Microsoft.Data.SqlClient.TdsParserStateObject, bool, bool)
Microsoft.Data.SqlClient.TdsParser.TryRun(Microsoft.Data.SqlClient.RunBehavior, Microsoft.Data.SqlClient.SqlCommand, Microsoft.Data.SqlClient.SqlDataReader, Microsoft.Data.SqlClient.BulkCopySimpleResultSet, Microsoft.Data.SqlClient.TdsParserStateObject, out bool)
Microsoft.Data.SqlClient.SqlDataReader.TryConsumeMetaData()
Microsoft.Data.SqlClient.SqlDataReader.MetaData.get()
Microsoft.Data.SqlClient.SqlCommand.FinishExecuteReader(Microsoft.Data.SqlClient.SqlDataReader, Microsoft.Data.SqlClient.RunBehavior, string, bool, bool, bool)
Microsoft.Data.SqlClient.SqlCommand.RunExecuteReaderTds(System.Data.CommandBehavior, Microsoft.Data.SqlClient.RunBehavior, bool, bool, int, out System.Threading.Tasks.Task, bool, bool, Microsoft.Data.SqlClient.SqlDataReader, bool)
Microsoft.Data.SqlClient.SqlCommand.RunExecuteReader(System.Data.CommandBehavior, Microsoft.Data.SqlClient.RunBehavior, bool, System.Threading.Tasks.TaskCompletionSource<object>, int, out System.Threading.Tasks.Task, out bool, bool, bool, string)
Microsoft.Data.SqlClient.SqlCommand.RunExecuteReader(System.Data.CommandBehavior, Microsoft.Data.SqlClient.RunBehavior, bool, string)
...
[Call Stack Truncated]
Inner Exception 1:
SqlException: Violation of PRIMARY KEY constraint 'PK_FieldA'. Cannot insert duplicate key in object 'dbo.ObjectA'. The duplicate key value is (00000000-0000-0000-0000-000000000000).
The statement has been terminated.
If a deeper stack trace would help, I can try and run that down.
Below is the gist of what we’re doing. This should be relatively equivalent to what we are actually doing on these objects, but extracting a test case would be a bit difficult at this point. If this isn’t enough, I can try and provide more details, but I don’t know if I can create a 1:1.
public partial class ObjectA {
Guid? FieldA { get; set; }
}
public interface IObjectARepository {
void Add(ObjectA entity);
}
public class ObjectARepository : IObjectARepository {
private DbContext _context;
private IConfiguration _config;
public ObjectARepository(DbContext context, IConfiguration config)
{
_context = context ?? throw new ArgumentNullException("context");
_config = config ?? throw new ArgumentNullException("config");
}
public void Add(ObjectA entity)
{
_context.Set<ObjectA>().Add(entity);
_context.SaveChanges(); // This is where RAM starts maxing out
}
}
public class ObjectAModel
{
Guid? FieldA { get; set; }
}
public class ObjectAService
{
private readonly IMapper _mapper;
private readonly IObjectARepository _objectARepository;
public ObjectAService(IMapper mapper, IObjectARepository objectARepository)
{
_mapper = mapper ?? throw new ArgumentNullException("mapper");
_objectARepository = objectARepository ?? throw new ArgumentNullException("objectARepository");
}
public void Add(ObjectAModel model)
{
// Use AutoMapper to map ObjectAModel to an ObjectA
// ObjectA entity = _mapper.Map(model);
// Oversimplifying here, but just to indicate how we get our entity
// At this point, entity.FieldA is either null or an empty guid.
_objectARepository.Add(entity);
}
}
We are working data-first, so no component model annotations on the entities. This is the gist of the relevant data
CREATE TABLE [dbo].[ObjectA](
[FieldA] [uniqueidentifier] NOT NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
ALTER TABLE [dbo].[ObjectA] ADD CONSTRAINT [PK_ObjectA] PRIMARY KEY NONCLUSTERED
(
[FieldA] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
ALTER TABLE [dbo].[ObjectA] ADD DEFAULT (newid()) FOR [FieldA]
If I work past the exceptions that happen inside of EF Core, then the system just runs away consuming RAM until whatever system the process is running on decides to just kill it.
So just in review, the behavior that I’ve seen is I retrieve or create an instance of an Entity, provide data to one of the fields that is not compliant with the constraints in the schema, and then calls to DbContext.SaveChanges() hang with massive RAM consumption before crashing whatever the application is.
Debugging without “Just My Code” allows me to inspect down into EF Core and see that there is indeed either a PK violation, overflow of a field, or whatever else the actual database error is, but somewhere in the midst of things this error is not caught and percolated up to the user, but instead disappears off into some ether somewhere and instead EF Core opts to just start gobbling up RAM on the host machine.
Also, for this specific case (as opposed to the same issues we’ve seen with inserting strings beyond the VARCHAR size), I’ve noticed that the specific error is that we are trying to insert a duplicate key, the empty GUID. Our DB schema up there has a default of NEWID() for FieldA. I’m not entirely sure why we aren’t just letting the database default the field here. This may be completely unrelated, but shouldn’t EF be supplying Guid.NewGuid() in accordance with this constraint? Or does the responsibility fall to us in code to explicitly provide an ID despite the fact the database already knows one is needed? Not a huge deal, and I could split this out to a separate issue, just in case anyone has some input or it may actually be directly related.
Finally, just some details that may not be apparent from the above, we’re using dependency injection and a repository pattern for our data layer, so we inject in the repository into the service, and actually that service would implement an interface that is then injected at the next stage of the process, but I left the rest of the stack above that out, not sure how relevant it is after we get out of the service call. For the DbContext injection, the only option we’re setting is EnableRetryOnFailure. I tried turning this off, thinking perhaps this was what was omitting the failures and just retrying over and over again, but it didn’t seem to make a difference. I’m going to try to get a memory snapshot to determine what most of the RAM consumption is in terms of, but last I did that I’m pretty sure it was some sort of cache object down in EF Core.
Pardon anything missing, I typed up all of the code pieces from memory. Hopefully there’s something simple here, but attempts to find other guidance on this issue or to even find other reports of something similar have been relatively fruitless. I did see one article on a separate page about expression trees and lambdas, but it didn’t seem relevant to how we’re handling our objects in any way.
In short, this would be totally workable if when I call SaveChanges up there, EF percolates up the exception that the primary key constraint on dbo.ObjectA is not met because FieldA is a zero GUID, but instead I only get the memory leak/RAM spike and then a crash unless I inspect down into EF Core. If there is any way to make EF Core stop and report DB errors instead of sequestering them and consuming RAM until a crash, that would be ideal. Thanks for taking the time to read, I know it’s a bit much.
EF Core version: 5.0.5 Database provider: Microsoft.EntityFrameworkCore.SqlServer Target framework: NET 5.0 Operating system: Windows 10+IIS Express 10 (dev), Windows Server 2012 R2 + IIS 8.5 (prod) IDE: Visual Studio 2019 16.9.3
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (2 by maintainers)
@gilmorem560 If you are using Serilog, then this could be a duplicate of #15214. If this is the case, then you’ll need to use Serilog.Exceptions.EntityFrameworkCore.
If it helps anyone, this is the same problem with NLog aswell. I had some json serializing of the log messages by default and it traversed through all the DBContext properties. Fix was to change the Nlog Configuration to not do json serializing on the error messages. Notice the
details
attribute is not there in thelogconsoleErr