question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Performance issue when using multiple includes and QueryFilters

See original GitHub issue

I’m having a performance issue when using multiple includes with SplitQuery and QueryFilters.

When I do:

dbContext.Principal
   .Include(p => p.Entity1)
   .Include(p => p.Entity2)
  ....
   .Include(p => p.EntityN)
   .Include(p => p.RelatedList0)
   .Include(p => p.RelatedList1)
   ...

All tables have SoftDeletion, so I have a .HasQueryFilter(“IsDeleted = 0”) on them.

The generated queries for the RelatedLists are adding LEFT JOIN with ALL ‘n’ tables to filter the IsDeleted and then use its IDs in the ORDER BY clause like:

SELECT [tM].Columns, [p].[ID], [t0].[RelatedID0], [t1].[RelatedID1]... [tn].[RelatedIDN]
FROM [PrincipalTable] AS [p]
LEFT JOIN (
    SELECT [x].[RelatedID0]
    FROM [RelatedTable0] AS [x] 
    WHERE [x].[IsDeleted] = CAST(0 AS bit)
) AS [t0] ON [p].[RelatedID0] = [t0].[RelatedID0]
LEFT JOIN (
    SELECT [x].[RelatedID1]
    FROM [RelatedTable1] AS [x] 
    WHERE [x].[IsDeleted] = CAST(0 AS bit)
) AS [t0] ON [p].[RelatedID1] = [t1].[RelatedID0]
LEFT JOIN ...
INNER JOIN 
( table it's really loading the data) as [tM] ON [p].[PrincipalID] = [tM].[PrincipalID]
ORDER BY [p].[PrincipalID], [t0].[RelatedID0], [t1].[RelatedID1]... [tn].[RelatedIDN]

This is making the query too complex for the database, and if you use:

.Include(p => p.RelatedList1).ThenInclude(p => p.AnotherList1)
or
.Include(p => p.Entity1).ThenInclude(p => p.RelatedList1)

It adds more complexity.

The number of ‘reads’ in SQL Server can be really high

I believe that EF Core could improve the performance by avoiding some LEFT JOINs. I’m not sure how it’s implemented, but I think it could at least remove the Entities IDs from ORDER BY and SELECT clauses if the Entity doesn’t have a ‘.ThenInclude’, and we won’t need the LEFT JOIN there.

And when we have the ThenInclude, check if we really need to do the LEFT JOIN or if it’s ok to use the ID that’s on the ‘Principal’ table, the difference is that it could have a deleted ID instead of NULL there, but if it was already filtered in the main query, I think we don’t care if it will appear as an ID that will be ignored or if it’s NULL and will be ignored anyway. But again, I didn’t look at the EF Core code to check if this information is useful.

EF Core version: 6.0.0-preview.7.21378.4 Database provider: Microsoft.EntityFrameworkCore.SqlServer Target framework: NET6.0-preview7 Operating system: Windows 10 IDE: VS 2022 Preview 17.0.0 Preview 3.1

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
rojicommented, Sep 6, 2021

@sergiorpvn sorry for not replying earlier; I’ve taken a look at your comments again, and I’m not sure I understand…

The query above with the multiple LEFT JOINs should get generated in single query mode: I’m assuming that RelatedTable0 and RelatedTable1 correspond to the two collection navigations on Principal, in which case they wouldn’t be loaded via a join when using split query. I’ve done a quick repro based on your code (see below for the full code), and I see the following two queries when using split query:

-- EF Core loads the principals first
SELECT [p].[Id]
FROM [Principal] AS [p]
ORDER BY [p].[Id]

-- Since EF found at least one principal, it then loads the related entities:
SELECT [t].[RelatedID0], [t].[IsDeleted], [t].[PrincipalId], [p].[Id]
FROM [Principal] AS [p]
INNER JOIN (
    SELECT [r].[RelatedID0], [r].[IsDeleted], [r].[PrincipalId]
    FROM [Related0] AS [r]
    WHERE [r].[IsDeleted] = CAST(0 AS bit)
) AS [t] ON [p].[Id] = [t].[PrincipalId]
ORDER BY [p].[Id]

Note that there are no LEFT JOINs.

Can you please post an actual, runnable C# code sample, along with the SQL EF generates from it, and the SQL you’d like to see instead? This way we’re sure we’re talking about the same thing.

Repro code
await using (var ctx = new BlogContext())
{
    await ctx.Database.EnsureDeletedAsync();
    await ctx.Database.EnsureCreatedAsync();

    ctx.Principal.Add(new());
    await ctx.SaveChangesAsync();
}

await using (var ctx = new BlogContext())
{
    _ = await ctx.Principal
        .AsSplitQuery()
        .Include(p => p.RelatedList0)
        .Include(p => p.RelatedList1)
        .ToListAsync();
}

public class BlogContext : DbContext
{
    public DbSet<Principal> Principal { get; set; }

    static ILoggerFactory ContextLoggerFactory
        => LoggerFactory.Create(b => b.AddConsole().AddFilter("", LogLevel.Information));

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseSqlServer(@"Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0")
            .EnableSensitiveDataLogging()
            .UseLoggerFactory(ContextLoggerFactory);

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Related0>().HasQueryFilter(r => !r.IsDeleted);
        modelBuilder.Entity<Related1>().HasQueryFilter(r => !r.IsDeleted);
    }
}

public class Principal
{
    public int Id { get; set; }

    public List<Related0> RelatedList0 { get; set; }
    public List<Related1> RelatedList1 { get; set; }
}

public class Related0
{
    [Key]
    public int RelatedID0 { get; set; }
    public bool IsDeleted { get; set; }

    public Principal Principal { get; set; }
}

public class Related1
{
    [Key]
    public int RelatedID1 { get; set; }
    public bool IsDeleted { get; set; }

    public Principal Principal { get; set; }
}
0reactions
sergiorpvncommented, Oct 6, 2021

Hi @roji ,

Sorry for the (too) late reply, I was working on other stuff, vacations, etc… but now I can go back to this. Thanks for sending this code, it works as you told in your reply. However, after reviewing my code, I think I missed some data for you. My problem is a bit different, but I was able to modify your code a little bit to reproduce my issue.

Imagine that you have a class DataRequest that is a single data request that you can send via fax or email (or phone, whatsapp, pigeon, etc), and in each fax/email you can add multiple DataRequests. If I have some requests IDs and want to get all other requests that were sent together, I’d have the following code:

using Microsoft.Extensions.Logging;
using System.ComponentModel.DataAnnotations;

await using (var ctx = new BlogContext())
{
    await ctx.Database.EnsureDeletedAsync();
    await ctx.Database.EnsureCreatedAsync();
    var related0 = new EmailSent() { DatasRequested = new List<DataRequest>() { new() { PersonName = "X" }, new() { PersonName = "Z" } } };
    var related1 = new FaxSent() { DatasRequested = new List<DataRequest>() { new() { PersonName = "Y" } } };

    ctx.EmailsSent.Add(related0);
    ctx.FaxesSent.Add(related1);
    await ctx.SaveChangesAsync();
}

await using (var ctx = new BlogContext())
{
    _ = await ctx.DataRequests
        .AsSplitQuery()
        .Include(p => p.EmailSent).ThenInclude(p => p.DatasRequested)
        .Include(p => p.FaxSent).ThenInclude(p => p.DatasRequested)
        .Where(p => p.Id == 1)
        .ToListAsync();
}

public class BlogContext : DbContext
{
    public DbSet<DataRequest> DataRequests { get; set; }
    public DbSet<EmailSent> EmailsSent { get; set; }
    public DbSet<FaxSent> FaxesSent { get; set; }

    static ILoggerFactory ContextLoggerFactory
        => LoggerFactory.Create(b => b.AddConsole().AddFilter("", LogLevel.Information));

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseSqlServer(@"Server=localhost;Database=test;Integrated Security=True;Trusted_Connection=True;Connect Timeout=60;ConnectRetryCount=0")
            .EnableSensitiveDataLogging()
            .UseLoggerFactory(ContextLoggerFactory);

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<DataRequest>().HasQueryFilter(r => !r.IsDeleted);
        modelBuilder.Entity<EmailSent>().HasQueryFilter(r => !r.IsDeleted);
        modelBuilder.Entity<FaxSent>().HasQueryFilter(r => !r.IsDeleted);
    }
}

public class DataRequest
{
    public int Id { get; set; }
    public bool IsDeleted { get; set; }

    public string PersonName { get; set; }

    public int? EmailSentId { get; set; }
    public int? FaxSentId { get; set; }

    public EmailSent EmailSent { get; set; }
    public FaxSent FaxSent { get; set; }
}

public class EmailSent
{
    [Key]
    public int EmailSentId { get; set; }
    public bool IsDeleted { get; set; }

    public List<DataRequest> DatasRequested { get; set; }
}

public class FaxSent
{
    [Key]
    public int FaxSentId { get; set; }
    public bool IsDeleted { get; set; }

    public List<DataRequest> DatasRequested { get; set; }
}

This is generating the following selects using version 6.0.0-rc.1.21452.10:

SELECT [d].[Id], [d].[EmailSentId], [d].[FaxSentId], [d].[IsDeleted], [d].[PersonName], [t].[EmailSentId], [t].[IsDeleted], [t0].[FaxSentId], [t0].[IsDeleted]
FROM [DataRequests] AS [d]
LEFT JOIN (
    SELECT [e].[EmailSentId], [e].[IsDeleted]
    FROM [EmailsSent] AS [e]
    WHERE [e].[IsDeleted] = CAST(0 AS bit)
) AS [t] ON [d].[EmailSentId] = [t].[EmailSentId]
LEFT JOIN (
    SELECT [f].[FaxSentId], [f].[IsDeleted]
    FROM [FaxesSent] AS [f]
    WHERE [f].[IsDeleted] = CAST(0 AS bit)
) AS [t0] ON [d].[FaxSentId] = [t0].[FaxSentId]
WHERE ([d].[IsDeleted] = CAST(0 AS bit)) AND ([d].[Id] = 1)
ORDER BY [d].[Id], [t].[EmailSentId], [t0].[FaxSentId]
go

SELECT [t1].[Id], [t1].[EmailSentId], [t1].[FaxSentId], [t1].[IsDeleted], [t1].[PersonName], [d].[Id], [t].[EmailSentId], [t0].[FaxSentId]
FROM [DataRequests] AS [d]
LEFT JOIN (
    SELECT [e].[EmailSentId]
    FROM [EmailsSent] AS [e]
    WHERE [e].[IsDeleted] = CAST(0 AS bit)
) AS [t] ON [d].[EmailSentId] = [t].[EmailSentId]
LEFT JOIN (
    SELECT [f].[FaxSentId]
    FROM [FaxesSent] AS [f]
    WHERE [f].[IsDeleted] = CAST(0 AS bit)
) AS [t0] ON [d].[FaxSentId] = [t0].[FaxSentId]
INNER JOIN (
    SELECT [d0].[Id], [d0].[EmailSentId], [d0].[FaxSentId], [d0].[IsDeleted], [d0].[PersonName]
    FROM [DataRequests] AS [d0]
    WHERE [d0].[IsDeleted] = CAST(0 AS bit)
) AS [t1] ON [t].[EmailSentId] = [t1].[EmailSentId]
WHERE ([d].[IsDeleted] = CAST(0 AS bit)) AND ([d].[Id] = 1)
ORDER BY [d].[Id], [t].[EmailSentId], [t0].[FaxSentId]
go

SELECT [t1].[Id], [t1].[EmailSentId], [t1].[FaxSentId], [t1].[IsDeleted], [t1].[PersonName], [d].[Id], [t].[EmailSentId], [t0].[FaxSentId]
FROM [DataRequests] AS [d]
LEFT JOIN (
    SELECT [e].[EmailSentId]
    FROM [EmailsSent] AS [e]
    WHERE [e].[IsDeleted] = CAST(0 AS bit)
) AS [t] ON [d].[EmailSentId] = [t].[EmailSentId]
LEFT JOIN (
    SELECT [f].[FaxSentId]
    FROM [FaxesSent] AS [f]
    WHERE [f].[IsDeleted] = CAST(0 AS bit)
) AS [t0] ON [d].[FaxSentId] = [t0].[FaxSentId]
INNER JOIN (
    SELECT [d0].[Id], [d0].[EmailSentId], [d0].[FaxSentId], [d0].[IsDeleted], [d0].[PersonName]
    FROM [DataRequests] AS [d0]
    WHERE [d0].[IsDeleted] = CAST(0 AS bit)
) AS [t1] ON [t0].[FaxSentId] = [t1].[FaxSentId]
WHERE ([d].[IsDeleted] = CAST(0 AS bit)) AND ([d].[Id] = 1)
ORDER BY [d].[Id], [t].[EmailSentId], [t0].[FaxSentId]
go

I could group the Email/Fax/etc in another class, but the issue would remain if I had another entity linked to the DataRequest in the same way as Email/Fax classes are.

I hope this code helps you.

Thanks!

Read more comments on GitHub >

github_iconTop Results From Across the Web

c# - Entity Framework Include performance
Secondly, where multiple includes are used on a query result and that result set size is quite large, it still suffered from poor...
Read more >
Efficient Querying - EF Core
Performance guide for efficient querying using Entity Framework Core. ... Composite indexes can speed up queries which filter on multiple ...
Read more >
Performance issues when using multiple filters
It's because the "filter =" function runs the query many times, stores the results and then has to re-read them again when it...
Read more >
Chapter 4. Query Performance Optimization
Use a covering index (“Using index” in the Extra column) to avoid row accesses, and filter out nonmatching rows after retrieving each result...
Read more >
SQL Query Optimization: 12 Useful Performance Tuning ...
12 Query optimization tips for better performance. Monitoring metrics can be used ... Tip 3: Avoid using multiple OR in the FILTER predicate....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found