question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Eager Loading Many to Many Relationship Doesn't Use the Index

See original GitHub issue

File a bug

Okay, so I’m using SQLite. That might matter because DBMS’s are pointlessly different just for the sake of it.

I have a many to many relationship declared using List<Entity> in each class. That generally works, but I know it’s newer.

EF Core generates a query that SQLite doesn’t use an index search for, and instead it uses a scan, which is slow.

Include your code

    public class Image
    {
        public int ImageId { get; set; }
        public int Width { get; set; }
        public int Height { get; set; }

        public virtual List<ImageTag> Tags { get; set; }
    }

    public class ImageTag
    {
        public int ImageTagId { get; set; }
        
        public string Name { get; set; }
        public string Description { get; set; }
        public string Type { get; set; }

        [IgnoreDataMember]
        public virtual List<Image> Images { get; set; }
    }
        protected override void OnModelCreating(ModelBuilder modelBuilder)
        {
            // nullability
            modelBuilder.Entity<Image>().Property(a => a.Width).IsRequired();
            modelBuilder.Entity<Image>().Property(a => a.Height).IsRequired();
            modelBuilder.Entity<ImageTag>().Property(a => a.Name).IsRequired();

            // indexes
            modelBuilder.Entity<Image>().HasIndex(a => a.Width);
            modelBuilder.Entity<Image>().HasIndex(a => a.Height);
            modelBuilder.Entity<ImageTag>().HasIndex(a => a.Name).IsUnique();
            modelBuilder.Entity<ImageTag>().HasIndex(a => a.Type);
        }

Query

var namesToLookup = new List<string> { "a", "b", "c" };
var existingTags = await Set<ImageTag>().Include(a => a.Images)
                .Where(a => namesToLookup.Contains(a.Name)).ToListAsync(token);

So that generates a query like so:

SELECT "i"."ImageTagId", "i"."Description", "i"."Name", "i"."Type", "t"."ImagesImageId", "t"."TagsImageTagId", "t"."ImageId", "t"."Height", "t"."Width"
FROM "ImageTags" AS "i"
LEFT JOIN (
    SELECT "i0"."ImagesImageId", "i0"."TagsImageTagId", "i1"."ImageId", "i1"."Height", "i1"."Width"
    FROM "ImageImageTag" AS "i0"
    INNER JOIN "Images" AS "i1" ON "i0"."ImagesImageId" = "i1"."ImageId"
) AS "t" ON "i"."ImageTagId" = "t"."TagsImageTagId"
WHERE "i"."Name" IN ('a', 'b', 'c')
ORDER BY "i"."ImageTagId", "t"."ImagesImageId", "t"."TagsImageTagId", "t"."ImageId"

Which explains to this

MATERIALIZE 1
SCAN TABLE ImageImageTag AS i0
SEARCH TABLE Images AS i1 USING INTEGER PRIMARY KEY (rowid=?)
SEARCH TABLE ImageTags AS i USING INDEX IX_ImageTags_Name (Name=?)
SEARCH SUBQUERY 1 AS t USING AUTOMATIC COVERING INDEX (TagsImageTagId=?)
USE TEMP B-TREE FOR ORDER BY

And because of the SCAN TABLE, it’s really slow, like minutes, which would be okay if it wasn’t a really easy fix.

SELECT "i"."ImageTagId", "i"."Description", "i"."Name", "i"."Type", "i0"."ImagesImageId", "i0"."TagsImageTagId", "i1"."ImageId", "i1"."Height", "i1"."Width"
FROM "ImageTags" AS "i"
INNER JOIN "ImageImageTag" AS "i0" ON "i0"."TagsImageTagId" = "i"."ImageTagId"
INNER JOIN "Images" AS "i1" ON "i0"."ImagesImageId" = "i1"."ImageId"
WHERE "i"."Name" IN ('a', 'b', 'c')
ORDER BY "i"."ImageTagId", "i0"."ImagesImageId", "i0"."TagsImageTagId", "i1"."ImageId"

Explains to

SEARCH TABLE ImageTags AS i USING INDEX IX_ImageTags_Name (Name=?)
SEARCH TABLE ImageImageTag AS i0 USING INDEX IX_ImageImageTag_TagsImageTagId (TagsImageTagId=?)
SEARCH TABLE Images AS i1 USING INTEGER PRIMARY KEY (rowid=?)
USE TEMP B-TREE FOR ORDER BY

That’s fast. SQLite Studio just says [00:28:36] Query finished in 0.000 second(s).

So this issue is twofold.

  • Is it like that for a (good) reason?
  • Is there a workaround that doesn’t require me to manually define my intermediate tables or execute raw sql?

I have a several hundred gig database already built, so I’m not running a migration to define the tables myself to then try to manually build a better query in LINQ. I’m not running raw sql because that defeats the purpose. If I was willing to hardcode sql, I wouldn’t be using an ORM with so many known quirks. No offense, it’s still by far the best option.

Sorry for not following the template exactly, but I’ve seen this reproduced in other issues. They just didn’t complain about it, which makes me think it may be an SQLite <-> EFCore quirk. It’s not a specific bug, but more of a logic issue.

Include provider and version information

EF Core version: 6.0.0-preview4… System.Data.Sqlite.Core is the same version Database provider: SQLite Target framework: net5.0

I would be willing to use net6.0, but this app runs on linux, and installing the net6.0 preview tools for linux is still a pita.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:19 (8 by maintainers)

github_iconTop GitHub Comments

2reactions
AndriySvyrydcommented, Jul 27, 2021

This is something out-of-scope for EF to do automatically as determining whether this optimization would be appropriate for a given query and model is not trivial and might depend on the data and user’s intent.

0reactions
da3dsoulcommented, Jul 28, 2021

For future peeps that may come across this, I used the following code:

            var namesToLookup = new List<string> { "a", "b", "c" };
            var imageImageTags = Set<Dictionary<string, object>>("ImageImageTag");
            var tempTags = await (from imageTag in Set<ImageTag>()
                join imageImageTag in imageImageTags
                    on imageTag.ImageTagId equals EF.Property<int>(imageImageTag, "TagsImageTagId") into grouping
                from imageImageTag in grouping.DefaultIfEmpty()
                join image in Set<Image>()
                    on EF.Property<int>(imageImageTag, "ImagesImageId") equals image.ImageId into grouping2
                from image in grouping2.DefaultIfEmpty()
                where namesToLookup.Contains(imageTag.Name)
                orderby imageTag.ImageTagId
                select new { imageTag, image }).ToListAsync(token);
            var existingTags = tempTags.GroupBy(a => a.imageTag, a => a.image).Select(a =>
            {
                var tag = a.Key;
                tag.Images = a.ToList();
                return tag;
            }).ToList();
            AttachRange(existingTags);

To generate the following SQL:

SELECT "i"."ImageTagId", "i"."Description", "i"."Name", "i"."Type", "i1"."ImageId", "i1"."Height", "i1"."Width"
FROM "ImageTags" AS "i"
LEFT JOIN "ImageImageTag" AS "i0" ON "i"."ImageTagId" = "i0"."TagsImageTagId"
LEFT JOIN "Images" AS "i1" ON "i0"."ImagesImageId" = "i1"."ImageId"
WHERE "i"."Name" IN ('a', 'b', 'c')
ORDER BY "i"."ImageTagId"

The grouping, model relationship mapping, and context attaching needs to be post processed at the end, but it’s still much faster than trying to run the subquery in my case. EDIT: After letting it run against real data for a few hours, it went from an average of 4:12s (rounding) to <1s. The mapping table in question is several hundred thousand records.

Because of the inaccessibility of the intermediate table, I think this could be included in EF with some optimization and cleanup. I can’t think of any situation where the first left join would not be null without the second left join matching, as only EF can make them when a user provides a mapping of Image to ImageTag.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Eager load a multiple indexes relationship with Laravel 5
I tried to change my relationship using a join clause instead of a where clause. The eager loading works but the results are...
Read more >
Eager loading not working on many to many relationship
Set up; I have a Programme model with a one to one relation to a User ... So with eager loading I tried...
Read more >
Can't get Eloquent eager loading working
Eager loading is not going to load all the various permutations of queries unless you define those permutations as relationships and eager load...
Read more >
Unable to disable eager loading for relations specified via ...
When eager loading a relationship (see example below), I am unable to disable this behavior using: ->without('items.event').
Read more >
Eager Loading not working with BelongsTo · Issue #112
One note, it doesn't appear to be an issue on initial index page-load. ... Belongs to will eager load and proceed to query...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found