question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

LoadWith: potential SQL codegen quality improvement

See original GitHub issue

Nothing wrong here, just noticed a potential SQL codegen improvement.

Suppose we have a model with an association, like so:

class Parent
{
  [PrimaryKey] int Id;
  int ChildId;
  [Association] Child Child;
}

class Child
{
  [PrimaryKey] int Id;
  string Name;
}

If I do the following projection:

from p in db.Parent
select new Parent 
{
  Id = p.Id,
  Child = new Child  
  {
    Id = p.Child.Id, 
    Name = p.Child.Name,
  }
}

I get this SQL, which is perfect:

select
  p.Id,
  c.Id as ChildId,
  c.Name as ChildName
from Parent p
left join Child c on p.childId = c.Id

Now if I do a similar query using LoadWith:

db.Parent
  .LoadWith(p => p.Child, 
            c => c.Select(x => new Child { Id = x.Id, Name = x.Name })

The SQL becomes:

select
  p.Id,
  c.Id as ChildId,
  c.Name as ChildName,
  c.isEmpty as ChildIsEmpty
from Parent p
left join (
  select 
    Id, 
    Name, 
    1 as isEmpty
  from Child c
) on p.childId = c.Id

I suppose the key difference between the two is the isEmpty column and what happens if the join fails. In the first case I’ll get a Child instance with Id = 0 and Name = null, while in the 2nd case I’ll have Child = null.

The sub-selects introduced for 1 as isEmpty could be dropped when Child has a primary key (not-null) or when it’s an inner join:

select
  p.Id,
  c.Id as ChildIsEmpty,
  c.Id as ChildId,
  c.Name as ChildName,
from Parent p
left join Child c on p.childId = c.Id

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
jods4commented, Jul 30, 2021

which eventually will require JOINing with derived table. So, Linq2Db adds this JOIN in advance.

  1. The fact that it needs sub-selects for more complex queries doesn’t preclude it from optimizing simpler cases. In fact Linq2db performs a ton of optimizations and simplifications in your queries and that’s really one of its strength.

  2. Sorting can’t be done by adding ORDER BY in a JOIN subquery like that anyway. Filtering can, but it can be done in the JOIN condition as well. As far as I can see, the subquery is required for the constant isEmpty column, which detects if there is a matching row or not, as I pointed out in my OP.

I believe there no RDBMS out there which suffer from additional SELECT in JOIN.

The question is: if I filter on one of those sub-select columns in the outer projection, will the RDBMS engine be able to use indexes on the underlying table or is the subselect hiding the index?

I think all RDBMS should handle this properly but I have only tested Oracle 12 (index is used ok), which is the target for my query. If we want to investigate the perf benefits, the “weakest” DBs should be tested to be sure (thinking of MySQL, SQLite, MS Access for example).

So there is no reason to change this.

It sure works, so it doesn’t have to be changed.

There are still “nice to have” reasons, such as generating shorter and more readable queries. I often look at the SQL executed on my DB, and I have queries with multiple projected LoadWith and many columns, they’re quite long. Take a look at the most basic example I gave in my OP: it’s 6 straight lines with no nesting against 13. Now imagine a much larger query with several LoadWith.

The very clean SQL query generation is also one of Linq2DB many qualities. I love looking at its output and seeing more or less the SQL I’d have hand-written. Especially after looking at Entity Framework queries that are horrible as hell for the most basic things.

As I wrote on the 1st line of this issue: there is nothing wrong here. It works and perf is good. It’s a potential small improvement that I think DBAs and people who look at their SQL queries would appreciate. I leave it to the linq2db team to decide if it’s simple enough to do or not worth the effort.

0reactions
sdanylivcommented, Jul 30, 2021

Actually i know what happened here. Will check, probably we can improve this with low efforts.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Fixed Issues in Apache Impala
Serious errors / crashes. Improved code coverage in Impala testing uncovered a number of potentially serious errors that could occur with specific query...
Read more >
Fix List for Db2 Version 11.1 for Linux, UNIX and Windows
Fix Pack m4fp6 - Codegen and Runtime. IT31151, 1, WRONG RESULTS ARE POSSIBLE WITH COMPLEX SQL WHICH USES UDFS AND CONSTANTS AND ENCOUNTER...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found