LoadWith: potential SQL codegen quality improvement
See original GitHub issueNothing wrong here, just noticed a potential SQL codegen improvement.
Suppose we have a model with an association, like so:
class Parent
{
[PrimaryKey] int Id;
int ChildId;
[Association] Child Child;
}
class Child
{
[PrimaryKey] int Id;
string Name;
}
If I do the following projection:
from p in db.Parent
select new Parent
{
Id = p.Id,
Child = new Child
{
Id = p.Child.Id,
Name = p.Child.Name,
}
}
I get this SQL, which is perfect:
select
p.Id,
c.Id as ChildId,
c.Name as ChildName
from Parent p
left join Child c on p.childId = c.Id
Now if I do a similar query using LoadWith
:
db.Parent
.LoadWith(p => p.Child,
c => c.Select(x => new Child { Id = x.Id, Name = x.Name })
The SQL becomes:
select
p.Id,
c.Id as ChildId,
c.Name as ChildName,
c.isEmpty as ChildIsEmpty
from Parent p
left join (
select
Id,
Name,
1 as isEmpty
from Child c
) on p.childId = c.Id
I suppose the key difference between the two is the isEmpty
column and what happens if the join fails.
In the first case I’ll get a Child
instance with Id = 0
and Name = null
, while in the 2nd case I’ll have Child = null
.
The sub-selects introduced for 1 as isEmpty
could be dropped when Child
has a primary key (not-null) or when it’s an inner join:
select
p.Id,
c.Id as ChildIsEmpty,
c.Id as ChildId,
c.Name as ChildName,
from Parent p
left join Child c on p.childId = c.Id
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
Fixed Issues in Apache Impala
Serious errors / crashes. Improved code coverage in Impala testing uncovered a number of potentially serious errors that could occur with specific query...
Read more >Fix List for Db2 Version 11.1 for Linux, UNIX and Windows
Fix Pack m4fp6 - Codegen and Runtime. IT31151, 1, WRONG RESULTS ARE POSSIBLE WITH COMPLEX SQL WHICH USES UDFS AND CONSTANTS AND ENCOUNTER...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
The fact that it needs sub-selects for more complex queries doesn’t preclude it from optimizing simpler cases. In fact Linq2db performs a ton of optimizations and simplifications in your queries and that’s really one of its strength.
Sorting can’t be done by adding ORDER BY in a JOIN subquery like that anyway. Filtering can, but it can be done in the JOIN condition as well. As far as I can see, the subquery is required for the constant
isEmpty
column, which detects if there is a matching row or not, as I pointed out in my OP.The question is: if I filter on one of those sub-select columns in the outer projection, will the RDBMS engine be able to use indexes on the underlying table or is the subselect hiding the index?
I think all RDBMS should handle this properly but I have only tested Oracle 12 (index is used ok), which is the target for my query. If we want to investigate the perf benefits, the “weakest” DBs should be tested to be sure (thinking of MySQL, SQLite, MS Access for example).
It sure works, so it doesn’t have to be changed.
There are still “nice to have” reasons, such as generating shorter and more readable queries. I often look at the SQL executed on my DB, and I have queries with multiple projected
LoadWith
and many columns, they’re quite long. Take a look at the most basic example I gave in my OP: it’s 6 straight lines with no nesting against 13. Now imagine a much larger query with severalLoadWith
.The very clean SQL query generation is also one of Linq2DB many qualities. I love looking at its output and seeing more or less the SQL I’d have hand-written. Especially after looking at Entity Framework queries that are horrible as hell for the most basic things.
As I wrote on the 1st line of this issue: there is nothing wrong here. It works and perf is good. It’s a potential small improvement that I think DBAs and people who look at their SQL queries would appreciate. I leave it to the linq2db team to decide if it’s simple enough to do or not worth the effort.
Actually i know what happened here. Will check, probably we can improve this with low efforts.