Bug: invalid removal of unused aggregation
See original GitHub issueWe have found a case where linq2db simplifies a query too aggressively and produces an incorrect result.
Our high-level use case is that we produce a paginated list. To do this we construct a complex query, which is then executed twice:
.Count()
to have the total number of rows.Skip().Take().ToList()
to fetch one page of data
During the .Count()
execution, no field is selected, which leads to massive simplifications made by linq2db.
From a perf perspective this is very nice, unfortunately we found out one bad simplification that produces incorrect results.
The main query performs CROSS APPLY with an aggregation as a sub-query.
The C# code goes something like this (much simplified, this example could be a simple inner join/group by but in reality we several cross apply):
from p in db.GetTable<Parent>()
from c in (from t in db.GetTable<Child>()
where c.ParentId == p.Id
group t by 1 into g
select new { Count = g.Count() })
select new { p.Id, c.Count }
This generates the following SQL (correct):
select p.id, c.count
from Parent p
cross apply (
select count(*)
from Child c
where c.parentId = p.id
) c
But when this request goes through .Count()
, c.Count
is not selected anymore and linq2db simplifies everything down to:
select count(*)
from Parent p
inner join (
select c.parentId
from Child c
group by c.parentId
) c on c.parentId = p.id
But removing count(*)
, even though it was unused, is incorrect.
In SQL, a query that has no group by and returns aggregations always returns a single row. If no row matches, then a single result with count(*)
being 0 is returned.
Linq2db removed the unused count(*)
and put a group by
in its place, but the behaviour is different.
If there is no matching row, then the sub-select returns nothing and the inner join
fails and discards the row.
-> Linq2db needs to recognize sub-queries that return aggregates without group by. It needs knowledge that such queries always return exactly one row. This actually means that a correct and more efficient optimization is to completely remove such sub-query if its results are not used.
Environment details
linq2db version: 3.1.2 Database Server: Oracle 12c Database Provider: Oracle Managed Provider Operating system: Windows .NET Framework: .net 4.7.2
Issue Analytics
- State:
- Created 3 years ago
- Comments:14 (14 by maintainers)
Oh, it needs more investigation from my side. If you help me to create set of tests, probably I’ll figure out corner cases and remove this join. But create new issue for tracking that.
Anyway, if you have missed that: https://github.com/linq2db/linq2db/issues/1402#issuecomment-438961009 You can do pagination and Count in one roundtrip.
Check these tests, it should be
Sql.Ext.Count().ToValue()
:https://github.com/linq2db/linq2db/blob/master/Tests/Linq/Linq/AnalyticTests.cs#L32-L60
There are a lot of overloads along with Window Functions