JOIN instead of CROSS APPLY in generated query in SQL Server
See original GitHub issueEF Core Preview 5 would generate CROSS APPLY from a linq query like this:
from navObject in Context.NavObjects
join vessel in Context.Vessels on navObject.VesselId equals vessel.VesselId
from passage in Context.Passages
.Where(x => x.VesselId == navObject.VesselId && x.ActualDepartureTime.Value <= fromTime)
.OrderByDescending(x => x.ActualDepartureTime)
.Take(1)
.DefaultIfEmpty()
The generated query would be:
SELECT ... FROM [NavObject] AS [no]
INNER JOIN [Vessel] AS [vessel] ON [no].[ObjectId] = [vessel].[ObjectId]
CROSS APPLY (
SELECT TOP(1) [x].*
FROM [Passage] AS [x]
WHERE ([x].[ObjectId] = [no].[ObjectId]) AND ([x].[ActualDepartureTime] <= @__fromTime_1)
ORDER BY [x].[ActualDepartureTime] DESC
) AS [t]
In RC1 the query contains JOINs from SELECTs from SELECTs which cause where bad performance and timeouts:
SELECT ... FROM [NavObject] AS [n]
INNER JOIN [Vessel] AS [v] ON [n].[ObjectId] = [v].[ObjectId]
INNER JOIN (
SELECT [t].....
FROM (
SELECT [p]...., ROW_NUMBER() OVER(PARTITION BY [p].[ObjectId] ORDER BY [p].[ActualDepartureTime] DESC) AS [row]
FROM [Passage] AS [p]
WHERE ([p].[ActualDepartureTime] <= @__fromTime_1)
) AS [t]
WHERE [t].[row] <= 1
) AS [t0] ON [n].[ObjectId] = [t0].[ObjectId]
As you can clearly see, the Preview 5 generated query is clear and effective while the RC1 generated query is off. Please fix this query generation pattern.
Further technical details
EF Core version: 3.0 RC1 (versus 3.0 Preview 5) Database provider: Microsoft.EntityFrameworkCore.SqlServer Target framework: .NET Core 3.0 Operating system: Windows 10 IDE: Visual Studio 2019 16.2.5
Issue Analytics
- State:
- Created 4 years ago
- Reactions:23
- Comments:69 (31 by maintainers)
Top Results From Across the Web
When should I use CROSS APPLY over INNER JOIN?
Here's how it works. The query inside CROSS APPLY can reference the outer table, where INNER JOIN cannot do this (it throws compile...
Read more >SQL Server CROSS APPLY and OUTER APPLY
Microsoft SQL Server 2005 introduced the APPLY operator, which is like a join clause and it allows joining between two table expressions i.e. ......
Read more >INNER JOIN vs. CROSS APPLY at EXPLAIN EXTENDED
In SQL Server, while most queries which employ CROSS APPLY can be rewritten using an INNER JOIN, CROSS APPLY can yield better execution...
Read more >The Difference between CROSS APPLY and OUTER ...
The CROSS APPLY operator is semantically similar to INNER JOIN operator. It retrieves those records from the table valued function and the table ......
Read more >Understanding SQL Server CROSS APPLY and OUTER ...
Thus, the CROSS APPLY is similar to an INNER JOIN, or, more precisely, like a CROSS JOIN with a correlated sub-query with an...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Ran into an instance of this in production. After a table hit a certain number of records the query plan changed and it went from a 300 ms query to a 3 minute query.
Over the last couple of years of the observing this query my team has had to make multiple changes that differ from the implementation that should work without issue.
Standard query I expect to work performantly.
Modified query that was performing well until the issue in production yesterday.
Current query that solved the performance issues. This way forces the filter to be in the same block instead of using the windowed function
Here is a graph the performance impact of when the query went sour and when the last change was put into place.
Here is the SQL query given for the EF query without the “Take” statement.
Here is the SQL query given for the EF query with the “Take” statement.
Using the Take(1) before FirstOrDefault() seems to force the
OUTER APPLY
and the filtering to be done inside of the same block with the select, as opposed to theOUTER JOIN
with the filter done outside of the same block as the select.Ok, some more input on this problem. Here is the linq:
The meaning is to get the latest point for each vessel id. In 3.0 Preview 5 this would generate such SQL:
The subquery to retrieve data from Position is effectively filtered.
Now, since Preview 5 and until 3.1 release, the query is such:
And this is the problem - the inner subquery retrieves all rows from Position table, and in our case it is 16+ million rows, which may even be much more for some other customers. However, the subquery is executed for each row in the master query. So, it appears that the use of partitioned queries for MS SQL was based on wrong assumptions, as this pattern generates queries that will not perform quite well even on small data sets, while on large data sets they simply kill the reader.
I cannot say how this pattern behaves on other servers, such as PosgreSQL and Oracle, but for MS SQL it is not applicable. I would highly recommend to change the query generation pattern for such linq expressions back to what it was up until 3.0 Preview 5.