How to use a correlated subquery?
See original GitHub issueI’m trying to write a query with a subquery that references rows from the parent query. For example, say I want SQL for getting the latest weather value from a table of weather data that looks roughly like
SELECT * FROM weather w
WHERE
w.time >= '2023-02-01' AND w.time <= '2023-02-10'
AND w.timestamp = (
SELECT MAX(timestamp) FROM weather inner
WHERE inner.location = w.location AND inner.instrument = w.instrument AND inner.time =
w.time
)
First I tried
let lBound,uBound = (DateTime(2023,02,01), DateTime(2023,02,10))
select {
for w in table<weather> do
where (
w.time >= lBound && w.time <= uBound && w.timestamp = subqueryOne (
select {
for inner in table<weather> do
where (inner.location = w.location && inner.instrument = w.instrument && inner.time = w.time)
select (maxBy inner.timestamp)
}
)
)
}
but that doesn’t let me use a select
to start a subquery inside another one. So then I tried
let lBound,uBound = (DateTime(2023,02,01), DateTime(2023,02,10))
let selectInner = select
select {
for w in table<weather> do
where (
w.time >= lBound && w.time <= uBound && w.timestamp = subqueryOne (
selectInner {
for inner in table<weather> do
where (inner.location = w.location && inner.instrument = w.instrument && inner.time = w.time)
select (maxBy inner.timestamp)
}
)
)
}
which compiles, but gave me
System.NotImplementedException : The method or operation is not implemented. Stack Trace: at SqlHydra.Query.LinqExpressionVisitors.visit@212(FSharpFunc
2 qualifyColumn, Expression exp, Query query) at SqlHydra.Query.LinqExpressionVisitors.visit@212(FSharpFunc
2 qualifyColumn, Expression exp, Query query) at SqlHydra.Query.SelectBuilders.SelectBuilder2.Where[T](QuerySource
2 state, Expression`1 whereExpression)
I thought I’d try
let lBound,uBound = (DateTime(2023,02,01), DateTime(2023,02,10))
let sub = select {
for inner in table<weather> do
groupBy (inner.location, inner.instrument, inner.time)
select (inner.location, inner.instrument, inner.time, maxBy inner.timestamp)
}
select {
for w in table<weather> do
where (
w.time >= lBound && w.time <= uBound &&
(w.location, w.instrument, w.time, w.timestamp) = subqueryOne sub
)
}
which also gives me a NotImplementedException (not surprising, and I think there’s another open issue about using tuples in a where clause) and
let lBound,uBound = (DateTime(2023,02,01), DateTime(2023,02,10))
let sub = select {
for inner in table<weather> do
groupBy (inner.location, inner.instrument, inner.time)
select (inner.location, inner.instrument, inner.time, maxBy inner.timestamp)
}
select {
for w in table<weather> do
for (location, instrument, time, timestamp) in subqueryMany sub do
where (
w.time >= lBound && w.time <= uBound &&
w.location = location && w.instrument = instrument && w.time = time && w.timestamp = timestamp
)
}
but that fails to compile.
Any suggestions how to get this sort of query to work?
Issue Analytics
- State:
- Created 7 months ago
- Comments:8 (8 by maintainers)
Here are a few options for getting correlated subqueries working:
Option 1
The subquery is still in its own function, and the parent table is passed into the subquery. (The naming of the passed in table,
od
, would matter since that is used to define the table alias, so it would need to match what is in the parent query.)The problem I hit with I ran into in my experiment branch is that the
LinqExpressionVisitor
would need to be able to actually evaluate that function to get the resultingSqlKata.Query
, and I’m not sure that is possible. So that might be a dead-end… I’m not sure.Option 2
One way to bypass this could be to create a new function, similar to
table<>
, that could be used in a separate subquery function to declaratively designate a parent table source without actually passing one in. Something like this:Note that the
correlatedTable
function would be similar to thetable
function, but its definition would need to return an instance of the table'T
itself instead of aQuerySource<'T>
.Option 3
The third option would be to allow nesting the query within the parent query, at which point, it should be able to access the parent
od
table. I have seen this done before on the Pulumi.FSharp.Extensions project, and I think I asked him how he did it in the issues forum, but I don’t remember:In this case, the
subQueryOne
andsubQueryMany
could be turned into nested CE builders of their own:However, as I look at the
buckets
example from the Pulumi library, I don’t think it would work for us since we need the subquery to be within the where clause.TBH, it seems to me that Option 2 is our best bet. It’s reasonably easy to understand, and should be easy to implement. What do you think?
Thanks for the help on this! Having the workaround for a hand-written query is definitely helpful.