Left Join becomes Inner Join for inequality conditions
See original GitHub issueCode:
import pandas as pd
import numpy as np
from dask.distributed import Client
from dask_sql import Context
client = Client()
cont = Context()
df1 = pd.DataFrame({
'dated': pd.date_range(pd.Timestamp('2021-01-01'), pd.Timestamp('2021-01-10')),
'var1': np.ones(10)})
df2 = pd.DataFrame({
'startdate': [pd.Timestamp('2020-12-30'), pd.Timestamp('2021-01-09')],
'enddate': [pd.Timestamp('2021-01-03'), pd.Timestamp('2021-01-20')],
'var2': np.array([2.0, 3.0])})
cont.create_table('df1', df1)
cont.create_table('df2', df2)
df3 = cont.sql(
"""select a.*, b.var2
from df1 a left join df2 b
on b.startdate<=a.dated and a.dated<=b.enddate""").compute()
Results:
df1
:
dated var1
0 2021-01-01 1.0
1 2021-01-02 1.0
2 2021-01-03 1.0
3 2021-01-04 1.0
4 2021-01-05 1.0
5 2021-01-06 1.0
6 2021-01-07 1.0
7 2021-01-08 1.0
8 2021-01-09 1.0
9 2021-01-10 1.0
df2
:
startdate enddate var2
0 2020-12-30 2021-01-03 2.0
1 2021-01-09 2021-01-20 3.0
df3
:
dated var1 var2
0 2021-01-01 1.0 2.0
2 2021-01-02 1.0 2.0
4 2021-01-03 1.0 2.0
17 2021-01-09 1.0 3.0
19 2021-01-10 1.0 3.0
This is an Inner Join, not Left Join.
The correct output should be as follows, using sqlite3:
import sqlite3
# Connect database
conn = sqlite3.connect(':memory:')
df1.to_sql('df1', conn, index=False)
df2.to_sql('df2', conn, index=False)
df3 = pd.read_sql_query(
"""select a.*, b.var2
from df1 a left join df2 b
on b.startdate<=a.dated and a.dated<=b.enddate""",
conn,
parse_dates=['dated'])
where df3
is
dated var1 var2
0 2021-01-01 1.0 2.0
1 2021-01-02 1.0 2.0
2 2021-01-03 1.0 2.0
3 2021-01-04 1.0 NaN
4 2021-01-05 1.0 NaN
5 2021-01-06 1.0 NaN
6 2021-01-07 1.0 NaN
7 2021-01-08 1.0 NaN
8 2021-01-09 1.0 3.0
9 2021-01-10 1.0 3.0
Issue Analytics
- State:
- Created 2 years ago
- Comments:9
Top Results From Across the Web
left join turns into inner join
When you move the condition to the ON clause, it becomes part of the JOIN row matching, rather than the final filter. The...
Read more >When does an SQL left join act like an inner join?
Inner join only returns data from the left table if it matches data in the right table while left join returns all data...
Read more >Learn SQL: INNER JOIN vs LEFT JOIN
You'll use INNER JOIN when you want to return only records having pair on both sides, and you'll use LEFT JOIN when you...
Read more >SQL Gotcha: When an OUTER JOIN Accidentally Becomes ...
One of the first concepts you learn when writing SQL is the difference between an INNER JOIN and an OUTER JOIN (e.g., LEFT...
Read more >Joins (SQL Server)
Inner join ; Left outer join; Right outer join; Full outer join ... Although join conditions usually have equality comparisons (=), other ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi, @nils-braun . Yeah, I’ve finished editing the
join.py
file to make it work, but I have not fully tested it yet. I plan to add more unit tests in the next week.Hi @flcong! Did you have time to look into the issue with the joins further? Is there anything I can help you with?