Improve SQL syntax for MERGE
See original GitHub issueExisting SQL syntax (see here and here) for MEGRE could be improved by adding an alternative for ON <merge_condition>
Main assumption
In common cases target and source tables have the same column names used in <merge_condition>
as keys.
For example ON target.id = source.id
or ON target.name = source.name AND target.surname = source.surname
It would be more convenient to use ON COLUMNS (id)
and ON COLUMNS (name, surname)
or similar instead.
The same approach is used for JOIN where join_criteria
syntax is ON boolean_expression | USING ( column_name [ , ... ] )
Improvement proposal Syntax
MERGE INTO target_table_identifier [AS target_alias]
USING source_table_identifier [<time_travel_version>] [AS source_alias]
ON { <merge_condition> | COLUMNS ( column_name [ , ... ] ) }
[ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ]
[ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ]
[ WHEN NOT MATCHED [ AND <condition> ] THEN <not_matched_action> ]
Example
MERGE INTO target
USING source
ON COLUMNS (name, surname)
WHEN MATCHED THEN
UPDATE SET *
WHEN NOT MATCHED THEN
INSERT *
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:5 (4 by maintainers)
Top Results From Across the Web
Ways to improve the performance of a SQL MERGE statement
Ways to improve the performance of a SQL MERGE statement · 1. Create indexes: Ensure that the columns referenced in the condition are...
Read more >How to optimize SQL Server Merge statement running with ...
1 Answer 1 · Create an index on the join columns in the source table that is unique and covering. · Create a...
Read more >Use MERGE to Update 1 Million Rows in 2 Seconds - Vertica
SQL MERGE statements combine INSERT and UPDATE operations. They are a great way to update by inserting a small (<1000), or large (>1...
Read more >Understanding the SQL MERGE statement - SQLShack
The MERGE statement in SQL is a very popular clause that can handle inserts, updates, and deletes all in a single transaction without...
Read more >MERGE (Transact-SQL) - SQL Server - Microsoft Learn
This can improve query performance because the operations are performed within a single statement, therefore, minimizing the number of times the ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I doubt this syntax improvement will be supported by Spark. MERGE has a clear SQL standard, and I am not sure if something like USING is supported in that. We have been able to convince Spark to support UPDATE SET * and INSERT * (completely outside the SQL standard) only because of the major usability improvement that it gives. But for something smaller like this, I doubt we will be able to convince Spark community to adopt this non-standard approach.
Closing in favor of the ticket created in Spark JIRA https://issues.apache.org/jira/browse/SPARK-36472