Snowflake - Error parsing MERGE statement
See original GitHub issueHello,
I am obtaining an error parsing a specific merge statement. Here is my code and error:
Code:
bad_text = """
merge into target using source
on target.id = source.id
when matched then update set
target.c1 = source.c1,
target.c2 = source.c2
when not matched then insert (
c1,
c2
) values (
c1,
c2
);
"""
from sqllineage.runner import LineageRunner
t = LineageRunner(bad_text)
print(t)
Error:
SQLLineageException Traceback (most recent call last)
/tmp/ipykernel_9799/2058247866.py in <module>
----> 1 print(t)
~/.virtualenvs/my-proj/lib/python3.9/site-packages/sqllineage/runner.py in wrapper(*args, **kwargs)
18 self = args[0]
19 if not self._evaluated:
---> 20 self._eval()
21 return func(*args, **kwargs)
22
~/.virtualenvs/my-proj/lib/python3.9/site-packages/sqllineage/runner.py in _eval(self)
145 if s.token_first(skip_cm=True)
146 ]
--> 147 self._lineage_results = [LineageAnalyzer().analyze(stmt) for stmt in self._stmt]
148 self._combined_lineage_result = combine(*self._lineage_results)
149 self._evaluated = True
~/.virtualenvs/my-proj/lib/python3.9/site-packages/sqllineage/runner.py in <listcomp>(.0)
145 if s.token_first(skip_cm=True)
146 ]
--> 147 self._lineage_results = [LineageAnalyzer().analyze(stmt) for stmt in self._stmt]
148 self._combined_lineage_result = combine(*self._lineage_results)
149 self._evaluated = True
~/.virtualenvs/my-proj/lib/python3.9/site-packages/sqllineage/core.py in analyze(self, stmt)
94 else:
95 # DML parsing logic also applies to CREATE DDL
---> 96 self._extract_from_dml(stmt)
97 return self._lineage_result
98
~/.virtualenvs/my-proj/lib/python3.9/site-packages/sqllineage/core.py in _extract_from_dml(self, token)
155 source_table_token_flag = False
156 elif target_table_token_flag:
--> 157 self._handle_target_table_token(sub_token)
158 target_table_token_flag = False
159 elif temp_table_token_flag:
~/.virtualenvs/my-proj/lib/python3.9/site-packages/sqllineage/core.py in _handle_target_table_token(self, sub_token)
216 else:
217 if not isinstance(sub_token, Identifier):
--> 218 raise SQLLineageException(
219 "An Identifier is expected, got %s[value: %s] instead."
220 % (type(sub_token).__name__, sub_token)
SQLLineageException: An Identifier is expected, got IdentifierList[value: target.c1 = source.c1,
target.c2 = source.c2] instead.
Not sure if it might help identifying the problem, but removing one of the lines from the UPDATE statement turns it into a parsable statement, but gives an incorrect output:
# this works!
merge into target using source
on target.id = source.id
when matched then update set
target.c1 = source.c1
when not matched then insert (
c1,
c2
) values (
c1,
c2
);
Output:
Statements(#): 1
Source Tables:
source.c1
Target Tables:
<default>.target
target.c1
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (4 by maintainers)
Top Results From Across the Web
UPDATE and MERGE using IDENTIFIER function with ALIAS ...
This article explains why using IDENTIFIER function in UPDATE or MERGE statement with an alias fails with an error and how to fix...
Read more >Continue loading data in MERGE despite "bad" record
I am trying to MERGE data from a S3 files into my Snowflake data table. ... In MERGE statement select the PK from...
Read more >Simple procedure throws and error when using a merge ...
Simple procedure throws and error when using a merge statement. insert, update, and delete statements work. Is this expected or a bug?
Read more >How To: Perform a MERGE/UPSERT from a flat file staged on S3
A merge or upsert operation can be performed by directly referencing the stage file location in the query. Below is an example:
Read more >Step 6. Resolve Data Load Errors Related to Data Issues
The following process returns errors by query ID and saves the results to a ... to ignore this error | mycsvtable/contacts3.csv.gz | 3...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Your understand is correct. The rule this “MERGE” statement interfere with is “CREATE TABLE LIKE”. Previously I already use a hack to handle “CREATE TABLE LIKE”(see https://github.com/reata/sqllineage/issues/117 for reference), so if possible, I’d rather not patch this with another hack.
I cannot give a precise ETA until I have time to look at this issue in detail next weekend.
As for the “design philosophy”, we have a contributing guide, although it’s mainly focus on the toolchain and dev process side of the story. Personally I only have a very basic understanding of parser, after all, all you need is learn how to use a parser instead of building one. Most of my knowledge comes from using and reading the source code of sqlparse. I’d encourage you to try play with this library using your own SQL and then come back to see how sqllineage is leveraging it. Contribution is always welcome.
The long term vision is to replace sqlparse library with an in-house SQL parser, possibly built with ANTLR, so that we can create parser strictly to each SQL dialect specification, but that’s really in the distant future.
Hey @sudhanshu456 , the column level lineage code is now merged to master. Although it’s not ready for release yet, but I guess the code structure is stable enough if you’d like to submit a PR for this issue based on current code.