question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Snowflake - Error parsing MERGE statement

See original GitHub issue

Hello,

I am obtaining an error parsing a specific merge statement. Here is my code and error:

Code:

bad_text = """
    merge into target using source
      on target.id = source.id
      when matched then update set
        target.c1 = source.c1,
        target.c2 = source.c2
      when not matched then insert (
            c1,
            c2
        ) values (
            c1,
            c2
        );
"""

from sqllineage.runner import LineageRunner
t = LineageRunner(bad_text)
print(t)

Error:

SQLLineageException                       Traceback (most recent call last)
/tmp/ipykernel_9799/2058247866.py in <module>
----> 1 print(t)

~/.virtualenvs/my-proj/lib/python3.9/site-packages/sqllineage/runner.py in wrapper(*args, **kwargs)
     18         self = args[0]
     19         if not self._evaluated:
---> 20             self._eval()
     21         return func(*args, **kwargs)
     22 

~/.virtualenvs/my-proj/lib/python3.9/site-packages/sqllineage/runner.py in _eval(self)
    145             if s.token_first(skip_cm=True)
    146         ]
--> 147         self._lineage_results = [LineageAnalyzer().analyze(stmt) for stmt in self._stmt]
    148         self._combined_lineage_result = combine(*self._lineage_results)
    149         self._evaluated = True

~/.virtualenvs/my-proj/lib/python3.9/site-packages/sqllineage/runner.py in <listcomp>(.0)
    145             if s.token_first(skip_cm=True)
    146         ]
--> 147         self._lineage_results = [LineageAnalyzer().analyze(stmt) for stmt in self._stmt]
    148         self._combined_lineage_result = combine(*self._lineage_results)
    149         self._evaluated = True

~/.virtualenvs/my-proj/lib/python3.9/site-packages/sqllineage/core.py in analyze(self, stmt)
     94         else:
     95             # DML parsing logic also applies to CREATE DDL
---> 96             self._extract_from_dml(stmt)
     97         return self._lineage_result
     98 

~/.virtualenvs/my-proj/lib/python3.9/site-packages/sqllineage/core.py in _extract_from_dml(self, token)
    155                 source_table_token_flag = False
    156             elif target_table_token_flag:
--> 157                 self._handle_target_table_token(sub_token)
    158                 target_table_token_flag = False
    159             elif temp_table_token_flag:

~/.virtualenvs/my-proj/lib/python3.9/site-packages/sqllineage/core.py in _handle_target_table_token(self, sub_token)
    216         else:
    217             if not isinstance(sub_token, Identifier):
--> 218                 raise SQLLineageException(
    219                     "An Identifier is expected, got %s[value: %s] instead."
    220                     % (type(sub_token).__name__, sub_token)

SQLLineageException: An Identifier is expected, got IdentifierList[value: target.c1 = source.c1,
    target.c2 = source.c2] instead.

Not sure if it might help identifying the problem, but removing one of the lines from the UPDATE statement turns it into a parsable statement, but gives an incorrect output:

# this works!
merge into target using source
on target.id = source.id
when matched then update set
      target.c1 = source.c1
when not matched then insert (
      c1,
      c2
) values (
      c1,
      c2
);

Output:

Statements(#): 1
Source Tables:
    source.c1
Target Tables:
    <default>.target
    target.c1

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
reatacommented, Sep 8, 2021

Your understand is correct. The rule this “MERGE” statement interfere with is “CREATE TABLE LIKE”. Previously I already use a hack to handle “CREATE TABLE LIKE”(see https://github.com/reata/sqllineage/issues/117 for reference), so if possible, I’d rather not patch this with another hack.

I cannot give a precise ETA until I have time to look at this issue in detail next weekend.

As for the “design philosophy”, we have a contributing guide, although it’s mainly focus on the toolchain and dev process side of the story. Personally I only have a very basic understanding of parser, after all, all you need is learn how to use a parser instead of building one. Most of my knowledge comes from using and reading the source code of sqlparse. I’d encourage you to try play with this library using your own SQL and then come back to see how sqllineage is leveraging it. Contribution is always welcome.

The long term vision is to replace sqlparse library with an in-house SQL parser, possibly built with ANTLR, so that we can create parser strictly to each SQL dialect specification, but that’s really in the distant future.

1reaction
reatacommented, Oct 8, 2021

@reata glad to hear you’re working on column-level lineage let me know if you need any help! this will open so many other options going forward.

Fyi - I will take a look at the code source first.

Hey @sudhanshu456 , the column level lineage code is now merged to master. Although it’s not ready for release yet, but I guess the code structure is stable enough if you’d like to submit a PR for this issue based on current code.

Read more comments on GitHub >

github_iconTop Results From Across the Web

UPDATE and MERGE using IDENTIFIER function with ALIAS ...
This article explains why using IDENTIFIER function in UPDATE or MERGE statement with an alias fails with an error and how to fix...
Read more >
Continue loading data in MERGE despite "bad" record
I am trying to MERGE data from a S3 files into my Snowflake data table. ... In MERGE statement select the PK from...
Read more >
Simple procedure throws and error when using a merge ...
Simple procedure throws and error when using a merge statement. insert, update, and delete statements work. Is this expected or a bug?
Read more >
How To: Perform a MERGE/UPSERT from a flat file staged on S3
A merge or upsert operation can be performed by directly referencing the stage file location in the query. Below is an example:
Read more >
Step 6. Resolve Data Load Errors Related to Data Issues
The following process returns errors by query ID and saves the results to a ... to ignore this error | mycsvtable/contacts3.csv.gz | 3...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found