Sampling rows using TABLESAMPLE raises a ParseException
See original GitHub issueThe SQL standard 2003 defined a TABLESAMPLE
clause to execute the query only on a (random) subset of rows (supported e.g. in SQL Server and PostgreSQL). However, parsing a query which contains such a clause leads to either an incorrect JSON parse tree (if no table alias is given) or raises a ParseException
(if an alias is given).
Consider the following example query: SELECT * FROM foo TABLESAMPLE bernoulli (20) WHERE a < 42
Parsing it via parse("SELECT * FROM foo TABLESAMPLE bernoulli (20) WHERE a < 42")
mistakes the TABLESAMPLE
for an alias:
{'select': '*',
'from': {'value': 'foo', 'name': {'TABLESAMPLE': 'bernoulli'}},
'where': {'lt': ['a', 42]}}
If the query is modified to use an alias:
parse("SELECT * FROM foo f TABLESAMPLE bernoulli (20) WHERE f.a < 42")
,
parsing it raises a
ParseException: Expecting {union} | {intersect} | {except} | {minus} | {order by} | {offset} | {fetch} | {limit} | {union} | {intersect} | {except} | {minus} | {StringEnd}, found "TABLESAMPL" (at char 20), (line:1, col:21)
.
EDIT [22-06-09]: I corrected a copy/paste error in the example query. If the corrected version (without parenthesis around bernoulli and an added sampling percentage) is parsed, both versions (with and without table alias) raise a ParseException
.
Issue Analytics
- State:
- Created a year ago
- Comments:8 (6 by maintainers)
Top GitHub Comments
This is done. Feel free to open another issue if you find another problem, or even if you have a question.
Thank you for your help on this issue.
It seems like I made a copy/paste error in the initial example query. I edited the issue description to use the corrected version. I am terribly sorry for the confusion.
The progress is looking really good! I was already able to parse my test queries successfully!
Also, if I can help you out with anything regarding this issue just let me know. Although I am not really familiar with the tech stack, maybe there is still something left.