question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Fixing query options

See original GitHub issue

This issue is created follow up #8880.

Background

Currently, Pinot SQL OPTION keyword is a REGEX match that is allow ANYWHERE in the SQL string. This causes SQLi issues.

Backward-Compatibility Issue

#8880 proposed an alternative syntax of using OPTION similar to SQL setStatement (see: https://calcite.apache.org/docs/reference.html)

However, this creates a backward-incompatible change towards the Pinot SQL syntax.

Thus we propose to hold off the syntax change until we come to a consensus on which syntax to use.

SQLi Resolution

  • #8905 proposed a temporary solution to only allow OPTION keyword at end of SQL string.
  • also marked the REGEX syntax as deprecated (although we have no easy way to inform end-user in the query result)

Alternative Syntax Change

Note, for the syntax code segment: <, > is used to quote reserved keywords or reserved operators.

As STATEMENT

Pinot query now only allows a single STATEMENT. So there’s no difference setting OPTIONS as “clause” associated with the statement, or associates the OPTION with the entire SQL statement list context.

Potentially acceptable syntax listed below:

  1. Standard SET statement
setStatement:
    <SET> identifier <=> literal <;>

identifier:
    simpleIdentifier | <">quoted.complex.identifier<">

literal:
    stringLiteral | integerLiteral | doubleLiteral | booleanLiteral

PRO: standard SET statement is default supported in Calcite, also supported by major SQL engines:

  1. Special OPTION keyword
optionStatement:
  <OPTION> <(> optionKVPair [<,> optionKVPair ]* <)> <;>

optionKVPair:
  identifier <=> literal

identifier:
    simpleIdentifier | <">quoted.complex.identifier<">

literal:
    stringLiteral | integerLiteral | doubleLiteral | booleanLiteral

PRO: similar usage of current OPTION clause. CON: I couldn’t find a commonly used SQL system that supports this statement type, not easy for users understand the syntax.

AS CLAUSE

We can also extend Calcite’s syntax to support OPTION clause in SELEC statement the syntax will be similar to:

select:
      <SELECT> [ <ALL> | <DISTINCT> ]
          { * | projectItem [<,> projectItem ]* }
      <FROM> tableExpression
      [ <WHERE> booleanExpression ]
      [ <GROUP BY> { groupItem [<,> groupItem ]* } ]
      [ <HAVING> booleanExpression ]
      [ <OPTIONS> <(> optionKVPair [<,> optionKVPair ]* <)> ]
      <;>

optionKVPair:
  identifier <=> literal

identifier:
    simpleIdentifier | <">quoted.complex.identifier<">

literal:
    stringLiteral | integerLiteral | doubleLiteral | booleanLiteral

PRO: This is almost identical to current Pinot OPTION syntax CON: for each statement we need to extend and add OPTION syntax (e.g. INSERT, CREATE, DELETE, and other DML/DQL we add in the future); also I think requires us to alter Calcite’s core syntax parsing extension template.

Closing Thoughts

  • Please comment/reply on this issue to share your understanding, and if any of the descriptions I posted is incorrect or imprecise.
  • if there’s any alternative syntax not listed above. please kindly share in the comment as well. I will incorporate in the alternative syntax list.

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:7
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
walterddrcommented, Jul 5, 2022

looks like <SET> key = value is the way to go. implementing it soon

1reaction
jackjllicommented, Jun 22, 2022

I’m also incline to SET, as SET is more commonly adopted by other data platforms, which requires less learning curves for users to leverage.

And I think what Jackie mentioned for SET in Presto is that Presto uses SET to set runtime parameters, which is similar to what Postgres does.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Common Power Query errors & how to fix them
Click File > Option Settings > Query Options. The Query Options window dialog box. Select Privacy > Always ignore Privacy Level settings, then ......
Read more >
Dealing with errors - Power Query | Microsoft Learn
In Power Query, you can encounter two types of errors: Step-level errors; Cell-level errors. This article provides suggestions for how to fix ......
Read more >
4 Ways to Fix Date Errors in Power Query + Locale & Regional ...
1. Locale in Data Type Menu. The first way to fix a date error that stems from location differences is to choose the...
Read more >
Query optimization techniques in SQL Server: tips and tricks
Fixing bad queries and resolving performance problems can involve hours (or ... The first 3 options above are as much design/architecture ...
Read more >
SQL Performance Tuning: 15 Go-To Tips to Fix Slow Queries
If you don't clarify your information request, the people who want the information will keep sending you back for more data. So before...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found