question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add support for case sensitive identifiers

See original GitHub issue
<delimited identifier> ::=
  <double quote> <delimited identifier body> <double quote>

<delimited identifier body> ::=  <delimited identifier part>...
<delimited identifier part> ::=
    <nondoublequote character>
  | <doublequote symbol>

<Unicode delimited identifier> ::=
  U <ampersand> <double quote> <Unicode delimiter body> <double quote>
      <Unicode escape specifier>
<Unicode escape specifier> ::=
  [ UESCAPE <quote> <Unicode escape character> <quote> ]
<Unicode delimiter body> ::=
  <Unicode identifier part>...
<Unicode identifier part> ::=
    <delimited identifier part>
  | <Unicode escape value>
24) For every <identifier body> IB there is exactly one corresponding case-normal form CNF. CNF is an <identifier body> derived from IB as follows:
Let n be the number of characters in IB. For i ranging from 1 (one) to n, the i-th character Mi of IB is transliterated into the corresponding character 
or characters of CNF as follows:
Case:
   a) If Mi is a lower case character or a title case character for which an equivalent upper case sequence U is de ned by Unicode, then let j be th
       e number of characters in U; the next j characters of CNF are U.
   b) Otherwise, the next character of CNF is Mi.
25) The case-normal form of the <identifier body> of a <regular identifier> is used for purposes such as and including determination of identifier 
      equivalence, representation in the Definition and Information Schemas, and representation in diagnostics areas.

...

27) Two <regular identifier>s are equivalent if the case-normal forms of their <identifier body>s, considered as the repetition of a <character string literal> 
that specifies a <character set specification> of SQL_IDENTIFIER and an implementation-defined collation IDC that is sensitive to case, compare equally 
according to the comparison rules in Subclause 8.2, “<comparison predicate>”.

28) A <regular identifier> and a <delimited identifier> are equivalent if the case-normal form of the <identifier body> of the <regular identifier> and the 
<delimited identifier body> of the <delimited identifier> (with all occurrences of <quote> replaced by <quote symbol> and all occurrences of 
<doublequote symbol> replaced by <double quote>), considered as the repetition of a <character string literal> that specifies a <character set specification>
 of SQL_IDENTIFIER and IDC, compare equally according to the comparison rules in Subclause 8.2, “<comparison predicate>”.


29) Two<delimited identifier>s are equivalent if their <delimited identifierbody>s,considered as the repetition of a <character string literal> that specifies
 a <character set specification> of SQL_IDENTIFIER and an implementation-defined collation that is sensitive to case, compare equally according to the
 comparison rules in Subclause 8.2, “<comparison predicate>”.

30) Two <Unicode delimited identifier>s are equivalent if their <Unicode delimiter body>s, considered as the repetition of a <character string literal> that
 specifies a <character set specification> of SQL_IDENTIFIER and an implementation-defined collation that is sensitive to case, compare equally according
 to the comparison rules in Subclause 8.2, “<comparison predicate>”.

31) A <Unicode delimited identifier> and a <delimited identifier> are equivalent if their <Unicode delimiter body> and <delimited identifier body>, 
respectively, each considered as the repetition of a <character string literal> that specifies a <character set specification> of SQL_IDENTIFIER and 
an implementation-defined collation that is sensitive to case, compare equally according to the comparison rules in Subclause 8.2, “<comparison predicate>”.

32) A <regular identifier> and a <Unicode delimited identifier> are equivalent if the case-normal form of the <identifier body> of the <regular identifier> 
and the <Unicode delimiter body> of the <Unicode delimited identifier> considered as the repetition of a <character string literal>, each specifying a
 <character set specification> of SQL_IDENTIFIER and an implementation-defined collation that is sensitive to case, compare equally according to the 
comparison rules in Subclause 8.2, “<comparison predicate>”.

The approach and design is being captured here: https://github.com/prestosql/presto/wiki/Delimited-Identifiers

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:2
  • Comments:9 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
findepicommented, Apr 23, 2019

@martint thanks for looking into this.

When connectors return a name to the engine, do they first need to normalize it according to SQL rules?

For existing objects, names should be returned as-is. If the remote storage is case insensitive (like Hive is, right?), we could add some normalization here. But for JDBC connectors we generally shouldn’t. Table names are case-sensitive in eg Postgres, MySQL, SQL Server or Oracle. It is only “less convenient” (requires "-delimiting) to create tables with case different than the default.

NameSelector may not be an appropriate concept for all usages. When creating a table, how do we convey the table and column names to the connector?

Good point. When creating a table, we want to pass “identifier name from a query” to the connector. We can pass the value normalized to lower- (or upper-) -case unless it was "-delimited. (Even if we normalize to upper when talking to Postgres connector, it will still normalize to lower because the name is not delimited.) Plus, we need to retain information whether it was delimited.

My envisioned NameSelector(String name, boolean delimited) fits here perfectly… except for its name.

So:

  • let’s call this concept SqlIdentifier(String name, boolean delimited)
  • let’s use it when looking for a table, or creating new table
  • let’s use case-sensitive String (or Name, or ObjectName) for names of existing objects
  • let’s use it when listing objects (eg listing schemas, listing tables)
0reactions
kokosingcommented, Feb 21, 2022

Another thing to remember is to make sure that identifiers in event listeners are propagated in proper casing.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Add support for case sensitive identifiers · Issue #17 - GitHub
For existing objects, names should be returned as-is. If the remote storage is case insensitive (like Hive is, right?), we could add some...
Read more >
enable_case_sensitive_identifier - Amazon Redshift
Activates a configuration value that determines whether name identifiers of databases, tables, and columns are case sensitive.
Read more >
Identifier Case-sensitivity - MariaDB Knowledge Base
Database, table, table aliases and trigger names are affected by the systems case-sensitivity, while index, column, column aliases, stored routine and event ...
Read more >
MySQL 8.0 Reference Manual :: 9.2.3 Identifier Case Sensitivity
In MySQL, databases correspond to directories within the data directory. Each table within a database corresponds to at least one file within the...
Read more >
Example: Database Uses Case-Sensitive Identifiers
You can now add comments to any guide or article page. To provide feedback and suggestions, log in with your Informatica credentials. Then,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found