Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add support for case sensitive identifiers

See original GitHub issue

<delimited identifier> ::=
  <double quote> <delimited identifier body> <double quote>

<delimited identifier body> ::=  <delimited identifier part>...
<delimited identifier part> ::=
    <nondoublequote character>
  | <doublequote symbol>

<Unicode delimited identifier> ::=
  U <ampersand> <double quote> <Unicode delimiter body> <double quote>
      <Unicode escape specifier>
<Unicode escape specifier> ::=
  [ UESCAPE <quote> <Unicode escape character> <quote> ]
<Unicode delimiter body> ::=
  <Unicode identifier part>...
<Unicode identifier part> ::=
    <delimited identifier part>
  | <Unicode escape value>

24) For every <identifier body> IB there is exactly one corresponding case-normal form CNF. CNF is an <identifier body> derived from IB as follows:
Let n be the number of characters in IB. For i ranging from 1 (one) to n, the i-th character Mi of IB is transliterated into the corresponding character 
or characters of CNF as follows:
Case:
   a) If Mi is a lower case character or a title case character for which an equivalent upper case sequence U is de ned by Unicode, then let j be th
       e number of characters in U; the next j characters of CNF are U.
   b) Otherwise, the next character of CNF is Mi.
25) The case-normal form of the <identifier body> of a <regular identifier> is used for purposes such as and including determination of identifier 
      equivalence, representation in the Definition and Information Schemas, and representation in diagnostics areas.

...

27) Two <regular identifier>s are equivalent if the case-normal forms of their <identifier body>s, considered as the repetition of a <character string literal> 
that specifies a <character set specification> of SQL_IDENTIFIER and an implementation-defined collation IDC that is sensitive to case, compare equally 
according to the comparison rules in Subclause 8.2, “<comparison predicate>”.

28) A <regular identifier> and a <delimited identifier> are equivalent if the case-normal form of the <identifier body> of the <regular identifier> and the 
<delimited identifier body> of the <delimited identifier> (with all occurrences of <quote> replaced by <quote symbol> and all occurrences of 
<doublequote symbol> replaced by <double quote>), considered as the repetition of a <character string literal> that specifies a <character set specification>
 of SQL_IDENTIFIER and IDC, compare equally according to the comparison rules in Subclause 8.2, “<comparison predicate>”.


29) Two<delimited identifier>s are equivalent if their <delimited identifierbody>s,considered as the repetition of a <character string literal> that specifies
 a <character set specification> of SQL_IDENTIFIER and an implementation-defined collation that is sensitive to case, compare equally according to the
 comparison rules in Subclause 8.2, “<comparison predicate>”.

30) Two <Unicode delimited identifier>s are equivalent if their <Unicode delimiter body>s, considered as the repetition of a <character string literal> that
 specifies a <character set specification> of SQL_IDENTIFIER and an implementation-defined collation that is sensitive to case, compare equally according
 to the comparison rules in Subclause 8.2, “<comparison predicate>”.

31) A <Unicode delimited identifier> and a <delimited identifier> are equivalent if their <Unicode delimiter body> and <delimited identifier body>, 
respectively, each considered as the repetition of a <character string literal> that specifies a <character set specification> of SQL_IDENTIFIER and 
an implementation-defined collation that is sensitive to case, compare equally according to the comparison rules in Subclause 8.2, “<comparison predicate>”.

32) A <regular identifier> and a <Unicode delimited identifier> are equivalent if the case-normal form of the <identifier body> of the <regular identifier> 
and the <Unicode delimiter body> of the <Unicode delimited identifier> considered as the repetition of a <character string literal>, each specifying a
 <character set specification> of SQL_IDENTIFIER and an implementation-defined collation that is sensitive to case, compare equally according to the 
comparison rules in Subclause 8.2, “<comparison predicate>”.

The approach and design is being captured here: https://github.com/prestosql/presto/wiki/Delimited-Identifiers

Issue Analytics

State:
Created 5 years ago
Reactions:2
Comments:9 (9 by maintainers)

Top GitHub Comments

1reaction

findepicommented, Apr 23, 2019

@martint thanks for looking into this.

When connectors return a name to the engine, do they first need to normalize it according to SQL rules?

For existing objects, names should be returned as-is. If the remote storage is case insensitive (like Hive is, right?), we could add some normalization here. But for JDBC connectors we generally shouldn’t. Table names are case-sensitive in eg Postgres, MySQL, SQL Server or Oracle. It is only “less convenient” (requires "-delimiting) to create tables with case different than the default.

NameSelector may not be an appropriate concept for all usages. When creating a table, how do we convey the table and column names to the connector?

Good point. When creating a table, we want to pass “identifier name from a query” to the connector. We can pass the value normalized to lower- (or upper-) -case unless it was "-delimited. (Even if we normalize to upper when talking to Postgres connector, it will still normalize to lower because the name is not delimited.) Plus, we need to retain information whether it was delimited.

My envisioned NameSelector(String name, boolean delimited) fits here perfectly… except for its name.

So:

let’s call this concept SqlIdentifier(String name, boolean delimited)
let’s use it when looking for a table, or creating new table
let’s use case-sensitive String (or Name, or ObjectName) for names of existing objects
let’s use it when listing objects (eg listing schemas, listing tables)

0reactions

kokosingcommented, Feb 21, 2022

Another thing to remember is to make sure that identifiers in event listeners are propagated in proper casing.

Top Results From Across the Web

Add support for case sensitive identifiers · Issue #17 - GitHub

For existing objects, names should be returned as-is. If the remote storage is case insensitive (like Hive is, right?), we could add some...

enable_case_sensitive_identifier - Amazon Redshift

Activates a configuration value that determines whether name identifiers of databases, tables, and columns are case sensitive.

Identifier Case-sensitivity - MariaDB Knowledge Base

Database, table, table aliases and trigger names are affected by the systems case-sensitivity, while index, column, column aliases, stored routine and event ...

MySQL 8.0 Reference Manual :: 9.2.3 Identifier Case Sensitivity

In MySQL, databases correspond to directories within the data directory. Each table within a database corresponds to at least one file within the...

Example: Database Uses Case-Sensitive Identifiers

You can now add comments to any guide or article page. To provide feedback and suggestions, log in with your Informatica credentials. Then,...