Superset ignore case-sensitivity of time column names with Postgresql
See original GitHub issueMake sure these boxes are checked before submitting your issue - thank you!
- I have checked the superset logs for python stacktraces and included it here as text if any
- I have reproduced the issue with at least the latest released version of superset
- I have checked the issue tracker for the same issue and I haven’t found one similar
- A closed issue seems to be related: #2588 but was closed because inactivity.
Superset version
Current master: 85692612d6f1438d8b19f51aefae92a240bc3cf3
Expected results
A table with column name where case is important (for example invoiceDate
) should be usable without any problem.
Actual results
When creating a chart with such a table, I get the following error:
column "invoicedate" does not exist LINE 1: SELECT DATE_TRUNC('day', invoiceDate) AT TIME ZONE 'UTC' AS ... ^ HINT: Perhaps you meant to reference the column "InvoiceLine.invoiceDate".
It seems to happens because of the Time Column
case sensitivity in combination with a Time Grain
(here day
).
Steps to reproduce
Import a table with case-sensitive names.
Issue Analytics
- State:
- Created 5 years ago
- Comments:15 (15 by maintainers)
Top Results From Across the Web
Superset ignore case-sensitivity of time column names with ...
I believe that Postgres is pickier than other engines in this case, i.e. most other engines probably wouldn't require quotes around the column...
Read more >[GitHub] victornoel commented on issue #5886: Superset ignore ...
... #5886: Superset ignore case-sensitivity of column names with Postgresql ... I choose my table where the date column is case sensitive, I...
Read more >Superset ignore case-sensitivity of time column names with Postgresql
Superset ignore case-sensitivity of time column names with Postgresql · [x] I have checked the superset logs for python stacktraces and included it...
Read more >Are PostgreSQL column names case-sensitive? - Stack Overflow
So, yes, PostgreSQL column names are case-sensitive (when double-quoted): ... Key words and unquoted identifiers are case insensitive.
Read more >Documentation: 15: 9.7. Pattern Matching - PostgreSQL
You may see these operator names in EXPLAIN output and similar places, since the parser ... denotes repetition of the previous item zero...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
TL;DR: I was talking more about how the query is generated in Superet, not what is valid SQL syntax. I believe that in this instance putting quotes around
{col}
for Postgres is probably the simplest solution, but I would not apply the same to all other engine types.Long answer: I was referring to how Superset generates SQL queries. Currently most other parts of the query are generated using SQL Alchemy constructs that automatically detect the need for quotes. For example, selecting a column
ColumnName
with an aliascol_alias
is in fact roughly defined ascolumn('ColumnName').label('col_alias')
, which is then compiled intoSELECT "ColumnName" AS col_alias
. If either require quoting,ColumnName
in this example, SQL Alchemy automatically puts quotes around it.The problem here is that time grain expressions don’t contain a reference to a SQL Alchemy column object, but are static strings inside a
literal_column
object, which makes it impossible for SQL Alchemy to apply conditional quoting. See below where this happens:https://github.com/apache/incubator-superset/blob/master/superset/connectors/sqla/models.py#L139-L143
I believe that Postgres is pickier than other engines in this case, i.e. most other engines probably wouldn’t require quotes around the column name in the time grain expression. Therefore I would not force quotes into the time grain expressions of other engines at this time.
For the record, a second workaround was provided here: https://github.com/apache/incubator-superset/pull/5962#issuecomment-425623999.
If it is possible, I would like to go farther than workarounds so that this can be fixed seriously in superset (for the sake of correctness and for future users 😉. I can contribute if needed with proper guidance 😃