question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[for each dataset] fails on table names containing whitespaces

See original GitHub issue

Problem Applies to SQL Server: when for each dataset is used on a table name containing whitespaces (example: Contact Type), the following syntax error is thrown:

Soda Core 3.0.10
Query execution error in nav.Contact Type.aggregation[0]: ('42S02', "[42S02] [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]Invalid object name 'Contact'. (208) (SQLExecDirectW)")
SELECT 
  COUNT(*) 
FROM dbo.Contact Type

Suspected cause I think this happens because of omitted object identifiers in the SQL-statements (the sqlserver-dialect of) Soda generates here. MSSQL expects dbo.Contact Type to be enclosed with (preferably) brackets, like [dbo].[Contact Type]. Seen this issue before in Soda <-> MSSQL (https://github.com/sodadata/soda-sql/issues/171, https://github.com/sodadata/soda-sql/issues/181).

Steps to reproduce

  1. Have a Microsoft SQL Server instance with a table containing a column with a white-space character in it’s name.
  2. Configure checks.yml with at least:
for each dataset T:
  datasets:
    - "Contact Type"
    - Country
  checks:
...
  1. Run soda scan

System Docker version: 20.10.17 Soda Core version: 3.0.10 Data source type: Microsoft SQL Server 2019 (15.0.2080.9)

Let me know if I can test anything for you on sqlserver! ✋🏼

Issue Analytics

  • State:open
  • Created 10 months ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
vijaykirancommented, Dec 19, 2022

@geertvanzoest Ah, Indeed we have to separate “quoting” in metadata queries and normal queries. I’ll see how to fix it in 3.0.17

0reactions
geertvanzoestcommented, Dec 16, 2022

This is the result:

2022-12-16T11:02:46.4366094Z Pull docker image of Soda Core with tag 'v3.0.16' if not locally available.
2022-12-16T11:02:47.7652281Z v3.0.16: Pulling from sodadata/soda-core
2022-12-16T11:02:47.7684000Z Digest: sha256:ae631c9a28531ddda8594b49363bb1e97ecf17ef64fdcbcd31387f46f73f15b2
2022-12-16T11:02:47.7685102Z Status: Image is up to date for sodadata/soda-core:v3.0.16
2022-12-16T11:02:47.7724314Z docker.io/sodadata/soda-core:v3.0.16
2022-12-16T11:02:47.7999642Z Run Soda Core container and execute checks as configured in YAML-files.
2022-12-16T11:02:48.7365536Z [11:02:48] Soda Core 3.0.16
2022-12-16T11:02:49.3850683Z Instantiating for each for ['Country']
2022-12-16T11:02:49.6810203Z [11:02:49] Scan summary:
2022-12-16T11:02:49.6810595Z [11:02:49] 2/2 checks PASSED: 
2022-12-16T11:02:49.6810943Z [11:02:49]     Country in nav
2022-12-16T11:02:49.6811149Z [11:02:49]       Minimum row count check [PASSED]
2022-12-16T11:02:49.6811349Z [11:02:49]       Schema changes [PASSED]
2022-12-16T11:02:49.6811550Z [11:02:49] All is good. No failures. No warnings. No errors.
2022-12-16T11:02:49.6811756Z [11:02:49] Sending results to Soda Cloud
2022-12-16T11:02:50.0187176Z [11:02:50] Soda Cloud Trace: 8242778775593017275
2022-12-16T11:02:50.8500909Z ##[debug]$LASTEXITCODE: 0

Soda doesn’t evaluate the table when using for each "[Table Name]" because it tries to fetch the objectname from the information_schema using this query:

SELECT table_name 
FROM information_schema.tables
WHERE (table_name like 'Country' OR table_name like '[Contact Type]')
      AND lower(table_schema) = 'dbo'

As you can see, the brackets are parsed as being part of the objectname instead of being parsed as object identifiers. Therefore, no result is returned from information_schema and the table isn’t checked by Soda.

So just for funzies, when I use this checks.yml:

for each dataset T:
  datasets:
    - "[Contact Type]"
    - "[Country]"

This is the result 😃

2022-12-16T10:58:41.1006674Z Pull docker image of Soda Core with tag 'v3.0.16' if not locally available.
2022-12-16T10:58:42.2888831Z v3.0.16: Pulling from sodadata/soda-core
2022-12-16T10:58:42.2922884Z Digest: sha256:ae631c9a28531ddda8594b49363bb1e97ecf17ef64fdcbcd31387f46f73f15b2
2022-12-16T10:58:42.2925430Z Status: Image is up to date for sodadata/soda-core:v3.0.16
2022-12-16T10:58:42.2961387Z docker.io/sodadata/soda-core:v3.0.16
2022-12-16T10:58:42.3217617Z Run Soda Core container and execute checks as configured in YAML-files.
2022-12-16T10:58:43.3045956Z [10:58:43] Soda Core 3.0.16
2022-12-16T10:58:43.9951143Z Instantiating for each for []
2022-12-16T10:58:43.9951974Z [10:58:43] Scan summary:
2022-12-16T10:58:43.9952597Z [10:58:43] No checks found, 0 checks evaluated.
2022-12-16T10:58:43.9953129Z [10:58:43] Sending results to Soda Cloud
2022-12-16T10:58:44.2559101Z [10:58:44] Soda Cloud Trace: 1121090628266021427
2022-12-16T10:58:45.0607743Z ##[debug]$LASTEXITCODE: 0

Let me know if I can do anything else to help.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Error when querying a SQL Server view with whitespace in the ...
Try using the syntax: $table = "['my_other view']". Where you use single quotes around the table name inside of square brackets.
Read more >
How to write SQL queries with spaces in column names
In this article, we are going to learn how we can write a SQL query with space in the column name. Blanks spaces...
Read more >
fread fails if whitespace before first character #1035 - GitHub
Each row begins with two white spaces, but the separator for the rest of the values is a single white space. In version...
Read more >
Dealing with extra white spaces while reading CSV in Pandas
Our dataset contains these columns: Id which identifies each row; Street which has initial and trailing white space; City which has leading ...
Read more >
Column name can not contain leading or trailing white spaces ...
I am doing a join on several columns. I just changed the name of one and now I have an error regarding columns...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found