dbt docs generate fails when a column name is " "
See original GitHub issueDescribe the bug
A clear and concise description of what the bug is. What command did you run? What happened?
Steps to reproduce
- Create a model with a column name
" "
(I know)
$ cat models/customers.sql
select 1 as " "
- Run
dbt docs generate
(Note this also applies to source tables with this column name, but easiest to reproduce by creating a model)
Expected behavior
An error, but with a good message
Actual behaviour
Encountered an error:
None is not of type 'string'
Failed validating 'type' in schema['properties']['name']:
{'type': 'string'}
On instance['name']:
None
Screenshots and log output
2020-06-17 19:22:49.606297 (MainThread): Encountered an error:
2020-06-17 19:22:49.607069 (MainThread): None is not of type 'string'
Failed validating 'type' in schema['properties']['name']:
{'type': 'string'}
On instance['name']:
None
2020-06-17 19:22:49.611463 (MainThread): jsonschema.exceptions.ValidationError: None is not of type 'string'
Failed validating 'type' in schema['properties']['name']:
{'type': 'string'}
On instance['name']:
None
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/main.py", line 81, in main
results, succeeded = handle_and_check(args)
File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/main.py", line 159, in handle_and_check
task, res = run_from_args(parsed)
File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/main.py", line 212, in run_from_args
results = task.run()
File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/task/generate.py", line 249, in run
catalog = Catalog(catalog_data)
File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/task/generate.py", line 62, in __init__
self.add_column(col)
File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/task/generate.py", line 97, in add_column
column = ColumnMetadata.from_dict(column_data)
File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/hologram/__init__.py", line 594, in from_dict
cls.validate(data)
File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/hologram/__init__.py", line 937, in validate
raise ValidationError.create_from(error) from error
hologram.ValidationError: None is not of type 'string'
Failed validating 'type' in schema['properties']['name']:
{'type': 'string'}
On instance['name']:
None
System information
Which database are you using dbt with?
- postgres
- redshift
- bigquery
- snowflake
- other (specify: ____________)
The output of dbt --version
:
dbt --version
installed version: 0.18.0-b1
latest version: 0.17.0
Your version of dbt is ahead of the latest release!
Plugins:
- bigquery: 0.18.0b1
- snowflake: 0.18.0b1
- redshift: 0.18.0b1
- postgres: 0.18.0b1
Additional context
- Might also happen if you have a column named
'null'
. - Here is the line of code
Workaround
Rename the column!
This often crops up if an existing relation (rather than a dbt-generated relation) has a strange column name. To find the column, use the information schema. For example, on Snowflake this would look like:
select * from <YOUR_DATABASE>.information_schema.columns
where (
trim(column_name) = ''
or column_name is null
or lower(column_name) = 'null'
or lower(column_name) = 'none'
)
Make sure you check all databases that your dbt project references!
Issue Analytics
- State:
- Created 3 years ago
- Comments:13 (7 by maintainers)
Top Results From Across the Web
dbt docs generate fails when a column name is " " · Issue #2564
Actual behaviour. Encountered an error: None is not of type 'string' Failed validating 'type' in schema['properties']['name']: {'type': 'string ...
Read more >Debugging errors - dbt Developer Hub
To fix this: Open the offending file (e.g. schema.yml ); Check the line in the error message (e.g. line 5 ); Find the...
Read more >'DBT docs generate' does not populate model column-level ...
LATER EDIT: I tried running dbt --debug docs generate and it seems that all data is retrieved directly from the target environment (in...
Read more >Stable Releases - dbtvault
dbt Docs : The built-in dbt docs site ( dbt docs serve ) now includes documentation ... This caused missing columns in generated...
Read more >7 dbt Testing Best Practices - Datafold
An important concept of dbt tests is that these SELECT statements attempt to find failing records, records that would show a test to...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
As the original bug report states, this can also happen when an existing table in your data warehouse has a poorly-named column! (The above instructions were to replicate the bug in the easiest way possible)
I’d suggest using the information schema to find any columns with a name that would get coerced to a
null
value.If you’re using snowflake, this might look like:
Make sure you check it for all Snowflake databases that you use.
Resolved by #3499 (tested with the
develop
version of dbt)