question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

dbt docs generate fails when a column name is " "

See original GitHub issue

Describe the bug

A clear and concise description of what the bug is. What command did you run? What happened?

Steps to reproduce

  1. Create a model with a column name " " (I know)
$ cat models/customers.sql
select 1 as " "
  1. Run dbt docs generate

(Note this also applies to source tables with this column name, but easiest to reproduce by creating a model)

Expected behavior

An error, but with a good message

Actual behaviour

Encountered an error:
None is not of type 'string'

Failed validating 'type' in schema['properties']['name']:
    {'type': 'string'}

On instance['name']:
    None

Screenshots and log output

2020-06-17 19:22:49.606297 (MainThread): Encountered an error:
2020-06-17 19:22:49.607069 (MainThread): None is not of type 'string'

Failed validating 'type' in schema['properties']['name']:
    {'type': 'string'}

On instance['name']:
    None
2020-06-17 19:22:49.611463 (MainThread): jsonschema.exceptions.ValidationError: None is not of type 'string'

Failed validating 'type' in schema['properties']['name']:
    {'type': 'string'}

On instance['name']:
    None

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/main.py", line 81, in main
    results, succeeded = handle_and_check(args)
  File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/main.py", line 159, in handle_and_check
    task, res = run_from_args(parsed)
  File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/main.py", line 212, in run_from_args
    results = task.run()
  File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/task/generate.py", line 249, in run
    catalog = Catalog(catalog_data)
  File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/task/generate.py", line 62, in __init__
    self.add_column(col)
  File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/dbt/task/generate.py", line 97, in add_column
    column = ColumnMetadata.from_dict(column_data)
  File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/hologram/__init__.py", line 594, in from_dict
    cls.validate(data)
  File "/Users/claire/.pyenv/versions/3.7.5/envs/dbt-dev-v2/lib/python3.7/site-packages/hologram/__init__.py", line 937, in validate
    raise ValidationError.create_from(error) from error
hologram.ValidationError: None is not of type 'string'

Failed validating 'type' in schema['properties']['name']:
    {'type': 'string'}

On instance['name']:
    None

System information

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

dbt --version
installed version: 0.18.0-b1
   latest version: 0.17.0

Your version of dbt is ahead of the latest release!

Plugins:
  - bigquery: 0.18.0b1
  - snowflake: 0.18.0b1
  - redshift: 0.18.0b1
  - postgres: 0.18.0b1

Additional context

  • Might also happen if you have a column named 'null'.
  • Here is the line of code

Workaround

Rename the column!

This often crops up if an existing relation (rather than a dbt-generated relation) has a strange column name. To find the column, use the information schema. For example, on Snowflake this would look like:

select * from <YOUR_DATABASE>.information_schema.columns
where (
    trim(column_name) = ''
    or column_name is null
    or lower(column_name) = 'null'
    or lower(column_name) = 'none'
) 

Make sure you check all databases that your dbt project references!

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:13 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
clrcrlcommented, Sep 3, 2020

As the original bug report states, this can also happen when an existing table in your data warehouse has a poorly-named column! (The above instructions were to replicate the bug in the easiest way possible)

I’d suggest using the information schema to find any columns with a name that would get coerced to a null value.

If you’re using snowflake, this might look like:

select * from <YOUR_DATABASE>.information_schema.columns
where (
    trim(column_name) = ''
    or column_name is null
    or lower(column_name) = 'null'
    or lower(column_name) = 'none'
) 

Make sure you check it for all Snowflake databases that you use.

1reaction
clrcrlcommented, Jul 8, 2021

Resolved by #3499 (tested with the develop version of dbt)

Screen Shot 2021-07-08 at 6 12 05 PM
Read more comments on GitHub >

github_iconTop Results From Across the Web

dbt docs generate fails when a column name is " " · Issue #2564
Actual behaviour. Encountered an error: None is not of type 'string' Failed validating 'type' in schema['properties']['name']: {'type': 'string ...
Read more >
Debugging errors - dbt Developer Hub
To fix this: Open the offending file (e.g. schema.yml ); Check the line in the error message (e.g. line 5 ); Find the...
Read more >
'DBT docs generate' does not populate model column-level ...
LATER EDIT: I tried running dbt --debug docs generate and it seems that all data is retrieved directly from the target environment (in...
Read more >
Stable Releases - dbtvault
dbt Docs : The built-in dbt docs site ( dbt docs serve ) now includes documentation ... This caused missing columns in generated...
Read more >
7 dbt Testing Best Practices - Datafold
An important concept of dbt tests is that these SELECT statements attempt to find failing records, records that would show a test to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found