question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Custom schemas: table already exists

See original GitHub issue

Issues with re-running workflows when using custom schemas.

When I create a model with a custom schema configured:

-- models/clean/clean_accounts.sql
{{ config(alias='accounts', schema='clean', materialization='table') }}
select * from {{ source('incoming', 'accounts') }}

I am able to run the workflow successfully once:

> dbt run
...
Completed successfully

However, if I run the same workflow again I get an error:

> dbt run
...
Runtime Error in model clean_orders (models/clean/clean_accounts.sql)
  Database Error
    org.apache.spark.sql.AnalysisException: `dev_clean`.`accounts` already exists.;

Instead, the table should be dropped and recreated. If we repeat the same exercise without the schema='clean' configuration, everything works as expected.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
drewbanincommented, Dec 17, 2019

hey @eamontaaffe - thanks for your thoughtful writeup here! I appreciate your patience - it was hard to get back in the swing of the dbt-spark plugin, but I’m excited to get this (and the other open PRs in this repo) merged!

I think the change you’ve proposed here is uncontroversial - let me pick this up with you in the open PR.

1reaction
jtcohen6commented, Feb 4, 2020

In the spirit of figuring out what was actually going wrong with adapter.get_relation, I discovered the cause: in Spark, unlike in other dbt adapters, database and schema are one and the same. Only the schema property of the materialization is updated, however, when a custom schema is declared in a model config. When dbt checks the cache here for a table matching both the database and schema of the model, it supplies the custom schema for schema but the default (target.database) for database.

I think we should fix get_relation, rather than the workaround in #42. We could redefine all get_relation calls to look like

{%- set old_relation = adapter.get_relation(database=schema, schema=schema, identifier=identifier) -%}

Or we could re-implement cache.get_relations for the Spark adapter to only check for a matching schema. I’m leaning toward the latter, what do you think @drewbanin?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Custom schemas: table already exists · Issue #38 · dbt-labs ...
So what I think is happening here is that DBT isn't picking up that the table already exists when it attempts a subsequent...
Read more >
Error 'Table '.\schema\table' already exists when adding ...
Seems like a very odd error to get. The types of the columns are both "bigint(20) not null". Both tables are InnoDB. The...
Read more >
CREATE VIEW - Snowflake Documentation
A CREATE VIEW statement produces an error if a table with the same name already exists in the schema. When a view is...
Read more >
Describing Databases with MetaData
This method will issue queries that first check for the existence of each individual table, and if not found will issue the CREATE ......
Read more >
SQL CREATE/ALTER/DROP SCHEMA - w3resource
In MySQL, CREATE SCHEMA is a synonym for CREATE DATABASE. Syntax: CREATE {DATABASE | SCHEMA} [IF NOT EXISTS] db_name [create_specification] ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found