on sql-warehouse incremental models always replace all the data
See original GitHub issueDescribe the bug
Im running a simple model:
{{ config(
materialized = 'incremental',
incremental_strategy='merge',
file_format = 'delta',
unique_key='h3_10',
) }}
select "h3rferfre5" as h3_10
I found that when using sql-warehouse:
sql-serverless:
outputs:
dev:
host: **
http_path: **
schema: hive_metastore.tube_silver_prod
threads: 1
token: **
type: databricks
target: dev
The model is always executes:
create
or replace table hive_metastore.tube_silver_prod.my_fi...
as a result all the data is deleted and replaced.
When running the same model with a profile that uses an interactive cluster
cluster:
outputs:
dev:
host: ***
http_path: ***
schema: tube_silver_prod
threads: 1
token: ***
type: databricks
target: dev
the result query is as expected:
merge into tube_silver_prod.my_first_dbt_model as DBT_INTERNAL_DEST...
the result is merged/appended as expected
Plugins:
- databricks: 1.2.1 - Update available!
- spark: 1.2.0 - Update available! Python 3.8.13
Thanks
Issue Analytics
- State:
- Created a year ago
- Comments:6
Top Results From Across the Web
Incremental models - dbt Developer Hub
Using an incremental model limits the amount of data that needs to be transformed, vastly reducing the runtime of your transformations. This ...
Read more >Reduce Computing Costs with dbt Incremental Models
Incremental models reduce the run time of your data models. When your run time is reduced, data warehouse performance increases and you save...
Read more >The What, Why, When, and How of Incremental Loads
What is an incremental data load and why is it important? In this post, we review the merits of using incremental loads in...
Read more >Understanding dbt Incremental Strategies part 1/2 - Medium
Using incremental models you can transform and insert into your tables only recent data, reducing (massively, depending on the size of the table)...
Read more >Data Warehouse Infrastructure: Full vs Incremental Loading in ...
Full load: with a full load, the entire dataset is dumped, or loaded, and is then completely replaced (i.e. deleted and replaced) with...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
works perfectly, thanks! @ueshin I didnt know that keyword
Could you try
?