Looks like duckdb only works on single-threaded operations
See original GitHub issuedbt-core version: dbt-core 1.0.8
adapter version: dbt-duckdb 1.1.1
# my config on a single thread that works
jaffle_shop:
target: dev
outputs:
dev:
type: duckdb
path: 'jaffle_shop.duckdb'
~/De/jaffle_shop_duckdb │ on duckdb !1 dbt build ✔ │ took 5s │ jaffle_shop_duckdb Py │ system Node
02:25:56 Running with dbt=1.0.8
02:25:56 Unable to do partial parsing because profile has changed
02:25:57 Found 5 models, 20 tests, 0 snapshots, 0 analyses, 165 macros, 0 operations, 3 seed files, 0 sources, 0 exposures, 0 metrics
02:25:57
02:25:57 Concurrency: 1 threads (target='dev')
02:25:57
02:25:57 1 of 28 START seed file main.raw_customers...................................... [RUN]
02:25:57 1 of 28 OK loaded seed file main.raw_customers.................................. [INSERT 100 in 0.13s]
02:25:57 2 of 28 START seed file main.raw_orders......................................... [RUN]
02:25:57 2 of 28 OK loaded seed file main.raw_orders..................................... [INSERT 99 in 0.08s]
02:25:57 3 of 28 START seed file main.raw_payments....................................... [RUN]
02:25:57 3 of 28 OK loaded seed file main.raw_payments................................... [INSERT 113 in 0.08s]
02:25:57 4 of 28 START view model main.stg_customers..................................... [RUN]
02:25:57 4 of 28 OK created view model main.stg_customers................................ [OK in 0.14s]
02:25:57 5 of 28 START view model main.stg_orders........................................ [RUN]
02:25:57 5 of 28 OK created view model main.stg_orders................................... [OK in 0.07s]
02:25:57 6 of 28 START view model main.stg_payments...................................... [RUN]
02:25:58 6 of 28 OK created view model main.stg_payments................................. [OK in 0.07s]
02:25:58 7 of 28 START test not_null_stg_customers_customer_id........................... [RUN]
02:25:58 7 of 28 PASS not_null_stg_customers_customer_id................................. [PASS in 0.09s]
02:25:58 8 of 28 START test unique_stg_customers_customer_id............................. [RUN]
02:25:58 8 of 28 PASS unique_stg_customers_customer_id................................... [PASS in 0.07s]
02:25:58 9 of 28 START test accepted_values_stg_orders_status__placed__shipped__completed__return_pending__returned [RUN]
02:25:58 9 of 28 PASS accepted_values_stg_orders_status__placed__shipped__completed__return_pending__returned [PASS in 0.07s]
02:25:58 10 of 28 START test not_null_stg_orders_order_id................................ [RUN]
02:25:58 10 of 28 PASS not_null_stg_orders_order_id...................................... [PASS in 0.07s]
02:25:58 11 of 28 START test unique_stg_orders_order_id.................................. [RUN]
02:25:58 11 of 28 PASS unique_stg_orders_order_id........................................ [PASS in 0.07s]
02:25:58 12 of 28 START test accepted_values_stg_payments_payment_method__credit_card__coupon__bank_transfer__gift_card [RUN]
02:25:58 12 of 28 PASS accepted_values_stg_payments_payment_method__credit_card__coupon__bank_transfer__gift_card [PASS in 0.07s]
02:25:58 13 of 28 START test not_null_stg_payments_payment_id............................ [RUN]
02:25:58 13 of 28 PASS not_null_stg_payments_payment_id.................................. [PASS in 0.07s]
02:25:58 14 of 28 START test unique_stg_payments_payment_id.............................. [RUN]
02:25:58 14 of 28 PASS unique_stg_payments_payment_id.................................... [PASS in 0.07s]
02:25:58 15 of 28 START table model main.customers....................................... [RUN]
02:25:58 15 of 28 OK created table model main.customers.................................. [OK in 0.09s]
02:25:58 16 of 28 START table model main.orders.......................................... [RUN]
02:25:58 16 of 28 OK created table model main.orders..................................... [OK in 0.08s]
02:25:58 17 of 28 START test not_null_customers_customer_id.............................. [RUN]
02:25:58 17 of 28 PASS not_null_customers_customer_id.................................... [PASS in 0.07s]
02:25:58 18 of 28 START test unique_customers_customer_id................................ [RUN]
02:25:58 18 of 28 PASS unique_customers_customer_id...................................... [PASS in 0.07s]
02:25:58 19 of 28 START test accepted_values_orders_status__placed__shipped__completed__return_pending__returned [RUN]
02:25:59 19 of 28 PASS accepted_values_orders_status__placed__shipped__completed__return_pending__returned [PASS in 0.07s]
02:25:59 20 of 28 START test not_null_orders_amount...................................... [RUN]
02:25:59 20 of 28 PASS not_null_orders_amount............................................ [PASS in 0.07s]
02:25:59 21 of 28 START test not_null_orders_bank_transfer_amount........................ [RUN]
02:25:59 21 of 28 PASS not_null_orders_bank_transfer_amount.............................. [PASS in 0.07s]
02:25:59 22 of 28 START test not_null_orders_coupon_amount............................... [RUN]
02:25:59 22 of 28 PASS not_null_orders_coupon_amount..................................... [PASS in 0.07s]
02:25:59 23 of 28 START test not_null_orders_credit_card_amount.......................... [RUN]
02:25:59 23 of 28 PASS not_null_orders_credit_card_amount................................ [PASS in 0.07s]
02:25:59 24 of 28 START test not_null_orders_customer_id................................. [RUN]
02:25:59 24 of 28 PASS not_null_orders_customer_id....................................... [PASS in 0.07s]
02:25:59 25 of 28 START test not_null_orders_gift_card_amount............................ [RUN]
02:25:59 25 of 28 PASS not_null_orders_gift_card_amount.................................. [PASS in 0.07s]
02:25:59 26 of 28 START test not_null_orders_order_id.................................... [RUN]
02:25:59 26 of 28 PASS not_null_orders_order_id.......................................... [PASS in 0.07s]
02:25:59 27 of 28 START test relationships_orders_customer_id__customer_id__ref_customers_ [RUN]
02:25:59 27 of 28 PASS relationships_orders_customer_id__customer_id__ref_customers_..... [PASS in 0.07s]
02:25:59 28 of 28 START test unique_orders_order_id...................................... [RUN]
02:25:59 28 of 28 PASS unique_orders_order_id............................................ [PASS in 0.07s]
02:25:59
02:25:59 Finished running 3 seeds, 3 view models, 20 tests, 2 table models in 2.49s.
02:25:59
02:25:59 Completed successfully
02:25:59
02:25:59 Done. PASS=28 WARN=0 ERROR=0 SKIP=0 TOTAL=28
My gut tells me there are weird race conditions going on there across the threads. I haven’t dug in deep yet, but wanted to point this out. This command sometimes works if I’m lucky on multiple threads.
# my config on 16 threads that does NOT work
jaffle_shop:
target: dev
outputs:
dev:
type: duckdb
path: 'jaffle_shop.duckdb'
threads: 16
~/De/jaffle_shop_duckdb │ on duckdb !1 dbt build --full-refresh ✔ │ jaffle_shop_duckdb Py │ system Node
02:30:25 Running with dbt=1.0.8
02:30:25 Found 5 models, 20 tests, 0 snapshots, 0 analyses, 165 macros, 0 operations, 3 seed files, 0 sources, 0 exposures, 0 metrics
02:30:25
02:30:26 Concurrency: 16 threads (target='dev')
02:30:26
02:30:26 1 of 28 START seed file main.raw_customers...................................... [RUN]
02:30:26 2 of 28 START seed file main.raw_orders......................................... [RUN]
02:30:26 3 of 28 START seed file main.raw_payments....................................... [RUN]
02:30:26 3 of 28 OK loaded seed file main.raw_payments................................... [CREATE 113 in 0.20s]
02:30:26 1 of 28 OK loaded seed file main.raw_customers.................................. [CREATE 100 in 0.25s]
02:30:26 2 of 28 OK loaded seed file main.raw_orders..................................... [CREATE 99 in 0.30s]
02:30:26 4 of 28 START view model main.stg_payments...................................... [RUN]
02:30:26 5 of 28 START view model main.stg_customers..................................... [RUN]
02:30:26 6 of 28 START view model main.stg_orders........................................ [RUN]
02:30:26 5 of 28 ERROR creating view model main.stg_customers............................ [ERROR in 0.17s]
02:30:26 6 of 28 ERROR creating view model main.stg_orders............................... [ERROR in 0.17s]
02:30:26 7 of 28 SKIP test not_null_stg_customers_customer_id............................ [SKIP]
02:30:26 8 of 28 SKIP test unique_stg_customers_customer_id.............................. [SKIP]
02:30:26 9 of 28 SKIP test accepted_values_stg_orders_status__placed__shipped__completed__return_pending__returned [SKIP]
02:30:26 10 of 28 SKIP test not_null_stg_orders_order_id................................. [SKIP]
02:30:26 11 of 28 SKIP test unique_stg_orders_order_id................................... [SKIP]
02:30:26 4 of 28 OK created view model main.stg_payments................................. [OK in 0.22s]
02:30:26 12 of 28 START test accepted_values_stg_payments_payment_method__credit_card__coupon__bank_transfer__gift_card [RUN]
02:30:26 13 of 28 START test not_null_stg_payments_payment_id............................ [RUN]
02:30:26 14 of 28 START test unique_stg_payments_payment_id.............................. [RUN]
02:30:26 13 of 28 PASS not_null_stg_payments_payment_id.................................. [PASS in 0.16s]
02:30:26 12 of 28 PASS accepted_values_stg_payments_payment_method__credit_card__coupon__bank_transfer__gift_card [PASS in 0.22s]
02:30:26 14 of 28 PASS unique_stg_payments_payment_id.................................... [PASS in 0.27s]
02:30:26 15 of 28 SKIP relation main.customers........................................... [SKIP]
02:30:26 16 of 28 SKIP relation main.orders.............................................. [SKIP]
02:30:26 17 of 28 SKIP test not_null_customers_customer_id............................... [SKIP]
02:30:26 18 of 28 SKIP test unique_customers_customer_id................................. [SKIP]
02:30:26 22 of 28 SKIP test not_null_orders_coupon_amount................................ [SKIP]
02:30:26 19 of 28 SKIP test accepted_values_orders_status__placed__shipped__completed__return_pending__returned [SKIP]
02:30:26 20 of 28 SKIP test not_null_orders_amount....................................... [SKIP]
02:30:26 21 of 28 SKIP test not_null_orders_bank_transfer_amount......................... [SKIP]
02:30:26 23 of 28 SKIP test not_null_orders_credit_card_amount........................... [SKIP]
02:30:26 24 of 28 SKIP test not_null_orders_customer_id.................................. [SKIP]
02:30:26 26 of 28 SKIP test not_null_orders_order_id..................................... [SKIP]
02:30:26 27 of 28 SKIP test relationships_orders_customer_id__customer_id__ref_customers_ [SKIP]
02:30:26 28 of 28 SKIP test unique_orders_order_id....................................... [SKIP]
02:30:26 25 of 28 SKIP test not_null_orders_gift_card_amount............................. [SKIP]
02:30:27
02:30:27 Finished running 3 seeds, 3 view models, 20 tests, 2 table models in 1.08s.
02:30:27
02:30:27 Completed with 2 errors and 0 warnings:
02:30:27
02:30:27 Runtime Error in model stg_customers (models/staging/stg_customers.sql)
02:30:27 Catalog Error: Table with name raw_customers does not exist!
02:30:27 Did you mean "stg_customers"?
02:30:27
02:30:27 Runtime Error in model stg_orders (models/staging/stg_orders.sql)
02:30:27 Catalog Error: Table with name raw_orders does not exist!
02:30:27 Did you mean "stg_orders"?
02:30:27
02:30:27 Done. PASS=7 WARN=0 ERROR=2 SKIP=19 TOTAL=28
Issue Analytics
- State:
- Created a year ago
- Comments:8 (5 by maintainers)
Top Results From Across the Web
Efficient SQL on Pandas with DuckDB
As DuckDB is capable of using multiple processors (multi-threading), we include both a single-threaded variant and a variant with two threads.
Read more >Python Relational Api - Methods · Issue #2000 - GitHub
As far as I know, duckdb operators currently preserve order only in the single-threaded case, but not when multithreading is enabled.
Read more >Hannes Mühleisen - DuckDB, an in-process analytical DBMS
Talk delivered February 24, 2022. Visit https://www.nyhackr.org to learn more and follow https://twitter.com/nyhackr.
Read more >Jim Tommaney - Databricks - DuckDB: Embedded Analytics ...
DuckDB is a project coming out of CWI in the Netherlands that combines vector, columnar, and parallel capabilities.
Read more >DuckDB – The SQLite for Analytics (Mark Raasveldt, CWI)
CMU Database Group - Quarantine Tech Talks (2020)Speaker: Mark Raasveldt (https://www.cwi.nl/people/mark-raasveldt) DuckDB - The SQLite For ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@dataders it’s a straight-up bug in the Python driver; the CLI version of DuckDB does not have this issue.
I just pushed
dbt-duckdb
version1.1.4
which will throw a runtime exception if you’re trying to use multiple threads with the adapter, and it sounds like the DuckDB folks are working on a patch that will fix the underlying issue. Once a version with that patch is released, I will update the adapter to be version-aware and only require the single-threadedness if you are using a version of DuckDB <= 0.4.0.