question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Duplicate Rows detected during snapshot

See original GitHub issue

We have been battling a dbt bug for several months now that we were hopeful was solved in the release of 0.17.0.

Consistently, the snapshot of a table we have breaks due to the following error:

Database Error in snapshot user_campaign_audit (snapshots/user_campaign_audit.sql) 100090 (42P18): Duplicate row detected during DML action

Checking our snapshot table, there are indeed multiple rows with identical dbt_scd_ids. The table being snapshot changes it’s schema with relatively high frequency. It’s a core table that feeds a lot of downstream tables, so new columns are added fairly often. We also run a production dbt run every time we merge a branch into our master branch (we are running dbt on a GItlab CI/CD flow), so the snapshot can run multiple times a day.

Our current approach to fix this is to create a copy of the snapshot table, reduce it to every distinct record, and then use that as the production version of the table. Something like:

create broken_audit_table as (select distinct * from audit_table); alter table broken_audit_table swap with audit_table; 'grant ownership on audit_table to role dbt;

Let me know if there is any more detail I can provide. Full stack is Fivetran/Snowflake/dbt

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:15 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
anitsche-btcommented, Apr 13, 2022

I have the same issue on exasol with error message: “Unable to get a stable set of rows in the source tables” and there are duplicate lines in the temp table before merge, even though the soures are clean. I figured out, that a single quote within a varchar column caused the problem, after excluding all rows with single quotes in the string, the duplicates where gone

1reaction
muscovitebobcommented, Feb 14, 2022

@muscovitebob @urwa we’re looking into this again. One hypothesis is that using multiple threads could cause this. Are you also using more than 1?

I think in my case the issue may have been caused by running two instances of dbt concurrently. We have been migrating Airflow instances and had a dbt dag running on both instances at one point. I suspect that the snapshot command ran at the same time on both by accident and this is the root cause on my case.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Duplicate Rows detected during snapshot · Issue #2642 · dbt ...
We have been battling a dbt bug for several months now that we were hopeful was solved in the release of 0.17.0. Consistently,...
Read more >
dbt Snapshot Failing (ERROR: 100090 (42P18): Duplicate ...
The table that is being snapshotted has all unique rows meaning dbt_scd_id is a unique key. I resolved this issue by adding the...
Read more >
"Duplicate row detected during DML action" while ... - ERROR
This issue occurs because of the duplicate rows present in the source. Snowflake results in this error for UPDATE operation when it receives ......
Read more >
100090 (42P18): Duplicate row detected during DML action
I was able to select distinct * from my staging table to eliminate duplicates. This solved the problem with the Merge.
Read more >
Use Microsoft Dataverse to detect duplicate records and merge
The new experience of detecting duplicates and merging them is supported when duplicates are detected while manually entering data in the app ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found