question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Defer refers to outdated seed

See original GitHub issue

Describe the bug

I’m running the following on CI:

aws s3 cp $(MANIFEST_REMOTE_PATH) target/prod/manifest.json
dbt seed --select "state:modified+" --full-refresh --state target/prod/ --target dev-master
dbt run --models "state:modified+" --defer --state target/prod/ --target dev-master
...

Say we have a seed countries.csv:

name
Portugal
Ireland
Switzerland

And a model dim_countries.sql:

select to_upper(name) as name
from {{ ref("countries") }}

Now in one PR we change the seed to

id,name
1,Portugal
2,Ireland
3,Switzerland

And we change the model to:

select
  id, 
  to_upper(name) as name
from {{ ref("countries") }}

dbt correctly identifies that countries.csv changed, and that dim_countries.sql changed. However when the model runs, it fails with “column id does not exist” because the model tries to read the seed from the “main run” (it defers the seed) instead of identifying that it re-ran.

Note that I tried copying the manifest produced by dbt seed into target/prod/ but what happens then is that dbt does not identify the model as “modified”

dbt version

0.18.1

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
jtcohen6commented, Dec 3, 2020

One approach that’s occurred to me, though I haven’t thought through all the implications:

Today, we defer all models that are not included in the node selection criteria. That means we defer:

  • seeds (always)
  • modified models that are included, then excluded, from shifting criteria, e.g. dbt run -m state:modified && dbt run -m state:modified,config.materialized:incremental (see slack thread)

Perhaps we shouldn’t use selection criteria as the basis for deferral. Instead, during compilation, we could check to see if a referent’s “new” representation (ci_schema.identifier) exists in the database (via cache lookup). If it doesn’t exist, we “fall back” to the comparison manifest’s representation (prod_schema.identifier) of a node with the same unique_id.

If we took this more-naive approach to deferral:

  • Would it also help us avoid tricky issues around manifest construction when previous source or ephemeral model parents have gone missing (#2875)?
  • Could we even revisit the question of how deferral might work for tests (#2701)?
1reaction
jtcohen6commented, Nov 24, 2020

@dmateusp That’s a good thought, and if it proves much more straightforward, that may be the move. In either case, it will require tweaking some of the logic of deferral—which isn’t something we can squeeze in for the next minor version (v0.19), but possibly for the one after.

In the meantime, I’ll plan to document this as a known caveat to state comparison + deferral.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Defer refers to outdated seed · Issue #2909 · dbt-labs ...
The issue here is that --defer simply switches the reference of any resource not included in the selection criteria. As part of node...
Read more >
Caveats to state comparison
If the contents of these seeds is modified, the seed will be included in state:modified . If a seed file is >1 MB...
Read more >
You're Not Supposed To Use This Seed... - YouTube
The mysterious 0 seeds secrets, unlockedMinecraft Bedrock is the same as Minecraft Windows 10, Minecraft Pocket Edition; or Minecraft Xbox ...
Read more >
Loan Consolidation in Detail - FSA Partner Connect
The interest rate on a Federal Consolidation Loan is the weighted average of the interest rates of the loans being consolidated. A “weighted...
Read more >
Seed aging, delayed germination and reduced competitive ...
Delayed seed germination and malformed seedling can jeopardize plant development and growth, resulting in yield losses, especially in annual species (Rice and ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found