question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

feat: refactor dbt integration as an external Python plugin

See original GitHub issue

This is a specific implementation issue for https://github.com/meltano/meltano/issues/6130 and is related to https://github.com/meltano/meltano/issues/6397 & meltano/meltano#6398

Problem to Solve

Much of the problems documented in meltano/meltano#6397 meltano/meltano#6398 are applicable to dbt. Files and functionality related to dbt are spread out across multiple repos:

This is cumbersome and not contributor-friendly. Additionally, it’s difficult to iterate on as it’s not clear where, how, or even what users should do to extend functionality - do they add a command? do they add something to Meltano? Should it be in the file bundle(s)?

There’s also the challenge of hooking into the capabilities of Meltano. As we expand what users can do with Meltano across multiple plugins, we dramatically increase the number of places we need to make an update when we update a capability or add additional functionality.

We need to make it easy to add more plugins to Meltano and to iterate on their capabilities without users having to be aware of the internals of Meltano specifically. We’ll start with Airflow & dbt.

Prior Art / Things to Keep in Mind

Our dbt integration was arguably strong when part of meltano elt in that it did some things automatically for you that users may want to configure.

Definition of Done

For this issue, there are a few items we’d need to see out of a first iteration.

  • All of the dbt-specific code needed for meltano run is in its own package. We’ll worry about the monorepo question separately
    • It’s not clear to me how much of it is tied-in with meltano elt
    • as part of scoping / building we may need to figure out how we want to handle the adapter-specific versions - does that pattern still make sense?
  • This is built with the the skeleton of a true Meltano SDK in mind - bonus if the skeleton is started. This should also be built with Airflow (#6398) and other generic plugins in mind.
  • User functionality with dbt is basically the same but there will likely be new generic capabilities either used or at least proposed - particularly since there are multiple YAML files that would reference the same core package.
  • Docs are updated in Meltano and the Hub on how the dbt integration is used and changed
  • Hub updated to use new integration

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
tayloramurphycommented, Sep 7, 2022

@pandemicsyn my gut is that yes I want file bundles to go away, but more that I just want them folded into the repo. I think the main concern is that we support meltano upgrade on file bundles and I’m not sure how that would work with an extension - maybe it’s just a new version and we support the update extra for everything? We can chat through it Monday (remind me to record!)

0reactions
tayloramurphycommented, Nov 1, 2022

Closing as complete per Florian’s comment.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Ability to add dbt packages in CLI without `meltano ...
Ability to add dbt packages in CLI without meltano add transform . ... feat: refactor dbt integration as an external Python plugin meltano/meltano#6434....
Read more >
Python models
About Python models in dbt​. dbt Python ("dbt-py") models will help you solve use cases that can't be solved with SQL.
Read more >
DataHub Releases
... feat(ingest): support incremental lineage to dbt node from external platform by @mayurinehate in https://github.com/datahub-project/datahub/pull/6392 ...
Read more >
Modern data modeling: Start with the end?
I practice same method to write clear code to write sql, eg: too many mocks = refactor into separate model ( class) ....
Read more >
Stephanie Simone
Synatic, a provider of data integration and automation, announced it has secured an additional $2.5 million in a seed extension funding round, enabling...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found