feat: refactor dbt integration as an external Python plugin
See original GitHub issueThis is a specific implementation issue for https://github.com/meltano/meltano/issues/6130 and is related to https://github.com/meltano/meltano/issues/6397 & meltano/meltano#6398
Problem to Solve
Much of the problems documented in meltano/meltano#6397 meltano/meltano#6398 are applicable to dbt. Files and functionality related to dbt are spread out across multiple repos:
- Meltano https://github.com/meltano/meltano/tree/main/src/meltano/core/plugin/dbt
- Files bundles https://github.com/meltano?q=dbt&type=all&language=&sort= 1 for each adapter
- Metadata definition on MeltanoHub https://hub.meltano.com/transformers/
This is cumbersome and not contributor-friendly. Additionally, it’s difficult to iterate on as it’s not clear where, how, or even what users should do to extend functionality - do they add a command? do they add something to Meltano? Should it be in the file bundle(s)?
There’s also the challenge of hooking into the capabilities of Meltano. As we expand what users can do with Meltano across multiple plugins, we dramatically increase the number of places we need to make an update when we update a capability or add additional functionality.
We need to make it easy to add more plugins to Meltano and to iterate on their capabilities without users having to be aware of the internals of Meltano specifically. We’ll start with Airflow & dbt.
Prior Art / Things to Keep in Mind
Our dbt integration was arguably strong when part of meltano elt
in that it did some things automatically for you that users may want to configure.
- dbt automatically runs clean and deps for users https://github.com/meltano/meltano/blob/main/src/meltano/core/runner/dbt.py#L61-L62 - they may want to configure that
- Issues like https://github.com/meltano/meltano/issues/3032 could be a better out of the box experience but should be configurable
- Other capabilities could be added that may not be best to live in dbt core https://github.com/meltano/dbt-ext/issues/10
Definition of Done
For this issue, there are a few items we’d need to see out of a first iteration.
- All of the dbt-specific code needed for
meltano run
is in its own package. We’ll worry about the monorepo question separately- It’s not clear to me how much of it is tied-in with
meltano elt
- as part of scoping / building we may need to figure out how we want to handle the adapter-specific versions - does that pattern still make sense?
- It’s not clear to me how much of it is tied-in with
- This is built with the the skeleton of a true Meltano SDK in mind - bonus if the skeleton is started. This should also be built with Airflow (#6398) and other generic plugins in mind.
- User functionality with dbt is basically the same but there will likely be new generic capabilities either used or at least proposed - particularly since there are multiple YAML files that would reference the same core package.
- Docs are updated in Meltano and the Hub on how the dbt integration is used and changed
- Hub updated to use new integration
Issue Analytics
- State:
- Created a year ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
@pandemicsyn my gut is that yes I want file bundles to go away, but more that I just want them folded into the repo. I think the main concern is that we support
meltano upgrade
on file bundles and I’m not sure how that would work with an extension - maybe it’s just a new version and we support the update extra for everything? We can chat through it Monday (remind me to record!)Closing as complete per Florian’s comment.