question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support enable upstream sensor for dbt assets

See original GitHub issue

What’s the use case?

Currently, dagster_dbt does not have the built-in option to load dbt assets that listening to their upstream assets (in and out of dbt graphs). This option will unify the Dagster SDA + dbt experience as the data flow will be based entirely on asset-sensor pattern.

For ex, we may have this additional flag: load_assets_from_dbt_project(..., update_with_upstream=True)

More info: https://dagster.slack.com/archives/C01U5LFUZJS/p1659067614814969

Ideas of implementation

General ideal is to traverse through the dbt graph and configure the sensor for each node.

Additional information

No response

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:1
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
clairelin135commented, Oct 12, 2022

Hi @nvinhphuc! You could build a simplified version of this using the multi asset sensor.

For example, given a list of asset keys to materialize when upstream partitions are available, the sensor below will update a partition when any of its upstream partitions have new materializations.

assets_to_trigger = [AssetKey("w"), AssetKey("x"), AssetKey("y"), AssetKey("z")]


@multi_asset_sensor(
    asset_selection=AssetSelection.keys(*assets_to_trigger).upstream(depth=1)
    - AssetSelection.keys(*assets_to_trigger).sinks(),
    job=assets_job,
)
def my_multi_asset_sensor(context):
    upstream_keys_by_asset = {
        asset_key: (
            AssetSelection.keys(asset_key).upstream(depth=1) - AssetSelection.keys(asset_key)
        ).resolve(list(context._repository_def._assets_defs_by_key.values()))
        for asset_key in assets_to_trigger
    }

    run_requests = []
    for (
        partition,
        materializations_by_asset,
    ) in context.latest_materialization_records_by_partition_and_asset().items():

        for asset_key in assets_to_trigger:
            upstream_keys = upstream_keys_by_asset.get(asset_key, {})
            updated_upstreams = set(upstream_keys) & set(materializations_by_asset.keys())
            if updated_upstreams:
                if all(
                    [
                        context.all_partitions_materialized(upstream_key, [partition])
                        for upstream_key in upstream_keys
                    ]
                ):
                    run_requests.append(
                        assets_job.run_request_for_partition(partition, asset_selection=[asset_key])
                    )
                    for updated_upstream in updated_upstreams:
                        context.advance_cursor(
                            {updated_upstream: materializations_by_asset[updated_upstream]}
                        )
    return run_requests

This feature is experimental, so we welcome any feedback!

1reaction
sryzacommented, Oct 12, 2022

@nvinhphuc I filed an issue to track this: https://github.com/dagster-io/dagster/issues/9988. We’re hoping to get it done in the next 6 weeks.

Read more comments on GitHub >

github_iconTop Results From Across the Web

make build_asset_reconciliation_sensor work with partitions
sryza added sensors asset-defs partitions labels 19 days ago ... Support enable upstream sensor for dbt assets #9886.
Read more >
dbt + Dagster
Dagster has built-in support for loading dbt models, seeds, and snapshots as software-defined assets, enabling you to: Visualize and orchestrate a graph of ......
Read more >
Dagster + Airbyte + dbt: How Software-Defined Assets ...
A software-defined asset is a Dagster object that couples an asset to the function and upstream assets that are used to produce its...
Read more >
SFTP sensor - Georg Heiler
Even a data pipeline following the principles of the modern data stack (which often uses data assets defined by DBT using SQL in...
Read more >
upstream_prod - dbt - Package hub
packages: - package: LewisDavies/upstream_prod version: 0.1.1. Run dbt deps to install the package. For more information on using packages in your dbt ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found