BigQuery: bytes processed should be logged for all dbt commands that perform queries
See original GitHub issueDescribe the feature
With the release of dbt 0.18.0, launching a dbt run
command against a bigquery database will output the bytes processed when a model is successfully deployed. However, to get a full picture of the amount of bytes dbt processes on bigquery (and hence to get an idea of the cost of running a specific dbt command), this should also be logged when launching dbt test
, dbt source snapshot-freshness
and dbt run-operation
commands.
Additional context
For dbt run
commands, this feature was discussed in #2526, it would be nice to extend this to also log this number for the other commands mentioned above.
Who will this benefit?
This is a bigquery specific issue.
Are you interested in contributing this feature?
Sure, happy to see how we could add this.
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (9 by maintainers)
Top Results From Across the Web
BigQuery: bytes processed should be logged for all dbt ...
Describe the feature With the release of dbt 0.18.0, launching a dbt run command against a bigquery database will output the bytes processed...
Read more >Control costs in BigQuery - Google Cloud
This page describes best practices for controlling costs in BigQuery. BigQuery has two pricing models for running queries:.
Read more >Logging dbt jobs in BigQuery - Medium
In order to log all jobs run by dbt on BigQuery, ... slot time for their jobs and how much bytes they are...
Read more >BigQuery setup - dbt Developer Hub
When a maximum_bytes_billed value is configured for a BigQuery profile, queries executed by dbt will fail if they exceed the configured maximum ...
Read more >DBT : Access Denied to Table dbt-tutorial:jaffle_shop.orders
yml and set "treatment location" to "US" when you create your Big Query account (if not you will have a 404 error). If...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
#2961 adds an adapter-specific dict to
run_results.json
. Let’s consider that a partway resolution to this issue; meanwhile, I’ll open a new issue to address the desire for recordingbytes_processed
for queries from other invocation types—namelydbt test
—which we can revisit for v0.20.0.IIRC, after discussing this live, we decided we’d opt for:
The trickiness will be in surfacing that data to stdout. While this isn’t a blocker for #2493, it’s highly related. I’m removing “good first issue” and “bigquery”—this gets into our core plumbing—and pulling it into v0.19.
Edit: Eventually, we may want this to be a list of dictionaries (
query_stats
). One model/materialization may include multiple queries and we’d want to collect stats for all of them. This isn’t something we do a good job of today in run results more generally.