Storing test results in the database
See original GitHub issueDescribe the feature
dbt testing is powerful; being notified when a test fails can be life-saving; debugging those failed tests could be easier!
There are two long-lived issues related to this subject (#517 and #903), and dozens of valuable comments on both. After discussing with several colleagues and a few community members, we’ve arrived at an approach I feel good about.
The goal here isn’t to build a dimensional model of historical data surrounding dbt invocations. I think we can and should build out capacity for those longitudinal analyses, but it should be powered by dbt artifacts (manifest and run results) rather than adding records to database tables (a la logging package).
Rather, the goal is to add tooling that can be of immediate benefit to dbt developers, for whom debugging tests is a slog that’s harder than it needs to be. We want to provide the post-test investigator with a high-level overview of dbt test
statuses, and the errant records from any failing tests, within a query or two.
A lot of this work is going to build on top of v2 schema tests (#1173) and data tests (#2274), both of which are on the 1.0.0 docket. So while this feature is not coming in the next month or two, it is firmly on the roadmap.
Spec: v2 schema tests
To get where we want to go, we need to talk a little about our planned changes to schema tests.
Let’s imagine a modified version of the current not_null
schema test. Several small things are different, the most significant of which is that the test returns a set of records instead of just count(*)
, similar to how data tests work today:
{% test not_null(model, column_name) %}
{% set name = 'not_null__' ~ model.name ~ '__' ~ column_name %}
{% set description %} Assert that {{ column_name }} is never null in {{ model }} {% endset %}
{% set fail_msg %} Found {{ result }} null values of {{ model }}.{{ column_name }} {% endset %}
{{ config(name=name, description=description, fail_msg=fail_msg) }}
select * from {{ model }}
where {{ column_name }} is null
{% endmacro %}
Let’s say we have two not_null
tests defined in our project, in a file called resources.yml
:
version: 2
models:
- name: customers
colums:
- name: id
tests:
- not_null
- name: orders
columns:
- name: order_status
tests:
- not_null
The standard way of executing these tests looks like compiling and running some queries, and storing the compiled version of those queries in the target
directory:
-- compiled SQL stored in target/compiled/models/resources.yml/schema_test/not_null__customers__id.sql`
select count(*) from (
select * from dev_jcohen.customers
where id is null
) validation_errors
;
-- compiled SQL stored in target/compiled/models/resources.yml/schema_test/not_null__orders__order_status.sql`
select count(*) from (
select * from dev_jcohen.orders
where order_status is null
) validation_errors
;
After running, dbt returns any failures as log output to the CLI:
12:00:01 | 1 of 2 START test not_null__customers__id.........................[RUN]
12:00:02 | 1 of 2 PASS not_null__customers__id.............................. [PASS in 1.00s]
12:00:03 | 2 of 2 START test not_null__orders__order_status................. [RUN]
12:00:04 | 2 of 2 FAIL not_null__orders__order_status....................... [FAIL 7 in 1.00s]
Failure in test not_null__orders__order_status (models/resources.yml)
Found 7 null values of orders.order_status
compiled SQL at target/compiled/models/resources.yml/schema_test/not_null__orders__order_status.sql
There are many features we imagine unlocking with v2 testing:
- In addition to
warn
severity, more configurablewarn_after
anderror_after
thresholds based on a scalar count or % - Tagging tests to customize failure notifications
For now, let’s focus on storing the results of these tests in the database for easier debugging.
Spec: storing results
Now let’s imagine a new test
flag:
$ dbt test --store-results
It’s unlikely that dbt developers would want to include this flag when iterating on net-new models, but it’s quite likely that production or CI test runs would benefit from including it.
When the --store-results
flag is included, dbt will instead execute tests like so:
drop schema if exists dev_jcohen__dbt_test_results;
create schema dev_jcohen__dbt_test_results;
create table dev_jcohen__dbt_test_results.not_null__customers__id as (
-- compiled SQL stored in target/compiled/models/resources.yml/schema_test/not_null__customers__id.sql`
select * from dev_jcohen.customers
where id is null
);
create table dev_jcohen__dbt_test_results.not_null__orders__order_status as (
-- compiled SQL stored in target/compiled/models/resources.yml/schema_test/not_null__orders__order_status.sql`
select * from dev_jcohen.orders
where order_status is null
);
Based on the row count in the query status returned by the database, or by running a quick count(*)
on each table after creation, dbt will determine whether the test passed or failed and create a summary table of all tests in the invocation. (We’ll use a mechanism similar to dbt seed
to batch inserts and avoid any limits on inserts or query length.)
create table dev_jcohen__dbt_test_results.summary (
invocation_id string,
test_name string,
status string,
execution_time timestamp
);
insert into dev_jcohen__dbt_test_results.summary values (
('xxxxxxxx-xxxx-0000-0000-xxxxxxxxxxxx', 'not_null__customers__id', 'PASS', '2020-06-25 12:00:01'),
('xxxxxxxx-xxxx-0000-0000-xxxxxxxxxxxx', 'not_null__orders__order_status', 'FAIL 7', '2020-06-25 12:00:03')
);
And finally log to the CLI:
12:00:01 | 1 of 2 START test not_null__customers__id.........................[RUN]
12:00:02 | 1 of 2 PASS not_null__customers__id.............................. [PASS in 1.00s]
12:00:03 | 2 of 2 START test not_null__orders__order_status................. [RUN]
12:00:04 | 2 of 2 FAIL not_null__orders__order_status....................... [FAIL 7 in 1.00s]
Failure in test not_null__orders__order_status (models/resources.yml)
Found 7 null values of orders.order_status
-------------------------------------------------------------------------
select * from dev_jcohen__dbt_test_results.not_null__orders__order_status
-------------------------------------------------------------------------
compiled SQL at target/compiled/models/resources.yml/schema_test/not_null__orders__order_status.sql
Questions
Test definition
dbt will depend entirely on the test block to circumscribe its failures. Each test block query will be responsible for defining:
- Which columns to include in the table of test failures. E.g.
*
, an explicit column list,dbt_utils.star
or* (except)
(on BigQuery). - How many rows to include in the table of test failures, e.g. by adding
limit 500
.
Audit schema
dbt will store all results in one audit schema per dbt test --store-results
invocation. It will name the schema {{ target.schema }}__dbt_test_results
.
Can I name the schema differently? What about tests on models with custom schemas?
We’re going to keep this simple for the first version by storing all tests in one schema with a set name. We envision adding configuration for this later, especially if there is user demand.
Why does dbt drop and recreate the dbt test audit schema every time?
dbt test
is a stateless, idempotent operation; dbt test --store-results
will be as well. If you run it several times, you will end up with the same artifacts.
How can I preserve history of past test failures?
The same way you preserve historical data from sources, models, or anything else: snapshots on top of tables in the audit schema, or your own custom run-operation macros to copy or unload that data to external storage.
Granular control
The --store-results
flag flips dbt test
from running a set of queries that performs only introspection to a set of queries that create assets in the database.
Will it be possible to store results for some tests, and not for others?
In the first version of this, the --store-results
flag will change the behavior of all tests run. Eventually, we envision adding a test config flag (store_results: false
) that takes precedence over the CLI flag, similar to changes to --full-refresh
in 0.18.0.
Until then, you can manage this with multiple commands and node selection:
dbt test --store-failures --exclude my_massive_model
dbt test --models my_massive_model
Issue Analytics
- State:
- Created 3 years ago
- Reactions:42
- Comments:13 (7 by maintainers)
Heya, just giving my 2 cents here.
If you want to write stuff to the database, why not make it a model?
A few weeks ago people asked me this exact same thing about storing test failure results in the db, in order for them to be able to fully deprecate a much more expensive tooling they had in place. The solution I came up with was to have tests as models that output a single line, with exactly 1 row and at least the following 3 columns:
test_name
,test_result
andtest_details
. On thetest_details
column I aggregate whatever information is relevant to the test, such that when posted on slack (by reading the target compiled file), it has like breaks and etc. After the model runs, there’s also a custom schema test to check the columntest_results
for the value ‘FAILED’, pretty straightforward. Finally, I used a post-hook to, if the test failed, insert the results in a ‘test_history’ table, shared between all tests following this patter.I must say it has worked pretty well for now, but of course, I don’t think this would solve every use case.
Here’s one of the examples:
Here’s the macro I used:
And here’s the sample slack alert I got from it.
We’re also able to surface that test_history table in our viz tool to try and detect patters in test failures etc.
Maybe this could be translated into a custom materialization, no?
Anyway, just thought it was worth sharing…
I want to store results of
dbt source snapshot-freshness
too in addition to results of tests.