dbt Tests #
Tests are assertions about models and resources in the project.
Does the assertion hold? The test passes. If not, it fails, and it highlights potential issues in the design.
Basically unit tests, great for ensuring data quality.
Design #
In the models/<data store>/schema.yaml
file:
models:
- name: <some name>
description: ...
columns:
...
- name:
description: ...
columns:
- name: <column name>
description:
tests:
- unique
- not_null
unique
- check that it’s uniquenot_null
- check that the column is not null
Implementation #
Under the hood, tests are essentially macros.
Tests are stored under the macros/
path.
The convention is to name them as : test_<descriptor>.sql
. Within that file is just the sql script to be run as a test of some form.
A simple macro to test if a field is even or odd might be located at macro/test_is_even.sql
(code snippet can be found at the official documentation).
A generic form of a test, named test_is_<condition>.sql
might look like:
{% test is_<condition>(model, column_name) %}
with validation as (
select
{{ column_name }} as <field name>
from {{ model }}
),
validation_errors as (
select
<field name>
from validation
where <some sort of condition>
)
select * -- or perhaps count(*)
from validation_errors
{% endtest %}
To then use the test, modify the YAML located alongside the model to include:
version: 2
models:
- name: <the {{ model }} referenced in the test>
description:
...
columns:
- name: <this is what the {{column_name}} in the test file refers to>
description: ...
tests:
- is_<condition>
Where ^ is_<condition>
is derived from the test file test_is_<condition>.sql
Testing #
To run the tests: dbt test
It’ll then go through the tests, and highlight where the failures happened.
This test basically compiles all the tests, and can be examined at ./target/compiled/<data store>/...
for test cases. Basically a bunch of .sql
files.
Resolving errors #
If errors are found, it’s a good practice to copy the SQL query and run it directly to see what gets returned.