dbt tests: False negatives when testing between project layers

We have a small project with four layers.  When we added relationship tests between project layers, we started getting false test failures.

The raw layer is a bit like dbt’s stage layer, we do renaming and some very simple changes here and have validation for duplicates and nulls. The validation layer we build dims and facts, check data, count relationships and other tests. In reporting we build objects to help the BI tool visualize the data and business_app are simply the exposed, secure version of the reporting models.

Currently, we build the project in layers so if there’s a failure in one layer nothing in the next layer builds.  Pseudo code of our cmd list would be:

dbt build -s layer:raw
dbt build -s layer:validation
dbt build -s layer:reporting
dbt build -s layer:business_app

(yes, yes.. layer isn’t a real selector, but you get it)

 

The problem was, we added relationship tests between the layers, so our validation.yml has things like :

- name: validation_layer_model_1
  tests:
    - dbt_utils.equal_rowcount:
        compare_model: ref('raw_layer_model_A')

 

Because of the ref() to the raw layer, this relationship test is run as part of

dbt build -s layer:raw

 

So we build raw_layer_model_A with more data, but haven’t yet run validation_layer_model_1 and now this rowcount fails.

We thought about different ways to address this:

  1. Change the selections we use to build the project. 👎 We’ve been otherwise happy with this command structure
  2. Write custom tests. 👎 more code to maintain
  3. Write where clauses to the relationship test to be sure the fact build time is after the associated raw build time. 👎 not all our models have build timestamp, so would need modify a bunch of them.
  4. Add some 🏷️s. 👍
    • ✅ simple
    • ✅ no new sql code to write or maintain
    • 🏁

 

We added tags to all cross-layer relationship tests.

- name: validation_layer_model_1
  tests:
    - dbt_utils.equal_rowcount:
        compare_model: ref('raw_layer_model_A')
        tags: val_raw_relationship

 

We updated the project build job as follows:

dbt build -s layer:raw --exclude tag:val_raw_relationship
dbt build -s layer:validation --exclude tag:rep_val_relationship
dbt build -s layer:reporting
dbt build -s layer:business_app

 

We’ve found that the excluded tests were correctly run when we built the next layer in the next build statement. But you could also add explicitly call them with a test cmd:

dbt build -s layer:raw --exclude tag:val_raw_relationship
dbt build -s layer:validation --exclude tag:rep_val_relationship
dbt test -s tag:val_raw_relationship
dbt build -s layer:reporting --exclude tag:val_rep_relationship
dbt test -s tag:rep_val_relationship
dbt build -s layer:business_app

 

 

 

Is your organization adapting to remain competitive? In The Architect Room, we design a roadmap for the future of your digital organization, while capitalizing on current technology investments. Our knowledge spans a variety of offerings across all of the major public cloud providers. 

 

13 Responses to dbt tests: False negatives when testing between project layers

Leave a Reply

Your email address will not be published. Required fields are marked *