There are 6 main use cases when it comes to using dbt from a practical perspective. These are outlined below:
Debugging
Database comparisons
Data engineering prototyping
Design & develop new data models
Test a model in different environments
Compare live outputs of models
Debugging
Bring your existing models into Count to clearly understand the lineage and see the results at every stage, allowing you to quickly spot the error that needs resolving.
By doing this in a canvas, you're also able to make debugging collaborative.
Database comparisons
Compare difference databases directly in one place, and even merge the results.
This is great for:
debugging
understanding (and explaining) how your data flows through your different tools
migrations
Data engineering prototyping
Quickly start prototyping in a canvas using sample data from:
CSV files
APIs using Python cells
Staging databases
Before the data engineers have built out the full models in your production database.
You're able to do this in any language, without having to wait for the full end-to-end data modelling process, resulting in a faster path to productionising later.
Design & develop new data models
Once your dbt project is connected, you can develop new models entirely within Count:
Import the upstream models you need
Create new models as cells, referencing any upstream model or macro that you require
Commit the new model files to GitHub from Count
Test a model in different environments
While building a model, at some point you'll probably want to preview what your model will look like in your production environment.
The canvas is a great place to do these comparisons, just:
Add connections to your production and staging environments
Add dbt integrations to your connections
In a new canvas, write your new model in a cell connected to your staging connection
Duplicate the cell, and set the source to your production connection
You now have a live preview of your model running in both environments, where you can, for example:
Compare returned rows
Compute sense-check derived metrics like counts, totals and averages in other cells
Visualise column distributions
Compare live outputs of models
You may already be using tools like DataDiff to compare the outputs of your models during your continuous integration jobs. This is good practice, and it's even more helpful to extend this diffing process to give live results while developing your models.
One way of achieving this in the canvas is to add a local cell that's always checking your in-progress model results with a known-good result set.
For example, continuing with the example above, suppose you're trying to remove a hard-coded value in a filter. In the example below, we use a DuckDB EXCEPT operator to find any rows in one table that aren't present in another. As soon as this condition is breached, a big and obvious warning is displayed:
As the canvas is such a flexible space, you're free to hide this logic in a frame, add any additional formatting etc.
The following macro is an example of how you might want to generalise this approach:
{% macro except_diff(table_1, table_2, error, success) %}
with new_rows as (select * from {{table_1}} except select * from {{table_2}})
select if(count(*) > 0, 'β {{error}}', 'β {{success}}') as status
from new_rows
{% endmacro %}
--
For more technical information about how the dbt integration works, see our docs.