Examples#
For consistency and clarity of the following examples, we're going to use a simplified dbt project. In practice, the model governance and cross-project lineage features describe are most beneficial for large dbt projects that are struggling to scale.
We will give a basic example for each command, but to see the full list of additional flags you can add to a given command, check out the commands page.
Note
One helpful flag that you can add to all of the commands is --read-catalog
, which will skip the dbt docs generate
step and instead read the local catalog.json
file - this will speed up the time it takes to run the dbt-meshify
commands but relies on your local catalog.json
file being up-to-date. Alternatively, you can configure this via the DBT_MESHIFY_READ_CATALOG
environment variable.
Let's imagine a dbt project with the following models:
You can checkout the source code for this example here.
Component commands#
Create a new group#
Let's say you want to create a new group for your sales analytics models.
You can run the following command:
dbt-meshify operation create-group sales_analytics --owner-name Ralphie --select +int_sales__unioned +int_returns__unioned transactions
This will create a new group named "sales_analytics" with the owner "Ralphie" and add all selected models to that group with the appropriate access
configuration:
-
create a new group definition in a
_groups.yml
file: -
add all selected models to that group with the appropriate
access
config - all models that are only referenced by other models in their same group will have
access: private
- all other models (those that are referenced by models outside their group or are "leaf" models) will have
access: protected
Add/increment model versions#
Let's say you want to add a new version to the customers model, which is currently un-versioned. Versions can provide a smoother upgrade pathway when introducing breaking changes to models that have downstream dependencies.
You can run the following command:
This will add a version to the customers
model for your current version, and will add a new version for breaking change you wish to implement:
- the
customers.sql
file will be renamed tocustomers_v1.sql
- a new
customers_v2.sql
file will be created based oncustomers_v1.sql
- the necessary version configurations will be created (or added to a pre-existing
yml
file)
Add contract(s)#
Let's say you want to add a new contract to the stores
model, which is currently un-contracted.
You can run the following command:
This will add an enforced contract to the stores
model:
- add a
contract
config and setenforced: true
- add every column's
name
anddata_type
if not already defined
Global commands#
Group together a subset of models#
Let's say you want to group together your sales analytics models - create a new group and add contracts to appropriate models simultaneously.
You can run the following command:
dbt-meshify group sales_analytics --owner-name Ralphie --select +int_sales__unioned +int_returns__unioned transactions
This will create a new group named "sales_analytics" with the owner "Ralphie", add all selected models to that group with the appropriate access
configuration, _and add contracts to the models at the boundary between this group and the rest of the project__:
- create a new group definition in a
_groups.yml
file - add all selected models to that group with the appropriate
access
config - all models that are only referenced by other models in their same group will have
access: private
- all other models (those that are referenced by models outside their group or are "leaf" models) will have
access: protected
- for all
protected
models: - add a
contract
config and setenforced: true
- add every column's
name
anddata_type
if not already defined
Split out a new subproject#
Let's say you want to split our your sales analytics models into a new subproject.
You can run the following command:
This will create a new subproject that contains the selected sales analytics models, configure the "edge" models to be public
and contracted, and replace all dependencies in the downstream project on the upstreams's models with cross-project ref
s:
-
create a new subproject that contains the selected sales analytics models
-
add a
dependencies.yml
to the downstream project (in our case, our new subproject is downstream of the original project because thetransactions
model depends on some of the models that remain in the original project -stores
andcustomers
) - add
access: public
to all "leaf" models (models with no downstream dependencies) and models in the upstream project that are referenced by models in the downstream project - for all
public
models: - add a
contract
config and setenforced: true
- add every column's
name
anddata_type
if not already defined - replace any dependencies in the downstream project on the upstream's models with a cross-project
ref
By default, the new subproject will be created in the current directory; however, you can use the --create-path
flag to create it in any directory you like.
Connect multiple dbt projects#
Let's look at a slightly modified version of the example we've been working with. Instead of a single dbt project, let's imagine you're starting with two separate dbt projects connected via the "source hack":
- project A contains the following models
- project B contains the following models
We call this type of multi-project configuration the "source hack" because there are models generated by project A (stores
and customers
) that are defined as sources in project B.
Let's say we want to connect these two projects using model governance best practices and cross project ref
s.
You can run the following command:
This will make the upstream project a dependency for the downstream project, configure the "edge" models to be public
and contracted, and replace all dependencies in the downstream project on the upstreams's models with cross-project ref
s:
- add a
dependencies.yml
to the downstream project (in our case, project B is downstream of project A because thetransactions
model depends on some of the models that are generated by project A -stores
andcustomers
) - add
access: public
to all models in the upstream project that are referenced by models in the downstream project - for all
public
models: - add a
contract
config and setenforced: true
- add every column's
name
anddata_type
if not already defined - replace any dependencies in the downstream project on the upstream's models with a cross-project
ref
- remove unnecessary sources