Configs, properties, what are they?
Resources in your project—models, snapshots, seeds, tests, and the rest—can have a number of declared properties. Resources can also define configurations, which are a special kind of property that bring extra abilities. What's the distinction?
- Properties are declared for resources one-by-one in
properties.yml
files. Configs can be defined there, nested under aconfig
property. They can also be set one-by-one via aconfig()
macro (right within.sql
files), and for many resources at once indbt_project.yml
. - Because configs can be set in multiple places, they are also applied hierarchically. An individual resource might inherit or override configs set elsewhere.
- You can select resources based on their config values using the
config:
selection method, but not the values of non-config properties. - There are slightly different naming conventions for properties and configs depending on the file type. Refer to naming convention for more details.
A rule of thumb: properties declare things about your project resources; configs go the extra step of telling dbt how to build those resources in your warehouse. This is generally true, but not always, so it's always good to check!
For example, you can use resource properties to:
- Describe models, snapshots, seed files, and their columns
- Assert "truths" about a model, in the form of data tests, e.g. "this
id
column is unique" - Define pointers to existing tables that contain raw data, in the form of sources, and assert the expected "freshness" of this raw data
- Define official downstream uses of your data models, in the form of exposures
Whereas you can use configurations to:
- Change how a model will be materialized (tableIn simplest terms, a table is the direct storage of data in rows and columns. Think excel sheet with raw values in each of the cells., viewA view (as opposed to a table) is a defined passthrough SQL query that can be run against a database (or data warehouse)., incremental, etc)
- Declare where a seed will be created in the database (
<database>.<schema>.<alias>
) - Declare whether a resource should persist its descriptions as comments in the database
- Apply tags and "meta" properties
Where can I define configs?
Depending on the resource type, configurations can be defined in the dbt project and also in an installed package by:
- Using a
config
property in a.yml
file in themodels/
,snapshots/
,seeds/
,analyses
, ortests/
directory - From the
dbt_project.yml
file, under the corresponding resource key (models:
,snapshots:
,tests:
, etc)
Config inheritance
The most specific config always takes precedence. This generally follows the order above: an in-file config()
block --> properties defined in a .yml
file --> config defined in the project file.
Note - Generic data tests work a little differently when it comes to specificity. See test configs.
Within the project file, configurations are also applied hierarchically. The most specific config always takes precedence. In the project file, for example, configurations applied to a marketing
subdirectory will take precedence over configurations applied to the entire jaffle_shop
project. To apply a configuration to a model or directory of models, define the resource path as nested dictionary keys.
Configurations in your root dbt project have higher precedence than configurations in installed packages. This enables you to override the configurations of installed packages, providing more control over your dbt runs.
Combining configs
Most configurations are "clobbered" when applied hierarchically. Whenever a more specific value is available, it will completely replace the less specific value. Note that a few configs have different merge behavior:
tags
are additive. If a model has some tags configured indbt_project.yml
, and more tags applied in its.sql
file, the final set of tags will include all of them.meta
dictionaries are merged (a more specific key-value pair replaces a less specific value with the same key)pre-hook
andpost-hook
are also additive.
Where can I define properties?
In dbt, you can use properties.yml
files to define properties for resources. You can declare properties in .yml
files, in the same directory as your resources. You can name these files whatever_you_want.yml
and nest them arbitrarily in sub-folders within each directory.
We highly recommend that you define properties in dedicated paths alongside the resources they're describing.
schema.yml files
Previous versions of the docs referred to these as schema.yml
files — we've moved away from that terminology since the word schema
is used to mean other things when talking about databases, and people often thought that you had to name these files schema.yml
.
Instead, we now refer to these files as properties.yml
files. (Of course, you're still free to name your files schema.yml
)
Which properties are not also configs?
In dbt, you can define node configs in properties.yml
files, in addition to config()
blocks and dbt_project.yml
. However, some special properties can only be defined in the .yml
file and you cannot configure them using config()
blocks or the dbt_project.yml
file:
Certain properties are special, because:
- They have a unique Jinja rendering context
- They create new project resources
- They don't make sense as hierarchical configuration
- They're older properties that haven't yet been redefined as configs
These properties are:
description
tests
docs
columns
quote
source
properties (e.g.loaded_at_field
,freshness
)exposure
properties (e.g.type
,maturity
)macro
properties (e.g.arguments
)
Example
Here's an example that defines both sources
and models
for a project:
version: 2
sources:
- name: raw_jaffle_shop
description: A replica of the postgres database used to power the jaffle_shop app.
tables:
- name: customers
columns:
- name: id
description: Primary key of the table
tests:
- unique
- not_null
- name: orders
columns:
- name: id
description: Primary key of the table
tests:
- unique
- not_null
- name: user_id
description: Foreign key to customers
- name: status
tests:
- accepted_values:
values: ['placed', 'shipped', 'completed', 'return_pending', 'returned']
models:
- name: stg_jaffle_shop__customers
config:
tags: ['pii']
columns:
- name: customer_id
tests:
- unique
- not_null
- name: stg_jaffle_shop__orders
config:
materialized: view
columns:
- name: order_id
tests:
- unique
- not_null
- name: status
tests:
- accepted_values:
values: ['placed', 'shipped', 'completed', 'return_pending', 'returned']
config:
severity: warn
Related documentation
You can find an exhaustive list of each supported property and config, broken down by resource type:
- Model properties and configs
- Source properties and configs
- Seed properties and configs
- Snapshot properties
- Analysis properties
- Macro properties
- Exposure properties
FAQs
Troubleshooting common errors
Invalid test config given in [model name]
This error occurs when your .yml
file does not conform to the structure expected by dbt. A full error message might look like:
* Invalid test config given in models/schema.yml near {'namee': 'event', ...}
Invalid arguments passed to "UnparsedNodeUpdate" instance: 'name' is a required property, Additional properties are not allowed ('namee' was unexpected)
While verbose, an error like this should help you track down the issue. Here, the name
field was provided as namee
by accident. To fix this error, ensure that your .yml
conforms to the expected structure described in this guide.
Invalid syntax in your schema.yml file
If your .yml
file is not valid yaml, then dbt will show you an error like this:
Runtime Error
Syntax error near line 6
------------------------------
5 | - name: events
6 | description; "A table containing clickstream events from the marketing website"
7 |
Raw Error:
------------------------------
while scanning a simple key
in "<unicode string>", line 6, column 5:
description; "A table containing clickstream events from the marketing website"
^
This error occurred because a semicolon (;
) was accidentally used instead of a colon (:
) after the description
field. To resolve issues like this, find the .yml
file referenced in the error message and fix any syntax errors present in the file. There are online YAML validators that can be helpful here, but please be mindful of submitting sensitive information to third-party applications!