Skip to main content

Configs, properties, what are they?

Resources in your project—models, snapshots, seeds, tests, and the rest—can have a number of declared properties. Resources can also define configurations, which are a special kind of property that bring extra abilities. What's the distinction?

  • Properties are declared for resources one-by-one in properties.yml files. Configs can be defined there, nested under a config property. They can also be set one-by-one via a config() macro (right within .sql files), and for many resources at once in dbt_project.yml.
  • Because configs can be set in multiple places, they are also applied hierarchically. An individual resource might inherit or override configs set elsewhere.
  • You can select resources based on their config values using the config: selection method, but not the values of non-config properties.
  • There are slightly different naming conventions for properties and configs depending on the file type. Refer to naming convention for more details.

A rule of thumb: properties declare things about your project resources; configs go the extra step of telling dbt how to build those resources in your warehouse. This is generally true, but not always, so it's always good to check!

For example, you can use resource properties to:

  • Describe models, snapshots, seed files, and their columns
  • Assert "truths" about a model, in the form of data tests, e.g. "this id column is unique"
  • Define pointers to existing tables that contain raw data, in the form of sources, and assert the expected "freshness" of this raw data
  • Define official downstream uses of your data models, in the form of exposures

Whereas you can use configurations to:

  • Change how a model will be materialized (tableIn simplest terms, a table is the direct storage of data in rows and columns. Think excel sheet with raw values in each of the cells., viewA view (as opposed to a table) is a defined passthrough SQL query that can be run against a database (or data warehouse)., incremental, etc)
  • Declare where a seed will be created in the database (<database>.<schema>.<alias>)
  • Declare whether a resource should persist its descriptions as comments in the database
  • Apply tags and "meta" properties

Where can I define configs?

Depending on the resource type, configurations can be defined in the dbt project and also in an installed package by:

  1. Using a config property in a .yml file in the models/, snapshots/, seeds/, analyses, or tests/ directory
  2. From the dbt_project.yml file, under the corresponding resource key (models:, snapshots:, tests:, etc)

Config inheritance

The most specific config always takes precedence. This generally follows the order above: an in-file config() block --> properties defined in a .yml file --> config defined in the project file.

Note - Generic data tests work a little differently when it comes to specificity. See test configs.

Within the project file, configurations are also applied hierarchically. The most specific config always takes precedence. In the project file, for example, configurations applied to a marketing subdirectory will take precedence over configurations applied to the entire jaffle_shop project. To apply a configuration to a model or directory of models, define the resource path as nested dictionary keys.

Configurations in your root dbt project have higher precedence than configurations in installed packages. This enables you to override the configurations of installed packages, providing more control over your dbt runs.

Combining configs

Most configurations are "clobbered" when applied hierarchically. Whenever a more specific value is available, it will completely replace the less specific value. Note that a few configs have different merge behavior:

  • tags are additive. If a model has some tags configured in dbt_project.yml, and more tags applied in its .sql file, the final set of tags will include all of them.
  • meta dictionaries are merged (a more specific key-value pair replaces a less specific value with the same key)
  • pre-hook and post-hook are also additive.

Where can I define properties?

In dbt, you can use properties.yml files to define properties for resources. You can declare properties in .yml files, in the same directory as your resources. You can name these files whatever_you_want.yml and nest them arbitrarily in sub-folders within each directory.

We highly recommend that you define properties in dedicated paths alongside the resources they're describing.

info

schema.yml files

Previous versions of the docs referred to these as schema.yml files — we've moved away from that terminology since the word schema is used to mean other things when talking about databases, and people often thought that you had to name these files schema.yml.

Instead, we now refer to these files as properties.yml files. (Of course, you're still free to name your files schema.yml)

Which properties are not also configs?

In dbt, you can define node configs in properties.yml files, in addition to config() blocks and dbt_project.yml. However, some special properties can only be defined in the .yml file and you cannot configure them using config() blocks or the dbt_project.yml file:

Certain properties are special, because:

  • They have a unique Jinja rendering context
  • They create new project resources
  • They don't make sense as hierarchical configuration
  • They're older properties that haven't yet been redefined as configs

These properties are:

Example

Here's an example that defines both sources and models for a project:

models/jaffle_shop.yml
version: 2

sources:
- name: raw_jaffle_shop
description: A replica of the postgres database used to power the jaffle_shop app.
tables:
- name: customers
columns:
- name: id
description: Primary key of the table
tests:
- unique
- not_null

- name: orders
columns:
- name: id
description: Primary key of the table
tests:
- unique
- not_null

- name: user_id
description: Foreign key to customers

- name: status
tests:
- accepted_values:
values: ['placed', 'shipped', 'completed', 'return_pending', 'returned']


models:
- name: stg_jaffle_shop__customers
config:
tags: ['pii']
columns:
- name: customer_id
tests:
- unique
- not_null

- name: stg_jaffle_shop__orders
config:
materialized: view
columns:
- name: order_id
tests:
- unique
- not_null
- name: status
tests:
- accepted_values:
values: ['placed', 'shipped', 'completed', 'return_pending', 'returned']
config:
severity: warn


You can find an exhaustive list of each supported property and config, broken down by resource type:

FAQs

Does my `.yml` file containing tests and descriptions need to be named `schema.yml`?
If I can name these files whatever I'd like, what should I name them?
Should I use separate files to declare resource properties, or one large file?
Can I add tests and descriptions in a config block?
Why do model and source yml files always start with `version: 2`?
Can I use a YAML file extension?

Troubleshooting common errors

Invalid test config given in [model name]

This error occurs when your .yml file does not conform to the structure expected by dbt. A full error message might look like:

* Invalid test config given in models/schema.yml near {'namee': 'event', ...}
Invalid arguments passed to "UnparsedNodeUpdate" instance: 'name' is a required property, Additional properties are not allowed ('namee' was unexpected)

While verbose, an error like this should help you track down the issue. Here, the name field was provided as namee by accident. To fix this error, ensure that your .yml conforms to the expected structure described in this guide.

Invalid syntax in your schema.yml file

If your .yml file is not valid yaml, then dbt will show you an error like this:

Runtime Error
Syntax error near line 6
------------------------------
5 | - name: events
6 | description; "A table containing clickstream events from the marketing website"
7 |

Raw Error:
------------------------------
while scanning a simple key
in "<unicode string>", line 6, column 5:
description; "A table containing clickstream events from the marketing website"
^

This error occurred because a semicolon (;) was accidentally used instead of a colon (:) after the description field. To resolve issues like this, find the .yml file referenced in the error message and fix any syntax errors present in the file. There are online YAML validators that can be helpful here, but please be mindful of submitting sensitive information to third-party applications!

0
Loading