Creating dbt Models Without the Default Schema Prefix

In dbt (Data Build Tool), the generate_schema_name macro is used to define the schema where your models will be built. By default, dbt uses a schema defined in profiles.yml. If object has its own schema in dbt_project.yml then the target schema will be prefixed with the one defined in profiles.yml. However, there are scenarios where this default behavior may not align with your data architecture or business requirements.

For example, you might want all your models to be created in a specific schema without any prefixes. To achieve this, you can override the generate_schema_name macro using Jinja.

Jinja

Default Behavior of generate_schema_name

Say you have the following configuration in the profiles.yml

# profiles.yml
my_profile:
  target: dev
  outputs:
    dev:
      type: snowflake
      account: my_account
      user: my_user
      password: my_password
      role: my_role
      database: my_database
      warehouse: my_warehouse
      schema: analytics

You want to override schema in dbt_project.yml like this:

# dbt_project.yml
name: my_project

# This setting configures which "profile" dbt uses for this project.
profile: my_profile

require-dbt-version: 1.4.0

models:
  my_project:
    my_model:
      +schema: finance
      +tags:
        - finance

After running dbt, you realize that my_model was created in analytics_finance schema in the database whereas you wanted it to be created in finance schema

Overriding the Macro

To override the default behavior and specify your custom schema logic, follow these steps:

Step 1: Create a Custom generate_schema_name Macro

In your dbt project, navigate to the macros folder (create one if it doesn’t exist). Then, create a new file, e.g., generate_schema_name.sql, and define your custom macro:

{%- macro generate_schema_name(custom_schema_name, node) -%}
    {{ custom_schema_name or target.schema }}
{%- endmacro -%}

Step 2: Save the Macro

Save this file in the macros directory of your dbt project. dbt automatically picks up the overridden macro.

Step 3: Test the New Behavior

Run dbt run or dbt compile and inspect the generated SQL files to confirm that your models are being created in the desired schema without any prefixes.

Advanced Customizations

You can further enhance the macro logic to handle specific requirements:

1. Environment-Specific Schemas

Assign different schemas based on the target environment (e.g., dev, staging, prod):

{% macro generate_schema_name(custom_schema_name, node) -%}
    {%- if target.name == 'dev' -%}
        {{ 'dev_schema' }}
    {%- if target.name == 'staging' -%}
        {{ 'staging_schema' }}
    {%- else -%}
        {{ custom_schema_name or target.schema }}
    {%- endif -%}
{%- endmacro %}

2. Dynamic Schemas for Different Models

Use node attributes (e.g., tags) to route models to specific schemas:

{% macro generate_schema_name(custom_schema_name, node) %}
    {% if 'finance' in node.tags %}
        {{ 'finance_schema' }}
    {% elif 'marketing' in node.tags %}
        {{ 'marketing_schema' }}
    {% else %}
        {{ custom_schema_name or target.schema }}
    {% endif %}
{% endmacro %}

Overriding the generate_schema_name macro in dbt is a simple yet powerful way to gain control over your schema naming conventions. By customizing this macro, you can align dbt’s behavior with your organizational standards and simplify data governance.

More From Author

Leave a Reply

Recent Comments

No comments to show.