question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add a generate_database_name() macro

See original GitHub issue

Describe the feature

In the current version (v14.0), there exists a {{ generate_schema_name_for_env }} macro which works very well in dev mode, allowing a production run to write to the specified schema but writing all tables and views to a dev schema when in dev mode. In the same way we need a {{generate_database_name_for_env}} macro for when a database is configured in the dbt_project.yml file.

Currently: If I have in my dbt_project.yml file a section in models that reads:

models:
   product:
      database: mart_db
      materialized: view
      schema: mart_schema

and in my profiles.yml file I have:

my_dbt:
  target: dev
  outputs:
    dev:
      type: snowflake
      account: *******

      user: "{{ env_var('DBT_USER') }}"
      password: "{{ env_var('DBT_PASSWORD') }}"

      role: "ur_{{ env_var('DBT_USER') }}"
      database: dbt_dev

Then my models in dev mode will be written using mart_db instead of dev_db.

Describe alternatives you’ve considered

Right now to solve this I’ve created alternative ref macro called xref to override this behavior but it feels a bit clunky to do this and I will have to tell out dbt devs to all use {{ xref('some table') }} instead of the inbuilt ref function.

Additional context

Not database specific, it’s a dbt issue.

Who will this benefit?

Anyone who wants to specify a set of production databases in their dbt_project.yml file in the same way that they might already do for their schemas using the existing {{generate_database_name_for_env}} macro but who also wants to have dbt write all tables and views into a single schema when in dev_mode.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:2
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
whisperstreamcommented, Aug 22, 2019

Let me clean up my xref macro a bit and add some comments, in the mean time I’ll try and explain this part I have to override the normal ref() function and return instead the final production table name a bit better:

So just to preface this, what I describe below (and the condition in xref()) only happens when target.name!=prod AND env_var('DEVELOPER_TYPE') == tier2, otherwise dbt runs as normal.

Normally in dbt you create a hierarchy of models, i.e.

[actual raw table] -> model_raw -> model_stage -> model_bizready -> model_mart

This assumes you have access to [actual raw table], but in my case tier2 developers only have access to [production bizready table] (created by a previous production run of dbt)

So when a tier2 developer executes:

dbt -run -m +model_mart

dbt will try and execute the whole DAG, which will fail because tier2 can’t access the [actual raw table]. To work around this, my xref() macro figures out that xref(model_bizready) should not keep traversing the DAG as normal, but instead should rewrite the DAG to reference the production table that the tier2 dev does have access to. So for xref(model_bizready) the DAG becomes:

[production bizready table] -> model_mart

That way the tier2 developer doesn’t need access to anything including or proceeding bizready because they don’t have access rights to it anyway, but they’re still able to contribute to and execute dbt for any mart models they wish to work on.

Conversely a tier1 developer or a production process executing the same dbt -run -m +model_mart will execute the whole DAG normally i.e. [actual raw table] -> model_raw -> model_stage -> model_bizready -> model_mart

0reactions
whisperstreamcommented, Sep 9, 2019

@drewbanin

Ok here’s what my xref macro is looking like:

all bizready tables are prefixed with br_, so a sample table might be br_s1__some_dataset,

In prod mode or when DEVELOPER_TYPE is RAW_DEV then xref just acts like ref. but if developer mode is BIZREADY_DEV and a br_ table is being referenced then it rewrites the table reference to point to production. If target != prod I always do the mapping because I need xref to fail if anyone adds a new bizready schema. Anyway hopefully the use case makes sense. The database name override isn’t the only piece I’d need to do this more elegantly, I’d also need to some how get the custom schema name from the config.

Originally I was getting the custom schema name if the developer mode was BIZREADY_DEV but dbt creates the schemas first instead of doing it only if something is written to that schema and this meant that if two devs were running dbt at the same time, one would error out because dbt wouldn’t be able to create the custom schemas again (because the 1st dev would have created them)…that’s why I have the clumsy lookup table.

Not sure if you have some better ways I might solve this use case, but being able to oevrride the ref function is probably the 1st step and then maybe having more ways look up information about a reference name, i.e. to see if it has a custom schema, custom database configured etc…?

{%- macro xref(package_name, table_name=None) -%}

    {%- if table_name is none -%}
        {%- set table_name = package_name -%}
        {%- set package_name = None -%}
    {%-  endif -%}

    {# -- only do this for bizready tables #}
    {%- if (package_name is none and target.name == 'prod') or not table_name.startswith('br_') -%}
        {{- return( ref(package_name, table_name) ) -}}
    {%- endif -%}

    {%- set orig_ref = ref(table_name) -%}
    {%- set final_table = orig_ref -%}

    {%- set prefix_mapping = {
            's1':  'schema1',
            's2':  'schema2',
            's3':  'schema3'
        }
    -%}

    {# -- lookup the mapping and get the schema name -- #}
    {%- set db_schema = prefix_mapping[table_name.split('__')[0][3:]] if (table_name|string).startswith('br_') else None -%}

    {%- if var('DEVELOPER_TYPE') == 'BIZREADY_DEV' -%}

        {%- set db = var('BIZREADY_PROD_DB') -%}

        {%- set final_table = db + '.' + db_schema + '.' + table_name -%}

        {{- log('[' ~ var('DEVELOPER_TYPE') ~ ' MODE] - Converted original table ref: ' ~ table_name
            ~ ' from expected conversion: ' ~ orig_ref ~ ' to: ' ~ final_table) -}}

    {%- else -%}

        {{- log('No changes made for table: ' ~ table_name ~ ' and ref: ' ~ orig_ref) -}}

    {%- endif -%}

    {{- log('Returning final table:' ~ final_table) -}}

    {{- final_table -}}

{%- endmacro -%}
Read more comments on GitHub >

github_iconTop Results From Across the Web

Add a generate_database_name() macro · Issue #1695 · dbt ...
So just to preface this, what I describe below (and the condition in xref() ) only happens when target.name!=prod AND env_var('DEVELOPER_TYPE') ...
Read more >
Create a data macro - Microsoft Support
In the Navigation Pane, double-click the table to which you want to add the data macro. · On the Table tab, in the...
Read more >
Custom databases - dbt Developer Hub
The database name generated for a model is controlled by a macro called generate_database_name . This macro can be overridden in a dbt ......
Read more >
Excel Macro to Generate Database Insert Script - CodeProject
An Excel macro that generates SQL insert script that can be executed against database directly.
Read more >
Create Macro - DuckDB
The CREATE MACRO statement can create a scalar or table macro (function) in the catalog. A macro may only be a single SELECT...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found