question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Preserve column order in union macro

See original GitHub issue

Describe the feature

I’d like to suggest a new feature for the union_relations macro: is it possible to preserve the order of the columns if either of the models does not have a column instead of sending it to the end of the table? Currently, if either of the table used as a source in the union macro does not have a column that appears in the other table, the column is sent to the right of the table instead of appearing as it does in the source table.

Ex:

table 1

id name email

table 2

id email

unioned table (current display)

id name email

instead of (ideal display)

id email name

Describe alternatives you’ve considered

N/A-would defeat the purpose of using this macro

Who will this benefit?

Analysts building dbt models, including myself – the order of columns often matters (columns with similar topic are often grouped together)

Are you interested in contributing this feature?

Yes, happy to!

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
chloe-lubincommented, Apr 8, 2022

Hi @joellabes , thanks so much for sharing! I just tried it with the 2 models I’m looking to union and confirming that the columns appear in order after reversing the model references. Would it be possible to make a note in the documentation?

1reaction
joellabescommented, Apr 8, 2022

Thanks for this @chloe-lubin! I just did some experimentation, and want to share a bit of context for how it currently works.

The macro iterates over each ref that’s passed into the macro, in order. I made two basic models:

--model1.sql
select '1' as col1, '2' as col2, '3' as col3, '4' as col4, '100' as col100
--model2.sql
select 'd' as col4, 'a' as col1, 'b' as col2, 'z' as col20

When passing them into the macro, model1 then model2, I get this result: image

When doing model2 then model1, I get this: image

Note that in the second example, col4 comes first, and the columns that don’t appear in model1 (col3 and col100) appear last.

Related: https://github.com/dbt-labs/dbt-utils/issues/395 proposes upcasting columns when they are of incompatible types; I have cast my numbers as strings for the sake of this demo.

So for your example, you could get the desired outcome by doing {{ dbt_utils.union_relations([ref('table2'), ref('table1')]) }} instead of table1 then table2.

From what I’ve seen so far, I’m pretty comfortable with the current behaviour, because users can get the behaviour they want pretty easily. If your example was a cut down to a minimum version to demonstrate the point, it’d be useful for you to share the actual case you’re trying to achieve (e.g. many more columns, or multiple relations being unioned together).

Let me know what you think!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Preserving Field Order without Using the Select Tool
CS Macro Dev: Preserving Field Order without Using the Select Tool · Here we have a dataset with four fields. · Now that...
Read more >
How can I keep the order of column values in a union select?
To answer why it ends up sorted: UNION removes duplicates so usually sorts the data along the way. You can use UNION ALL...
Read more >
Towards an Error-free UNION ALL | dbt Developer Blog
The union_relations macro in the dbt_utils package completely frees us from propagating null or 0 values for each column that doesn't exist in ......
Read more >
Custom Transformation - use cases with advanced SQL queries
You need to output a new column in the table with the order priority according ... Save the macro and the page. ......
Read more >
15.00 - UNION Operator - Teradata Database
The union must include the same number of columns from each table in ... the ALL option for each UNION operator in the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found