question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Ephemeral model bug: nested CTE cannot reference previous CTE

See original GitHub issue

Given a query like:

with my_first_cte as (

  with my_sub_cte as (
  
    select 1 as fun
    
  )
  
  select * from my_sub_cte
  
),

my_second_cte as (

  with my_next_sub_cte as (
  
    select * from my_first_cte
    
  )
  
  select * from my_next_sub_cte
  
)

select * from my_second_cte

Returns the following error:

ERROR processing query/statement. Error Code: 0, SQL state: org.apache.spark.sql.AnalysisException: Table or view not found: my_first_cte; line 17 pos 18

This prevents us from being able to have multiple ephemeral models in a dependency line, since the second CTE (ephemeral model) includes sub CTEs that reference the first CTE (ephemeral model).

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
jtcohen6commented, Aug 25, 2020

This bug is actually resolved in Spark 3.x, which offers greater support for nested CTEs/subqueries. While it’s still an issue in Spark 2.x, I’m going to close this, as we don’t plan to make any code changes.

2reactions
drewbanincommented, Feb 11, 2020

@aaronsteers fyi, check out https://github.com/fishtown-analytics/dbt/issues/1248 and https://github.com/fishtown-analytics/dbt/pull/1283

I think sql parsing + inlining CTEs is a cool idea - it would definitely give us the best of both worlds:

  1. clean source SQL using CTEs
  2. valid compiled SQL that actually runs on spark 😃

But, dbt doesn’t do very much SQL parsing at all, and we generally tend to bet on databases better supporting the SQL standard over time rather than trying to work around their current limitations.

I don’t think we have any immediate plans to do this type of SQL parsing/rewriting, but it could definitely be in scope for the future. For the moment, the shortest path to supporting ephemeral models is probably going to be a subquery-based implementation

Read more comments on GitHub >

github_iconTop Results From Across the Web

Ephemeral model bug: nested CTE cannot reference previous ...
This prevents us from being able to have multiple ephemeral models in a dependency line, since the second CTE (ephemeral model) includes sub...
Read more >
How to re-use one CTE in another CTE in jOOQ - Stack Overflow
At first, I thought it might be a problem with the use of count(). From the manual, it looks like count() is being...
Read more >
Best practices - dbt Developer Hub
Breaking the CTE into a separate model allows you to reference the model from any number of downstream models, reducing duplicated code. A...
Read more >
dbt Guide - GitLab
This is intended to be performed on a model by model bases and for models with known performance needs; for example, the model...
Read more >
Temporary Tables in SQL Server - Simple Talk
If the nested procedure references a temporary table and two temporary tables with the same name exist at that time, which table is...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found