question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unable to query delta table version from Athena with SQL

See original GitHub issue

Hello Delta team, I would like to clarify if the above scenario is actually a possibility.

Here are our current scenario steps:

Tooling Version:

  • AWS Glue - 3.0
  • Python version - 3
  • Spark version - 3.1
  • Delta.io version -1.0.0

From AWS Glue jobs running PySpark code, we make several overwrite operations to a delta table, as follows:

df.write.format("delta").mode("overwrite").save(target_s3_path)

The operation succeeds with no issues.

Also, we were able to run successfully the following operations:

 deltaTable = DeltaTable.forPath(spark, target_s3_path)

 fullHistoryDF = deltaTable.history()    # get the full history of the table

 lastOperationDF = deltaTable.history(1) # get the last operation
    
 preLastOperationDF = deltaTable.history(2)

Those were also successful and we were able to read its content and check the multiple versions of the delta table that were written.

Here is the key concern of the issue: From pyspark code, we had no issues reading any of the table versions specifically, as follows:

spark.read.format("delta").option("versionAsOf", 2).load(target_s3_path)

We would like to run a very similar SQL query through Athena instead, in order to retrieve a specific version of a table, for example:

SELECT * FROM "delta_db"."delta_table" VERSION AS OF 2;

But running this syntax in athena results in the following error: line 1:71: mismatched input 'AS'. Expecting: '(', ',', 'CROSS', 'EXCEPT', 'FULL', 'GROUP', 'HAVING', 'INNER', 'INTERSECT', 'JOIN', 'LEFT', 'LIMIT', 'NATURAL', 'OFFSET', 'ORDER', 'RIGHT', 'TABLESAMPLE', 'UNION', 'WHERE', <EOF>

Obs.: The manyfest file creation step and the CREATE EXTERNAL TABLE from athena step described in https://docs.delta.io/latest/presto-integration.html#presto-and-athena-to-delta-lake-integration were also executed prior to the above SQL query attempt

Any clarification regarding the matter would be really appreciated! Thanks in advance!

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
zsxwingcommented, Dec 6, 2021

AFAIK, the VERSION AS OF syntax is not supported by Presto/Athena. https://github.com/prestodb/presto/pull/16843 is building a native Presto connector which will support time travel like select * from mytable@v123.

0reactions
dennygleecommented, Jan 5, 2022

Closing this issue; please re-open if this issue if any other questions.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Query Delta Lake Table using Athena using ... - Stack Overflow
I believe the problem happens specifically if the account has Lake Formation enabled. Steps to replicate:
Read more >
Query Delta Lake Tables from Presto and Athena, Improved ...
0 release notes. In this blog post, we will elaborate on reading Delta Lake tables with Presto, improved operations concurrency, easier and ...
Read more >
Presto, Trino, and Athena to Delta Lake integration using ...
This article describes how to set up a Presto, Trino, and Athena to Delta Lake integration using manifest files and query Delta tables....
Read more >
Resolve issues with Amazon Athena queries returning empty ...
I ran a CREATE TABLE statement in Amazon Athena with expected columns and their data types. When I run the query SELECT *...
Read more >
Why is Databricks Delta table & AWS Athena is not a good ...
For Athena / Presto to query Delta S3 folder following changes need to be made on Databricks and Athena Tables. a) Create a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found