Spark can't read Iceberg table created from Presto
See original GitHub issueSpark can’t read table which was created in Presto
create table iceberg.examples.test_table
with (format = 'parquet')
as select timestamp '2021-01-19 23:59:59.999999' as ts;
In Spark
spark.read
.format("iceberg")
.load("examples.test_table")
.show(10, false)
Fails with exception java.lang.UnsupportedOperationException: Spark does not support timestamp without time zone fields
Please add support of Timestamp without timezone
type into Iceberg Spark runtime, because Iceberg supports this type
Issue Analytics
- State:
- Created 3 years ago
- Reactions:4
- Comments:5 (2 by maintainers)
Top Results From Across the Web
[GitHub] [iceberg] sshkvar opened a new issue #2122: Spark ...
Spark can't read table which was created in Presto ```java create table iceberg.examples.test_table with (format = 'parquet') as select ...
Read more >Spark Queries - Apache Iceberg
Paths and table names can be loaded with Spark's DataFrameReader interface. How tables are loaded depends on how the identifier is specified. When...
Read more >Use a cluster with Iceberg installed - Amazon EMR
Create an Iceberg cluster · Initialize a Spark session for Iceberg · Write to an Iceberg table · Read from an Iceberg table...
Read more >How to make iceberg data find in metabase from presto?
My data is stored in Hive, which uses Iceberg table style and is called through Presto/Spark Sql, but No fields found for table....
Read more >Iceberg connector — Trino 403 Documentation
All changes to table state create a new metadata file and replace the old metadata ... Whether schema locations should be deleted when...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
According to Spark docs
TimestampType
represents values comprising values of fields year, month, day, hour, minute, and second, with the session local time-zone. The timestamp value represents an absolute point in time. So it’s more likeLocalDateTime
. Why can’t Iceberg exposetimestamp without timezone
as Sparktimestamp
?As far as I understand the problem is that Spark has only one timestamp type but Iceberg has two types which causes some inconsistency in type mapping between Spark and Iceberg.
In my opinion it could be solved in a following way:
timestamp
,timestamptz
) could be exposed as Sparktimestamp
.spark.sql.parquet.outputTimestampType=(INT96 | TIMESTAMP_MICROS | TIMESTAMP_MILLIS)
). Why can’t we have such logic for Iceberg?Related issues: https://github.com/apache/iceberg/issues/2388 and https://github.com/apache/iceberg/issues/2244