question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Incorrect table definition after executing rollback_to_snapshot procedure

See original GitHub issue

Apache Iceberg version

0.14.0 (latest release)

Query engine

Spark

Please describe the bug 🐞

Spark returns the latest table definition even after executing rollback_to_snapshot procedure.

Steps to reproduce

> CREATE TABLE test USING iceberg AS SELECT 1 c1;
> ALTER TABLE test ADD COLUMN c2 int;
> INSERT INTO test VALUES (1, 1);
> SELECT * FROM iceberg_test.default.test.snapshots;
2022-08-19 07:32:29.499	2770581293596517273 ...
2022-08-19 07:32:50.006	6893045681966948046 ...

> DESC iceberg_test.default.test.snapshot_id_2770581293596517273;
c1                  	int

# Partitioning
Not partitioned

> CALL iceberg_test.system.rollback_to_snapshot('default.test', 2770581293596517273);
> DESC iceberg_test.default.test;
c1                  	int
c2                  	int

The result is same even after I executed REFRESH TABLE iceberg_test.default.test after rollback_to_snapshot.

Relates to https://apache-iceberg.slack.com/archives/C025PH0G1D4/p1660895079836159 Trino issue https://github.com/trinodb/trino/issues/13699

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:8 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
rdbluecommented, Aug 22, 2022

This isn’t a bug. Metadata and data updates are intended to be separate, although I can see why there are cases where you’d assume that they are not.

If you update the schema and commit data in a single job, then it isn’t unreasonable to assume the schema change would be rolled back. But if I concurrently add a column while someone else commits, then a rollback should be independent. Expectations can go both ways.

While expectations differ, Iceberg never rolls back to a previous schema because that operation is unsafe. For example, if someone deletes a required column and then tries to roll that back, there may have been data written without that column. You can recover the column, but you need to make it optional (or in the future, set a read default).

0reactions
findepicommented, Sep 5, 2022

From end-user perspective, there is a difference between

  1. querying table state at given snapshot – at least in Trino, this uses the “schema current at that time”, so includes columns that have been dropped since then

  2. query table state after rollback_to_snapshot – if this uses current schema, this doesn’t include columns that have been dropped since the snapshot

Now consider example

-- add new column
ALTER TABLE orders ADD COLUMN order_timestamp timestamp(6) with time zone;
-- feel in data for new column
UPDATE orders SET order_timestamp = CAST(json_value(order_data, '$.timestamp') AS timestamp(6) with time zone);
-- drop the now-redundant column
ALTER TABLE orders DROP COLUMN order_data;

-- imagine now that comparing this uncovered that `order_data` was encoded in a bad way, so we need to roll this all back
CALL rollback_to_snapshot(.....)

As a user, i would expect to see order_data column back in my table. Per this issue, i understand this wouldn’t be the case. As a user I would call it a data loss (and so a bug).

cc @alexjo2144 @electrum

Read more comments on GitHub >

github_iconTop Results From Across the Web

snapshot too old error - Ask TOM
ORA-01555: snapshot too old: rollback segment number # with name "???" too small then this means this is a read consistent failure on...
Read more >
How to rollback using explicit SQL Server transactions
This demonstration shows that an explicit transaction rollbacks a transaction, but it cannot revert the identity value. It is the reason we see ......
Read more >
Database Engine events and errors - SQL Server
In this article. The table contains error message numbers and the description, which is the text of the error message from the sys.messages...
Read more >
ORA-01555 Snapshot Too Old - Burleson Consulting
Cause: Rollback records needed by a reader for consistent read are overwritten by other writers. Action: If in Automatic Undo Management mode, increase...
Read more >
Error and Transaction Handling in SQL Server Part Two
Implementing Error Handling with Stored Procedures in SQL 2000. ... The data inserted into the permanent table Hot is missing after the rollback....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found