question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] Fix DeltaTableBuilder metadata configuration

See original GitHub issue

Bug

Describe the problem

Currently, when DeltaTableBuilder is given a property that does not start with a delta. prefix, then that property is written to the table metadata in all lowercase.

Steps to reproduce

io.delta.tables.DeltaTable.create()
  .addColumn("bar", StringType)
  .property("dataSkippingNumIndexedCols", "33")
  .tableName("my_table")
  .execute()

Observed results

This will incorrectly write dataskippingnumindexedcols to the delta log metadata.

Expected results

We want to preserve the case of table properties. dataSkippingNumIndexedCols should be written to the metadata.

Implementation Requirement

In addition to fixing the above issue, we also need to be backwards compatible and be able to read older tables with these invalid properties.

As well, you need to include a flag that can be enabled to roll back to this previous behaviour.

After you fix this issue, please update the test and test output table here: https://github.com/delta-io/delta/blob/master/core/src/test/scala/org/apache/spark/sql/delta/DeltaWriteConfigsSuite.scala#L316

Also, please add tests with the feature flag enabled/disabled, as well as by reading an “older” table with these invalid delta properties (this should go into EvolvabilitySuite).

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:7 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
zsxwingcommented, Jun 21, 2022

We only want properties that are prefixed with delta. to be written to the delta log, and we obviously want to preserve their case.

As this API is for setting table properties rather than options, we should allow arbitrary table properties. What we should fix is making it preserve the case.

0reactions
zsxwingcommented, Jul 5, 2022

@edmondo1984 done. Thanks for offering the help!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Table batch reads and writes - Delta Lake Documentation
Table metadata. DESCRIBE DETAIL. DESCRIBE HISTORY. Configure SparkSession. Configure storage credentials. Spark configurations. SQL session configurations.
Read more >
Make Your Data Lakehouse Run, Faster With Delta Lake 1.1
Support for setting user metadata in the commit information when creating or replacing tables. Fix for an incorrect analysis exception in MERGE ...
Read more >
Azure Synapse Runtime for Apache Spark 3.3 is now in Public ...
It has a more intuitive API and a simpler configuration process. ... Fix for DeltaTableBuilder to preserve table property case of non-delta ...
Read more >
Merge Operation - The Internals of Delta Lake
Please note that the above commands leave us with an empty Delta table. Let's fix it. Scala SQL. import ...
Read more >
Configure metadata processing for Delta tables - Unravel Data
If Delta table processing fails, check the following error logs in the delta_file_handoff.log file located at /opt/unravel/logs/ . INFO table.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found