question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] `LogStore` class not found, from pyspark

See original GitHub issue

Bug

Describe the problem

I’ve upgraded python delta-spark to 1.2.0 (and pyspark to 3.3.0). When I run pyspark, I see the below error (in observed results).

I believe as part of #954, the LogStore class was changed to now use io.delta.storage.S3SingleDriverLogStore.

The problem appears to be that the LogStore class isn’t present in the python artefact leading to the below error. Is this an oversight in the packaging, or should I be specifying an extra dependency to bring in this class?

I also see errors about the S3SingleDriverLogStore, which I assume are consequences of this.

Steps to reproduce

Upgrade to delta-spark 1.2.0, and run via python (with the default logger).

Observed results

py4j.protocol.Py4JJavaError: An error occurred while calling o2176.execute.
5679E                   : com.google.common.util.concurrent.ExecutionError: java.lang.NoClassDefFoundError: io/delta/storage/LogStore

Expected results

A successful run, using the LogStore class to log to S3.

Further details

n/a

Environment information

  • Delta Lake version: 1.2.0
  • Spark version: 3.2.1
  • Scala version: n/a

Willingness to contribute

The Delta Lake Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the Delta Lake code base?

  • Yes. I can contribute a fix for this bug independently.
  • Yes. I would be willing to contribute a fix for this bug with guidance from the Delta Lake community.
  • No. I cannot contribute a bug fix at this time.

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:1
  • Comments:19 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
zsxwingcommented, Jun 23, 2022

@hustnn so you have a delta-core jar in spark jars directory? If so, this makes sense. This line https://github.com/delta-io/delta/blob/v1.2.1/core/src/main/scala/org/apache/spark/sql/delta/storage/DelegatingLogStore.scala#L159 requires delta-core and delta-storage are loaded in the same way, either both exists in jars directory, or both loaded through --packages.

1reaction
zsxwingcommented, Aug 29, 2022

@0xdarkman could you check your Spark’s jars directory and see if there are any delta jars there?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Error writing delta file with pyspark - Caused by: java.lang ...
I have a parquet file that I am trying to write to a delta table. My code is straight forward I think. from...
Read more >
Getting Started - Spark 3.3.1 Documentation
The entry point into all functionality in Spark is the SparkSession class. ... To use these features, you do not need to have...
Read more >
DeltaLog - The Internals of Delta Lake
Creates the LogStore based on spark.delta.logStore.class configuration property. Initializes the current snapshot. Updates state of the delta table when there ...
Read more >
Error conditions in Azure Databricks - Microsoft Learn
The partition(s) cannot be found in table . Verify the partition specification and table name. To tolerate the error on drop use ALTER...
Read more >
Introduction to Delta Lake on Apache Spark ... for Data ...
It's hard not get spammed these days as a data engineering with Delta Lake this Delta ... logStore.class=org.apache.spark.sql.delta.storage.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found