[PySpark 3.2.0] (java.lang.IncompatibleClassChangeError: class org.apache.spark.sql.catalyst.plans.logical.DeltaDelete has interface org.apache.spark.sql.catalyst.plans.logical.UnaryNode as super class)
See original GitHub issuePySpark version 3.2.0 Python version 3.9.7
Reproduction Steps:
- Start spark shell with:
$ pyspark --packages io.delta:delta-core_2.12:1.0.0 --conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension" --conf "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog"
- Try reading a delta table (this one was was recently created using PySpark 3.1.2)
Error:
>>> delta_fs_root = '/my_delta_path'
>>> df = spark.read.format("delta").load(f"{delta_fs_root}/topic")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/spark/python/pyspark/sql/readwriter.py", line 158, in load
return self._df(self._jreader.load(path))
File "/usr/local/spark/python/lib/py4j-0.10.9.2-src.zip/py4j/java_gateway.py", line 1309, in __call__
File "/usr/local/spark/python/pyspark/sql/utils.py", line 111, in deco
return f(*a, **kw)
File "/usr/local/spark/python/lib/py4j-0.10.9.2-src.zip/py4j/protocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o40.load.
: com.google.common.util.concurrent.ExecutionError: java.lang.IncompatibleClassChangeError: class org.apache.spark.sql.catalyst.plans.logical.DeltaDelete has interface org.apache.spark.sql.catalyst.plans.logical.UnaryNode as super class
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2261)
at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4789)
at org.apache.spark.sql.delta.DeltaLog$.apply(DeltaLog.scala:464)
at org.apache.spark.sql.delta.DeltaLog$.forTable(DeltaLog.scala:401)
at org.apache.spark.sql.delta.catalog.DeltaTableV2.deltaLog$lzycompute(DeltaTableV2.scala:73)
at org.apache.spark.sql.delta.catalog.DeltaTableV2.deltaLog(DeltaTableV2.scala:73)
at org.apache.spark.sql.delta.catalog.DeltaTableV2.toBaseRelation(DeltaTableV2.scala:139)
at org.apache.spark.sql.delta.sources.DeltaDataSource.createRelation(DeltaDataSource.scala:177)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:350)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$1(DataFrameReader.scala:243)
at scala.Option.map(Option.scala:230)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:188)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.IncompatibleClassChangeError: class org.apache.spark.sql.catalyst.plans.logical.DeltaDelete has interface org.apache.spark.sql.catalyst.plans.logical.UnaryNode as super class
at java.base/java.lang.ClassLoader.defineClass1(Native Method)
at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1017)
at java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:174)
at java.base/java.net.URLClassLoader.defineClass(URLClassLoader.java:550)
at java.base/java.net.URLClassLoader$1.run(URLClassLoader.java:458)
at java.base/java.net.URLClassLoader$1.run(URLClassLoader.java:452)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:451)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
at org.apache.spark.sql.delta.DeltaAnalysis.apply(DeltaAnalysis.scala:64)
at org.apache.spark.sql.delta.DeltaAnalysis.apply(DeltaAnalysis.scala:57)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:211)
at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
at scala.collection.immutable.List.foldLeft(List.scala:91)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200)
at scala.collection.immutable.List.foreach(List.scala:431)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200)
at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:215)
at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:209)
at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:172)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179)
at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179)
at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:193)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:192)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:88)
at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:196)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:196)
at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:88)
at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:86)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:78)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:205)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:211)
at org.apache.spark.sql.Dataset$.apply(Dataset.scala:75)
at org.apache.spark.sql.delta.Snapshot.$anonfun$loadActions$1(Snapshot.scala:261)
at scala.collection.immutable.List.map(List.scala:293)
at org.apache.spark.sql.delta.Snapshot.loadActions(Snapshot.scala:261)
at org.apache.spark.sql.delta.Snapshot.stateReconstruction(Snapshot.scala:98)
at org.apache.spark.sql.delta.Snapshot.cachedState$lzycompute(Snapshot.scala:117)
at org.apache.spark.sql.delta.Snapshot.cachedState(Snapshot.scala:116)
at org.apache.spark.sql.delta.Snapshot.state(Snapshot.scala:120)
at org.apache.spark.sql.delta.Snapshot.$anonfun$computedState$1(Snapshot.scala:140)
at org.apache.spark.sql.delta.util.DeltaProgressReporter.withJobDescription(DeltaProgressReporter.scala:53)
at org.apache.spark.sql.delta.util.DeltaProgressReporter.withStatusCode(DeltaProgressReporter.scala:32)
at org.apache.spark.sql.delta.util.DeltaProgressReporter.withStatusCode$(DeltaProgressReporter.scala:27)
at org.apache.spark.sql.delta.Snapshot.withStatusCode(Snapshot.scala:55)
at org.apache.spark.sql.delta.Snapshot.computedState$lzycompute(Snapshot.scala:137)
at org.apache.spark.sql.delta.Snapshot.computedState(Snapshot.scala:136)
at org.apache.spark.sql.delta.Snapshot.metadata(Snapshot.scala:179)
at org.apache.spark.sql.delta.Snapshot.toString(Snapshot.scala:290)
at java.base/java.lang.String.valueOf(String.java:2951)
at java.base/java.lang.StringBuilder.append(StringBuilder.java:168)
at org.apache.spark.sql.delta.Snapshot.$anonfun$new$1(Snapshot.scala:293)
at org.apache.spark.sql.delta.Snapshot.$anonfun$logInfo$1(Snapshot.scala:270)
at org.apache.spark.internal.Logging.logInfo(Logging.scala:57)
at org.apache.spark.internal.Logging.logInfo$(Logging.scala:56)
at org.apache.spark.sql.delta.Snapshot.logInfo(Snapshot.scala:270)
at org.apache.spark.sql.delta.Snapshot.<init>(Snapshot.scala:293)
at org.apache.spark.sql.delta.SnapshotManagement.createSnapshot(SnapshotManagement.scala:223)
at org.apache.spark.sql.delta.SnapshotManagement.createSnapshot$(SnapshotManagement.scala:211)
at org.apache.spark.sql.delta.DeltaLog.createSnapshot(DeltaLog.scala:59)
at org.apache.spark.sql.delta.SnapshotManagement.getSnapshotAtInit(SnapshotManagement.scala:195)
at org.apache.spark.sql.delta.SnapshotManagement.getSnapshotAtInit$(SnapshotManagement.scala:186)
at org.apache.spark.sql.delta.DeltaLog.getSnapshotAtInit(DeltaLog.scala:59)
at org.apache.spark.sql.delta.SnapshotManagement.$init$(SnapshotManagement.scala:49)
at org.apache.spark.sql.delta.DeltaLog.<init>(DeltaLog.scala:63)
at org.apache.spark.sql.delta.DeltaLog$.$anonfun$apply$3(DeltaLog.scala:467)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
at org.apache.spark.sql.delta.DeltaLog$.$anonfun$apply$2(DeltaLog.scala:467)
at com.databricks.spark.util.DatabricksLogging.recordOperation(DatabricksLogging.scala:77)
at com.databricks.spark.util.DatabricksLogging.recordOperation$(DatabricksLogging.scala:67)
at org.apache.spark.sql.delta.DeltaLog$.recordOperation(DeltaLog.scala:367)
at org.apache.spark.sql.delta.metering.DeltaLogging.recordDeltaOperation(DeltaLogging.scala:106)
at org.apache.spark.sql.delta.metering.DeltaLogging.recordDeltaOperation$(DeltaLogging.scala:91)
at org.apache.spark.sql.delta.DeltaLog$.recordDeltaOperation(DeltaLog.scala:367)
at org.apache.spark.sql.delta.DeltaLog$.$anonfun$apply$1(DeltaLog.scala:466)
at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4792)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2257)
... 26 more
Issue Analytics
- State:
- Created 2 years ago
- Comments:10 (2 by maintainers)
Top Results From Across the Web
IncompatibleClassChangeError with delta lake 1.2.0
java.lang.IncompatibleClassChangeError: class org.apache.spark.sql.catalyst.plans.logical.DeltaDelete has interface ...
Read more >class org.apache.spark.sql.catalyst.plans.logical ...
DynamicFileFilterWithCardinalityCheck has interface org.apache.spark.sql.catalyst.plans.logical.BinaryNode as super class at java.lang.
Read more >Solved: User class threw exception: org.apache.spark.sql.A...
Solved: Hi community, We run Spark 2.3.2 on Hadoop 3.1.1. We use external ORC tables stored on HDFS. We are - 313613.
Read more >CatalystScan (Spark 3.2.0 JavaDoc) - Apache Spark
Compared to PrunedFilteredScan , this operator receives the raw expressions from the org.apache.spark.sql.catalyst.plans.logical.LogicalPlan .
Read more >distinct() in pyspark leads to class cast exception in Spark 3.2 ...
LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@rhazegh Delta Lake doesn’t support Spark 3.2.0 yet. We are actively working on this and will make a new release to support Spark 3.2.0 soon.
@mattfysh yes! This was the issue: https://github.com/delta-io/delta/issues/1404
Fix is to replace
io.delta:delta-core_2.12:1.0.0
withio.delta:delta-core_2.12:2.1.0
on https://delta.io/learn/getting-started