question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Data class serialization fails with `v1.0.3` and Spark `3.2.0`

See original GitHub issue

Hello,

It seems like there is an issue with the above-mentioned versions of Spark and kotlin-spark-api. When trying to convert a Dataset<Row> to Dataset<MyDataClass> using the as method, I get the following error

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.expressions.objects.Invoke.<init>(Lorg/apache/spark/sql/catalyst/expressions/Expression;Ljava/lang/String;Lorg/apache/spark/sql/types/DataType;Lscala/collection/Seq;Lscala/collection/Seq;ZZ)V
	at org.apache.spark.sql.KotlinReflection$.$anonfun$serializerFor$16(KotlinReflection.scala:759)
	at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
	at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
	at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
	at [scala.collection.TraversableLike.map](http://scala.collection.traversablelike.map/)(TraversableLike.scala:286)
	at [scala.collection.TraversableLike.map](http://scala.collection.traversablelike.map/)$(TraversableLike.scala:279)
	at scala.collection.mutable.ArrayOps$[ofRef.map](http://ofref.map/)(ArrayOps.scala:198)
	at org.apache.spark.sql.KotlinReflection$.$anonfun$serializerFor$1(KotlinReflection.scala:749)
	at scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:73)
	at org.apache.spark.sql.KotlinReflection.cleanUpReflectionObjects(KotlinReflection.scala:1013)
	at org.apache.spark.sql.KotlinReflection.cleanUpReflectionObjects$(KotlinReflection.scala:1012)
	at org.apache.spark.sql.KotlinReflection$.cleanUpReflectionObjects(KotlinReflection.scala:47)
	at org.apache.spark.sql.KotlinReflection$.serializerFor(KotlinReflection.scala:591)
	at org.apache.spark.sql.KotlinReflection$.serializerFor(KotlinReflection.scala:578)
	at org.apache.spark.sql.KotlinReflection.serializerFor(KotlinReflection.scala)
	at org.jetbrains.kotlinx.spark.api.ApiV1Kt.kotlinClassEncoder(ApiV1.kt:183)
	at org.jetbrains.kotlinx.spark.api.ApiV1Kt.generateEncoder(ApiV1.kt:170)

When using the latest code from the spark-3.2 branch, things are working fine, so I’m assuming that you have already found this issue and fixed it. Would it be possible to cut a release containing that fix? Sorry I haven’t had the time to submit a failing test as an example, but since the last release it seems like you have added quite a few, so this might be already covered. Please let me know if you think it would be valuable to add a test for that particular case.

Thanks!

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:9

github_iconTop GitHub Comments

4reactions
Jolanrensencommented, May 13, 2022

@devictr Very soon! We expect to be able to release version 1.1.0 in about a week.

1reaction
Jolanrensencommented, Jul 3, 2022

Unfortunately we are indeed tied to patch versions of Spark. This is because we “replace” the core functionality of the catalyst encoding of classes to add support for Kotlin data classes etc. This means we must match on the byte code level. We are working, at the moment, to be able to provide exact versions for all Scala and all Spark versions. Most of the time the code base can namely be identical, but we do need a separate build for it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Tuning - Spark 3.2.0 Documentation - Apache Spark
This guide will cover two main topics: data serialization, which is crucial for good network performance and can also reduce memory use, and...
Read more >
Failing to run Better-Minecraft Fabric Edition Modpack #1491
Describe the problem After downloading the pack https://www.curseforge.com/minecraft/modpacks/better-minecraft-fabric/files and changing ...
Read more >
Serialization Exception in Apache Spark and Java
I am illustrating this for the member class below: public Member implements Serializable{ ... } This will allow you to serialize your object's ......
Read more >
Databricks Runtime 10.3 (Unsupported) - Azure
Databricks Runtime 10.3 includes Apache Spark 3.2.1. ... [SPARK-37556] [SQL] Deser void class fail with Java serialization; [SPARK-37520] ...
Read more >
Databricks Runtime 10.3 (Unsupported)
Release notes about Databricks Runtime 10.3, powered by Apache Spark. ... [SPARK-37556] [SQL] Deser void class fail with Java serialization.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found