[SUPPORT] Hudi 0.10.1 throws NoSuchMethodError: org.apache.spark.sql.execution.datasources.FileStatusCache.putLeafFiles(Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/fs/FileStatus;)V
See original GitHub issueDescribe the problem you faced I’m using Hudi 0.10.1 with Databricks (9.1 LTS (includes Apache Spark 3.1.2, Scala 2.12)
Trying to load a hudi data set on S3 but failed with this error
NoSuchMethodError: org.apache.spark.sql.execution.datasources.FileStatusCache.putLeafFiles(Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/fs/FileStatus;)V
Environment Description
-
Hudi version : 0.10.1
-
Spark version : 3.1.2
-
Hive version : N/A
-
Hadoop version : N/A
-
Storage (HDFS/S3/GCS…) : S3
-
Running on Docker? (yes/no) : No
Stacktrace
NoSuchMethodError: org.apache.spark.sql.execution.datasources.FileStatusCache.putLeafFiles(Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/fs/FileStatus;)V
at org.apache.hudi.HoodieFileIndex.$anonfun$loadPartitionPathFiles$4(HoodieFileIndex.scala:604)
at org.apache.hudi.HoodieFileIndex.$anonfun$loadPartitionPathFiles$4$adapted(HoodieFileIndex.scala:602)
at scala.collection.immutable.Map$Map1.foreach(Map.scala:128)
at org.apache.hudi.HoodieFileIndex.loadPartitionPathFiles(HoodieFileIndex.scala:602)
at org.apache.hudi.HoodieFileIndex.refresh0(HoodieFileIndex.scala:360)
at org.apache.hudi.HoodieFileIndex.<init>(HoodieFileIndex.scala:157)
at org.apache.hudi.DefaultSource.getBaseFileOnlyView(DefaultSource.scala:199)
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:119)
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:69)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:390)
Issue Analytics
- State:
- Created a year ago
- Comments:15 (10 by maintainers)
Top Results From Across the Web
GitBox - The Mail Archive
[GitHub] [hudi] zhouhaijia opened a new issue, #6137: [SUPPORT] Hudi ... NoSuchMethodError: org.apache.spark.sql.execution.datasources.
Read more >Release 0.10.1 - Apache Hudi
1, we made the Spark 3 version explicit in the bundle name and published a new bundle for Spark 3.0.x. Specifically, these 2...
Read more >org.apache.spark.sql.AnalysisException" while writing data to ...
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runBatch$17(MicroBatchExecution.scala:805).
Read more >No Such Method Error with Spark and Hudi | AWS re:Post
We have an EMR cluster that we launch to process data into a Hudi data set. ... which calls a spark.sql query, then...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Here is one fix that is in progress, #7088, by optionally falling back to using Spark’s data source with
HoodieROTablePathFilter
(how data source read is implemented pre-0.9.0 release) instead ofHoodieFileIndex
, so queries on Hudi tables can work in Databricks runtime.hudi-spark3.1.2-bundle_2.12:0.10.1
But I believe this happens on other higher versions of Hudi bundle jars as well.