question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[SUPPORT] Hudi 0.10.1 throws NoSuchMethodError: org.apache.spark.sql.execution.datasources.FileStatusCache.putLeafFiles(Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/fs/FileStatus;)V

See original GitHub issue

Describe the problem you faced I’m using Hudi 0.10.1 with Databricks (9.1 LTS (includes Apache Spark 3.1.2, Scala 2.12)

Trying to load a hudi data set on S3 but failed with this error NoSuchMethodError: org.apache.spark.sql.execution.datasources.FileStatusCache.putLeafFiles(Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/fs/FileStatus;)V

Environment Description

  • Hudi version : 0.10.1

  • Spark version : 3.1.2

  • Hive version : N/A

  • Hadoop version : N/A

  • Storage (HDFS/S3/GCS…) : S3

  • Running on Docker? (yes/no) : No

Stacktrace


NoSuchMethodError: org.apache.spark.sql.execution.datasources.FileStatusCache.putLeafFiles(Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/fs/FileStatus;)V
	at org.apache.hudi.HoodieFileIndex.$anonfun$loadPartitionPathFiles$4(HoodieFileIndex.scala:604)
	at org.apache.hudi.HoodieFileIndex.$anonfun$loadPartitionPathFiles$4$adapted(HoodieFileIndex.scala:602)
	at scala.collection.immutable.Map$Map1.foreach(Map.scala:128)
	at org.apache.hudi.HoodieFileIndex.loadPartitionPathFiles(HoodieFileIndex.scala:602)
	at org.apache.hudi.HoodieFileIndex.refresh0(HoodieFileIndex.scala:360)
	at org.apache.hudi.HoodieFileIndex.<init>(HoodieFileIndex.scala:157)
	at org.apache.hudi.DefaultSource.getBaseFileOnlyView(DefaultSource.scala:199)
	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:119)
	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:69)
	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:390)

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:15 (10 by maintainers)

github_iconTop GitHub Comments

2reactions
yihuacommented, Nov 1, 2022

Here is one fix that is in progress, #7088, by optionally falling back to using Spark’s data source withHoodieROTablePathFilter (how data source read is implemented pre-0.9.0 release) instead of HoodieFileIndex, so queries on Hudi tables can work in Databricks runtime.

1reaction
makamhareeshcommented, Aug 18, 2022

hudi-spark3.1.2-bundle_2.12:0.10.1

But I believe this happens on other higher versions of Hudi bundle jars as well.

Read more comments on GitHub >

github_iconTop Results From Across the Web

GitBox - The Mail Archive
[GitHub] [hudi] zhouhaijia opened a new issue, #6137: [SUPPORT] Hudi ... NoSuchMethodError: org.apache.spark.sql.execution.datasources.
Read more >
Release 0.10.1 - Apache Hudi
1, we made the Spark 3 version explicit in the bundle name and published a new bundle for Spark 3.0.x. Specifically, these 2...
Read more >
org.apache.spark.sql.AnalysisException" while writing data to ...
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runBatch$17(MicroBatchExecution.scala:805).
Read more >
No Such Method Error with Spark and Hudi | AWS re:Post
We have an EMR cluster that we launch to process data into a Hudi data set. ... which calls a spark.sql query, then...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found