question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

See original GitHub issue

I use Spark Sql to insert record to hudi. It work for a short time. However It throw “java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()” after a while.

Steps to reproduce the behavior:

I wrote a scala fuction to make instert sql


 private def write2Table(row: Row)(implicit sparkSession: SparkSession): Unit = {

    val filedSql = new StringBuilder()



    val filed = row.schema.fields.map(field =>{
      var value = ""
      if(row.getString(row.fieldIndex(field.name)).isEmpty){
        value = s"""null as ${field.name}"""
        value
      }else{
      field.dataType match {
        case StringType =>{value=s"""\'${row.getAs[String](field.name)}\' as ${field.name}"""}
        case BooleanType =>{value=s"""${row.getAs[Boolean](field.name)} as ${field.name}"""}
        case ByteType =>{value=s"""${row.getAs[Byte](field.name)} as ${field.name}"""}
        case ShortType =>{value=s"""${row.getAs[Short](field.name)} as ${field.name}"""}
        case IntegerType =>{value=s"""${row.getAs[Int](field.name)} as ${field.name}"""}
        case LongType =>{value=s"""${row.getAs[Long](field.name)} as ${field.name}"""}
        case FloatType =>{value=s"""${row.getAs[Float](field.name)} as ${field.name}"""}
        case DoubleType =>{value=s"""${row.getAs[Double](field.name)} as ${field.name}"""}
        case DateType =>{value=s"""\'${row.getAs[String](field.name)}\' as ${field.name}"""}
        case TimestampType =>{value=s"""\'${row.getAs[String](field.name)}\' as ${field.name}"""}
      }
      value
    }}).mkString(",")

    val insertSql = s"""insert into ${row.getAs("database")}.${row.getAs("table")}_cow select ${filed};"""
    try{
      println(s""" 插入 ${row.getAs("table")}_cow;""")
      sparkSession.sql(insertSql)

    }catch{
      case ex:Throwable=> {
        println(row.prettyJson)
        println(insertSql)
        throw ex
      }
    }

  }
}

Then call it in foreachRDD() of a DSteam

saveRdd.foreachRDD ( rdd => {

      rdd.collect().foreach(x=>{

        //println(x.json)
//        println(x.schema.sql)

        val row = x._1
        chackAndCreateTable(row)

        if(x._2.equals("INSERT")){
          write2Table(row)
        }


      })

    })

Expected behavior

A clear and concise description of what you expected to happen.

Environment Description

Environment Description

Hudi version : 0.11

Spark version : 3.2.1

Hadoop version : 3.2.2

Storage (HDFS/S3/GCS…) : HDFS

Running on Docker? (yes/no) : no

Here is my config code:

      .appName("SparkHudi")
      .master("spark://hadoop111:7077")
      .config("spark.sql.warehouse.dir","/user/hive/warehouse")
      .config("spark.serialize","org.apache.spark.serializer.KryoSerializer")
      .config("spark.sql.extensions","org.apache.spark.sql.hudi.HoodieSparkSessionExtension")
      .config("spark.sql.catalog.spark_catalog","org.apache.spark.sql.hudi.catalog.HoodieCatalog")
      .config("spark.sql.legacy.exponentLiteralAsDecimal.enabled",true)
      .enableHiveSupport()
      .config("hive.metastore.uris","thrift://19.11.8.111:9083")
      .getOrCreate()

spark-submit:

spark-submit   --jars /home/kadm/module/hudi-0.11/packaging/hudi-spark-bundle/target/hudi-spark3.2-bundle_2.12-0.11.0.jar  --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.2.1,org.apache.spark:spark-avro_2.12:3.2.1,org.apache.kafka:kafka-clients:3.1.0  --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'   --conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'   --conf 'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'  --conf "spark.driver.extraJavaOptions=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5445"   --master spark://hadoop111:7077 SparkHudi-1.0-SNAPSHOT-shaded.jar

Stacktrace

22/06/06 09:47:13 ERROR Javalin: Exception occurred while servicing http-request
java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()Lorg/apache/hadoop/hdfs/DFSInputStream$ReadStatistics;
	at org.apache.hudi.org.apache.hadoop.hbase.io.FSDataInputStreamWrapper.updateInputStreamStatistics(FSDataInputStreamWrapper.java:249)
	at org.apache.hudi.org.apache.hadoop.hbase.io.FSDataInputStreamWrapper.close(FSDataInputStreamWrapper.java:296)
	at org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.closeStreams(HFileBlock.java:1825)
	at org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFilePreadReader.close(HFilePreadReader.java:107)
	at org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.close(HFileReaderImpl.java:1421)
	at org.apache.hudi.io.storage.HoodieHFileReader.close(HoodieHFileReader.java:218)
	at org.apache.hudi.metadata.HoodieBackedTableMetadata.closeReader(HoodieBackedTableMetadata.java:574)
	at org.apache.hudi.metadata.HoodieBackedTableMetadata.close(HoodieBackedTableMetadata.java:567)
	at org.apache.hudi.metadata.HoodieBackedTableMetadata.close(HoodieBackedTableMetadata.java:554)
	at org.apache.hudi.metadata.HoodieMetadataFileSystemView.close(HoodieMetadataFileSystemView.java:83)
	at org.apache.hudi.common.table.view.FileSystemViewManager.clearFileSystemView(FileSystemViewManager.java:86)
	at org.apache.hudi.timeline.service.handlers.FileSliceHandler.refreshTable(FileSliceHandler.java:118)
	at org.apache.hudi.timeline.service.RequestHandler.lambda$registerFileSlicesAPI$19(RequestHandler.java:390)
	at org.apache.hudi.timeline.service.RequestHandler$ViewHandler.handle(RequestHandler.java:501)
	at io.javalin.security.SecurityUtil.noopAccessManager(SecurityUtil.kt:22)
	at io.javalin.Javalin.lambda$addHandler$0(Javalin.java:606)
	at io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:46)
	at io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:17)
	at io.javalin.core.JavalinServlet$service$1.invoke(JavalinServlet.kt:143)
	at io.javalin.core.JavalinServlet$service$2.invoke(JavalinServlet.kt:41)
	at io.javalin.core.JavalinServlet.service(JavalinServlet.kt:107)
	at io.javalin.core.util.JettyServerUtil$initialize$httpHandler$1.doHandle(JettyServerUtil.kt:72)
	at org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
	at org.apache.hudi.org.apache.jetty.servlet.ServletHandler.doScope(ServletHandler.java:482)
	at org.apache.hudi.org.apache.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1668)
	at org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
	at org.apache.hudi.org.apache.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
	at org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
	at org.apache.hudi.org.apache.jetty.server.handler.HandlerList.handle(HandlerList.java:61)
	at org.apache.hudi.org.apache.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174)
	at org.apache.hudi.org.apache.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
	at org.apache.hudi.org.apache.jetty.server.Server.handle(Server.java:502)
	at org.apache.hudi.org.apache.jetty.server.HttpChannel.handle(HttpChannel.java:370)
	at org.apache.hudi.org.apache.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
	at org.apache.hudi.org.apache.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
	at org.apache.hudi.org.apache.jetty.io.FillInterest.fillable(FillInterest.java:103)
	at org.apache.hudi.org.apache.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
	at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
	at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
	at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
	at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
	at org.apache.hudi.org.apache.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:367)
	at org.apache.hudi.org.apache.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:782)
	at org.apache.hudi.org.apache.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:918)
	at java.lang.Thread.run(Thread.java:748)

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:30 (13 by maintainers)

github_iconTop GitHub Comments

6reactions
shuai-xucommented, Jul 7, 2022

This problem is caused by that the hbase 2.4.9 jars in maven source are compiled with hadoop-2.7. A quick fix is to compile hbase with hadoop 3.* , mvn install it, and then compile hudi.

5reactions
dohongdayicommented, Jul 13, 2022

I resolved this by my own, by packaging a new version of hbase 2.4.9 with our Hadoop 3 version with the following command:

mvn clean install -Denforcer.skip -DskipTests -Dhadoop.profile=3.0 -Psite-install-step

then, changed hbase.defaults.for.version in hudi-common/src/main/resources/hbase-site.xml

after that, changed hbase.version in pom.xml of Hudi, used versions-maven-plugin to create a new Hudi version, and package Hudi again.

Read more comments on GitHub >

github_iconTop Results From Across the Web

java.lang.NoSuchMethodError: org.apache.hadoop.conf ...
Following is the error I encounter: "main" java.lang.NoSuchMethodError: org.apache.hadoop.conf.Configuration.reloadExistingConfigurations()V at ...
Read more >
HdfsDataInputStream (Apache Hadoop Main 3.3.4 API)
Method Summary ; org.apache.hadoop.hdfs.protocol.DatanodeInfo, getCurrentDatanode(). Get the datanode from which the stream is currently reading. ; org.apache.
Read more >
[jira] [Commented] (PHOENIX-5993) HBase 2.2.5 public ...
Thread.run(Thread.java:748) Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.
Read more >
JA018 java.lang.NoSuchMethodError: org.apache.ooz... - 69757
Have a 2 node cluster, things seem to work, HOWEVER, workflows occasionally (like 70% of the time) fail with the following error:.
Read more >
Mark Needham
Hadoop : HDFS - java.lang.NoSuchMethodError: org.apache.hadoop.fs.FSOutputSummer.<init>(Ljava/util/zip/Checksum;II)V.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found