[SUPPORT] Read Hudi data with flink-1.13.6 and report java.lang.NoSuchMethodError
See original GitHub issueDescribe the problem you faced It is normal to write data with flink, and you can read the written data with hive, but you can’t read it with flink itself,The following exception is reported:
Environment Description
-
Hudi version : 0.11.0
-
Flink version : 1.13.6
-
Hive version : 2.1.1-cdh6.2.0
-
Hadoop version : 3.0.0-cdh6.2.0
-
Storage (HDFS/S3/GCS…) : HDFS
-
Running on Docker? (yes/no) : no
Additional context Hudi CREATE TABLE statement:
CREATE TABLE t2(
uuid VARCHAR(20) PRIMARY KEY NOT ENFORCED,
name VARCHAR(10),
age INT,
ts timestamp(3),
part VARCHAR(20)
)
WITH (
'connector' = 'hudi',
'path' = 'hdfs:///user/hive/warehouse/hudi.db/t2',
'table.type' = 'MERGE_ON_READ',
'hoodie.datasource.write.recordkey.field'= 'uuid',
'write.precombine.field'= 'ts',
'write.tasks' = '1',
'write.rate.limit' = '2000',
'compaction.tasks' = '1',
'compaction.async.enabled' = 'true',
'compaction.trigger.strategy' = 'num_commits',
'compaction.delta_commits' = '1',
'changleog.enabled' = 'true',
'read.streaming.enabled'= 'true',
'read.streaming.check-interval'= '3',
'hive_sync.enable' = 'true', -- Required. To enable hive synchronization
'hive_sync.mode' = 'hms', -- Required. Setting hive sync mode to hms, default jdbc
'hive_sync.metastore.uris' = 'thrift://xxx:9083', -- Required. The port need set on hive-site.xml
'hive_sync.jdbc_url' = 'jdbc://hive2://xxx:10000',
'hive_sync.table'='t2', -- required, hive table name
'hive_sync.db'='hudi',
'hive_sync.username' = '',
'hive_sync.password' = '',
'hive_sync.support_timstamp' = 'true'
);
Query T2 table with SQL client of Flink:
select * from t2;
Stacktrace More error information
2022-05-10 15:41:49,278 INFO org.apache.hadoop.io.compress.CodecPool [] - Got brand-new decompressor [.gz]
2022-05-10 15:41:49,463 WARN org.apache.flink.runtime.taskmanager.Task [] - split_reader -> NotNullEnforcer(fields=[uuid]) (3/4)#0 (de4312b557275e636b33cacdeca84148) switched from RUNNING to FAILED with failure cause: java.lang.NoSuchMethodError: org.apache.parquet.bytes.BytesInput.toInputStream()Lorg/apache/parquet/bytes/ByteBufferInputStream;
at org.apache.flink.formats.parquet.vector.reader.AbstractColumnReader.readPageV1(AbstractColumnReader.java:211)
at org.apache.flink.formats.parquet.vector.reader.AbstractColumnReader.readToVector(AbstractColumnReader.java:156)
at org.apache.hudi.table.format.cow.vector.reader.ParquetColumnarRowSplitReader.nextBatch(ParquetColumnarRowSplitReader.java:311)
at org.apache.hudi.table.format.cow.vector.reader.ParquetColumnarRowSplitReader.ensureBatch(ParquetColumnarRowSplitReader.java:287)
at org.apache.hudi.table.format.cow.vector.reader.ParquetColumnarRowSplitReader.reachedEnd(ParquetColumnarRowSplitReader.java:266)
at org.apache.hudi.table.format.mor.MergeOnReadInputFormat$BaseFileOnlyFilteringIterator.reachedEnd(MergeOnReadInputFormat.java:509)
at org.apache.hudi.table.format.mor.MergeOnReadInputFormat.reachedEnd(MergeOnReadInputFormat.java:245)
at org.apache.hudi.source.StreamReadOperator.consumeAsMiniBatch(StreamReadOperator.java:186)
at org.apache.hudi.source.StreamReadOperator.processSplits(StreamReadOperator.java:166)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsNonBlocking(MailboxProcessor.java:359)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:323)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:202)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:684)
at org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:639)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:623)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:779)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
at java.lang.Thread.run(Thread.java:748)
Issue Analytics
- State:
- Created a year ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
java.lang.NoSuchMethodError in Flink - Stack Overflow
There is a conflict with dependencies. Apache Flink loads many classes by default into its classpath. Please read this article ...
Read more >Flink Guide - Apache Hudi
This guide helps you quickly start using Flink on Hudi, and learn different modes for reading/writing Hudi by Flink: Quick Start : Read...
Read more >FlinkCDC-Hudi:Mysql数据实时入湖全攻略一:初试风云
一、背景FlinkCDC是基于Flink开发的变化数据获取组件(Change data capture),目前支持mysql、PostgreSQL、mongoDB、TiDB、Oracle等数据库的同步。
Read more >Create a low-latency source-to-data lake pipeline using ...
Create a low-latency source-to-data lake pipeline using Amazon MSK Connect, Apache Flink, and Apache Hudi. by Ali Alemi | on 01 MAR 2022...
Read more >Technical questions and Answers
Everyone, Flink CDC flink1.13.6 connects to oracle only supports SID mode, ... I use to read Oracle11c data has a bug when submitting...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
i got same situation, there is my config: Hudi version : 0.11.0
Flink version : 1.13.6
Hive version : 2.1.1-cdh6.3.2
Hadoop version : 3.0.0-cdh6.3.2
Storage (HDFS/S3/GCS…) : HDFS
i append flink-parquet dependency can fix it. if you could not find class conflict maybe try my wey
We removed the parquet shade pattern since 0.11.0 for
hudi-flink-bundle
, maybe we should add it back 😃