[SUPPORT]KryoException when bulk insert into hudi with flink
See original GitHub issueWhen bulk insert into hudi with flink, flink job fail with Exception com.esotericsoftware.kryo.KryoException: java.lang.NullPointerException
– hudi table DDL CREATE TEMPORARY TABLE table_one ( imp_date string, id bigint, name string, ts timestamp(3) ) PARTITIONED BY (imp_date) WITH ( ‘connector’ = ‘hudi’, ‘path’ = ${hdfs_path}, ‘write.operation’ = ‘bulk_insert’, ‘table.type’ = ‘MERGE_ON_READ’, ‘hoodie.table.keygenerator.class’ = ‘org.apache.hudi.keygen.SimpleKeyGenerator’, ‘hoodie.datasource.write.recordkey.field’ = ‘id’, ‘write.precombine.field’ = ‘ts’, ‘hive_sync.enable’ = ‘true’, ‘hive_sync.mode’ = ‘hms’, ‘hive_sync.metastore.uris’ = ‘thrift://…’, ‘hive_sync.db’ = ‘hive_db’, ‘hive_sync.table’ = ‘table_one’, ‘hive_sync.partition_fields’ = ‘imp_date’, ‘hive_sync.partition_extractor_class’ = ‘org.apache.hudi.hive.MultiPartKeysValueExtractor’, ‘hoodie.datasource.write.hive_style_partitioning’ = ‘true’, ‘hoodie.metadata.enable’=‘true’ );
– insert SQL
insert into table_one
select
DATE_FORMAT(ts, ‘yyyyMMdd’) || cast(hour(ts) as string) as dt
,id
,name
,ts
from source_table;
Environment Description
-
Hudi version : 0.11 & 0.12
-
Flink version : 1.13
-
Storage (HDFS/S3/GCS…) : HDFS
Stacktrace
com.esotericsoftware.kryo.KryoException: java.lang.NullPointerException Serialization trace: cleaner (org.apache.flink.core.memory.MemorySegment) segments (org.apache.flink.table.data.binary.BinaryRowData) at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:82) at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) at com.esotericsoftware.kryo.Kryo.writeObjectOrNull(Kryo.java:577) at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.write(DefaultArraySerializers.java:320) at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.write(DefaultArraySerializers.java:289) at com.esotericsoftware.kryo.Kryo.writeObjectOrNull(Kryo.java:577) at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:68) at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495) at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:505) at org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.copy(KryoSerializer.java:266) at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:69) at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:46) at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:26) at org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:50) at org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:28) at org.apache.flink.table.runtime.util.StreamRecordCollector.collect(StreamRecordCollector.java:44) at org.apache.hudi.sink.bulk.sort.SortOperator.endInput(SortOperator.java:113) at org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.endOperatorInput(StreamOperatorWrapper.java:91) at org.apache.flink.streaming.runtime.tasks.OperatorChain.endInput(OperatorChain.java:441) at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:69) at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:427) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:204) at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:688) at org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:643) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:654) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:627) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:782) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at com.esotericsoftware.kryo.util.DefaultClassResolver.writeClass(DefaultClassResolver.java:80) at com.esotericsoftware.kryo.Kryo.writeClass(Kryo.java:488) at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:57) … 28 more
Issue Analytics
- State:
- Created a year ago
- Comments:9 (9 by maintainers)
Top GitHub Comments
override def getRegistration(klass: Class[_]) = if (isJavaLambda(klass)) { getClassResolver.getRegistration(classOf[ClosureSerializer.Closure]) } else super.getRegistration(klass)
This may be a flink problem, in com.twitter.chill.KryoBase, if the above code enter first branch, it will try to get serializer from a map in classResolver and not checking if the result is null, then it may cause a NPE.
Thanks, the problem expects to be fixed in #6571, feel free to reopen it if the problem still exists.