[SUPPORT] Exception while Querying Hive _rt table
See original GitHub issueDescribe the problem you faced
I am using Spark DF to persist Hudi Table and Hive sync is enabled. But when i query *_ro table all works fine but *_rt table is not working and giving exception.
-
I am using custom class to do
preCombine
and combineAndUpdateValue` , so I have included my jar file in ${Hive}/lib folder -
Also, tried to set conf in a Hive session
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
andset hive.fetch.task.conversion=none;
Hive - 2.3.7 Spark - 2 hudi-hadoop-mr-bundle-0.6.0.jar Hudi - 0.6.0
Actual Exception -> Caused by: java.lang.ClassCastException: org.apache.hudi.org.apache.avro.generic.GenericData$Record cannot be cast to org.apache.avro.generic.GenericRecord
CREATE EXTERNAL TABLE `bhuvan_123_ro`(
`_hoodie_commit_time` string,
`_hoodie_commit_seqno` string,
`_hoodie_record_key` string,
`_hoodie_partition_path` string,
`_hoodie_file_name` string,
`ts_ms` bigint,
`pincode` double,
`image_link` string,
`_id` string,
`op` string,
`a` string,
`b` string,
`c` string,
`d` string,
`e` double)
PARTITIONED BY (
`db_name` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hudi.hadoop.HoodieParquetInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
'file:/tmp/test/hudi-user-data/MOE_PRODUCT_INFO.bhuvan_123'
TBLPROPERTIES (
'last_commit_time_sync'='20201010202918',
'transient_lastDdlTime'='1602341935')
Time taken: 0.192 seconds, Fetched: 29 row(s)
Exception-
org.apache.hudi.exception.HoodieException: Unable to instantiate payload class
at org.apache.hudi.common.util.ReflectionUtils.loadPayload(ReflectionUtils.java:78) ~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at org.apache.hudi.common.util.SpillableMapUtils.convertToHoodieRecordPayload(SpillableMapUtils.java:116) ~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.processDataBlock(AbstractHoodieLogRecordScanner.java:277) ~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.processQueuedBlocksForInstant(AbstractHoodieLogRecordScanner.java:306) ~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.scan(AbstractHoodieLogRecordScanner.java:239) ~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:81) ~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.getMergedLogRecordScanner(RealtimeCompactedRecordReader.java:76) ~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.<init>(RealtimeCompactedRecordReader.java:55) ~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader.constructRecordReader(HoodieRealtimeRecordReader.java:70) ~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader.<init>(HoodieRealtimeRecordReader.java:47) ~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat.getRecordReader(HoodieParquetRealtimeInputFormat.java:186) ~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:376) ~[hive-exec-2.3.7.jar:2.3.7]
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:169) ~[hadoop-mapreduce-client-core-2.10.0.jar:?]
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:438) ~[hadoop-mapreduce-client-core-2.10.0.jar:?]
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) ~[hadoop-mapreduce-client-core-2.10.0.jar:?]
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:270) ~[hadoop-mapreduce-client-common-2.10.0.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_222]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_222]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_222]
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_222]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_222]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_222]
at org.apache.hudi.common.util.ReflectionUtils.loadPayload(ReflectionUtils.java:76) ~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
... 20 more
Caused by: java.lang.ClassCastException: org.apache.hudi.org.apache.avro.generic.GenericData$Record cannot be cast to org.apache.avro.generic.GenericRecord
at com.moengage.dpm.jobs.MergeHudiPayload.<init>(MergeHudiPayload.java:41) ~[dpm-feed-spark-jobs-1.0.10-rc0.jar:?]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_222]
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_222]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_222]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_222]
at org.apache.hudi.common.util.ReflectionUtils.loadPayload(ReflectionUtils.java:76) ~[hudi-hadoop-mr-bundle-0.6.0.jar:0.6.0]
Line Where Exception is thrown-
public MergeHudiPayload(Option<GenericRecord> record) {
this(record.isPresent() ? record.get() : null, (record1) -> 0); // natural order
}
Issue Analytics
- State:
- Created 3 years ago
- Comments:8 (3 by maintainers)
Top Results From Across the Web
Error while running query on HIVE; - Cloudera Community
FAILED: SemanticException Unable to fetch table sequence_table. Exception thrown when executing query : SELECT DISTINCT 'org.apache.hadoop.hive.metastore.model.
Read more >[#HUDI-467] Query RT Table in Hive found java.lang ... - Apache
Query RT Table in Hive found java.lang.NoClassDefFoundError Exception. Status: Assignee: Priority: Resolution: Closed. cdmikechen.
Read more >Troubleshooting Errors and Exceptions in Hive Jobs
This topic provides information about the errors and exceptions that you might encounter when running Hive jobs or applications.
Read more >Hive translator - select distinc count(*) is not supported
Hive does not support query like SELECT DISTINCT COUNT(*) ... Example query: SELECT DISTINCT COUNT(*) FROM Source.SmallA AS g_0 Exception: 0:23:47,771 WARN ...
Read more >How do I resolve "OutOfMemoryError" Hive Java heap space ...
The OutOfMemoryError exception usually happens during INSERT OVERWRITE commands when there's not enough heap space on hive-server2, the Hive ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@tandonraghav : Yes, you need to shade the jar containing the custom record payload. Here is some context http://hudi.apache.org/releases.html#release-highlights-1
Look for section starting with…
More Context: https://issues.apache.org/jira/browse/HUDI-519
@bvaradar Thanks for the help. I am able to resolve it by putting the shaded jar. I feel it should be documented well.
https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hudi-considerations.html & https://hudi.apache.org/docs/querying_data.html