question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Demo : Unexpected result in some queries

See original GitHub issue

I have two problems with the master branch(commit: ae3c02fb3) my steps:

  1. use HDFSParquetImporter to import from hive to hudi

  2. use HoodieDeltaStreamer to import new data from kafka.(I add a option to allow missing checkpointStr) the config is same as #779, with --disable-compaction. And then select distinct _hoodie_commit_time from rt_table/ro_table only return the first the commit time (use max() to ensure no newer commits return); But there are newer .deltacommit file in the .hoodie folder.

  3. restart the spark job. open the spark UI, will find that the job hangs at collect at HoodieMergeOnReadTable.java:318 (It hangs every time)

org.apache.spark.api.java.AbstractJavaRDDLike.collect(JavaRDDLike.scala:45)
com.uber.hoodie.table.HoodieMergeOnReadTable.rollback(HoodieMergeOnReadTable.java:318)
com.uber.hoodie.HoodieWriteClient.doRollbackAndGetStats(HoodieWriteClient.java:884)
com.uber.hoodie.HoodieWriteClient.rollbackInternal(HoodieWriteClient.java:962)
com.uber.hoodie.HoodieWriteClient.rollback(HoodieWriteClient.java:773)
com.uber.hoodie.HoodieWriteClient.rollbackInflightCommits(HoodieWriteClient.java:1182)
com.uber.hoodie.HoodieWriteClient.startCommitWithTime(HoodieWriteClient.java:1050)
com.uber.hoodie.HoodieWriteClient.startCommit(HoodieWriteClient.java:1043)
com.uber.hoodie.utilities.deltastreamer.DeltaSync.startCommit(DeltaSync.java:406)
com.uber.hoodie.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:332)
com.uber.hoodie.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:227)
com.uber.hoodie.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:382)
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:32 (31 by maintainers)

github_iconTop GitHub Comments

1reaction
vinothchandarcommented, Jul 30, 2019

Weird that it is intermittent. @bhasudha lets meet and take a stab at this sometime… this also blocks #751 and related efforts , which blocks spark upgrade which blocks timestamp support 😃

0reactions
vinothchandarcommented, Dec 2, 2019

Should be fixed now.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Unexpected result of type of schema evaluation
Recently I need to update some data, both the data source and some columns. I managed to do that in the powerquery editor,...
Read more >
Unexpected Query Plan - Help - Apollo GraphQL
This composes just fine, but when I try to execute the following query, I get some strange behavior query Locations { locations {...
Read more >
SELECT LIMIT 1 query returns unexpected results when the ...
Demo : I create a table that has a clustered index (id) and another indexed column (x), and another non-indexed column (y).
Read more >
Unexpected Search Results | WordPress.org
I may be missing something obvious, but the search results coming back are not what I expected so I tried the a similar...
Read more >
SQL Query Returned Unexpected Results - UCSD Blink
Use this guide to troubleshoot queries that don't return the information you want. Note: If you don't know the SQL programming language, you ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found