question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[SUPPORT] Hudi error while running HoodieMultiTableDeltaStreamer: Commit 20220809112130103 failed and rolled-back !

See original GitHub issue

Environment: AWS EMR Cluster: ‘emr-6.7.0’ version Hudi - 0.11.0 Hive - 3.1.3 Hadoop: 3.2.1 Spark: 3.2.1

Steps to reproduce the behavior: I am running Multi table streamer and as of now I am running it for 1 table only. We are using SQL server for CDC and debezium connector to push into Kafka and then reading kafka to pull records into hudi/hive tables.

kafka-source.properties

hoodie.deltastreamer.ingestion.tablesToBeIngested=default.rrm*******
hoodie.deltastreamer.ingestion.default.rrmencounter.configFile=s3://hudi-multistreamer-roobal/hudi-ingestion-config/rrm*******-config.properties
auto.offset.reset=earliest
hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.TimestampBasedKeyGenerator
hoodie.deltastreamer.source.kafka.value.deserializer.class=io.confluent.kafka.serializers.KafkaAvroDeserializer
schema.registry.url=http://xx.xxx.xx.xxx:8080/apis/ccompat/v6
bootstrap.servers=b-2.XXXXXXXXXX.amazonaws.com:9092

rrmXXXXX-config.properties

hoodie.datasource.write.recordkey.field=rrm******id
hoodie.datasource.write.partitionpath.field=rec******
hoodie.deltastreamer.ingestion.targetBasePath=s3://hudi-multistreamer-roobal/hudi/rrmencounter
hoodie.datasource.hive_sync.database=default
hoodie.datasource.hive_sync.table=rrm*******
hoodie.datasource.hive_sync.partition_fields=rec******
hoodie.datasource.hive_sync.support_timestamp=true
hoodie.deltastreamer.keygen.timebased.timestamp.type=EPOCHMILLISECONDS
hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd
hoodie.deltastreamer.keygen.timebased.timezone=GMT+8:00
hoodie.datasource.write.keygenerator.consistent.logical.timestamp.enabled=true
hoodie.datasource.write.hive_style_partitioning=true
hoodie.deltastreamer.source.kafka.topic=ROOBXXXXXX.dbo.rrm*******
hoodie.deltastreamer.schemaprovider.registry.url=http://xx.xxx.xx.xxx:8080/apis/ccompat/v6/subjects/ROOB*****.dbo.rrm******-value/versions/latest
hoodie.deltastreamer.schemaprovider.registry.targetUrl=http://xx.xxx.xx.xxx:8080/apis/ccompat/v6/subjects/ROOBxxxxxx.dbo.rrm******-value/versions/latest

Spark command:

spark-submit  \
  --jars "/usr/lib/spark/external/lib/spark-avro.jar" \
  --master yarn --deploy-mode client \
  --class org.apache.hudi.utilities.deltastreamer.HoodieMultiTableDeltaStreamer s3://XXXXX/hudi-utilities-bundle_2.12-0.11.0_edfx.jar \
  --props s3://xxxxxxxx/hudi-ingestion-config/kafka-source.properties \
  --config-folder s3://xxxxxxxx/hudi-ingestion-config/ \
  --payload-class org.apache.hudi.common.model.debezium.MssqlDebeziumAvroPayload \
  --schemaprovider-class org.apache.hudi.utilities.schema.SchemaRegistryProvider \
  --source-class org.apache.hudi.utilities.sources.debezium.MssqlDebeziumSource \
  --source-ordering-field _event_lsn \
  --min-sync-interval-seconds 60 \
  --enable-hive-sync \
  --table-type COPY_ON_WRITE \
  --base-path-prefix s3:///xxxxxxxx/hudi \
  --target-table rrm******  --continuous \
  --op UPSERT

Output: It creates multilevel directories based on partition like 2022 / 06 / 11 but neither hive table nor parquet data file is getting generated.

Please note: HoodieDeltaStreamer is working fine for same table with following spark command but HoodieMultiTableDeltaStreamer is not working. May be I am missing something

Following works fine for HoodieDeltaStreamer :

spark-submit  \
  --jars "/usr/lib/spark/external/lib/spark-avro.jar" \
  --master yarn --deploy-mode client \
  --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer s3://*******/hudi-utilities-bundle_2.12-0.11.0_edfx.jar \
  --table-type COPY_ON_WRITE --op UPSERT \
  --target-base-path s3://hudi-multistreamer-roobal/hudi/rrm******** \
  --target-table rrm*******  --continuous \
  --min-sync-interval-seconds 60 \
  --source-class org.apache.hudi.utilities.sources.debezium.MssqlDebeziumSource \
  --source-ordering-field _event_lsn \
  --payload-class org.apache.hudi.common.model.debezium.MssqlDebeziumAvroPayload \
  --hoodie-conf schema.registry.url=http://xx.xxx.xx.xxx:8080/apis/ccompat/v6 \
  --hoodie-conf hoodie.deltastreamer.schemaprovider.registry.url=http://xx.xxx.xx.xxx:8080/apis/ccompat/v6/subjects/ROOBxxxxxxxx.dbo.rrm*********-value/versions/latest \
  --hoodie-conf hoodie.deltastreamer.source.kafka.value.deserializer.class=io.confluent.kafka.serializers.KafkaAvroDeserializer \
  --hoodie-conf hoodie.deltastreamer.source.kafka.topic=ROOBxxxxxxxx.dbo.rrm******** \
  --hoodie-conf auto.offset.reset=earliest \
  --hoodie-conf hoodie.datasource.write.recordkey.field=rrm*******id \
  --hoodie-conf hoodie.datasource.write.partitionpath.field=rec******* \
  --hoodie-conf hoodie.deltastreamer.keygen.timebased.timestamp.type=EPOCHMILLISECONDS \
  --hoodie-conf hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd \
  --hoodie-conf hoodie.deltastreamer.keygen.timebased.timezone=GMT+8:00 \
  --hoodie-conf hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.TimestampBasedKeyGenerator \
  --hoodie-conf hoodie.datasource.hive_sync.support_timestamp=true \
  --enable-hive-sync \
  --hoodie-conf hoodie.datasource.write.hive_style_partitioning=true \
  --hoodie-conf hoodie.datasource.hive_sync.database=default \
  --hoodie-conf hoodie.datasource.hive_sync.table=rrm******** \
  --hoodie-conf hoodie.datasource.hive_sync.partition_fields=rec******* \
  --hoodie-conf hoodie.datasource.write.keygenerator.consistent.logical.timestamp.enabled=true \
  --hoodie-conf bootstrap.servers=b-2.xxxxxxxxxxxxxxxxxxxx.amazonaws.com:9092

Stacktrace

22/08/09 11:21:53 INFO SparkContext: Starting job: collect at SparkHoodieBackedTableMetadataWriter.java:166
22/08/09 11:21:53 INFO DAGScheduler: Job 21 finished: collect at SparkHoodieBackedTableMetadataWriter.java:166, took 0.000510 s
22/08/09 11:21:53 INFO MultipartUploadOutputStream: close closed:false s3://xxxxxxxxxxxxxxxx/hudi/rrm**************/.hoodie/20220809112149403.rollback
22/08/09 11:21:53 INFO SparkContext: Starting job: collectAsMap at HoodieSparkEngineContext.java:151
22/08/09 11:21:53 INFO DAGScheduler: Got job 22 (collectAsMap at HoodieSparkEngineContext.java:151) with 2 output partitions
22/08/09 11:21:53 INFO DAGScheduler: Final stage: ResultStage 51 (collectAsMap at HoodieSparkEngineContext.java:151)
22/08/09 11:21:53 INFO DAGScheduler: Parents of final stage: List()
22/08/09 11:21:53 INFO DAGScheduler: Missing parents: List()
22/08/09 11:21:53 INFO DAGScheduler: Submitting ResultStage 51 (MapPartitionsRDD[100] at mapToPair at HoodieSparkEngineContext.java:148), which has no missing parents
22/08/09 11:21:53 INFO MemoryStore: Block broadcast_29 stored as values in memory (estimated size 115.5 KiB, free 911.4 MiB)
22/08/09 11:21:53 INFO MemoryStore: Block broadcast_29_piece0 stored as bytes in memory (estimated size 41.1 KiB, free 911.4 MiB)
22/08/09 11:21:53 INFO BlockManagerInfo: Added broadcast_29_piece0 in memory on ip-10-151-46-163.us-west-2.compute.internal:40011 (size: 41.1 KiB, free: 912.2 MiB)
22/08/09 11:21:53 INFO SparkContext: Created broadcast 29 from broadcast at DAGScheduler.scala:1518
22/08/09 11:21:53 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 51 (MapPartitionsRDD[100] at mapToPair at HoodieSparkEngineContext.java:148) (first 15 tasks are for partitions Vector(0, 1))
22/08/09 11:21:53 INFO YarnScheduler: Adding task set 51.0 with 2 tasks resource profile 0
22/08/09 11:21:53 INFO TaskSetManager: Starting task 0.0 in stage 51.0 (TID 2020) (ip-10-151-46-136.us-west-2.compute.internal, executor 1, partition 0, PROCESS_LOCAL, 4440 bytes) taskResourceAssignments Map()
22/08/09 11:21:53 INFO TaskSetManager: Starting task 1.0 in stage 51.0 (TID 2021) (ip-10-151-46-136.us-west-2.compute.internal, executor 1, partition 1, PROCESS_LOCAL, 4436 bytes) taskResourceAssignments Map()
22/08/09 11:21:53 INFO BlockManagerInfo: Added broadcast_29_piece0 in memory on ip-10-151-46-136.us-west-2.compute.internal:34971 (size: 41.1 KiB, free: 1846.7 MiB)
22/08/09 11:21:53 INFO TaskSetManager: Finished task 1.0 in stage 51.0 (TID 2021) in 52 ms on ip-10-151-46-136.us-west-2.compute.internal (executor 1) (1/2)
22/08/09 11:21:53 INFO TaskSetManager: Finished task 0.0 in stage 51.0 (TID 2020) in 91 ms on ip-10-151-46-136.us-west-2.compute.internal (executor 1) (2/2)
22/08/09 11:21:53 INFO YarnScheduler: Removed TaskSet 51.0, whose tasks have all completed, from pool
22/08/09 11:21:53 INFO DAGScheduler: ResultStage 51 (collectAsMap at HoodieSparkEngineContext.java:151) finished in 0.101 s
22/08/09 11:21:53 INFO DAGScheduler: Job 22 is finished. Cancelling potential speculative or zombie tasks for this job
22/08/09 11:21:53 INFO YarnScheduler: Killing all running tasks in stage 51: Stage finished
22/08/09 11:21:53 INFO DAGScheduler: Job 22 finished: collectAsMap at HoodieSparkEngineContext.java:151, took 0.105640 s
22/08/09 11:21:53 ERROR HoodieDeltaStreamer: Shutting down delta-sync due to exception
org.apache.hudi.exception.HoodieException: Commit 20220809112130103 failed and rolled-back !
        at org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:649)
        at org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:331)
        at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:675)
        at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
22/08/09 11:21:53 ERROR HoodieAsyncService: Service shutdown with error
java.util.concurrent.ExecutionException: org.apache.hudi.exception.HoodieException: Commit 20220809112130103 failed and rolled-back !
        at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
        at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
        at org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103)
        at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:189)
        at org.apache.hudi.common.util.Option.ifPresent(Option.java:97)
        at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:186)
        at org.apache.hudi.utilities.deltastreamer.HoodieMultiTableDeltaStreamer.sync(HoodieMultiTableDeltaStreamer.java:416)
        at org.apache.hudi.utilities.deltastreamer.HoodieMultiTableDeltaStreamer.main(HoodieMultiTableDeltaStreamer.java:247)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1000)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1089)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1098)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.hudi.exception.HoodieException: Commit 20220809112130103 failed and rolled-back !
        at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:709)
        at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.hudi.exception.HoodieException: Commit 20220809112130103 failed and rolled-back !
        at org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:649)
        at org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:331)
        at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:675)
        ... 4 more
22/08/09 11:21:53 ERROR HoodieMultiTableDeltaStreamer: error while running MultiTableDeltaStreamer for table: rrm********
org.apache.hudi.exception.HoodieException: org.apache.hudi.exception.HoodieException: Commit 20220809112130103 failed and rolled-back !
        at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:191)
        at org.apache.hudi.common.util.Option.ifPresent(Option.java:97)
        at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:186)
        at org.apache.hudi.utilities.deltastreamer.HoodieMultiTableDeltaStreamer.sync(HoodieMultiTableDeltaStreamer.java:416)
        at org.apache.hudi.utilities.deltastreamer.HoodieMultiTableDeltaStreamer.main(HoodieMultiTableDeltaStreamer.java:247)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1000)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1089)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1098)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.util.concurrent.ExecutionException: org.apache.hudi.exception.HoodieException: Commit 20220809112130103 failed and rolled-back !
        at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
        at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
        at org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103)
        at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:189)
        ... 16 more
Caused by: org.apache.hudi.exception.HoodieException: Commit 20220809112130103 failed and rolled-back !
        at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:709)
        at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.hudi.exception.HoodieException: Commit 20220809112130103 failed and rolled-back !
        at org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:649)
        at org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:331)
        at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:675)
        ... 4 more
22/08/09 11:21:53 INFO Javalin: Stopping Javalin ...
22/08/09 11:21:53 INFO AbstractConnector: Stopped Spark@64387c17{HTTP/1.1, (http/1.1)}{0.0.0.0:8090}
22/08/09 11:21:53 ERROR Javalin: Javalin failed to stop gracefully
java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
        at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
        at org.apache.hudi.org.eclipse.jetty.server.AbstractConnector.doStop(AbstractConnector.java:333)
        at org.apache.hudi.org.eclipse.jetty.server.AbstractNetworkConnector.doStop(AbstractNetworkConnector.java:88)
        at org.apache.hudi.org.eclipse.jetty.server.ServerConnector.doStop(ServerConnector.java:248)
        at org.apache.hudi.org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
        at org.apache.hudi.org.eclipse.jetty.server.Server.doStop(Server.java:450)
        at org.apache.hudi.org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
        at io.javalin.Javalin.stop(Javalin.java:195)
        at org.apache.hudi.timeline.service.TimelineService.close(TimelineService.java:334)
        at org.apache.hudi.client.embedded.EmbeddedTimelineService.stop(EmbeddedTimelineService.java:137)
        at org.apache.hudi.utilities.deltastreamer.DeltaSync.close(DeltaSync.java:876)
        at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.close(HoodieDeltaStreamer.java:811)
        at org.apache.hudi.common.util.Option.ifPresent(Option.java:97)
        at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.onDeltaSyncShutdown(HoodieDeltaStreamer.java:222)
        at org.apache.hudi.async.HoodieAsyncService.lambda$shutdownCallback$0(HoodieAsyncService.java:171)
        at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
        at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
        at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1609)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
22/08/09 11:21:53 INFO Javalin: Javalin has stopped
22/08/09 11:21:53 INFO SparkUI: Stopped Spark web UI at http://ip-10-151-46-163.us-west-2.compute.internal:8090
22/08/09 11:21:53 INFO YarnClientSchedulerBackend: Interrupting monitor thread
22/08/09 11:21:53 INFO YarnClientSchedulerBackend: Shutting down all executors
22/08/09 11:21:53 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
22/08/09 11:21:53 INFO YarnClientSchedulerBackend: YARN client scheduler backend Stopped
22/08/09 11:21:53 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
22/08/09 11:21:53 INFO MemoryStore: MemoryStore cleared
22/08/09 11:21:53 INFO BlockManager: BlockManager stopped
22/08/09 11:21:53 INFO BlockManagerMaster: BlockManagerMaster stopped
22/08/09 11:21:53 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
22/08/09 11:21:53 INFO SparkContext: Successfully stopped SparkContext
22/08/09 11:21:53 INFO ShutdownHookManager: Shutdown hook called
22/08/09 11:21:53 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-1ac0b680-8bcb-46e4-89f7-3da932646847
22/08/09 11:21:53 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-2b324903-b982-45d3-944a-647af07783f9

Add the stacktrace of the error. org.apache.hudi.exception.HoodieException: Commit 20220809112130103 failed and rolled-back !

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:20 (11 by maintainers)

github_iconTop GitHub Comments

2reactions
rmahindra123commented, Aug 23, 2022

@ROOBALJINDAL When using DebeziumSource, please do not set the --schemaprovider-class, since the schema is applied in the source. Can you try after removing that config? I see that you did not apply that config for HoodieDeltastreamer

2reactions
nsivabalancommented, Aug 10, 2022

@pratyakshsharma : if any documentation needs to be improved wrt multi table deltastreamer, can you work on it. CC @bhasudha . we have seen few users from the community raised questions/issues around schema registry configs for multi table deltastreamer. would be beneficial if we can enhance our docs.

Read more comments on GitHub >

github_iconTop Results From Across the Web

[GitHub] [hudi] rmahindra123 commented on issue #6348 ...
... commented on issue #6348: [SUPPORT] Hudi error while running HoodieMultiTableDeltaStreamer: Commit 20220809112130103 failed and rolled-back !
Read more >
Streaming Ingestion - Apache Hudi
The HoodieDeltaStreamer utility (part of hudi-utilities-bundle ) provides the way to ingest from different sources such as DFS or Kafka, with the following ......
Read more >
Hudi: Uber Engineering's Incremental Processing Framework ...
Failed compaction could write partial parquet files. This is handled by the query layer, which filters file versions based on the commit ......
Read more >
Work with a Hudi dataset - Amazon EMR - AWS Documentation
Hudi supports inserting, updating, and deleting data in Hudi datasets through Spark. For more information, see Writing Hudi tables in Apache Hudi ......
Read more >
Hudi Failed to delete for commit time for certain records
Not sure why it is throwing this error while trying to delete some records but not for others. Here is the Hudi write...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found