Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Failed to publish segments because of [java.lang.RuntimeException: Aborting transaction!].

See original GitHub issue

Affected Version

>=0.15.1-incubating

Description

2019-09-28T12:17:34,157 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Encountered exception while running task.
java.util.concurrent.ExecutionException: org.apache.druid.java.util.common.ISE: Failed to publish segments because of [java.lang.RuntimeException: Aborting transaction!].
	at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) ~[guava-16.0.1.jar:?]
	at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) ~[guava-16.0.1.jar:?]
	at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) ~[guava-16.0.1.jar:?]
	at org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner.runInternal(SeekableStreamIndexTaskRunner.java:753) ~[druid-indexing-service-0.15.1-incubating.jar:0.15.1-incubating]
	at org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner.run(SeekableStreamIndexTaskRunner.java:246) [druid-indexing-service-0.15.1-incubating.jar:0.15.1-incubating]
	at org.apache.druid.indexing.seekablestream.SeekableStreamIndexTask.run(SeekableStreamIndexTask.java:167) [druid-indexing-service-0.15.1-incubating.jar:0.15.1-incubating]
	at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:419) [druid-indexing-service-0.15.1-incubating.jar:0.15.1-incubating]
	at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:391) [druid-indexing-service-0.15.1-incubating.jar:0.15.1-incubating]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_212]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_212]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_212]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]
Caused by: org.apache.druid.java.util.common.ISE: Failed to publish segments because of [java.lang.RuntimeException: Aborting transaction!].
	at org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver.lambda$publishInBackground$8(BaseAppenderatorDriver.java:602) ~[druid-server-0.15.1-incubating.jar:0.15.1-incubating]
	... 4 more

Issue Analytics

State:
Created 4 years ago
Reactions:1
Comments:28 (7 by maintainers)

Top GitHub Comments

4reactions

pjain1commented, Sep 30, 2021

One of the cause of this is when you resubmit the supervisor to consume from a different topic without changing the supervisor name, in that case you see following message in overlord logs

Not updating metadata, existing state[KafkaDataSourceMetadata{SeekableStreamStartSequenceNumbers=SeekableStreamEndSequenceNumbers{stream='STREAM1', partitionSequenceNumberMap={}] in metadata store doesn't match to the new start state[KafkaDataSourceMetadata{SeekableStreamStartSequenceNumbers=SeekableStreamStartSequenceNumbers{stream='STREAM2', partitionSequenceNumberMap={}, exclusivePartitions=[]}}]

Druid maintains end offsets of a topic for last published segments in druid_dataSource table and checks if it matches the start offsets of currently publishing segments for consistency. It uses the datasource name (which is same as supervisor name) as the key to store this metadata. So when you change the topic name, obviously the current start offsets to the task will not match with the stored end offsets hence task will fail.

Solution -

If there is no existing datasource having same name as new topic/stream name, just terminate existing supervisor and submit a new supervisor having same datasource and topic name.
If you want to keep the datasource name - (this is a hack) terminate currently running supervisor, delete the row for the corresponding dataSource in the druid_dataSource table and resubmit the supervisor. Word of caution - Since we are directly editing the metadata store do it at your own risk as it may cause data consistency issues for your druid datasource.

General advice is to keep the datasource name same as topic name and if you change topic then create a new supervisor with the changed name.

0reactions

marioramagliacommented, Jun 4, 2021

Confirmed with 0.20.2; also it seems related to the fact that the ingested Kafka topic has partitions. Recreating the topic without partitions seems to fix the issue. Please help. thanks

Top Results From Across the Web

io.druid.java.util.common.ISE: Transaction failure publishing ...

We have got 2 FAILED tasks recently for KAFKA Indexing service with exception "io.druid.java.util.common.ISE: Transaction failure publishing segments, ...

[GitHub] [druid] scimas commented on issue #8605: Failed to ...

[GitHub] [druid] scimas commented on issue #8605: Failed to publish segments because of [java.lang.RuntimeException: Aborting transaction!]

Druid kafka index tasks are failing sometimes - Google Groups

ISE: Failed to publish segments because of [java.lang.RuntimeException: Aborting transaction!] at com.google.common.util.concurrent.

[GitHub] [druid] JoeHo0727 closed issue #11180: org.apache ...

... #11180: org.apache.druid.java.util.common.ISE: Failed to publish segments because of [java.lang.RuntimeException: Aborting transaction!]

druid 问题记录

ISE: Failed to publish segments because of [java.lang.RuntimeException: Aborting transaction!].\n\tat com.google.common.util.concurrent.