question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Failed to publish segments because of [java.lang.RuntimeException: Aborting transaction!].

See original GitHub issue

Failed to publish segments because of [java.lang.RuntimeException: Aborting transaction!].

Affected Version

>=0.15.1-incubating

Description

2019-09-28T12:17:34,157 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Encountered exception while running task.
java.util.concurrent.ExecutionException: org.apache.druid.java.util.common.ISE: Failed to publish segments because of [java.lang.RuntimeException: Aborting transaction!].
	at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) ~[guava-16.0.1.jar:?]
	at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) ~[guava-16.0.1.jar:?]
	at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) ~[guava-16.0.1.jar:?]
	at org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner.runInternal(SeekableStreamIndexTaskRunner.java:753) ~[druid-indexing-service-0.15.1-incubating.jar:0.15.1-incubating]
	at org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner.run(SeekableStreamIndexTaskRunner.java:246) [druid-indexing-service-0.15.1-incubating.jar:0.15.1-incubating]
	at org.apache.druid.indexing.seekablestream.SeekableStreamIndexTask.run(SeekableStreamIndexTask.java:167) [druid-indexing-service-0.15.1-incubating.jar:0.15.1-incubating]
	at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:419) [druid-indexing-service-0.15.1-incubating.jar:0.15.1-incubating]
	at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:391) [druid-indexing-service-0.15.1-incubating.jar:0.15.1-incubating]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_212]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_212]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_212]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]
Caused by: org.apache.druid.java.util.common.ISE: Failed to publish segments because of [java.lang.RuntimeException: Aborting transaction!].
	at org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver.lambda$publishInBackground$8(BaseAppenderatorDriver.java:602) ~[druid-server-0.15.1-incubating.jar:0.15.1-incubating]
	... 4 more

Issue Analytics

  • State:open
  • Created 4 years ago
  • Reactions:1
  • Comments:28 (7 by maintainers)

github_iconTop GitHub Comments

4reactions
pjain1commented, Sep 30, 2021

One of the cause of this is when you resubmit the supervisor to consume from a different topic without changing the supervisor name, in that case you see following message in overlord logs

Not updating metadata, existing state[KafkaDataSourceMetadata{SeekableStreamStartSequenceNumbers=SeekableStreamEndSequenceNumbers{stream='STREAM1', partitionSequenceNumberMap={}] in metadata store doesn't match to the new start state[KafkaDataSourceMetadata{SeekableStreamStartSequenceNumbers=SeekableStreamStartSequenceNumbers{stream='STREAM2', partitionSequenceNumberMap={}, exclusivePartitions=[]}}]

Druid maintains end offsets of a topic for last published segments in druid_dataSource table and checks if it matches the start offsets of currently publishing segments for consistency. It uses the datasource name (which is same as supervisor name) as the key to store this metadata. So when you change the topic name, obviously the current start offsets to the task will not match with the stored end offsets hence task will fail.

Solution -

  1. If there is no existing datasource having same name as new topic/stream name, just terminate existing supervisor and submit a new supervisor having same datasource and topic name.
  2. If you want to keep the datasource name - (this is a hack) terminate currently running supervisor, delete the row for the corresponding dataSource in the druid_dataSource table and resubmit the supervisor. Word of caution - Since we are directly editing the metadata store do it at your own risk as it may cause data consistency issues for your druid datasource.

General advice is to keep the datasource name same as topic name and if you change topic then create a new supervisor with the changed name.

0reactions
marioramagliacommented, Jun 4, 2021

Confirmed with 0.20.2; also it seems related to the fact that the ingested Kafka topic has partitions. Recreating the topic without partitions seems to fix the issue. Please help. thanks

Read more comments on GitHub >

github_iconTop Results From Across the Web

io.druid.java.util.common.ISE: Transaction failure publishing ...
We have got 2 FAILED tasks recently for KAFKA Indexing service with exception "io.druid.java.util.common.ISE: Transaction failure publishing segments, ...
Read more >
[GitHub] [druid] scimas commented on issue #8605: Failed to ...
[GitHub] [druid] scimas commented on issue #8605: Failed to publish segments because of [java.lang.RuntimeException: Aborting transaction!]
Read more >
Druid kafka index tasks are failing sometimes - Google Groups
ISE: Failed to publish segments because of [java.lang.RuntimeException: Aborting transaction!] at com.google.common.util.concurrent.
Read more >
[GitHub] [druid] JoeHo0727 closed issue #11180: org.apache ...
... #11180: org.apache.druid.java.util.common.ISE: Failed to publish segments because of [java.lang.RuntimeException: Aborting transaction!]
Read more >
druid 问题记录
ISE: Failed to publish segments because of [java.lang.RuntimeException: Aborting transaction!].\n\tat com.google.common.util.concurrent.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found