Infinite loop after BROKER_FAILURE - Broker does not exist
See original GitHub issueHi guys ! After a double BROKER_fAILURE, CC seems to be stuck in an infinite loop. I loosed 2 brokers in a cluster of 3 brokers, I loosed Broker 1063 at 09:06:41 and 1068 at 09:08:19. Since then, CC tries to fix the anomaly but gets exception “java.lang.IllegalArgumentException: Broker [1063, 1068] does not exist.”
Does anyone have a clue on it ? 😃
[2018-11-13 10:08:54,828] WARN BROKER_FAILURE detected {
Broker 1068 failed at 13/11/2018 09:08:19
Broker 1063 failed at 13/11/2018 09:06:41
}. Self healing start time 13/11/2018 09:36:41. (com.linkedin.kafka.cruisecontrol.detector.notifier.SelfHealingNotifier)
[2018-11-13 10:08:54,828] WARN Self-healing has been triggered. (com.linkedin.kafka.cruisecontrol.detector.notifier.SelfHealingNotifier)
[2018-11-13 10:08:54,833] INFO Fixing anomaly {
Broker 1068 failed at 13/11/2018 09:08:19
Broker 1063 failed at 13/11/2018 09:06:41
} (com.linkedin.kafka.cruisecontrol.detector.AnomalyDetector)
[2018-11-13 10:08:54,839] WARN Anomaly handler received exception when try to fix the anomaly {
Broker 1068 failed at 13/11/2018 09:08:19
Broker 1063 failed at 13/11/2018 09:06:41
}. (com.linkedin.kafka.cruisecontrol.detector.AnomalyDetector)
com.linkedin.kafka.cruisecontrol.exception.KafkaCruiseControlException: java.lang.IllegalArgumentException: Broker [1063, 1068] does not exist.
at com.linkedin.kafka.cruisecontrol.KafkaCruiseControl.decommissionBrokers(KafkaCruiseControl.java:184)
at com.linkedin.kafka.cruisecontrol.detector.BrokerFailures.fix(BrokerFailures.java:44)
at com.linkedin.kafka.cruisecontrol.detector.AnomalyDetector$AnomalyHandlerTask.fixAnomaly(AnomalyDetector.java:268)
at com.linkedin.kafka.cruisecontrol.detector.AnomalyDetector$AnomalyHandlerTask.run(AnomalyDetector.java:203)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Broker [1063, 1068] does not exist.
at com.linkedin.kafka.cruisecontrol.KafkaCruiseControl.sanityCheckBrokerPresence(KafkaCruiseControl.java:779)
at com.linkedin.kafka.cruisecontrol.KafkaCruiseControl.decommissionBrokers(KafkaCruiseControl.java:166)
... 10 more
[2018-11-13 10:08:54,839] WARN BROKER_FAILURE detected {
Broker 1068 failed at 13/11/2018 09:08:19
Broker 1063 failed at 13/11/2018 09:06:41
}. Self healing start time 13/11/2018 09:36:41. (com.linkedin.kafka.cruisecontrol.detector.notifier.SelfHealingNotifier)
[2018-11-13 10:08:54,839] WARN Self-healing has been triggered. (com.linkedin.kafka.cruisecontrol.detector.notifier.SelfHealingNotifier)
[2018-11-13 10:08:54,844] INFO Fixing anomaly {
Broker 1068 failed at 13/11/2018 09:08:19
Broker 1063 failed at 13/11/2018 09:06:41
} (com.linkedin.kafka.cruisecontrol.detector.AnomalyDetector)
[2018-11-13 10:08:54,849] WARN Anomaly handler received exception when try to fix the anomaly {
Broker 1068 failed at 13/11/2018 09:08:19
Broker 1063 failed at 13/11/2018 09:06:41
}. (com.linkedin.kafka.cruisecontrol.detector.AnomalyDetector)
com.linkedin.kafka.cruisecontrol.exception.KafkaCruiseControlException: java.lang.IllegalArgumentException: Broker [1063, 1068] does not exist.
at com.linkedin.kafka.cruisecontrol.KafkaCruiseControl.decommissionBrokers(KafkaCruiseControl.java:184)
at com.linkedin.kafka.cruisecontrol.detector.BrokerFailures.fix(BrokerFailures.java:44)
at com.linkedin.kafka.cruisecontrol.detector.AnomalyDetector$AnomalyHandlerTask.fixAnomaly(AnomalyDetector.java:268)
at com.linkedin.kafka.cruisecontrol.detector.AnomalyDetector$AnomalyHandlerTask.run(AnomalyDetector.java:203)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Broker [1063, 1068] does not exist.
at com.linkedin.kafka.cruisecontrol.KafkaCruiseControl.sanityCheckBrokerPresence(KafkaCruiseControl.java:779)
at com.linkedin.kafka.cruisecontrol.KafkaCruiseControl.decommissionBrokers(KafkaCruiseControl.java:166)
... 10 more
Issue Analytics
- State:
- Created 5 years ago
- Comments:6
Top Results From Across the Web
IT19247: MQCONNECTOR CAUSING AN INFINITE LOOP ...
When MQ returns a failure return code, the error is logged and control is passed back to MQ. This results in a loop...
Read more >Infinite loop in Spring Kafka producer when sending message ...
I'm new to kafka and am wondering if this is normal behavior of Kafka as this may write unneccesary logs. I dont want...
Read more >Redelivery of documents in Broker 6 - webMethods
Just be careful you don't get into an infinite loop, where there is an error that can't be corrected by a redelivered, it...
Read more >Documentation - Apache Kafka
Each partition is an ordered, immutable sequence of messages that is continually appended to—a commit log. The messages in the partitions are each...
Read more >Error Messages - Zorro Project
The error or warning by the broker is displayed and recorded in the Zorro log. ... optimize calls or the loop parameters were...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@jocelyndrean Ah I assumed that you were using a released version for the
master
branch – i.e.0.1.*
– not the version of CC that supportsKafka 2.0
– i.e. CC versions2.*
. It turns out that that particular patch that I referred to above (i.e. https://github.com/linkedin/cruise-control/pull/352) was unintentionally forgotten to be cherry-picked inmigrate_to_kafka_2_0
branch; hence, the version2.0.8
was missing that particular fix.I just created version
2.0.9
with the fix (see https://github.com/linkedin/cruise-control/releases/tag/2.0.9), which should resolve the issue. Sorry for the confusion, thanks for reporting this – hope it helps!Thanks for this release 2.0.9 😃