question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Rebalance stuck, partitions rebalance not progressing in Kafka

See original GitHub issue

I am seeing an issue with the partition reassignments performed by CC: basically any rebalance that triggers more than one partition reassignment batch (tasks?) in kafka never completes. It gets stuck in the second batch. with cc waiting for reassignment to complete and kafka controller doing nothing. The reassign_partition zk node is being updated by CC but Kafka Controller is not triggering the rebalance. The only workaround is to delete /controller in kafka

Working with kafka 2.4.1 server side and a build of cc that has the 2.4.1 client libs (master)

Looking at https://github.com/linkedin/cruise-control/blob/b386141146dfe4cc013b8233c19434e09b50027c/cruise-control/src/main/scala/com/linkedin/kafka/cruisecontrol/executor/ExecutorUtils.scala#L89 it seems that CC is not using ReassignPartitionsCommand but instead updating the zk node directly. This apparently is not enough in newer version of Kafka.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:12 (5 by maintainers)

github_iconTop GitHub Comments

3reactions
isugimpycommented, Jun 23, 2020

A little late on this one since it’s closed, but running CC 2.4.8 on Kafka 2.4.0, I’m still experiencing this behavior. Would it be possible to reopen this issue, or should I open a new one?

1reaction
amurarucommented, Apr 23, 2020

One user pointed to https://issues.apache.org/jira/browse/KAFKA-9478 on gitter as the possible cause for this

Read more comments on GitHub >

github_iconTop Results From Across the Web

Solving My Weird Kafka Rebalancing Problems & Explaining ...
The idea is that a consumer does not need to revoke a partition if the group coordinator reassigns the same partition to the...
Read more >
Partitions processing stuck until state store is rebuilt during ...
So all the partitions - part1, part2 and part3 would be stuck till the rebalancing is complete.
Read more >
Rebalancing stuck, never finishes - The Mail Archive
I grabbed some logs from the time when it was continuously rebalancing. Logs are mixed from 6 pods, but all pods have the...
Read more >
Re: Kafka Streams application stuck rebalancing on startup
We see this when all four operators are in play. If you change the sample streams configuration to not do that final foreign...
Read more >
The Unofficial Kafka Rebalance How-To - Tom Lee (dot co)
Rebalances give all consumers in a consumer group a chance to negotiate partition assignments amongst themselves. The exact mechanics of the ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found