Help: Commit can't be completed since group has already rebalanced...
See original GitHub issueMade the switch from pykafka
to kafka-python
over the weekend, which resolved an issue where my Producer
would hang sending data to a Kafka cluster I don’t control.
This has had the unforeseen consequence of not allowing me to commit my offsets, seemingly only for messages that take a while to process (but I’ve seen some other messages processed that may be duplicates), though that could be an incorrect assumption. I’ve never noticed problems updating my offset with the other library, and thus I don’t think it has anything to do with Kafka broker settings, likely just something with my consumer.
For reference, I’m using Kafka-Python 1.2.2 and Kafka 0.9
Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured session.timeout.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
I’m simply creating a single consumer with a group, grabbing the first message available, processing it, and moving on.
consumer = KafkaConsumer(topic, bootstrap_servers=server_list, group_id=group, enable_auto_commit=False)
for message in consumer:
process_message(message.value)
consumer.commit()
Issue Analytics
- State:
- Created 7 years ago
- Comments:12 (5 by maintainers)
Top GitHub Comments
How long does your
process_message
take in the worst case? Are you using the default heartbeat and session timeout parameters (it appears so from the consumer side, but you might verify that you haven’t modified default server-side configs).pykafka maintains a custom leader election / partition assignment system, and I don’t know the details well enough to comment on it. kafka-python attempts to implement exactly the same group coordination system, algorithms, and configuration parameters as the official java client. But you have found one of the issues with the official implementation, namely that “long” message processing can cause unwanted group rebalance operations and interfere with offset commits etc. Rather than implement and maintain our own system here, I prefer to follow the official implementation. They are currently discussing / implementing a background heartbeat mechanism that should help address. Until then, the recommendation is to tune your heartbeat and session timeouts relative to your worst case message processing time.
group.max.session.timeout.ms
is the broker configuration. It defaults to 30000 (30 seconds). If you increase that you should be able to pass a larger value forsession_timeout_ms
to your KafkaConsumer instances.