Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

OffsetCommitRequest timeout causes consumers rebalancing

See original GitHub issue

Description

Hello, We have been using the latest Kafka client library (1.2.0) with defaults settings. Our typical Kafka topic consumption loop is to read an event and commit it one by one. Recently we have noticed a lot of random Broker: Unknown member exceptions while commiting event offset.

Logs says:

{"Message":"[thrd:GroupCoordinator]: 
GroupCoordinator/3: Timed out HeartbeatRequest in flight (after 10963ms, timeout #0): possibly held back by preceeding OffsetCommitRequest with timeout in 48457ms",
"ClientInstance":"rdkafka#consumer-1","Facility":"REQTMOUT"}

then this:

{"Message":"[thrd:GroupCoordinator]: 
GroupCoordinator/3: Timed out 1 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests",
"ClientInstance":"rdkafka#consumer-1","Facility":"REQTMOUT"}

And finally this happens (because of rebalancing)

Broker: Unknown member ---> 
Confluent.Kafka.KafkaException: Broker: Unknown member\n 
at Confluent.Kafka.Impl.SafeKafkaHandle.Commit(IEnumerable`1 offsets)\n at 
Confluent.Kafka.Consumer`2.Commit(ConsumeResult`2 result)

I’m wondering why we see this preceeding OffsetCommitRequest if we just commit offsets one by one sequentially.

Could you please help to figure out what is happening?

How to reproduce

NuGet packages installed: <PackageReference Include="Confluent.Kafka" Version="1.2.0" />

while (true)
{
                consumeResult = _consumer.Consume(500ms);
                if (consumeResult == null)
                {
                    return;
                }
                _consumer.Commit(consumeResult);
}

Issue Analytics

State:
Created 4 years ago
Comments:17 (8 by maintainers)

Top GitHub Comments

3reactions

alex-namelycommented, Mar 6, 2020

@aouakki , @oleg-orlenko In my company we have been migrating everything to much more stable go-based client https://github.com/Shopify/sarama

1reaction

mhowlettcommented, Mar 6, 2020

@alex-namely - we see a lot of people migrating to the confluent go client from sarama for the same reason. the confluent go client is used heavily by some of the largest users of kafka. can’t name names, but you’re most likely using more than one product powered by it.

@aouakki - we’re looking into this.