Subscribe :: Consume exception: Local: Maximum application poll interval (max.poll.interval.ms) exceeded
See original GitHub issueDescription
We’ve been getting this error since upgrading to beta3:
Subscribe :: Consume exception: Local: Maximum application poll interval (max.poll.interval.ms) exceeded - Local: Maximum application poll interval (max.poll.interval.ms) exceeded (ConsumeException) - Confluent.Kafka.ConsumeException: Local: Maximum application poll interval (max.poll.interval.ms) exceeded
at Confluent.Kafka.Consumer`2.ConsumeImpl[K,V](Int32 millisecondsTimeout, IDeserializer`1 keyDeserializer, IDeserializer`1 valueDeserializer)
at Confluent.Kafka.Consumer`2.Consume(CancellationToken cancellationToken)
at Wayfair.Common.MessageQueue.Kafka.KafkaSubscriberChannel`2.Subscribe(IEnumerable`1 targetQueues, Action`2 messageHandler, ISubscriberEventHandler eventHandler).
We kept upping the max.poll.interval
value to try to determine if we have a process that is taking way too long. Currently the value is set to 30 minutes. (This seems far longer than any process we would have running).
We are mostly out of ideas and wanted to see if there was a recommendation for how to solve this issue?
How to reproduce
Use the config specified below. After a fairly long period of time, the error will occur. I’m not sure exactly how long, but in the past, I’ve seen within an hour. Other times I am guessing a few hours to show up.
Checklist
Please provide the following information:
- Confluent.Kafka nuget version:
v1.0-beta3
- Apache Kafka version:
- Client configuration:
new ConsumerConfig {
EnableAutoCommit = true,
EnableAutoOffsetStore = false,
HeartbeatIntervalMs = 3000,
AutoCommitIntervalMs = 5000,
AutoOffsetReset = AutoOffsetReset.Earliest,
MaxPollIntervalMs = (int?) TimeSpan.FromMinutes(30).TotalMilliseconds
}
And…
new ConsumerBuilder<TKey, TMessage>(config)
.SetKeyDeserializer(keyDeserializer)
.SetValueDeserializer(valueDeserializer)
.SetErrorHandler(eventHandler.OnError)
.SetLogHandler(eventHandler.OnLog)
.SetOffsetsCommittedHandler(eventHandler.OnOffsetsCommitted)
.SetStatisticsHandler(eventHandler.OnStatistics)
.SetRebalanceHandler((c, e) =>
{
if (e.IsAssignment)
{
c.Assign(e.Partitions);
eventHandler.OnPartitionsAssigned(c, e);
}
else
{
c.Unassign();
eventHandler.OnPartitionsRevoked(c, e);
}
})
.Build())
- Operating system: Windows and Linux
- Provide logs (with “debug” : “…” as necessary in configuration)
- Provide broker log excerpts
- Critical issue
Issue Analytics
- State:
- Created 5 years ago
- Comments:14 (8 by maintainers)
Top Results From Across the Web
Kafka consumer gets stuck after exceeding max.poll. ...
The consumer process hangs and does not consume any more messages. The following error message gets logged. MAXPOLL|rdkafka#consumer-1| [thrd: ...
Read more >[Python] How to capture Application maximum poll interval ...
Hello, My microservice uses confluent-kafka-python. Once in a while it fails with this error %4|1654121013.314|MAXPOLL|rdkafka#consumer-1| ...
Read more >Long-Running Jobs - Karafka framework documentation
Long-Running Jobs. When working with Kafka, there is a setting called max.poll.interval.ms . It is the maximum delay between invocations of poll() commands....
Read more >Kafka Consumer configuration reference
max.poll.interval.ms¶. The maximum delay between invocations of poll() when using consumer group management. This places an upper bound on the amount of time ......
Read more >Recommended configurations for Apache Kafka clients
Increase poll processing timeout ( max.poll.interval.ms ); Decrease message batch size to speed up processing; Improve processing ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The point of max.poll.interval.ms is to provide a heartbeat between the application and the consumer: if the application has not called poll/consume (heartbeated) in this long the application is deemed dead/stalled/stuck/malfunctional and the consumer will leave the group so the assigned partitions can be assigned to a live application instance.
max.poll.interval.ms should thus be set to the maximum (plus some) theoretical processing time.
@mhowlett I am running into a similar issue. I have just upgraded the Confluent.Kafka to v 1.1.0. Here is the related log message
My question is, what is the best way to recover from this situation from within the code without recycling the windows service in which the consumer is running.
Some messages are going to take longer to process and instead of adjusting max.poll.interval.ms, is there a way to force the consumer to reconnect when this issue occurs? Is there a way to detect this and then recover from it?