question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Subscribe :: Consume exception: Local: Maximum application poll interval (max.poll.interval.ms) exceeded

See original GitHub issue

Description

We’ve been getting this error since upgrading to beta3:

Subscribe :: Consume exception: Local: Maximum application poll interval (max.poll.interval.ms) exceeded - Local: Maximum application poll interval (max.poll.interval.ms) exceeded (ConsumeException) - Confluent.Kafka.ConsumeException: Local: Maximum application poll interval (max.poll.interval.ms) exceeded
   at Confluent.Kafka.Consumer`2.ConsumeImpl[K,V](Int32 millisecondsTimeout, IDeserializer`1 keyDeserializer, IDeserializer`1 valueDeserializer)
   at Confluent.Kafka.Consumer`2.Consume(CancellationToken cancellationToken)
   at Wayfair.Common.MessageQueue.Kafka.KafkaSubscriberChannel`2.Subscribe(IEnumerable`1 targetQueues, Action`2 messageHandler, ISubscriberEventHandler eventHandler).

We kept upping the max.poll.interval value to try to determine if we have a process that is taking way too long. Currently the value is set to 30 minutes. (This seems far longer than any process we would have running).

We are mostly out of ideas and wanted to see if there was a recommendation for how to solve this issue?

How to reproduce

Use the config specified below. After a fairly long period of time, the error will occur. I’m not sure exactly how long, but in the past, I’ve seen within an hour. Other times I am guessing a few hours to show up.

Checklist

Please provide the following information:

  • Confluent.Kafka nuget version: v1.0-beta3
  • Apache Kafka version:
  • Client configuration:
new ConsumerConfig {
  EnableAutoCommit = true,
  EnableAutoOffsetStore = false,
  HeartbeatIntervalMs = 3000,
  AutoCommitIntervalMs = 5000,
  AutoOffsetReset = AutoOffsetReset.Earliest,
  MaxPollIntervalMs = (int?) TimeSpan.FromMinutes(30).TotalMilliseconds
}

And…

new ConsumerBuilder<TKey, TMessage>(config)
                .SetKeyDeserializer(keyDeserializer)
                .SetValueDeserializer(valueDeserializer)
                .SetErrorHandler(eventHandler.OnError)
                .SetLogHandler(eventHandler.OnLog)
                .SetOffsetsCommittedHandler(eventHandler.OnOffsetsCommitted)
                .SetStatisticsHandler(eventHandler.OnStatistics)
                .SetRebalanceHandler((c, e) =>
                {
                    if (e.IsAssignment)
                    {

                        c.Assign(e.Partitions);
                        eventHandler.OnPartitionsAssigned(c, e);
                    }
                    else
                    {
                        c.Unassign();
                        eventHandler.OnPartitionsRevoked(c, e);
                    }
                })
                .Build())
  • Operating system: Windows and Linux
  • Provide logs (with “debug” : “…” as necessary in configuration)
  • Provide broker log excerpts
  • Critical issue

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:14 (8 by maintainers)

github_iconTop GitHub Comments

4reactions
edenhillcommented, Apr 7, 2020

The point of max.poll.interval.ms is to provide a heartbeat between the application and the consumer: if the application has not called poll/consume (heartbeated) in this long the application is deemed dead/stalled/stuck/malfunctional and the consumer will leave the group so the assigned partitions can be assigned to a live application instance.

max.poll.interval.ms should thus be set to the maximum (plus some) theoretical processing time.

4reactions
vinodrescommented, Sep 24, 2019

@mhowlett I am running into a similar issue. I have just upgraded the Confluent.Kafka to v 1.1.0. Here is the related log message

Application maximum poll interval (300000ms) exceeded by 375ms (adjust max.poll.interval.ms for long-running message processing): leaving group

My question is, what is the best way to recover from this situation from within the code without recycling the windows service in which the consumer is running.

Some messages are going to take longer to process and instead of adjusting max.poll.interval.ms, is there a way to force the consumer to reconnect when this issue occurs? Is there a way to detect this and then recover from it?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Kafka consumer gets stuck after exceeding max.poll. ...
The consumer process hangs and does not consume any more messages. The following error message gets logged. MAXPOLL|rdkafka#consumer-1| [thrd: ...
Read more >
[Python] How to capture Application maximum poll interval ...
Hello, My microservice uses confluent-kafka-python. Once in a while it fails with this error %4|1654121013.314|MAXPOLL|rdkafka#consumer-1| ...
Read more >
Long-Running Jobs - Karafka framework documentation
Long-Running Jobs. When working with Kafka, there is a setting called max.poll.interval.ms . It is the maximum delay between invocations of poll() commands....
Read more >
Kafka Consumer configuration reference
max.poll.interval.ms¶. The maximum delay between invocations of poll() when using consumer group management. This places an upper bound on the amount of time ......
Read more >
Recommended configurations for Apache Kafka clients
Increase poll processing timeout ( max.poll.interval.ms ); Decrease message batch size to speed up processing; Improve processing ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found