question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Detect max.poll.interval.ms expiration

See original GitHub issue

Description

Some time my consumer throws ConsumeException:

Confluent.Kafka.ConsumeException: Local: Maximum application poll interval (max.poll.interval.ms) exceeded

at Confluent.Kafka.Consumer2.ConsumeImpl[K,V](Int32 millisecondsTimeout, IDeserializer1 keyDeserializer, IDeserializer1 valueDeserializer) at Confluent.Kafka.Consumer2.Consume(TimeSpan timeout)

The log informs:

Application maximum poll interval (300000ms) exceeded by 2134298747ms (adjust max.poll.interval.ms for long-running message processing): leaving group

and the consumer stops receiving new messages (consume() returns null).

Is there are a way of detect when max.poll.interval.ms occours?

Is there another way besides try to check the exception message?

I was searching and found 931.

2316 has been released?

Lib Version: 1.2.2.0

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:14 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
AndyPookcommented, Jan 9, 2020

Hi,

This is usually caused because Consume is not being called often enough (there can be some house-keeping interaction with the broker at this point). Typically there’s a consume/handle loop (ie consume to get a message; process the message; repeat). If your handle method/processing takes longer than the interval then you will see this exception.

I have seen this where a circuit-breaker was added so that the consumer could be “paused”. but that meant that the consumer wasn’t able to do its house-keeping and the interval expires and the exception is thrown. More simply, it can be seen if you leave the debugger on a break point for too long.

Usually, when the “loop” starts again, the underlying client is smart enough to restart the connection. Consume()==null can happen legitimately. Publish another message and you should see the consumer resume.

A massively simplified message loop…

while(!cancellationToken.IsCancellationRequested)
{
  var msg = consumer.Consume( 100ms or cancellationToken );
  if (msg == null || msg.IsPartitionEOF || cancellationToken.IsCancellationRequested)
    continue;

  Handle(msg);
}

If Handle takes longer than 300000ms (5 minutes) then you have other problems ! (note: if you pass a cancellationToken, internally it’s just a loop of Consume(100) waiting for the ct to be cancelled. Depending on what you’re trying to do, you take you choice…)

I hope this makes some sense? Happy to help further if you can share some more on what code pattern you are using.

0reactions
edenhillcommented, Apr 28, 2020

You should not call Assign() outside of OnPartitionsAssigned/Revoked

Read more comments on GitHub >

github_iconTop Results From Across the Web

Should we use max.poll.records or max.poll.interval.ms to ...
The latter lets the group know that your application is still alive so can trigger a rebalance before max.poll.interval.ms expires. The polling ...
Read more >
Kafka Consumer configuration reference
max.poll.interval.ms¶ ... The maximum delay between invocations of poll() when using consumer group management. This places an upper bound on the amount of...
Read more >
Dangerous default Kafka settings — Part 1 | by Irori
You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.
Read more >
Kafka Consumer Important Settings: Poll & Internal ...
Kafka Consumer Poll Thread​​ If two . poll() calls are separated by more than max.poll.interval.ms time, then the consumer will be disconnected from...
Read more >
KIP-517: Add consumer metrics to observe user poll behavior
Easily identify if/when max.poll.interval.ms needs to be changed (and to what value); View trends/patterns; Verify max.poll.interval.ms was hit ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found