question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DefaultAfterRollbackProcessor retries are not limited to 10 by default

See original GitHub issue

Hello, I am trying to find a way to set a number of retries for Kafka consumers. Without transactional.id, I was able to control it by setting max-attempts in RetryTemplate. However, when using transaction-id-prefix, I cannot seem to control max-attempts per channel.

Looking at the spring-kafka doc, by default DefaultAfterRollbackProcessor retries 10 times. Starting with version 2.2, the DefaultAfterRollbackProcessor can now recover (skip) a record that keeps failing. By default, after ten failures, the failed record is logged (at the ERROR level). You can configure the processor with a custom recoverer (BiConsumer) and maximum failures. Setting the maxFailures property to a negative number causes infinite retries.

However, when I actually test it, it seems to retry more than 10 times. My code is here:

Steps on how to reproduce:

  1. Start Zookeeper
  2. Start Kafka
  3. Produce an event to the topic named ‘test-topic’ (use console producer like $ ./kafka-console-producer.sh --broker-list localhost:9092 --topic test-topic )
  4. Let @StreamListener do the work (it will result in CommitFailedException, so it will retry)
  5. On the console, look for ‘log this message’. There will be more than 10 tries.

My question is this

  1. This behavior seems different compare to the document. Which one is the intended result?
  2. Based on my business requirements, I need to set max retries per channel. How can this be done when using transaction-id-prefix? I prefer doing it in application.yml in spring-cloud-stream-binder-kafka way (global max-attempts as well as overriding per channel) rather than within the code.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:18 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
garyrussellcommented, Mar 17, 2020

When not using transactions, add a SeekToCurrentErrorHandler to the container.

Then, when a fails, we discard b and c and re-fetch a, b, c on the next poll.

You can set max-attempts to 1 and add retry configuration to the error handler (retry count and back off). Or, configure the error handler with 0 retries to use the max-attempts via retry instead.

1reaction
garyrussellcommented, Mar 16, 2020

Is this still true then?

No; with the binder, the retry properties will be used instead of the default (10).

re-seek is happening which is the expected outcome.

There is no re-seek in this case (when retries are exhausted) it is simply that the offset was not committed so the record is redelivered after the partition is re-assigned.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Exceptions and Retry Policy in Kafka - Dev Genius
Handling NRE, SLE and not recognized exceptions. By default, DefaultAfterRollbackProcessor can properly recognize NRE exceptions by embedded ...
Read more >
Kafka Message recovery with AfterRollbackProcessor for Non ...
1 Answer 1. Sorted by: Reset to default. Highest score (default) ...
Read more >
Spring for Apache Kafka
The default replication factor for the retry topics is now -1 (use broker default). ... There is no limit to the number of...
Read more >
Configuring Retry Logic in Spring Batch - Baeldung
By default, a Spring batch job fails for any errors raised during its execution. However, at times, we may want to improve our...
Read more >
Spring for Apache Kafka
The DefaultAfterRollbackProcessor and SeekToCurrentErrorHandler can now recover ... By default, if the broker is not available, a message will be logged, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found