Duplicate consumption on multithreaded scenario (concurrency > 1)
See original GitHub issueWhen testing the currency parameter on the Consumer Config for a Spring Cloud Stream microservice (with Kafka), I noticed that several messages are processed twice. This happens because the second thread joins a little bit later than the first one, causing a rebalance prior T1 commiting its offsets, so T2 re-reads some messages from its newly assigned partitions.
I have the idempotence parameters set up, but it is not working with the concurrency parameter set to two as it may generate different producers for each thread, so it is not actually performing exactly-once-semantics.
Here you have an example log:
[
{
"@timestamp": "2021-04-15T09:19:11.321+02:00",
"@version": "1",
"message": "my-consumer-group: partitions assigned: [MY_AWESOME_TOPIC-0, MY_AWESOME_TOPIC-1]",
"logger_name": "org.springframework.cloud.stream.binder.kafka.KafkaMessageChannelBinder$1",
"thread_name": "KafkaConsumerDestination{consumerDestinationName='MY_AWESOME_TOPIC', partitions=0, dlqName='null'}.container-0-C-1",
"level": "INFO",
"level_value": 20000
},
{
"whatever": "some message processing...."
},
{
"@timestamp": "2021-04-15T09:19:21.226+02:00",
"@version": "1",
"message": "my-consumer-group: partitions assigned: [MY_AWESOME_TOPIC-0]",
"logger_name": "org.springframework.cloud.stream.binder.kafka.KafkaMessageChannelBinder$1",
"thread_name": "KafkaConsumerDestination{consumerDestinationName='MY_AWESOME_TOPIC', partitions=0, dlqName='null'}.container-1-C-1",
"level": "INFO",
"level_value": 20000
},
{
"@timestamp": "2021-04-15T09:19:21.227+02:00",
"@version": "1",
"message": "my-consumer-group: partitions assigned: [MY_AWESOME_TOPIC-1]",
"logger_name": "org.springframework.cloud.stream.binder.kafka.KafkaMessageChannelBinder$1",
"thread_name": "KafkaConsumerDestination{consumerDestinationName='MY_AWESOME_TOPIC', partitions=0, dlqName='null'}.container-0-C-1",
"level": "INFO",
"level_value": 20000
}
]
Any clue on why does this happens? It thrills me a bit that there is a first assignment and, 10 seconds later, the second thread joins, firing the rebalance, but T1 already started processing. Shouldn’t all the N threads configured by the concurrency parameter start at the same time to avoid this?
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (4 by maintainers)

Top Related StackOverflow Question
Spring’s behavior depends on the container
AckMode. WithAckMode.BATCH(the default), any pending offsets, for already processed records, are committed inonPartitionsRevoked; withAckMode.RECORD, commits are done immediately after processing each record, so there is nothing to do inonPartitionsRevokedsince there is nothing pending.This is not something the application needs to worry about.
Thanks Gary, I will tune those parameters to avoid duplicates. I am closing the issue. Again, thanks a lot!