question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Kafka messages are being lost on INVALID_FETCH_SESSION_EPOCH

See original GitHub issue

Recently we have observed that whenever there is an issue with INVALID_FETCH_SESSION_EPOCH from Kafka client the corresponding message is being skipped and hence it’s lost in transit. INVALID_FETCH_SESSION_EPOCH is a known issue that was fixed in Kafka client 2.3.0 + according to KAFKA-8052. However, what we observed is when this is printed in logs the corresponding message is skipped entirely. This is impacting the at-least-once guaranteed message delivery part of Spring Cloud Stream, so I am not sure if this is a known issue or not. Obviously, with the most recent versions of Spring Cloud Stream, this issue should not happen but it raises a point that under similar circumstances a message could be skipped due. I just want to raise this issue here due to the importance of losing messages in transit.

"@timestamp":"2021-08-11T13:40:44.081+00:00","message":"[Consumer clientId=consumer-9, groupId=tag-modifiers] Node 17 was unable to process the fetch request with (sessionId=1995608629, epoch=331): INVALID_FETCH_SESSION_EPOCH.","logger_name":"org.apache.kafka.clients.FetchSessionHandler","thread_name":"KafkaConsumerDestination{consumerDestinationName='edit-tag', partitions=9, dlqName='dlq'}.container-0-C-1","level":"INFO","profile":"beats","microservice":"data-service"

This issue is observed for Spring Cloud Stream 2.2.0 which uses Kafka client 2.0.1, but it should happen for any version of Spring Cloud Stream that uses Kafka client before 2.3.0.

I was under the impression that an exception in Kafka client to fetch a message should trigger the consumer to retry to fetch a message, but it seems that’s not what’s happening.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
garyrussellcommented, Aug 19, 2021

I have no idea - ask the Kafka folks - as I said, it’s totally out of our control.

0reactions
sobychackocommented, Aug 23, 2021

@mraliagha Closing the issue. Please feel free to re-open with more context if you think that is necessary.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Developers - Kafka messages are being lost on ...
Recently we have observed that whenever there is an issue with INVALID_FETCH_SESSION_EPOCH from Kafka client the corresponding message is ...
Read more >
Kafka INVALID_FETCH_SESSION_EPOCH - Stack Overflow
In our case, The root cause was kafka Broker - client incompatibility. If your cluster is behind the client version you might see...
Read more >
When you can lose messages in Kafka - Developer 2.0
Kafka is speedy and fault-tolerant distributed streaming platform. However, there are some situations when messages can disappear.
Read more >
INVALID_FETCH_SESSION_EP...
I see that this message comes from the changes introduced in KIP-227: ... INVALID_FETCH_SESSION_EPOCH is returned, FetchSession.scala
Read more >
Release Notes - Kafka - Version 2.3.0
[KAFKA-4730] - Streams does not have an in-memory windowed store ... [KAFKA-7801] - TopicCommand should not be able to alter transaction topic partition ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found