Kafka: Keeps expiring consumers
See original GitHub issueBug Report
Current behavior
We have 10 microservices and all interact with each other via kafka. We have noticed it randomly doesnt subscribes to topic, or randomly stops working and it gives kafka error, heartbeat not received while service on its own works fine.
[Nest] 19 - 06/16/2021, 1:09:12 PM [ClientKafka] ERROR [Connection] Response Heartbeat(key: 12, version: 3) {"timestamp":"2021-06-16T13:09:12.779Z","logger":"kafkajs","broker":"kafka-0.kafka-headless.dev.svc.cluster.local:9092","clientId":"reviews-ts-service-client","error":"The group is rebalancing, so a rejoin is needed","correlationId":1241,"size":10} +2857ms
[Nest] 19 - 06/16/2021, 1:09:12 PM [ClientKafka] ERROR [Runner] The group is rebalancing, re-joining {"timestamp":"2021-06-16T13:09:12.779Z","logger":"kafkajs","groupId":"reviews-consumer-ts-customer-client","memberId":"reviews-ts-service-client-453b2860-fdab-4c01-aa98-e015667b8d3b","error":"The group is rebalancing, so a rejoin is needed","retryCount":0,"retryTime":330} +1m
Nest] 21 - 06/16/2021, 6:49:52 PM [ClientKafka] ERROR [Connection] Response Heartbeat(key: 12, version: 3) {"timestamp":"2021-06-16T18:49:52.458Z","logger":"kafkajs","broker":"kafka-0.kafka-headless.dev.svc.cluster.local:9092","clientId":"captain-ps-service-client","error":"The coordinator is not aware of this member","correlationId":54,"size":10} +327904ms
[Nest] 21 - 06/16/2021, 6:49:52 PM [ClientKafka] ERROR [Runner] The coordinator is not aware of this member, re-joining the group {"timestamp":"2021-06-16T18:49:52.460Z","logger":"kafkajs","groupId":"captain-consumer-ps-client","memberId":"captain-ps-service-client-77090749-5dd9-4d17-a12b-aa072579caec","error":"The coordinator is not aware of this member","retryCount":7,"retryTime":30000} +1m
Input Code
import { KafkaOptions, Transport } from "@nestjs/microservices";
import appConfig from "config/appConfig";
export const microServiceConfig: KafkaOptions = {
transport: Transport.KAFKA,
options: {
client: {
clientId: 'promocode-service',
brokers: [...`${appConfig().KafkaHost}`.split(",")],
},
consumer: {
groupId: 'promocode-consumer',
sessionTimeout: 300000,
retry: { retries: 30 },
},
subscribe: {
fromBeginning: false,
}
}
};
Expected behavior
Not clear why kafka keeps timing out randomly if I redeploy all works and then again it stops. Is it wrapper causing issues? These random issues makes me wonder what causes it.
This is running on k8s and this behavior is seen in 1-2 users only, Kafka has enough memory!
All consumers have different group Id and all have high session timeout as well.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:5
- Comments:5 (1 by maintainers)
Top Results From Across the Web
How to manage expiration of Kafka Groups - Stack Overflow
Now I'm wondering, what's the name of the consumer config option which controls the expiration from this error message? stream · apache-kafka ...
Read more >KIP-211: Revise Expiration Semantics of Consumer Group ...
The expiration timer should start ticking the moment all group members are gone and the group transitions into Empty state.
Read more >Kafka Consumer Important Settings: Poll & Internal Threads ...
The way consumers maintain membership in a consumer group and ownership of the partitions assigned to them is by sending heartbeats to a...
Read more >Solved: Timeout Error When Using kafka-console-consumer ...
When I bring up kafka-console-consumer, a few minor log messages come up, and then it sits waiting for messages correctly.
Read more >Kafka client terminated with OffsetOutOfRangeException
By the time the batch is done processing, some of the Kafka partition offsets have expired. The offsets are calculated for the next...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi @jayeshanandani ,
By default, heartBeat is 3 seconds (heartbeatInterval = 3s) and the interval for call heartbeat method will be called every 5 seconds (maxWaitTimeInMs = 5s).
What does it mean? After every 5 seconds, library will call heartbeat method and determine can make a heartbeat request to Kafka Broker or not by the condition:
For my case, my method is heavy process (process json, parse json and format), it take more than 26s to finish a message. Look like during that time, my service can not send the heartbeat signal to KafkaBroker any my consumer is expired and killed.
HOW TO RESOLVE THIS ISSUE?
sessionTimeout : it should be greater than the processing time of method. heartbeatInterval: someone said, it should 2/3 of sessionTimeout maxWaitTimeInMs: it must be **_greater ** with heartbeatInterval
This issue was resolved by above configuration.
Notes: First time, when I config
It always show error:
=> The heartbeat will be called too early. It will call after 30s but the condition for sending request to KafkaService is 40s, that why the error happen.
@kamilmysliwiec do we need more information here? any input will be of a great help