"Broker: Unknown member" on commit for consumer manually assigned to partitions
See original GitHub issueDescription
I’m building an application which pushed messages which failed during processing to delay/retry topics. The primary consumer is fine manually committing its offsets as it reads messages.
The delay consumers are each pointed at a different partition of a delay topic. Each partition representing a timed delay.
The delay consumers are unable to manually commit their offset after consuming a message and processing it. I have enabled broker and protocol Debug and see this preceding the error:
19:04:04%7|1605837844.917|SEND|rdkafka#consumer-3| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Sent FetchRequest (v11, 131 bytes @ 0, CorrId 43)
%7|1605837844.918|SEND|rdkafka#consumer-2| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Sent FetchRequest (v11, 187 bytes @ 0, CorrId 34)
INF] Received message on RE.DOTNET.KAFKA_RETRY_EXAMPLE2.RETRY.AVRO. Partition [2]. Offset [92].
[19:04:04 INF] %7|1605837844.950|RETRY|rdkafka#consumer-5| [thrd:GroupCoordinator]: GroupCoordinator/1: Moved 1 retry buffer(s) to output queue
Triggering Primary Topic reprocessing on RE.DOTNET.KAFKA_RETRY_EXAMPLE2.PRIMARY_TOPIC.AVRO%7|1605837844.950|SEND|rdkafka#consumer-5| [thrd:GroupCoordinator]: GroupCoordinator/1: Sent OffsetCommitRequest (v7, 145 bytes @ 0, CorrId 14)
for RetryTransactionId 36898222-58b4-481e-8505-176465f4a5dd
%7|1605837844.952|RETRY|rdkafka#consumer-4| [thrd:GroupCoordinator]: GroupCoordinator/1: Moved 1 retry buffer(s) to output queue
[%7|1605837844.952|RECV|rdkafka#consumer-5| [thrd:GroupCoordinator]: GroupCoordinator/1: Received OffsetCommitResponse (v7, 61 bytes, CorrId 14, rtt 1.49ms)
19:04:04%7|1605837844.953|RECV|rdkafka#consumer-6| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Received FetchResponse (v11, 103 bytes, CorrId 35, rtt 101.42ms)
%7|1605837844.953|SEND|rdkafka#consumer-4| [thrd:GroupCoordinator]: GroupCoordinator/1: Sent OffsetCommitRequest (v7, 145 bytes @ 0, CorrId 14)
%7|1605837844.953|REQERR|rdkafka#consumer-5| [thrd:main]: GroupCoordinator/1: OffsetCommitRequest failed: Broker: Unknown member: explicit actions Refresh,Retry
INF%7|1605837844.954|RECV|rdkafka#consumer-9| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Received FetchResponse (v11, 111 bytes, CorrId 24, rtt 101.74ms)
%7|1605837844.954|RECV|rdkafka#consumer-4| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Received FetchResponse (v11, 103 bytes, CorrId 47, rtt 100.73ms)
] %7|1605837844.955|SEND|rdkafka#consumer-6| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Sent FetchRequest (v11, 131 bytes @ 0, CorrId 36)
%7|1605837844.979|RECV|rdkafka#consumer-4| [thrd:GroupCoordinator]: GroupCoordinator/1: Received OffsetCommitResponse (v7, 61 bytes, CorrId 14, rtt 25.98ms)
Seeking to specific message on %7|1605837844.980|SEND|rdkafka#consumer-5| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Sent FindCoordinatorRequest (v2, 69 bytes @ 0, CorrId 48)
RE.DOTNET.KAFKA_RETRY_EXAMPLE2.PRIMARY_TOPIC.AVRO%7|1605837844.980|SEND|rdkafka#consumer-9| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Sent FetchRequest (v11, 139 bytes @ 0, CorrId 25)
, partition %7|1605837844.981|SEND|rdkafka#consumer-4| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Sent FetchRequest (v11, 131 bytes @ 0, CorrId 48)
0%7|1605837844.981|REQERR|rdkafka#consumer-4| [thrd:main]: GroupCoordinator/1: OffsetCommitRequest failed: Broker: Unknown member: explicit actions Refresh,Retry
, offset %7|1605837844.983|RECV|rdkafka#consumer-9| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Received FetchResponse (v11, 379 bytes, CorrId 25, rtt 2.69ms)
1%7|1605837844.984|SEND|rdkafka#consumer-4| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Sent FindCoordinatorRequest (v2, 69 bytes @ 0, CorrId 49)
%7|1605837844.984|SEND|rdkafka#consumer-9| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Sent FetchRequest (v11, 139 bytes @ 0, CorrId 26)
.
[19:04:04 ERR] Failed to commit consumption of message on RE.DOTNET.KAFKA_RETRY_EXAMPLE2.DELAY.AVRO. Partition [2]. Offset [18]. Reason [Broker: Unknown member]
Broker: Unknown member
at Confluent.Kafka.Impl.SafeKafkaHandle.Commit(IEnumerable`1 offsets)
at Confluent.Kafka.Consumer`2.Commit(IEnumerable`1 offsets)
at Confluent.Kafka.Consumer`2.Commit(ConsumeResult`2 result)
at Kafka.Retry.Consumers.ContinuousConsumer`2.HandleCommit(ConsumeResult`2 consumeResult) in C:\GitRepos\kafka-retry-module-.net\src\Consumers\SuperClasses\ContinuousConsumer.cs:line 105%7|1605837845.012|RECV|rdkafka#consumer-7| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Received FetchResponse (v11, 103 bytes, CorrId 44, rtt 101.49ms)
In the logs before it start consuming I see quite a few of these messages of the delay consumers joining and leaving topics:
%7|1605835968.260|TOPBRK|rdkafka#consumer-6| [thrd::0/internal]: :0/internal: Topic RE.DOTNET.KAFKA_RETRY_EXAMPLE.DELAY.AVRO [4]: joining broker (rktp 00000255C8D47F10, 0 message(s) queued)
%7|1605835968.261|TOPBRK|rdkafka#consumer-5| [thrd::0/internal]: :0/internal: Topic RE.DOTNET.KAFKA_RETRY_EXAMPLE.DELAY.AVRO [2]: leaving broker (0 messages in xmitq, next broker localhost:9092/1, rktp 00000255C8B6B730)
%7|1605835968.261|TOPBRK|rdkafka#consumer-5| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Topic RE.DOTNET.KAFKA_RETRY_EXAMPLE.DELAY.AVRO [1]: joining broker (rktp 00000255C8D454A0, 0 message(s) queued)
How to reproduce
I’m using a consumer config with
_config = new ConsumerConfig
{
BootstrapServers = "example-servers:9092",
GroupId = "example-group",
EnableAutoCommit = false,
StatisticsIntervalMs = 5000,
SessionTimeoutMs = 10000,
AutoOffsetReset = AutoOffsetReset.Earliest,
EnablePartitionEof = true,
Debug = "broker"
};
var schemaRegistryConfig = new SchemaRegistryConfig
{
Url = "example-schema:8081",
RequestTimeoutMs = 10000
};
The consumer assigns itself to the topic and partition and consumes:
_consumer.Assign(_topics.Select(topic => new TopicPartition(topic, partition)).ToList());
while (true)
{
var consumeResult = _consumer.Consume(cancellationToken.Token);
// processing logic. abstracted the condition logic to "readRecordAndDetermineIfDelayIsUp"
if (readRecordAndDetermineIfDelayIsUp(consumeResult))
_consumer.Pause(_topics.Select(topic => new TopicPartition(topic, _partition)).ToList());
Task.Run(async () => {
await Task.Delay(delay);
_consumer.Resume(_topics.Select(topic => new TopicPartition(topic, _partition)).ToList());
return;
});
}
_consumer.Commit(consumeResult);
}
EDIT: After additional testing auto commit also receives the same error, so I don’t think the issue is with manually committing.
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (1 by maintainers)
Top GitHub Comments
Some tech lines please? 😃 changing the consumerGroupName would resolve this? And if so, how do i avoid getting the same message again from the topic? In fact, how do i get that message commit/go away and dont show up when i try to invoke Consume()?
The error message seems confusing. It might be related to timeouts I think. In my case when I slowly debugged code with step debugger it shown me such a message, but when I briefly skipped all breakpoints in succeeded. I also think so because there was something about pausing and leaving group in librdkafka logs.