question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

"Broker: Unknown member" on commit for consumer manually assigned to partitions

See original GitHub issue

Description

I’m building an application which pushed messages which failed during processing to delay/retry topics. The primary consumer is fine manually committing its offsets as it reads messages.

The delay consumers are each pointed at a different partition of a delay topic. Each partition representing a timed delay.

The delay consumers are unable to manually commit their offset after consuming a message and processing it. I have enabled broker and protocol Debug and see this preceding the error:

19:04:04%7|1605837844.917|SEND|rdkafka#consumer-3| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Sent FetchRequest (v11, 131 bytes @ 0, CorrId 43)
 %7|1605837844.918|SEND|rdkafka#consumer-2| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Sent FetchRequest (v11, 187 bytes @ 0, CorrId 34)
INF] Received message on RE.DOTNET.KAFKA_RETRY_EXAMPLE2.RETRY.AVRO. Partition [2]. Offset [92].
[19:04:04 INF] %7|1605837844.950|RETRY|rdkafka#consumer-5| [thrd:GroupCoordinator]: GroupCoordinator/1: Moved 1 retry buffer(s) to output queue
Triggering Primary Topic reprocessing on RE.DOTNET.KAFKA_RETRY_EXAMPLE2.PRIMARY_TOPIC.AVRO%7|1605837844.950|SEND|rdkafka#consumer-5| [thrd:GroupCoordinator]: GroupCoordinator/1: Sent OffsetCommitRequest (v7, 145 bytes @ 0, CorrId 14)
 for RetryTransactionId 36898222-58b4-481e-8505-176465f4a5dd
%7|1605837844.952|RETRY|rdkafka#consumer-4| [thrd:GroupCoordinator]: GroupCoordinator/1: Moved 1 retry buffer(s) to output queue
[%7|1605837844.952|RECV|rdkafka#consumer-5| [thrd:GroupCoordinator]: GroupCoordinator/1: Received OffsetCommitResponse (v7, 61 bytes, CorrId 14, rtt 1.49ms)
19:04:04%7|1605837844.953|RECV|rdkafka#consumer-6| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Received FetchResponse (v11, 103 bytes, CorrId 35, rtt 101.42ms)
 %7|1605837844.953|SEND|rdkafka#consumer-4| [thrd:GroupCoordinator]: GroupCoordinator/1: Sent OffsetCommitRequest (v7, 145 bytes @ 0, CorrId 14)
%7|1605837844.953|REQERR|rdkafka#consumer-5| [thrd:main]: GroupCoordinator/1: OffsetCommitRequest failed: Broker: Unknown member: explicit actions Refresh,Retry
INF%7|1605837844.954|RECV|rdkafka#consumer-9| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Received FetchResponse (v11, 111 bytes, CorrId 24, rtt 101.74ms)
%7|1605837844.954|RECV|rdkafka#consumer-4| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Received FetchResponse (v11, 103 bytes, CorrId 47, rtt 100.73ms)
] %7|1605837844.955|SEND|rdkafka#consumer-6| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Sent FetchRequest (v11, 131 bytes @ 0, CorrId 36)
%7|1605837844.979|RECV|rdkafka#consumer-4| [thrd:GroupCoordinator]: GroupCoordinator/1: Received OffsetCommitResponse (v7, 61 bytes, CorrId 14, rtt 25.98ms)
Seeking to specific message on %7|1605837844.980|SEND|rdkafka#consumer-5| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Sent FindCoordinatorRequest (v2, 69 bytes @ 0, CorrId 48)
RE.DOTNET.KAFKA_RETRY_EXAMPLE2.PRIMARY_TOPIC.AVRO%7|1605837844.980|SEND|rdkafka#consumer-9| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Sent FetchRequest (v11, 139 bytes @ 0, CorrId 25)
, partition %7|1605837844.981|SEND|rdkafka#consumer-4| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Sent FetchRequest (v11, 131 bytes @ 0, CorrId 48)
0%7|1605837844.981|REQERR|rdkafka#consumer-4| [thrd:main]: GroupCoordinator/1: OffsetCommitRequest failed: Broker: Unknown member: explicit actions Refresh,Retry
, offset %7|1605837844.983|RECV|rdkafka#consumer-9| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Received FetchResponse (v11, 379 bytes, CorrId 25, rtt 2.69ms)
1%7|1605837844.984|SEND|rdkafka#consumer-4| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Sent FindCoordinatorRequest (v2, 69 bytes @ 0, CorrId 49)
%7|1605837844.984|SEND|rdkafka#consumer-9| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Sent FetchRequest (v11, 139 bytes @ 0, CorrId 26)
.
[19:04:04 ERR] Failed to commit consumption of message on RE.DOTNET.KAFKA_RETRY_EXAMPLE2.DELAY.AVRO. Partition [2]. Offset [18]. Reason [Broker: Unknown member]
                                Broker: Unknown member
                                   at Confluent.Kafka.Impl.SafeKafkaHandle.Commit(IEnumerable`1 offsets)
   at Confluent.Kafka.Consumer`2.Commit(IEnumerable`1 offsets)
   at Confluent.Kafka.Consumer`2.Commit(ConsumeResult`2 result)
   at Kafka.Retry.Consumers.ContinuousConsumer`2.HandleCommit(ConsumeResult`2 consumeResult) in C:\GitRepos\kafka-retry-module-.net\src\Consumers\SuperClasses\ContinuousConsumer.cs:line 105%7|1605837845.012|RECV|rdkafka#consumer-7| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Received FetchResponse (v11, 103 bytes, CorrId 44, rtt 101.49ms)

In the logs before it start consuming I see quite a few of these messages of the delay consumers joining and leaving topics:

%7|1605835968.260|TOPBRK|rdkafka#consumer-6| [thrd::0/internal]: :0/internal: Topic RE.DOTNET.KAFKA_RETRY_EXAMPLE.DELAY.AVRO [4]: joining broker (rktp 00000255C8D47F10, 0 message(s) queued)
%7|1605835968.261|TOPBRK|rdkafka#consumer-5| [thrd::0/internal]: :0/internal: Topic RE.DOTNET.KAFKA_RETRY_EXAMPLE.DELAY.AVRO [2]: leaving broker (0 messages in xmitq, next broker localhost:9092/1, rktp 00000255C8B6B730)
%7|1605835968.261|TOPBRK|rdkafka#consumer-5| [thrd:localhost:9092/bootstrap]: localhost:9092/1: Topic RE.DOTNET.KAFKA_RETRY_EXAMPLE.DELAY.AVRO [1]: joining broker (rktp 00000255C8D454A0, 0 message(s) queued)

How to reproduce

I’m using a consumer config with

            _config = new ConsumerConfig
            {
                BootstrapServers = "example-servers:9092",
                GroupId = "example-group",
                EnableAutoCommit = false,
                StatisticsIntervalMs = 5000,
                SessionTimeoutMs = 10000,
                AutoOffsetReset = AutoOffsetReset.Earliest,
                EnablePartitionEof = true,
                Debug = "broker"
            };
            var schemaRegistryConfig = new SchemaRegistryConfig
            {
                Url = "example-schema:8081",
                RequestTimeoutMs = 10000
            };

The consumer assigns itself to the topic and partition and consumes:

_consumer.Assign(_topics.Select(topic => new TopicPartition(topic, partition)).ToList());

while (true)
{
    var consumeResult = _consumer.Consume(cancellationToken.Token);

    // processing logic. abstracted the condition logic to "readRecordAndDetermineIfDelayIsUp"
    if (readRecordAndDetermineIfDelayIsUp(consumeResult))
        _consumer.Pause(_topics.Select(topic => new TopicPartition(topic, _partition)).ToList());
        Task.Run(async () => {
            await Task.Delay(delay);
            _consumer.Resume(_topics.Select(topic => new TopicPartition(topic, _partition)).ToList());
            return;
        });
    }

    _consumer.Commit(consumeResult);
 }

EDIT: After additional testing auto commit also receives the same error, so I don’t think the issue is with manually committing.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

8reactions
aravindk777commented, Mar 22, 2022

Some tech lines please? 😃 changing the consumerGroupName would resolve this? And if so, how do i avoid getting the same message again from the topic? In fact, how do i get that message commit/go away and dont show up when i try to invoke Consume()?

0reactions
Arkemlarcommented, Mar 20, 2023

The error message seems confusing. It might be related to timeouts I think. In my case when I slowly debugged code with step debugger it shown me such a message, but when I briefly skipped all breakpoints in succeeded. I also think so because there was something about pausing and leaving group in librdkafka logs.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Broker: Unknown member after running around about 2 ...
Using those configuration, the consumer is able to receive event but its not last for long. around 2hours more it randomly gives those...
Read more >
Solution for Kafka CommitFailedException
CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member.
Read more >
kafka
Package kafka provides high-level Apache Kafka producer and consumers using bindings on-top of the librdkafka C library.
Read more >
librdkafka: RdKafka::KafkaConsumer Class Reference
The subscription set denotes the desired topics to consume and this set is provided to the partition assignor (one of the elected group...
Read more >
confluent_kafka API — confluent-kafka 2.2.0 documentation
Set the consumer partition assignment to the provided list of TopicPartition and start consuming. Parameters. partitions (list(TopicPartition)) – List of topic+ ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found