question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Why is Commit required after reading message off a topic for Consumer

See original GitHub issue

Description

I want to understand why Commit() is required (either auto/manual commit), after reading messages off a topic (either using Poll() or Consume() ).

Below is the modified code via .NET using Polling combined with manual CommitAsync() based on the code. https://github.com/confluentinc/confluent-kafka-dotnet/blob/master/examples/AdvancedConsumer/Program.cs

  public static void Run_PollWithManualCommit(string brokerList, List<string> topics)
        {
            using (var consumer = new Consumer<Ignore, string>(constructConfig(brokerList, false), null, new StringDeserializer(Encoding.UTF8)))
            {
                // Note: All event handlers are called on the main thread.

                consumer.OnMessage += (_, msg)
                    =>
                {
                    Console.WriteLine($"Topic: {msg.Topic} Partition: {msg.Partition} Offset: {msg.Offset} {msg.Value}");
                    Console.WriteLine($"Committing offset");
                    var committedOffsets = consumer.CommitAsync(msg).Result;
                    Console.WriteLine($"Committed offset: [{string.Join(", ", committedOffsets.Offsets)}]");
                };

                consumer.OnPartitionEOF += (_, end)
            => Console.WriteLine($"Reached end of topic {end.Topic} partition {end.Partition}, next message will be at offset {end.Offset}");

                // Raised on critical errors, e.g. connection failures or all brokers down.
                consumer.OnError += (_, error)
                    => Console.WriteLine($"Error: {error}");

                // Raised on deserialization errors or when a consumed message has an error != NoError.
                consumer.OnConsumeError += (_, msg)
                    => Console.WriteLine($"Error consuming from topic/partition/offset {msg.Topic}/{msg.Partition}/{msg.Offset}: {msg.Error}");

                //this is NOT called, when autocommit is disabled
                consumer.OnOffsetsCommitted += (_, commit) =>
                {
                    Console.WriteLine($"[{string.Join(", ", commit.Offsets)}]");

                    if (commit.Error)
                    {
                        Console.WriteLine($"Failed to commit offsets: {commit.Error}");
                    }
                    Console.WriteLine($"Successfully committed offsets: [{string.Join(", ", commit.Offsets)}]");
                };

                consumer.OnPartitionsAssigned += (_, partitions) =>
                {
                    Console.WriteLine($"Assigned partitions: [{string.Join(", ", partitions)}], member id: {consumer.MemberId}");
                    consumer.Assign(partitions);
                };

                consumer.OnPartitionsRevoked += (_, partitions) =>
                {
                    Console.WriteLine($"Revoked partitions: [{string.Join(", ", partitions)}]");
                    consumer.Unassign();
                };

                //consumer.OnStatistics += (_, json)
                //    => Console.WriteLine($"Statistics: {json}");

                //The subscribe() method controls which topics will be fetched in poll.
                consumer.Subscribe(topics);

                Console.WriteLine($"Subscribed to: [{string.Join(", ", consumer.Subscription)}]");

                var cancelled = false;
                Console.CancelKeyPress += (_, e) => {
                    e.Cancel = true; // prevent the process from terminating.
                    cancelled = true;
                };

                Console.WriteLine("Ctrl-C to exit.");
                while (!cancelled)
                {
                    consumer.Poll(TimeSpan.FromMilliseconds(100));
                }
            }
        }

       private static Dictionary<string, object> constructConfig(string brokerList, bool enableAutoCommit) =>
            new Dictionary<string, object>
            {
                { "group.id", "advanced-csharp-consumer" },
                { "enable.auto.commit", enableAutoCommit },
                { "auto.commit.interval.ms", 5000 },
                { "statistics.interval.ms", 60000 },
                { "bootstrap.servers", brokerList },
                { "default.topic.config", new Dictionary<string, object>()
                    {
                        { "auto.offset.reset", "smallest" }
                    }
                }
            };

I disabled the lines like below, and run Consumer again, it can still remember the CORRECT offset. That is,correct offset is used using Poll() method with manual commit without calling CommitAsync().

      consumer.OnMessage += (_, msg)
                =>
            {
                Console.WriteLine($"Topic: {msg.Topic} Partition: {msg.Partition} Offset: {msg.Offset} {msg.Value}");
                //Console.WriteLine($"Committing offset");
               // var committedOffsets = consumer.CommitAsync(msg).Result;
               // Console.WriteLine($"Committed offset: [{string.Join(", ", committedOffsets.Offsets)}]");
            };

For example, below is the test

Step 1 Run Consumer to consume one message, Step 2 kill it Step 3 re-start it Step 4 it can still remember the correct offset each time after it re-start like below


D:\myStudio2\Kafka\confluent-kafka-dotnet\examples\AdvancedConsumer>dotnet run PollWithManualCommit localhost:9092 Advanced
Subscribed to: [Advanced]
Ctrl-C to exit.
Assigned partitions: [Advanced [0]], member id: rdkafka-e4310a50-4621-4c6b-9a26-73fc984b7072
Topic: Advanced Partition: 0 Offset: 0 hello
Reached end of topic Advanced partition 0, next message will be at offset 1

Below is the result of running Producer, which shows the offset 0

D:\myStudio2\Kafka\confluent-kafka-dotnet\examples\AdvancedProducer>dotnet run localhost:9092 Advanced

-----------------------------------------------------------------------
Producer rdkafka#producer-1 producing on topic Advanced.
-----------------------------------------------------------------------
To create a kafka message with UTF-8 encoded key/value message:
> key value<Enter>
To create a kafka message with empty key and UTF-8 encoded value:
> value<enter>
Ctrl-C to quit.

> hello
Partition: 0, Offset: 0
>

I try to avoid Consume() as it might be deprecated based on this API doc: https://docs.confluent.io/current/clients/confluent-kafka-dotnet/api/Confluent.Kafka.Consumer.html#Confluent_Kafka_Consumer_Consume_Confluent_Kafka_Message__System_Int32_

How to reproduce

Checklist

Please provide the following information:

  • Confluent.Kafka nuget version:0.11.3
  • Apache Kafka version: 2.12-1.0.1
  • Client configuration:
  • Operating system:Windows 10
  • Provide logs (with “debug” : “…” as necessary in configuration)
  • Provide broker log excerpts
  • Critical issue

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:21 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
mhowlettcommented, Apr 13, 2018

@brunneus - your consumer.Assign call sets the consumption offset to 0. Try passing a List<TopicPartition> to Assign, not List<TopicPartitionOffset>, or set Offset to Offset.Invalid.

1reaction
mhowlettcommented, Mar 22, 2018

what you describe is expected behavior. CommitAsync will have no effect in your scenario because you are using a different consumer group on each run of your program (well I think so, you provide two variants of your program one where you don’t and one where you do, so I’m just guessing which one you’re actually running). Anyway, this is core functionality that is well tested, so it’s unlikely there is any issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why Commit is required after reading message off a topic ...
I am new to Kafka, I want to understand why Commit() is required (either auto/manual commit), after reading messages off a topic (either ......
Read more >
Kafka - When to commit?
It commits the offset, indicating that all the previous records from that partition have been processed.
Read more >
How Kafka's Consumer Auto Commit Configuration Can ...
Auto commit is enabled out of the box and by default commits every five seconds. For a simple data transformation service, “processed” means, ......
Read more >
Kafka Consumer | Confluent Documentation
As a consumer in the group reads messages from the partitions assigned by the coordinator, it must commit the offsets corresponding to the...
Read more >
Chapter 4. Kafka Consumers: Reading Data from Kafka
Suppose you have an application that needs to read messages from a Kafka topic, run some validations against them, and write the results...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found