Why is Commit required after reading message off a topic for Consumer
See original GitHub issueDescription
I want to understand why Commit() is required (either auto/manual commit), after reading messages off a topic (either using Poll() or Consume() ).
Below is the modified code via .NET using Polling combined with manual CommitAsync() based on the code. https://github.com/confluentinc/confluent-kafka-dotnet/blob/master/examples/AdvancedConsumer/Program.cs
public static void Run_PollWithManualCommit(string brokerList, List<string> topics)
{
using (var consumer = new Consumer<Ignore, string>(constructConfig(brokerList, false), null, new StringDeserializer(Encoding.UTF8)))
{
// Note: All event handlers are called on the main thread.
consumer.OnMessage += (_, msg)
=>
{
Console.WriteLine($"Topic: {msg.Topic} Partition: {msg.Partition} Offset: {msg.Offset} {msg.Value}");
Console.WriteLine($"Committing offset");
var committedOffsets = consumer.CommitAsync(msg).Result;
Console.WriteLine($"Committed offset: [{string.Join(", ", committedOffsets.Offsets)}]");
};
consumer.OnPartitionEOF += (_, end)
=> Console.WriteLine($"Reached end of topic {end.Topic} partition {end.Partition}, next message will be at offset {end.Offset}");
// Raised on critical errors, e.g. connection failures or all brokers down.
consumer.OnError += (_, error)
=> Console.WriteLine($"Error: {error}");
// Raised on deserialization errors or when a consumed message has an error != NoError.
consumer.OnConsumeError += (_, msg)
=> Console.WriteLine($"Error consuming from topic/partition/offset {msg.Topic}/{msg.Partition}/{msg.Offset}: {msg.Error}");
//this is NOT called, when autocommit is disabled
consumer.OnOffsetsCommitted += (_, commit) =>
{
Console.WriteLine($"[{string.Join(", ", commit.Offsets)}]");
if (commit.Error)
{
Console.WriteLine($"Failed to commit offsets: {commit.Error}");
}
Console.WriteLine($"Successfully committed offsets: [{string.Join(", ", commit.Offsets)}]");
};
consumer.OnPartitionsAssigned += (_, partitions) =>
{
Console.WriteLine($"Assigned partitions: [{string.Join(", ", partitions)}], member id: {consumer.MemberId}");
consumer.Assign(partitions);
};
consumer.OnPartitionsRevoked += (_, partitions) =>
{
Console.WriteLine($"Revoked partitions: [{string.Join(", ", partitions)}]");
consumer.Unassign();
};
//consumer.OnStatistics += (_, json)
// => Console.WriteLine($"Statistics: {json}");
//The subscribe() method controls which topics will be fetched in poll.
consumer.Subscribe(topics);
Console.WriteLine($"Subscribed to: [{string.Join(", ", consumer.Subscription)}]");
var cancelled = false;
Console.CancelKeyPress += (_, e) => {
e.Cancel = true; // prevent the process from terminating.
cancelled = true;
};
Console.WriteLine("Ctrl-C to exit.");
while (!cancelled)
{
consumer.Poll(TimeSpan.FromMilliseconds(100));
}
}
}
private static Dictionary<string, object> constructConfig(string brokerList, bool enableAutoCommit) =>
new Dictionary<string, object>
{
{ "group.id", "advanced-csharp-consumer" },
{ "enable.auto.commit", enableAutoCommit },
{ "auto.commit.interval.ms", 5000 },
{ "statistics.interval.ms", 60000 },
{ "bootstrap.servers", brokerList },
{ "default.topic.config", new Dictionary<string, object>()
{
{ "auto.offset.reset", "smallest" }
}
}
};
I disabled the lines like below, and run Consumer again, it can still remember the CORRECT offset. That is,correct offset is used using Poll() method with manual commit without calling CommitAsync().
consumer.OnMessage += (_, msg)
=>
{
Console.WriteLine($"Topic: {msg.Topic} Partition: {msg.Partition} Offset: {msg.Offset} {msg.Value}");
//Console.WriteLine($"Committing offset");
// var committedOffsets = consumer.CommitAsync(msg).Result;
// Console.WriteLine($"Committed offset: [{string.Join(", ", committedOffsets.Offsets)}]");
};
For example, below is the test
Step 1 Run Consumer to consume one message, Step 2 kill it Step 3 re-start it Step 4 it can still remember the correct offset each time after it re-start like below
D:\myStudio2\Kafka\confluent-kafka-dotnet\examples\AdvancedConsumer>dotnet run PollWithManualCommit localhost:9092 Advanced
Subscribed to: [Advanced]
Ctrl-C to exit.
Assigned partitions: [Advanced [0]], member id: rdkafka-e4310a50-4621-4c6b-9a26-73fc984b7072
Topic: Advanced Partition: 0 Offset: 0 hello
Reached end of topic Advanced partition 0, next message will be at offset 1
Below is the result of running Producer, which shows the offset 0
D:\myStudio2\Kafka\confluent-kafka-dotnet\examples\AdvancedProducer>dotnet run localhost:9092 Advanced
-----------------------------------------------------------------------
Producer rdkafka#producer-1 producing on topic Advanced.
-----------------------------------------------------------------------
To create a kafka message with UTF-8 encoded key/value message:
> key value<Enter>
To create a kafka message with empty key and UTF-8 encoded value:
> value<enter>
Ctrl-C to quit.
> hello
Partition: 0, Offset: 0
>
I try to avoid Consume() as it might be deprecated based on this API doc: https://docs.confluent.io/current/clients/confluent-kafka-dotnet/api/Confluent.Kafka.Consumer.html#Confluent_Kafka_Consumer_Consume_Confluent_Kafka_Message__System_Int32_
How to reproduce
Checklist
Please provide the following information:
- Confluent.Kafka nuget version:0.11.3
- Apache Kafka version: 2.12-1.0.1
- Client configuration:
- Operating system:Windows 10
- Provide logs (with “debug” : “…” as necessary in configuration)
- Provide broker log excerpts
- Critical issue
Issue Analytics
- State:
- Created 6 years ago
- Comments:21 (10 by maintainers)
@brunneus - your
consumer.Assign
call sets the consumption offset to 0. Try passing aList<TopicPartition>
toAssign
, notList<TopicPartitionOffset>
, or setOffset
toOffset.Invalid
.what you describe is expected behavior. CommitAsync will have no effect in your scenario because you are using a different consumer group on each run of your program (well I think so, you provide two variants of your program one where you don’t and one where you do, so I’m just guessing which one you’re actually running). Anyway, this is core functionality that is well tested, so it’s unlikely there is any issue.