Slow manual commits
See original GitHub issueDescription
We need to expose manual sync commits to our clients, but we get really poor performance out of it when compared to async commits + callbacks.
Attached client logs with “debug: all” for sync and async versions. sync_commit.txt async_commit.txt
How to reproduce
It looks like the commit request itself is quite quick but between the request being enqueued and sent to the broker it can take a while (~1sec) and I can see quite a few FetchRequest
s in between commits even though the flow of our consumer is something like:
private Message<string, byte[]> ConsumeMessageSync()
{
Message<string, byte[]> kafkaMessage;
_consumer.Consume(out kafkaMessage, 100);
return kafkaMessage;
}
var msg = ConsumeMessageSync();
var clientReadyMsg = process(msg);
emitMessageToClient(clientReadyMsg);
then client subscribes and commits after each emission...
What I don’t understand is why fetch requests are issued to the broker after the commit request is enqueued and while we wait for the commit result to come back. I played around with fetch.wait.max.ms
but that just changes the amount of fetch requests that gets sent in between.
Additionally there are some weird PROTOERR
level messages like this:
7|2018-02-06 12:01:01.765|rdkafka#consumer-1|PROTOERR| [thrd:lonrs08346.my-domain.net:2182/bootstrap]: lonrs08346.my-domain.net:2182/2: Protocol parse failure at 1048332/1048648 (rd_kafka_msgset_reader_msg_v0_1:464) (incorrect broker.version.fallback?)
Probably not related but worth pointing out. Is there something I am missing? Thanks in advance!
Checklist
Please provide the following information:
- Confluent.Kafka nuget version: 0.11.3
- Apache Kafka version: 0.10.0.1
- Client configuration:
{“enable.auto.commit”, “false”}, {“auto.offset.reset”, “earliest”} - Operating system: Win7x64
- Provide logs (with “debug” : “…” as necessary in configuration)
- Provide broker log excerpts
- Critical issue
Issue Analytics
- State:
- Created 6 years ago
- Reactions:1
- Comments:40 (20 by maintainers)
Thank you all for your patience.
I’ve now identified the issue: https://github.com/edenhill/librdkafka/blob/master/src/rdkafka_broker.c#L3187
When committing to a broker that we’re not fetching messages from there is a high probability that queued ops (such as a Commit) will be delayed up to 1000ms before being sent, regardless of
socket.blocking.max.ms
.I have a fix in place which I’ll test and then commit to master.
There is no workaround.
librdkafka issue: https://github.com/edenhill/librdkafka/issues/1787
@mhowlett I would suggest making it a just synchronous call because that’s what it is. I would leave it up to the consumers of this library whether or not they want to wrap it in a
Task.Run()
or offload it onto another thread. In order to avoid making this a breaking change, you could implement aCommitAsync(this Consumer consumer, ...)
extension method that wraps the synchronousConsumer.Commit()
in aTask.Run()
. Of course, since it’s a major release, it’s OK to make a breaking change.