can't calculate offset lag if consuming with KafkaConsumer
See original GitHub issueHi, faced with strange issue trying to calculate consumer offset lag. This is on kafka 0.9.0.0.
How to reproduce: Creating topic like this:
/opt/kafka/bin/kafka-topics.sh --create --topic topic_a --partitions 5 --replication-factor 3 --zookeeper localhost
Then produce some messages:
from kafka import SimpleClient, SimpleProducer
client = SimpleClient('localhost:9092,localhost:9093,localhost:9094')
producer = SimpleProducer(client=client, async=False)
for i in xrange(10):
producer.send_messages(TOPIC, str(i))
On other side half messages were consumed with:
consumer = KafkaConsumer(TOPIC, bootstrap_servers='localhost:9092,localhost:9093,localhost:9094', group_id=GROUP)
print([next(consumer).value for _ in range(5)])
consumer.commit()
Now we see correct total lag - 5:
/opt/kafka/bin/kafka-consumer-offset-checker.sh --group group_a --topic topic_a --zookeeper localhost
Group Topic Pid Offset logSize Lag Owner
group_a topic_a 0 2 2 0 none
group_a topic_a 1 0 2 2 none
group_a topic_a 2 2 2 0 none
group_a topic_a 3 1 2 1 none
group_a topic_a 4 0 2 2 none
i’m trying to get same value with following code:
client = SimpleClient('localhost:9092,localhost:9093,localhost:9094')
client.load_metadata_for_topics()
partitions = client.topic_partitions[TOPIC]
offset_requests = [OffsetRequestPayload(TOPIC, p, -1, 1) for p in partitions.keys()]
latest_offset_by_partition = {r.partition: r.offsets[0]
for r in client.send_offset_request(offset_requests)}
current_offset_by_partition = {r.partition: r.offset
for r in client.send_offset_fetch_request(GROUP, offset_requests)}
lag = 0
for part in partitions.keys():
current = current_offset_by_partition.get(part, -1)
latest = latest_offset_by_partition.get(part)
lag += latest - current
print('lag: {}'.format(lag))
but getting UnknownTopicOrPartitionError: UnknownTopicOrPartitionError - 3 - This request is for a topic or partition that does not exist on this broker.
if i do same: create, produce, but consume with:
client = SimpleClient('localhost:9092,localhost:9093,localhost:9094')
consumer = SimpleConsumer(client, group=GROUP, topic=TOPIC)
consumer.get_messages(5)
consumer.commit()
i’m getting correct lag with my code, and it is same as see with kafka-consumer-offset-checker
A bit confusing issue, probably i’m doing something wrong, so let me know if it is.
Issue Analytics
- State:
- Created 8 years ago
- Comments:14 (6 by maintainers)
Top Results From Across the Web
Apache Kafka Consumer Lag Monitoring - Sematext
If the offset is positive, that means that there is a lag. In most cases, if your Kafka Producer is actively producing messages...
Read more >Kafka consumer : offset lag between consumer and poducer
According to the JavaDocs on the KafkaConsumer you can make use of endOffsets method to "Get the end offsets for the given partitions."...
Read more >Understanding the lag in your Kafka cluster - Acceldata
Consumer lag indicates the lag between Kafka producers and consumers. If the rate of production of data far exceeds the rate at which...
Read more >Adding Time Lag to Monitor Kafka Consumer - Medium
Kafka Offset Monitor in its original form only measures absolute message lag, the number of messages that the consumer lags behind produced ...
Read more >Solved: Kafka consumer group lag in one or two partition e...
One thing you may want to consider, if you are getting rebalances, it may be because it is taking too long to deliver...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The way I typically calculate lag is with the
highwater()
method of KafkaConsumer:See also https://github.com/dpkp/kafka-python/pull/1643 which adds
KafkaAdmin.list_consumer_group_offsets()