question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

can't calculate offset lag if consuming with KafkaConsumer

See original GitHub issue

Hi, faced with strange issue trying to calculate consumer offset lag. This is on kafka 0.9.0.0.

How to reproduce: Creating topic like this:

/opt/kafka/bin/kafka-topics.sh --create --topic topic_a  --partitions 5 --replication-factor 3 --zookeeper localhost

Then produce some messages:

from kafka import SimpleClient, SimpleProducer
client = SimpleClient('localhost:9092,localhost:9093,localhost:9094')
producer = SimpleProducer(client=client, async=False)
for i in xrange(10):
    producer.send_messages(TOPIC, str(i))

On other side half messages were consumed with:

consumer = KafkaConsumer(TOPIC, bootstrap_servers='localhost:9092,localhost:9093,localhost:9094', group_id=GROUP)
print([next(consumer).value for _ in range(5)])
consumer.commit()

Now we see correct total lag - 5:

/opt/kafka/bin/kafka-consumer-offset-checker.sh --group group_a --topic  topic_a --zookeeper localhost

Group           Topic                          Pid Offset          logSize         Lag             Owner
group_a         topic_a                               0   2               2               0               none
group_a         topic_a                               1   0               2               2               none
group_a         topic_a                               2   2               2               0               none
group_a         topic_a                               3   1               2               1               none
group_a         topic_a                               4   0               2               2               none

i’m trying to get same value with following code:

client = SimpleClient('localhost:9092,localhost:9093,localhost:9094')
client.load_metadata_for_topics()

partitions = client.topic_partitions[TOPIC]
offset_requests = [OffsetRequestPayload(TOPIC, p, -1, 1) for p in partitions.keys()]

latest_offset_by_partition = {r.partition: r.offsets[0]
                              for r in client.send_offset_request(offset_requests)}
current_offset_by_partition = {r.partition: r.offset
                               for r in client.send_offset_fetch_request(GROUP, offset_requests)}
lag = 0
for part in partitions.keys():
    current = current_offset_by_partition.get(part, -1)
    latest = latest_offset_by_partition.get(part)
    lag += latest - current

print('lag: {}'.format(lag))

but getting UnknownTopicOrPartitionError: UnknownTopicOrPartitionError - 3 - This request is for a topic or partition that does not exist on this broker.

if i do same: create, produce, but consume with:

client = SimpleClient('localhost:9092,localhost:9093,localhost:9094')
consumer = SimpleConsumer(client, group=GROUP, topic=TOPIC)
consumer.get_messages(5)
consumer.commit()

i’m getting correct lag with my code, and it is same as see with kafka-consumer-offset-checker

A bit confusing issue, probably i’m doing something wrong, so let me know if it is.

Issue Analytics

  • State:closed
  • Created 8 years ago
  • Comments:14 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
dpkpcommented, Jun 19, 2017

The way I typically calculate lag is with the highwater() method of KafkaConsumer:

for msg in consumer:
    tp = TopicPartition(msg.topic, msg.partition)
    highwater = consumer.highwater(tp)
    lag = (highwater - 1) - msg.offset
0reactions
jeffwidmancommented, Nov 18, 2018

See also https://github.com/dpkp/kafka-python/pull/1643 which adds KafkaAdmin.list_consumer_group_offsets()

Read more comments on GitHub >

github_iconTop Results From Across the Web

Apache Kafka Consumer Lag Monitoring - Sematext
If the offset is positive, that means that there is a lag. In most cases, if your Kafka Producer is actively producing messages...
Read more >
Kafka consumer : offset lag between consumer and poducer
According to the JavaDocs on the KafkaConsumer you can make use of endOffsets method to "Get the end offsets for the given partitions."...
Read more >
Understanding the lag in your Kafka cluster - Acceldata
Consumer lag indicates the lag between Kafka producers and consumers. If the rate of production of data far exceeds the rate at which...
Read more >
Adding Time Lag to Monitor Kafka Consumer - Medium
Kafka Offset Monitor in its original form only measures absolute message lag, the number of messages that the consumer lags behind produced ...
Read more >
Solved: Kafka consumer group lag in one or two partition e...
One thing you may want to consider, if you are getting rebalances, it may be because it is taking too long to deliver...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found