question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Iterator interface and poll block with two consumers reading from one topic

See original GitHub issue

I have a topic with a single partition and two KafkaConsumer in two separate processes with different group_id. When I try to get messages (there are none yet) from this topic either using iterator interface (with consumer_timeout_ms=10) or by using KafkaConsumer.poll, the consumer blocks until session_timeout_ms expires, although I believe it’s supposed to block for at most consumer_timeout_ms. Is this assumption correct? What might be causing it to block for longer? If there is only one reader, or if group_id is None, everything is working as expected. Here are log messages before consumer blocks:

2017-09-19 15:01:06,706 [INFO] cluster: Group coordinator for my-group is BrokerMetadata(nodeId=0, host='kafka', port=9092, rack=None)
2017-09-19 15:01:06,707 [INFO] base: Discovered coordinator 0 for group my-group
2017-09-19 15:01:06,707 [INFO] consumer: Revoking previously assigned partitions set() for group my-group
2017-09-19 15:01:06,708 [INFO] base: (Re-)joining group my-group

Edit: In case debug logs are useful, here are they:

2017-09-19 18:41:55,821 [DEBUG] base: Sending group coordinator request for group my-group to broker 0
2017-09-19 18:41:55,822 [DEBUG] conn: <BrokerConnection node_id=0 host=kafka/172.23.0.2 port=9092> Request 3: GroupCoordinatorRequest_v0(consumer_group='my-group')
2017-09-19 18:41:55,822 [DEBUG] client_async: Sending metadata request MetadataRequest_v1(topics=['my-topic']) to node 0
2017-09-19 18:41:55,823 [DEBUG] conn: <BrokerConnection node_id=0 host=kafka/172.23.0.2 port=9092> Request 4: MetadataRequest_v1(topics=['my-topic'])
2017-09-19 18:41:55,823 [DEBUG] consumer: Cannot auto-commit offsets for group my-group because the coordinator is unknown
2017-09-19 18:41:55,823 [DEBUG] conn: <BrokerConnection node_id=0 host=kafka/172.23.0.2 port=9092> Response 3: GroupCoordinatorResponse_v0(error_code=0, coordinator_id=0, host='kafka', port=9092)
2017-09-19 18:41:55,824 [DEBUG] base: Received group coordinator response GroupCoordinatorResponse_v0(error_code=0, coordinator_id=0, host='kafka', port=9092)
2017-09-19 18:41:55,824 [DEBUG] cluster: Updating coordinator for my-group: GroupCoordinatorResponse_v0(error_code=0, coordinator_id=0, host='kafka', port=9092)
2017-09-19 18:41:55,824 [INFO] cluster: Group coordinator for my-group is BrokerMetadata(nodeId=0, host='kafka', port=9092, rack=None)
2017-09-19 18:41:55,824 [INFO] base: Discovered coordinator 0 for group my-group
2017-09-19 18:41:55,825 [DEBUG] conn: <BrokerConnection node_id=0 host=kafka/172.23.0.2 port=9092> Response 4: MetadataResponse_v1(brokers=[(node_id=0, host='kafka', port=9092, rack=None)], controller_id=0, topics=[(error_code=0, topic='my-topic', is_internal=False, partitions=[(error_code=0, partition=0, leader=0, replicas=[0], isr=[0])])])
2017-09-19 18:41:55,825 [DEBUG] cluster: Updated cluster metadata to ClusterMetadata(brokers: 1, topics: 1, groups: 1)
2017-09-19 18:41:55,825 [INFO] consumer: Revoking previously assigned partitions set() for group my-group
2017-09-19 18:41:55,826 [INFO] base: (Re-)joining group my-group
2017-09-19 18:41:55,826 [DEBUG] base: Sending JoinGroup (JoinGroupRequest_v0(group='my-group', session_timeout=30000, member_id='', protocol_type='consumer', group_protocols=[(protocol_name='range', protocol_metadata=b'\x00\x00\x00\x00\x00\x01\x00\x0fmy-topic\x00\x00\x00\x00'), (protocol_name='roundrobin', protocol_metadata=b'\x00\x00\x00\x00\x00\x01\x00\x0fmy-topic\x00\x00\x00\x00')])) to coordinator 0
2017-09-19 18:41:55,827 [DEBUG] conn: <BrokerConnection node_id=0 host=kafka/172.23.0.2 port=9092> Request 5: JoinGroupRequest_v0(group='my-group', session_timeout=30000, member_id='', protocol_type='consumer', group_protocols=[(protocol_name='range', protocol_metadata=b'\x00\x00\x00\x00\x00\x01\x00\x0fmy-topic\x00\x00\x00\x00'), (protocol_name='roundrobin', protocol_metadata=b'\x00\x00\x00\x00\x00\x01\x00\x0fmy-topic\x00\x00\x00\x00')])
2017-09-19 18:41:55,924 [DEBUG] consumer: No offsets to commit
2017-09-19 18:41:55,925 [DEBUG] consumer: Successfully auto-committed offsets for group my-group

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:6

github_iconTop GitHub Comments

1reaction
lopuhincommented, Sep 21, 2017

Thanks for looking into it @tvoinarovskyi ! I tried to build a reproducible example but didn’t 100% succeed so far - I only managed to “reproduce” it when I don’t close the consumer (as you said here https://github.com/dpkp/kafka-python/issues/1223#issuecomment-330829458).

I’m using Python 3.5 and Ubuntu 16.04.

Here is issue_1223.py:

import logging

import kafka
import kafka.client


topic_name = 'my-topic'
group_name = 'my-group'


def main():
    logging.basicConfig(
        level=logging.INFO,
        format='%(asctime)s [%(levelname)s] %(module)s: %(message)s')

    logging.info('Creating topic')
    client = kafka.client.KafkaClient()
    topics = client.cluster.topics()
    if topic_name not in topics:
        client.add_topic(topic_name)
    logging.info('done')

    consumer = kafka.KafkaConsumer(
        topic_name,
        group_id=group_name,
        consumer_timeout_ms=1000,
    )
    try:
        while True:
            logging.info('reading consumer...')
            for _ in consumer:
                pass
            logging.info('done')
    except KeyboardInterrupt:
        logging.info('Ctrl+C pressed, closing consumer and exiting')
        consumer.close()


if __name__ == '__main__':
    main()

I start kafka + zookeeper like this:

docker run -it --rm --name kafka -p 2181:2181 -p 9092:9092 --env ADVERTISED_HOST=127.0.0.1 --env ADVERTISED_PORT=9092 spotify/kafka

Then if I start python issue_1223.py in one tab, and the same in the other tab, everything is working fine. If I then remove the last line from main (consumer.close()), and restart one of consumers, I get the hang up.

To sum up: I’ll try to understand why did I get the issue in my original code, where I think I didn’t have any consumers exiting, will post an update if I understand it.

0reactions
tvoinarovskyicommented, Sep 21, 2017

Could you provide a reproducible example, I fail to reproduce this on my local machine… It’s consuming just fine for me.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Chapter 4. Kafka Consumers: Reading Data from Kafka
When multiple consumers are subscribed to a topic and belong to the same consumer group, each consumer in the group will receive messages...
Read more >
KafkaConsumer (kafka 2.5.0 API)
A client that consumes records from a Kafka cluster. This client transparently handles the failure of Kafka brokers, and transparently adapts as topic...
Read more >
KafkaConsumer — kafka-python 2.0.2-dev documentation
Consume records from a Kafka cluster. The consumer will transparently handle the failure of servers in the Kafka cluster, and adapt as topic-partitions...
Read more >
kafka-python Documentation - Read the Docs
partition assignment to multiple consumers in the same group ... Incompatible with iterator interface – use one or the other, not both.
Read more >
Learning Kafka with Python – consuming data - LeftAsExercise
Now let us see how we can actually read data from a topic. The library offers two options to do this. First, we...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found