what's the problem with consumer group
See original GitHub issueI use the group as the right way, but it just can’t do the way i want. here is my code,
#!/usr/bin/env python
import sys
from kafka.client import KafkaClient
from kafka.consumer import SimpleConsumer
from kafka.producer import SimpleProducer, KeyedProducer
def main():
if len(sys.argv) != 2:
sys.exit(0)
kafka = KafkaClient("localhost:9092")
if sys.argv[1] == "put":
producer = SimpleProducer(kafka)
resp = producer.send_messages("my-topic", "some message")
print resp
elif sys.argv[1] == "get":
consumer = SimpleConsumer(kafka, "my-foo-group", "my-topic")
for message in consumer:
print message
if __name__ == "__main__":
main()
What i want is , if i send “my-topic” a message, only one consumer can get this message from the group(“my-foo-group”) However, what i found out is, no matter how many consumer process i start, all of them will get this message at the end. Am i wrong or it’s the problem of kafka python client ?
Issue Analytics
- State:
- Created 9 years ago
- Comments:9
Top Results From Across the Web
load balancing - Kafka Issues on consumer group
One important thing we should remember when we work with Apache Kafka is the number of consumers in the same consumer group should...
Read more >Don't Use Apache Kafka Consumer Groups the Wrong Way!
Having consumers as part of different consumer groups means providing the “publish/subscribe” pattern where the messages from topic partitions ...
Read more >My Consumer Group Is Not Balanced - Jeppe Andersen Blog
The Kafka consumer groups concept is a great and easy-to-approach abstraction over multi-instance consumption of records from topics.
Read more >Consumer Group Protocol: Scalability and Fault Tolerance
Consumer Group Protocol. Kafka separates storage from compute. Storage is handled by the brokers and compute is mainly handled by consumers or frameworks ......
Read more >Complete Guide to Kafka Consumer Group
The maximum number of Consumers is equal to the number of partitions in the topic. If there are more consumers than partitions, then...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Currently, the “high-level” JVM consumers use ZK to coordinate which partitions are read by which threads. Each consuming thread in the JVM consumer will be reading from at least one partition, and these consumer threads can exist across multiple JVMs. This means you can create one logical “consumer group” that consists of several threads across several JVMs, e.g. a topic with 32 partitions could be read by 4 JVMs with 8 threads each and the data would be evenly distributed among the consumers.
The reason we haven’t added this feature is that there is a complex algorithm involving ZooKeeper to make sure a thread is consuming the correct partition at the correct offset. There are plans to redesign this “coordinated consumption” in Kafka so that it does not depend on ZooKeeper. This will make it easier for clients like kafka-python to do this kind of thing.
So, in other words, we’ll have it eventually.
HTH
To be clear: Kafka-Python supports offset management and resumption. It does not support having C consumers and P partitions and automatically distributing load without duplicate readers for a message. If you need help getting resuming from an offset working, we’d be glad to help you out.