Add context manager (with statement) for more Pythonic use of Consumers (and Producers)
See original GitHub issueThis is a feature request. I’m interested in trying to write it myself, but I want feedback on the idea before making a large pull request. Let me know if there’s a better location to submit this than here.
I was watching this Raymond Hettinger talk, where starting here he demonstrates best practice for wrapping a Java interface for use in Python, and I instantly thought of kafka-python.
Currently code using a KafkaConsumer looks like this (copied from example.py):
...
consumer = KafkaConsumer(bootstrap_servers='localhost:9092',
auto_offset_reset='earliest',
consumer_timeout_ms=1000)
consumer.subscribe(['my-topic'])
while not self.stop_event.is_set():
for message in consumer:
print(message)
if self.stop_event.is_set():
break
consumer.close()
...
Note the need for a manual consumer.close(), and consumer.subscribe(), as well as the checking of self.stop_event.is_set()
in two places.
Mirroring the linked talk, a more pythonic interface would, do the close() for us, and include the subscribe() in the initial call (though still allowing subscribe calls later to change the subscription. It would look like this:
with KafkaConsumer(boostrap_servers = 'localhost:9002, auto_offset_reset='earliest'
, subscribe = ['my-topic'], stop_if = self.stop_event.is_set) as consumer:
for message in consumer:
print(message)
The with
statement here, with tweaks to the iterator, could do a bunch of things for you automatically:
- Open the consumer and optionally create an initial subscription
- To avoid breaking existing implementations, an alternate constructor for the
for
loop iterator allows us to pass a bool variable, lambda or function that when true will raise StopIteration instead of returning another message. Called either because of a parameter in the init, or by using a line likefor message in consumer.stop_if(self.stop_event.is_set()):
this eliminates a extra loop and anif
statement by keeping all loop control logic into one place. The condition could even be later removed byconsumer.stop_if = None
resulting in the old (current) behavior. itertools.takewhile(predicate, iterable) would probably be a good place to start when building this. - Calls
close()
, any other teardown logic, and optionally commits reads without needing any additional code.
Issue Analytics
- State:
- Created 6 years ago
- Reactions:4
- Comments:5 (3 by maintainers)
See also #1101
Here’s the wrapper I wrote that does work but forces a
consumer_timeout_ms
value to do it.Edit: this wrapper actually works now. I’d just needed to reset the generator and next consumer timeout. every loop exception. Still uses hidden timeout though.