question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

seek_to_beginning is difficult to use when using topic subscription

See original GitHub issue

I have some issues using KafkaConsumer.seek_to_beginning. The only way I have gotten it to work thus far is to call consumer.topics() before calling seek_to_beginning, however I do not understand why and only found it out by trial and error.

I suggest too either update the documentation to reflect the behaviour or to change the implementation of seek_to_beginning to be more intuitive. I’m more than happy to help with the documentation however that might be of limited use since I don’t understand the current behaviour.

from kafka import KafkaConsumer

consumer = KafkaConsumer("some.topic",
                         bootstrap_servers=["kafka.example.com"],
                         group_id='some-group-id'))

# consumer.topics()
consumer.seek_to_beginning()

for message in consumer:
        print(message)

Without the consumer.topics() call I get the following exception:

$ python3 test.py
Traceback (most recent call last):
  File "test.py", line 9, in <module>
    consumer.seek_to_beginning()
  File "./venv/lib/python3.5/site-packages/kafka/consumer/group.py", line 581, in seek_to_beginning
    assert partitions, 'No partitions are currently assigned'
AssertionError: No partitions are currently assigned

Im using kafka-python 1.0.2

Issue Analytics

  • State:closed
  • Created 8 years ago
  • Comments:17 (8 by maintainers)

github_iconTop GitHub Comments

16reactions
dpkpcommented, Mar 16, 2016

Agree, it is a bit strange. The API is modeled after the official java client, which has the same issue. I am going to wait to see how they handle it before implementing any API changes. In the meantime, you can try a few different approaches:

(1) set auto_offset_reset='earliest' in your KafkaConsumer configuration. This will cause the consumer to fetch from the beginning of the topic/partition if the consumer group does not have a committed offset. So this works for the first run on a consumer group, but subsequent runs will resume at whatever offset the group last committed.

(2) in addition to (1) , also set group_id=None. This is roughly similar to the console-consumer --from-beginning. It will not commit offsets, but it will also not do group coordination, which means you wont be able to run several consumers together and have the partitions automatically divided up and allocated between them.

(3) manually assign partitions via consumer.assign() instead of subscribing to topics via consumer.subscribe(). If you do this, seek_to_beginning() should work as expected.

There are a few other approaches, but these are the 3 I generally recommend at this point.

3reactions
aure-ollicommented, Jun 19, 2019

Apparently, the only thing to do is to call consumer._client.poll() before calling consumer.seek_to_beginning(). This will eventually send the metadata request, and dispatch the partitions.

from kafka import KafkaConsumer
consumer = KafkaConsumer('test', bootstrap_servers='localhost:9092')
# poll has a timeout, be sure that the response has arrived
while not consumer._client.poll(): continue
consumer.seek_to_beginning()

That function will call client._maybe_refresh_metadata(), client._poll() and client._fire_pending_completed_requests(). There is no other function calling this sequence.

Unfortunately, the only function calling consumer._client.poll() are consumer.__next__ and consumer.poll (which is not returning the response, so impossible to check if it has arrived), so there’s no cleaner way to do this currently.

Read more comments on GitHub >

github_iconTop Results From Across the Web

seekToBeginning doesn't work without auto.offset.reset ... - Re
I can get around this by making the user provide a 0-arg function to return a fully configured + subscribed Kafka consumer, so...
Read more >
Re: seekToBeginning doesn't work without auto.offset.reset
Cody, Use ConsumerRebalanceListener to achieve that, ConsumerRebalanceListener listener = new ConsumerRebalanceListener() { @Override public ...
Read more >
Why don't Kafka's seekToBeginning and seekToEnd work with ...
I've seen a similar topic but the problem dealt with the subscribe() , not with the assign() method. The proposed solution was to...
Read more >
KafkaConsumer (clients 2.1.1-cp6 API)
A client that consumes records from a Kafka cluster. This client transparently handles the failure of Kafka brokers, and transparently adapts as topic...
Read more >
Chapter 4. Kafka Consumers: Reading Data from Kafka
Let's take topic T1 with four partitions. Now suppose we created a new consumer, C1, which is the only consumer in group G1,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found