Messages lost with new topic and regex subscription
See original GitHub issueDescribe the bug When a new topic is detected by a regexp subscription it takes time before the subscriptions cursor is set up for that topic. As the cursor is set to the end of the topic this means at least one message is lost and as this can take 40 seconds, one could lose 40 seconds of data.
To Reproduce
If I set up a consumer with a regex subscription, for example:
/opt/pulsar/bin/pulsar-client consume --regex '.*' -s all -n 0
I then send a message on a NEW topic that matches the regex.
/opt/pulsar//bin/pulsar-client produce addtopic -m 'm1'
The consumer detects the new topic and sets up a subscription to it. This can take 30-40 seconds. However it does not see the message (or any other messages sent befor the subscription is set up)
Once it is set up, sending more data to the topic will be picked up by the consumer.
/opt/pulsar//bin/pulsar-client produce addtopic -m 'm2'
The consumer will display the message ‘m2’.
So though it works from now on, potentially the first 40 seconds of data have been lost.
Expected behavior All messages sent to the new topic should be seen by the consumer.
Screenshots N/A
Desktop (please complete the following information): Centos 7 Pulsar 2.5.0, 2.5.1
Additional context
The initial message(s) are on the topic, one can see them with a reader. So a solution would be for the cursor for the new topic subscription be created pointing to the start of the topic rather then the normal end in this case.
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (3 by maintainers)
The same applies to a partitioned consumer. IMO, when a consumer found new topics/partitions, the subscription initial position should be changed to earliest no matter what the original initial position is.
Usually consumers use latest initial position to discard outdated messages. However, assuming that partitions were dynamic increased, i.e. there’re some producers and consumers serving this partitioned topic currently. If producers found the increased partitions before consumers, in consumer’s view, those messages before it consumes shouldn’t be considered outdated.
What do you think of this change? @sijie
@sijie - What is the official position on this? Is it suggested to use earliest? I see it has gone stale and has not been updated for two years. We are running into this issue, which is counter-intuitive to how a queue should work.