admin getMessageById vs api seek+readnext Performance difference
See original GitHub issuePerformance of Regular org.apache.pulsar.client.api, is about 5x slower then admin org.apache.pulsar.client.admin to fetch single messages that were already consumend.
I am implementing a reader that needs to fetch 1 document at a time from bookeeper in a non sequential way.
Using a regular Reader to fetch a single document it takes around 119ms to fetch one message, while using the admin api it takes around 20ms is there any reason for this disparity?
I fetched 1000 messages using both methods to get those average times:
Reader API:
PulsarClient client = PulsarClient.builder()
.serviceUrl("pulsar://localhost:6650")
.build();
Reader reader = client.newReader()
.topic("persistent://apache/pulsar/baby-yoda").startMessageId(MessageId.earliest)
.startMessageIdInclusive()
.receiverQueueSize(1) //only fetches a single document
.create();
MessageId messageId= //somemessageid
int counter=0;
while (counter<1000) {
reader.seek(messageId);
Message document = reader.readNext();
counter++;
}
Admin API
PulsarAdmin admin = PulsarAdmin.builder().serviceHttpUrl("http://localhost:8080").build();
Message m = null;
byte[] msg;
while (counter<1000) {
//topic ledger entry
try {
msg = admin.topics().getMessageById("testtopic3",4,982).getData();
} catch (PulsarAdminException e) {
e.printStackTrace();
}
counter++;
}
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
How to find the Message-ID of an email and use it to track ...
This article explains how to find the Message-ID of an email in Outlook / Office 365 and use this identifier to track emails...
Read more >Message trace in the modern EAC in Exchange Online
Admins can use message trace in the modern Exchange admin center to find out what happened to messages.
Read more >Tracking Read Status of Email Messages in Exchange Server
How to use delivery reports in Exchange Server to check the read status of email messages.
Read more >MessageID is different after send and in message view
The scenario is: we use the sent event to get the message id that was sent and we save it to our server....
Read more >Find messages with Email Log Search - Google Support
You can find all messages within a specific time range, or search for messages by sender, date, or message ID. You can optionally...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The read API need to create an internal consumer to subscribe to the topic and send flow permits to the broker, then the broker read data from bookkeeper or tailing cache(batch read), but the admin API is reading the data from bookkeeper or tailing cache directly, and only read one message.
@DonHaul
yes it is, most notably it was initially designed to read the objects as soon as it has been written by the writer (it is using some feature in BK that is not used in Pulsar, like
readUnconfirmed
, in order to reduce latency. feel free to open a ticket with your use case on the issue tracker on github.cc @nicoloboschi @diegosalvi @dianacle