question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Negative acknowledgement doesn't remove the message id from UnAckedMessageTracker when message id is instance of BatchMessageIdImpl

See original GitHub issue

Describe the bug

The actual symptom was that when using the DLQ feature, the redelivery counts were not consistent in a use case where negative acknowledgements are used. Messages would get redelivered more times than the configured maxRedeliverCount on the DeadLetterPolicy.

I observed this type of log messages in the log output:

14:20:07.080 [pulsar-timer-4-1] WARN  o.a.p.c.impl.UnAckedMessageTracker - [ConsumerBase{subscription='Test-Subscriber', consumerName='f194e', topic='test-topic'}] 5 messages have timed-out

By debugging, I noticed that calling org.apache.pulsar.client.api.Consumer#negativeAcknowledge(org.apache.pulsar.client.api.MessageId) doesn’t remove the message id from UnAckedMessageTracker when the message is instance of BatchMessageIdImpl.

Expected behavior

org.apache.pulsar.client.impl.UnAckedMessageTracker implementation should encapsulate the fact that the message id must be MessageIdImpl and not BatchMessageIdImpl.

Currently the logic to first convert a BatchMessageIdImpl is done on the calling side (examples: https://github.com/apache/pulsar/blob/de57ddd572dbc74529a56fee68c6be37bd35cf7c/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L521-L531 , https://github.com/apache/pulsar/blob/de57ddd572dbc74529a56fee68c6be37bd35cf7c/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L1155-L1164 )

Since the caller has to convert the message id before calling UnAckedMessageTracker add or remove methods, it seems that this leads to error prone usage of the UnAckedMessageTracker class. Currently the conversion to MessageIdImpl is missing in the negative acknowledgement method: https://github.com/apache/pulsar/blob/de57ddd572dbc74529a56fee68c6be37bd35cf7c/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L556-L557

Work around

One workaround is to convert a possible BatchMessageIdImpl to MessageIdImpl before calling the negativeAcknowledge method. Something like this

   ...
   consumer.negativeAcknowledge(convertMessageIdForNack(message.getMessageId()));
   ...

    // workaround Pulsar bug regarding negative acknowledgements
    private static MessageId convertMessageIdForNack(MessageId messageId) {
        if (messageId instanceof BatchMessageIdImpl) {
            // use similar logic as there is in org.apache.pulsar.client.impl.NegativeAcksTracker#add
            BatchMessageIdImpl batchMessageId = (BatchMessageIdImpl) messageId;
            return new MessageIdImpl(batchMessageId.getLedgerId(), batchMessageId.getEntryId(),
                    batchMessageId.getPartitionIndex());
        } else {
            return messageId;
        }
    }

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:10 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
codelipenghuicommented, Sep 23, 2020

@lhotari Sorry, I missed your last comment. We will try to fix it on 2.6.2.

1reaction
codelipenghuicommented, Jun 2, 2020

@lhotari Thanks for your feedback. Looks #6052 just can avoid the acked messages in the batch message deliver to the consumer. We need to find a way to handle negative acknowledgment. We need to remove the message ID from the unack message tracker and send the redelivery request to the broker until all batch indexes of the batch message are processed(ack or negative ack).

Read more comments on GitHub >

github_iconTop Results From Across the Web

[GitHub] [pulsar] devinbost commented on issue #6869: Negative ...
... on issue #6869: Negative acknowledgement doesn't remove the message id from UnAckedMessageTracker when message id is instance of BatchMessageIdImpl.
Read more >
Messaging Concepts - Apache Pulsar
A message's sequence ID is its ordering in that sequence. ... Use negative acknowledgement prior to acknowledgement timeout. Negative acknowledgement ...
Read more >
org.apache.pulsar.client.api.Message.getMessageId java ...
message ) throws PulsarClientException { try { acknowledge(message.getMessageId()); } catch (NullPointerException npe) { throw new PulsarClientException.
Read more >
org.apache.pulsar.client.api.Message#getMessageId
Otherwise, it sets the publish time(which always exists) * of the message as the timestamp. ... getMessageId(); if (messageId instanceof BatchMessageIdImpl) ...
Read more >
Pulsar: If a message gets nack'd (negativeAcknowledge ...
consumer.negativeAcknowledge(messageId);. When will it be redelivered to retry processing? I am unable to figure out what the default setting ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found