Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

maxBatchSize preallocated memory may be thousands of times larger than actual message length

See original GitHub issue

Is your enhancement request related to a problem? Please describe. Guys, during the stress test, we found that the client’s memory is very large, and even Full gc appeared. After analyzing the dump file, we found that the memory space occupied is much larger than the actual message size, and the serialization of 1KB of messages takes up 1MB.

batchedMessageMetadataAndPayload = PulsarByteBufAllocator.DEFAULT
                        .buffer(Math.min(maxBatchSize, ClientCnx.getMaxMessageSize()));

maxBatchSize = Math.max(maxBatchSize, uncompressedSize);

Debugging found that the maxBatchSize that controls the pre-allocated ByteBufPair.b2 memory size is stateful. As the size of the largest batch or the largest single message grows, this may cause the pre-allocated ByteBufPair.b2 memory to grow larger and larger , which maybe thousands of times larger than the payload of MessageImpl.

Lowering the value of batchingMaxMessages may reduce the risk of problems, but a single message that may be too large can also cause problems

Describe the solution you’d like It is best to loop through the messages to be packed to accurately calculate the memory size to be allocated

Describe alternatives you’ve considered Let the user choose whether to precisely allocate or pre-allocate

Additional context

maxPendingMessages=2000
maxPendingMessagesAcrossPartitions=40000
blockIfQueueFull=false
sendTimeoutMs=5000
batchingMaxPublishDelayMicros=50
batchingMaxMessages=2000
batchingMaxBytes=5242880
batchingEnabled=true

A11E7121-0FC6-4A0F-AC88-FC33AF63F41C