question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Allocating a JsonParser for a byte[] looks inefficient

See original GitHub issue

When profiling one of our Kafka consumers, I noticed that for every message a new HeapByteBuffer is allocated. This was surprising to me, because I thought they would be re-used, but it seems that the JsonFactory actually uses a new InputStreamReader for every byte array. And this InputStreamReader will use HeapByteBuffers for its internal buffering.

Is this something Jackson could do more elegantly by reusing the buffers? There seems to be no obvious way of reusing StreamDecoder, or InputStream reader, but since the source is known to be a byte array, there should be something better

Jackson version is 2.10.1. Screenshot 2020-01-14 at 12 09 53 (The API called in this screenshot is ObjectMapper.readValue(byte[]), but there seems to be no way to call this more efficiently since all calls end up in com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper.constructReader())

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:8 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
cowtowncodercommented, Jan 14, 2020

Yeah, that big buffer allocation for readers/streams by JDK has for years been huge overhead, and main reason for custom implementations of certain objects that otherwise I’d leave as-is (like UTF-8 Reader). And I am guessing that your test case uses even smaller messages than 1kB? (jvm-benchmark for example uses rather small, 400b as json)

0reactions
CodingFabiancommented, Jan 14, 2020

Yes, they are small. With Kafka you use a small key and a large payload. You look at the key to figure out if you want the message. The code that showed up as a hotspot was the code looking at those about 1kb keys that are json serialized.

For me, I try to make minimalist JHM benchmarks. In this case a very simple small JSON as bytes and then one ObjectMapper using a default factory and one with a disabled feature flag to trigger the two different code paths. It really seems to be the heap bytebuffer allocation that tips the scale. It also allocates 8x as much memory when going down the Reader path.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Building a high performance JSON parser - Dave Cheney
This is a talk about designing an efficient replacement for encoding/json. ... If we have to allocate memory to read or process those...
Read more >
Implementing High Performance Parsers in Java - InfoQ
Looking at the IndexBuffer code above, you can see that the element buffer uses nine bytes per element; four bytes for the position, ......
Read more >
c# - .NET Core JsonDocument.Parse(ReadOnlyMemory<Byte ...
One of the biggest one is that you allocate the memory (causes allocations and gc in the long run, something Memory/Span API wants...
Read more >
Speeding up json processing in go - Andrew Klotz
reusing allocated memory; the Flyweight pattern; JSON parsing without ... ArrayEach(block, func(value []byte, dataType jsonparser.
Read more >
“To be safe allocate 5 bytes more than you need” | Hacker News
The outer code knows how many entries to look for, ... If you feed it an N byte array it will have to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found