question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Large number of PoolThreadCache garbage objects

See original GitHub issue

We are using grpc-java version 1.16.0 in Apache Ratis for RPC communication. Apache Ratis is a java implementation of Raft consensus protocol and uses grpc for server-server and client-server communication. In the scenario there are ~10KB requests coming from the client to the server and the server is replicating those requests to other servers. There are thousands of these requests coming gradually to the server from the client. There are other small sized metadata related communications as well.

We are currently seeing a lot of GCs getting triggered and lot of garbage PoolThreadCache objects. Here is a snippet of the major unreachable heap objects.

org.apache.ratis.thirdparty.io.netty.buffer.PoolThreadCache self 126Kb (< 0.1%), 2,021 object(s)
	.tinySubPageDirectCaches ↘ 173,584Kb (24.5%), 2,021 reference(s)
		org.apache.ratis.thirdparty.io.netty.buffer.PoolThreadCache$MemoryRegionCache[] self 284Kb (< 0.1%), 2,021 object(s)
			{array elements} ↘ 173,300Kb (24.4%), 64,672 reference(s)
				org.apache.ratis.thirdparty.io.netty.buffer.PoolThreadCache$SubPageMemoryRegionCache self 2,021Kb (0.3%), 64,672 object(s)
					.queue ↘ 171,279Kb (24.1%), 64,672 reference(s)
						org.apache.ratis.thirdparty.io.netty.util.internal.shaded.org.jctools.queues.MpscArrayQueue self 40,925Kb (5.8%), 64,672 object(s)
							.buffer ↘ 130,354Kb (18.4%), 64,672 reference(s)
								Object[] self 130,354Kb (18.4%), 64,672 object(s)

	.tinySubPageHeapCaches ↘ 173,584Kb (24.5%), 2,021 reference(s)
		org.apache.ratis.thirdparty.io.netty.buffer.PoolThreadCache$MemoryRegionCache[] self 284Kb (< 0.1%), 2,021 object(s)
			{array elements} ↘ 173,300Kb (24.4%), 64,672 reference(s)
				org.apache.ratis.thirdparty.io.netty.buffer.PoolThreadCache$SubPageMemoryRegionCache self 2,021Kb (0.3%), 64,672 object(s)
					.queue ↘ 171,279Kb (24.1%), 64,672 reference(s)
						org.apache.ratis.thirdparty.io.netty.util.internal.shaded.org.jctools.queues.MpscArrayQueue self 40,925Kb (5.8%), 64,672 object(s)
							.buffer ↘ 130,354Kb (18.4%), 64,672 reference(s)
								Object[] self 130,354Kb (18.4%), 64,672 object(s)

	.smallSubPageDirectCaches ↘ 13,641Kb (1.9%), 2,021 reference(s)
		org.apache.ratis.thirdparty.io.netty.buffer.PoolThreadCache$MemoryRegionCache[] self 63Kb (< 0.1%), 2,021 object(s)
			{array elements} ↘ 13,578Kb (1.9%), 8,084 reference(s)
				org.apache.ratis.thirdparty.io.netty.buffer.PoolThreadCache$SubPageMemoryRegionCache self 252Kb (< 0.1%), 8,084 object(s)
					.queue ↘ 13,325Kb (1.9%), 8,084 reference(s)
						org.apache.ratis.thirdparty.io.netty.util.internal.shaded.org.jctools.queues.MpscArrayQueue self 5,115Kb (0.7%), 8,084 object(s)
							.buffer ↘ 8,210Kb (1.2%), 8,084 reference(s)
								Object[] self 8,210Kb (1.2%), 8,084 object(s)

	.smallSubPageHeapCaches ↘ 13,641Kb (1.9%), 2,021 reference(s)
		org.apache.ratis.thirdparty.io.netty.buffer.PoolThreadCache$MemoryRegionCache[] self 63Kb (< 0.1%), 2,021 object(s)
			{array elements} ↘ 13,578Kb (1.9%), 8,084 reference(s)
				org.apache.ratis.thirdparty.io.netty.buffer.PoolThreadCache$SubPageMemoryRegionCache self 252Kb (< 0.1%), 8,084 object(s)
					.queue ↘ 13,325Kb (1.9%), 8,084 reference(s)
						org.apache.ratis.thirdparty.io.netty.util.internal.shaded.org.jctools.queues.MpscArrayQueue self 5,115Kb (0.7%), 8,084 object(s)
							.buffer ↘ 8,210Kb (1.2%), 8,084 reference(s)
								Object[] self 8,210Kb (1.2%), 8,084 object(s)

	.normalDirectCaches ↘ 5,699Kb (0.8%), 2,021 reference(s)
	
	.normalHeapCaches ↘ 5,699Kb (0.8%), 2,021 reference(s)

	.freed ↘ 31Kb (< 0.1%), 2,021 reference(s)

It seems like there is lot of garbage being generated via thread caches.

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:16 (6 by maintainers)

github_iconTop GitHub Comments

6reactions
zandegrancommented, Feb 21, 2020

We are also affected by this memory leak large amounts of io.netty.util.internal.shaded.org.jctools.queues.MpscArrayQueue.buffer objects. Upto 30% of the heap

1reaction
MajorHe1commented, Jan 25, 2022

@ejona86 Thanks for your reply!! The screen shot indeed doesn’t look problematic, as you said. The problem is with the amount of this kind of objects that we see in that screen shot. The following screen shot shows it better (different dump): image I think that many of these Object[] instances are just arrays of null values, and they are also related to these caches. I’ve just now realised that the service didn’t run enough time to really demonstrate the memory leak. I’ll monitor if for several more days.

Hello, is there any process to solve this problem?

Read more comments on GitHub >

github_iconTop Results From Across the Web

C# garbage collection of many relatively big objects
My big objects are List<JournalEntryResponseItem> instances, initially coming from datasource and then saved to ResultsBuffer . UPD 2: Data ...
Read more >
Large object heap (LOH) on Windows | Microsoft Learn
The .NET garbage collector (GC) divides objects up into small and large objects. When an object is large, some of its attributes become...
Read more >
Why are some Java objects alive? - IBM
If the objects retaining a large part of the Java heap were not alive, then they would be garbage; thus, they would be...
Read more >
Decreasing Memory Overhead in Hard Real-Time Garbage ...
(2) using large amounts of extra memory. Memory is usually a scarce ... tors, all share the problem that unreachable (garbage) objects are...
Read more >
7 Problems to Look out for When Analyzing Garbage ...
One way to do this is to use the static keyword for large objects, e.g. ... detect when this is happening because the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found