Memory leak of ScanResults
See original GitHub issueHi, firstly thank you very much for this library - it is very useful!
I just updated our usage from 2.1.x to 4.0.4 and noticed a lot of out of memory issues with some of our test code. Digging in a bit, I noticed a lot of ScanResult
objects retained on the heap. Indeed, running repeated ClassGraph().scan()
s ends up chewing up a bunch of memory.
The culprit, I believe, is the shutdown hook being added in the ScanResult
constructor. This ends up holding a reference to the ScanResult
(and all it’s objects) for the lifetime of the VM. I did a quick test of removing this bit of code and memory usage stayed constant regardless of how many iterations I ran.
It seems that this is the wrong place for this code. Seeing that you do provide a close()
(cleanup) method, the user already has the ability to enable this functionality themselves - i.e. they could add the shutdownhook themselves if they wanted to or somehow schedule cleanup as relevant to the lifecycle of their app.
Thoughts?
Issue Analytics
- State:
- Created 5 years ago
- Comments:10 (8 by maintainers)
Top GitHub Comments
@jdeppe-pivotal Great sleuthing, thanks for digging down to find the root of the problem.
This is tricky to solve. When you call close(), not only are some memory resources freed up, but temporary files are removed (such as inner jars-within-jars that had to be extracted in order to scan them). Those resources cannot be removed while the
ScanResult
is in scope, because calls likeClassInfo#loadClass()
require those temporary files to be in place. And theClassInfo
objects obviously cannot be freed right after the scan, because they need to be returned by variousScanResult
API calls.There is already a method
ScanResult#finalize()
that calls#close()
, and this is supposed to clean up resources on garbage collection (with the caveat that finalizers are supposedly not 100% reliable). I didn’t think about the fact that the shutdown hook would hold a reference, so this was a great catch, thanks for figuring that out, I never would have thought to look there for a held reference! I changed the code to useWeakReference<ScanResult>
in the shutdown hook, which should solve that problem, and I made a number of other changes to try to free up resources more aggressively. Can you please test the master version, and let me know if it solves the problem?I’m also curious though – how many scans are you performing? I’m surprised that the result of a scan caused a memory leak that was bad enough to even notice. Typical usage of ClassGraph is to run a single scan (or a small fixed number of scans) on startup. The classpath / module path is probably never going to change during the lifetime of the JVM invocation, so if you are scanning repeatedly, you are probably doing wasted / repeated work.
@jdeppe-pivotal it looks like synthetic accessors are going away in JDK 11:
https://www.javacodegeeks.com/2018/08/nested-classes-private-methods.html