question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

hanging threads on exit

See original GitHub issue

Our test infrastructure has (very useful!) checks for hanging threads that we use to ensure our code properly shuts down all of its thread pools and doesn’t have resource leakage. After the most recent upgrade of Corretto, these tests have started failing with stack traces like:

[junit]      Testsuite: com.amazon.lucene.gcr.replica.WarmerTest
 [junit]      Feb 14, 2020 2:35:41 PM com.carrotsearch.randomizedtesting.ThreadLeakControl checkThreadLeaks
 [junit]      WARNING: Will linger awaiting termination of 3 leaked thread(s).
 [junit]      Feb 14, 2020 2:36:01 PM com.carrotsearch.randomizedtesting.ThreadLeakControl checkThreadLeaks
 [junit]      SEVERE: 3 threads leaked from SUITE scope at com.amazon.lucene.gcr.replica.WarmerTest: 
 [junit]         1) Thread[id=35, name=Native reference cleanup thread, state=TIMED_WAITING, group=TGRP-WarmerTest]
 [junit]              at java.base@11.0.6/java.lang.Object.wait(Native Method)
 [junit]              at java.base@11.0.6/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:155)
 [junit]              at app/com.amazon.corretto.crypto.provider/com.amazon.corretto.crypto.provider.Janitor$Stripe.tryClean(Janitor.java:238)
 [junit]              at app/com.amazon.corretto.crypto.provider/com.amazon.corretto.crypto.provider.Janitor$Stripe.access$600(Janitor.java:158)
 [junit]              at app/com.amazon.corretto.crypto.provider/com.amazon.corretto.crypto.provider.Janitor$JanitorState.cleanerThread(Janitor.java:351)
 [junit]              at app/com.amazon.corretto.crypto.provider/com.amazon.corretto.crypto.provider.Janitor$JanitorState$$Lambda$175/0x0000000800399040.run(Unknown Source)
 [junit]              at java.base@11.0.6/java.lang.Thread.run(Thread.java:834)
 [junit]         2) Thread[id=34, name=ForkJoinPool.commonPool-worker-51, state=WAITING, group=TGRP-WarmerTest]
 [junit]              at java.base@11.0.6/jdk.internal.misc.Unsafe.park(Native Method)
 [junit]              at java.base@11.0.6/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
 [junit]              at java.base@11.0.6/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1628)
 [junit]              at java.base@11.0.6/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177)
 [junit]         3) Thread[id=36, name=ForkJoinPool.commonPool-worker-37, state=TIMED_WAITING, group=TGRP-WarmerTest]
 [junit]              at java.base@11.0.6/jdk.internal.misc.Unsafe.park(Native Method)
 [junit]              at java.base@11.0.6/java.util.concurrent.locks.LockSupport.parkUntil(LockSupport.java:275)
 [junit]              at java.base@11.0.6/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1619)
 [junit]              at java.base@11.0.6/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177)

In one case we’ve been able to work around this by changing our test implementation, eg by eliminating the usage of java.nio.Files.walkFileTree - it’s not clear how this is connected to the dangling threads, but removing our temp files in a different way did make the tests pass.

Other cases I have not yet been able to determine the triggering cause. We do have a mechanism for whitelisting instances of thread leakage such as this, and I can pursue that, but it would really be preferable if the JVM were not leaking threads – is it? Can we clean this up in Corretto?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
alvdavicommented, Feb 26, 2020

I’m sure the person that bundled ACCP is paying attention to this thread and will make sure the updated version is available in a timely manner.

0reactions
SalusaSeconduscommented, Feb 26, 2020

As a note, even after we release the new patch release with this fix, it will likely be some time before the new version is picked up and distributed by your internal systems. So, you will need to white-list these threads for at least a little while.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Interrupt/Stop/End a hanging multi-threaded python ...
If you want to force all the threads to exit when the process exits, you can set the "daemon" flag of the thread...
Read more >
Terminating threads - IBM
A process can exit at any time when a thread calls the exit subroutine. Similarly, a thread can exit at any time by...
Read more >
hanging-threads - PyPI
If a thread is frozen for at least 10 seconds then the stack is dumped into standard error stream. This happens again every...
Read more >
How to terminate a unresponsive or hang thread in c# 3.0
I want to implement a mechanism which will exit/terminate any hanging/blocked/timeout thread. I have used a threading timer to check the ...
Read more >
Java - Hanging Thread Detection and Handling - InfoWorld
The Java ThreadPool does not have a mechanism for detecting hanging threads. Using a strategy like fixed threadpool ( Executors.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found