question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ThreadLocalPool (reactive) leaks connections upon Thread Scale up and down

See original GitHub issue

Describe the bug The ThreadlocalPool and derived variant ThreadLocalPgPool leak connections when quarkus is scaling down threads (after idling). Root cause is, that the thread locals are additionally kept in the threadLocalPools list in io.quarkus.reactive.datasource.runtime.ThreadLocalPool. When quarkus is scaling up worker threads under load, additional ThreadLocal instances are spawned (understoodable). But as soon as the thread pool shrinks, the previous thread local instances are still in the threadLocalPools list. All connections in there are also kept open towards the DB. When the the engine gets under load again and spawns new threads, new Threadlocal pool instances are created. This breaks as soon as the underlying DB is running out of connections.

Expected behavior ThreadLocal pools (not the pool itself, but the ThreadLocal “partitions” of the pool should be closes properly including their connections, when the Threads are scaled down.

Actual behavior As mentioned above, the pool creates new ThreadPool “partitions” for each newly spawner worker thread plus corresponding DB Connections

To Reproduce

Steps to reproduce the behavior:

  1. deploy some bean with DB interaction (simple query) with hibernate reactive
  2. put the system under load; should be so much load, that the DB Pool is saturated
  3. Cross check open connection on the DB (e.g. Postges via SELECT * FROM pg_stat_activity;); should show pool size * worker threads count of open connections
  4. leave the system in idle and wait until the worker threads are reduced by quarkus
  5. Repeat the load test
  6. Cross check open connection on the DB (e.g. Postges via SELECT * FROM pg_stat_activity;); should show pool size * worker threads count of open connections -> the number of connections is doubled

Configuration

application.properties
quarkus.hibernate-orm.database.generation=create
quarkus.datasource.db-kind=postgresql
quarkus.datasource.username=***
quarkus.datasource.password=***
quarkus.datasource.reactive=true
quarkus.datasource.reactive.url=postgresql://localhost:5434/****
quarkus.hibernate-orm.log.sql=false
quarkus.datasource.reactive.max-size=10
quarkus.thread-pool.max-threads=32
quarkus.datasource.reactive.cache-prepared-statements=true
quarkus.datasource.reactive.postgresql.pipelining-limit=256

Environment (please complete the following information):

  • Output of uname -a or ver: Darwin localhost 20.2.0 Darwin Kernel Version 20.2.0: Wed Dec 2 20:39:59 PST 2020; root:xnu-7195.60.75~1/RELEASE_X86_64 x86_64
  • Output of java -version: java version “11.0.2” 2019-01-15 LTS
  • GraalVM version (if different from Java):
  • Quarkus version or git rev: 1.11.0.Beta2
  • Build tool (ie. output of mvnw --version or gradlew --version):

Gradle 6.5.1

Build time: 2020-06-30 06:32:47 UTC Revision: 66bc713f7169626a7f0134bf452abde51550ea0a

Kotlin: 1.3.72 Groovy: 2.5.11 Ant: Apache Ant™ version 1.10.7 compiled on September 1 2019 JVM: 11.0.2 (Oracle Corporation 11.0.2+9-LTS) OS: Mac OS X 10.16 x86_64

Workaroung for me at the moment is to set quarkus.thread-pool.core-threads and quarkus.thread-pool.max-threads to the same value in order to prevent any up and down scaling of the ThreadPool (e.e. quarkus.thread-pool.core-threads=32 quarkus.thread-pool.max-threads=32)

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:27 (18 by maintainers)

github_iconTop GitHub Comments

2reactions
Sannecommented, Jan 4, 2021

@voigtste might want to play with https://github.com/quarkusio/quarkus/pull/14102

It does NOT address:

  • the connection leak reported here
  • nor fixes Hibernate Reactive Panache

but if you use Hibernate Reactive over its own API and without opening a Session directly, it will delegate work to the right context. You can set -Dorg.hibernate.reactive.common.InternalStateAssertions.ENFORCE=true to have it throw exceptions when one of the other APIs is used from the wrong thread (and it would fail with Panache Reactive if you use it at all from the wrong thread).

Assuming you don’t access the SQL Pool directly from a worker pool thread, and your code doesn’t fail when run with that flag enabled, it should avoid the connection leak as well.

1reaction
Sannecommented, Jan 5, 2021

I didn’t want to spend too much time on this (as I don’t have much time and should really study vert.x 4 for a long term optimal solution), but since a leak isn’t acceptable to me I’m sending a PR with an alternative solution.

Essentially, while it would be more complicated to immediately cleanup references on scale down, it’s quite straight forward to check for zombies when we scale in the other direction. This implies that while resources won’t be immediately released, the maximum cost will - in worst scenario - only match the cost you’d have at maximum scale up of the pool.

So it’s not optimal yet but at least it’s not a leak which risks taking down a system, as a periodic scale up/down won’t result in additional memory costs.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Connection Pool and thread pool setting in Java
I wonder how to set the proper connection pool size and thread pool size to make the app work effective? BTW, the database...
Read more >
Troubleshooting connection pooling (J2C) problems in ... - IBM
This situation would occur when the connection pool is at its maximum size (defined by the Maximum Connections property on the connection pool), ......
Read more >
quarkusio/quarkus 1.11.0.CR1 on GitHub - NewReleases.io
#14102 - Upgrade to Hibernate Reactive 1.0.0. ... #14087 - ThreadLocalPool (reactive) leaks connections upon Thread Scale up and down ...
Read more >
How to detect leaked datasource connections using the ...
How to detect leaked datasource connections using the cached connection manager (CCM) debug facility in JBoss EAP.
Read more >
Concurrency in Spring WebFlux - Baeldung
Above all, reactive programming doesn't emphasize which thread events should be generated and consumed. Rather, the emphasis is on structuring ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found