Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Connection leak?

See original GitHub issue

We recently updated some of our postgres client libraries and experience an almost total pool exhaustion roughly 2~3h are deploying the new version in production (never happened before) on 2 out of 20 DBs.

We updated the following libraries. pg 7.4.3 ~> 7.6.1 pg-pool 2.0.3 ~> 2.0.4 pg-promise 7.5.4 ~> 8.5.2

Once the pool is slowly exhausting a majority of queries hitting an affected process are returned with “timeout exceeded when trying to connect” https://github.com/brianc/node-pg-pool/blob/v2.0.4/index.js#L178 and the percentage is increasing overtime and condition is persistent (~30min until revert)

I’m debugging this for multiple days now, but have a hard time identifying the exact root cause, so far suspected:

https://github.com/brianc/node-pg-pool/pull/86, which means we generally queue more work now as no longer all pending queue items are dropped, but we rarely saw “timeout exceeded when trying to connect” errors before so this seems unlikely

https://github.com/brianc/node-postgres/pull/1503 some kind of race condition here as both affected DBs occasionally are hit by queries running into statement timeouts.

Any ideas, pointer, potential areas for races would be really appreciated.

//cc @vitaly-t I know that pg-promise is not part of pg distribution, but you seemed really active here and I would prefer a single spot for discussion. Any insight would be appreciated.

We mainly (but not exclusively) use nested transactions via https://github.com/vitaly-t/pg-promise#transactions starting with SET LOCAL statement_timeout = 30000;