Database disconnect handling
See original GitHub issueI have been doing some in-depth testing for how the library behaves when we suddenly lose connection with the server, both under light load and the heavy load, and here are the results.
Under light load
Under lite light load everything seems to be normal, as expected, for as long as we provide a pg.on('error', cb)
handler, it’s all good, all the way to version 5.1.
However, it all changes just as we go to version 6.0, where restoring connection with the server does not restore the communications anymore. After restoring the communications in 6.0 we continue getting error Unable to set non-blocking to true
for every single query, till the process is restarted.
Under heavy load
Just as the server connection dies, in about 50% of all cases under the heavy load I can see that the connection pool is suddenly trying to make an abnormal call into client.end()
. What I mean by that is Client
objects for which the connection has been allocated successfully (just before it got broken), and the client hasn’t called method done()
yet to release the connection. I haven’t been able to figure out the the full implication of such an out-of-turn client release yet, but since it only happens in about 50% of cases under heavy load, I suspect it to be a bug of some sort, as we end up calling client.end()
twice then.
This abnormal call under heavy load is happening consistently for both 5.1 and 6.0, i.e. for both the old and the new connection pools the same.
Tested under Node.js 6.2.2, Windows 10 64-bit.
NOTE: By light mode I mean: connecting via the pool and making queries once a second. And by heavy load - doing the same, but 10 times a second (which isn’t really that heavy 😉).
Issue Analytics
- State:
- Created 7 years ago
- Comments:10 (7 by maintainers)
Top GitHub Comments
Hello.
I work at Heroku on the Connect product, and my team utilizes this package in our system for monitoring thousands of our customer’s Heroku Postgres databases. We’ve seen these error message as well, and I recently did some investigating of the errors on specific databases to see if there was any trends. I found a variety of events that took place with a Postgres database when these errors began to emit from our system:
heroku pg:reset [DATABASE]
. This blows away all data in the database. I don’t know the technical details of how this is implemented.My current plan is to pro-actively re-connect with some kind of back-off when these errors start appearing. We are using 4.x of the library.
Hopefully this helps you in your efforts to reproduce the problem.
It doesn’t appear that we use pg-native when we’re using this package directly (e.g. we never use
require('pg').native
). We do include it in our dependencies, but it appears to be for use with sequelize.There is no reason for us using an older version of the package aside from this code being relatively old and us not being as pro-active about bumping our node dependencies as we should. We’re primarily a Python team.
I am going to try and get us onto 6.x while working on this error handling. Our usage is pretty simple, and it appears the biggest change code-wise is no more
pg.connect
, which should be easy to fix.