No great pattern for doing a graceful shutdown with Apollo Server integration packages
See original GitHub issueHello,
After updating to 2.22 (see PR #4981) and applying the recommendation in changelog to insert await server.start()
between server = new ApolloServer()
and server.applyMiddleware
we started observing that Apollo is now listetning to termination signals and stops handling in-flight requests by throwing:
{"errors": [{
"message": "Cannot execute GraphQL operations after the server has stopped.",
"extensions": {"code":"INTERNAL_SERVER_ERROR"}
}]}
We were already handling these signals and calling the express close
method (which does not abort in-flight requests but rather stops accepting new ones and waits for the others to finish).
My impression was that when using some middleware, like express, rather than the standalone apollo server these signals should not be handled by apollo itself? At least they were not prior to 2.22.x.
To work around this issue we explicitely set stopOnTerminationSignals: false
and it seems to have resolved it.
Some context: We are deploying to a K8S deployment which does a rolling update. After the new version has started k8s sends a termination signal to the old version. Upon receiving this signal we make the readiness probe fail to avoid new requests being routed but keep express up for some more time until the in-flight requests are finished (or a timeout is triggered).
Issue Analytics
- State:
- Created 2 years ago
- Reactions:2
- Comments:12 (9 by maintainers)
I would like to throw this out there just so that it’s acknowledged: right now it looks like
stop
also shuts down the health check endpoint, which can then causereadinessProbes
to fail in kubernetes, which then prevents the request from completing.I haven’t looked into the code, so maybe I’m missing something, but it seems like that’s the case, and if so it would be good to keep in mind that in kubernetes the health check should stay active until the pod is ready to be removed. At least, that’s what I intuit, as once a pod enters
terminating
state it’s generally considered finished cleaning up once it stops beingready
.Something else could be going on; I’ll follow up after I figure out what’s up with this.
Edit: Upon further examination, the issue was caused by Istio needing a pod annotation of:
(starting on the inline-and-improve-stoppable project at https://github.com/apollographql/apollo-server/pull/5498 )