Subscriber maxes out CPU after exactly 60 minutes on GKE
See original GitHub issueEnvironment details
- OS: GKE
- Node.js version: 12.14.1
- npm version: –
@google-cloud/pubsub
version:pubsub@1.5.0
/grpc-js@0.6.16
Steps to reproduce
- Start a pod that subscribes to an idle subscription
- Wait 60 minutes
- Observe node process eat up all available CPU
I have a pod that subscribes to a subscription which is 100% idle (i.e. no messages published on that topic) during the observation period.
Exactly 60 minutes after the pod started it eats up all available CPU until the pod is deleted manually to force a restart.
This behavior does not appear when using the C++ gGRPC bindings as suggested in https://github.com/googleapis/nodejs-pubsub/issues/850#issuecomment-573900907
I have also enabled debug output through
GRPC_TRACE=all
GRPC_VERBOSITY=DEBUG
I’ve put the full debug output (as fetched from Google Cloud Logging) of the entire lifetime of the pod in this gist.
There is no further output after the last line even though the pod kept running for a while as can be seen form the graph above: the bump in CPU usage is the time where the pod eats up all available CPU and doesn’t seem to log anything. (Times in the logs correspond to the times in the graph above with 1h timezone difference, so 10:51 in the logs = 11:51 in the graph).
I can reproduce this problem with any of my subscriber pods.
I believe it is very likely that this problem also falls into the category of #868.
Issue Analytics
- State:
- Created 4 years ago
- Comments:24 (7 by maintainers)
Top GitHub Comments
I have now published
@grpc/grpc-js
version 0.6.18. I’m pretty sure I’ve fixed it this time, and if not I added some more relevant logging. Can you try that out?@ctavan I will take a look at updating the versions in nodejs-pubsub. Thank you for all the help in testing this stuff!