Sporadic cur != GRPC_CHANNEL_SHUTDOWN crashes the node process
See original GitHub issueProblem description
The company I work in has multiple services and uses node grpc implementation to communicate amongst them. We are currently experiencing sporadically E0714 23:39:13.907064150 18 connectivity_state.cc:154] assertion failed: cur != GRPC_CHANNEL_SHUTDOWN
errors that shut down the service completely. The GRPC_CHANNEL_SHUTDOWN
errors are usually preceded ( not always ) by other errors in the service such as Postgres statement timeouts and elastic search errors. However, these are not unhandled errors and it should definitely not shut down the service. It also looks to be happening with one of the services that usually handle more data compared to the others.
Reproduction steps
You can use the code in the repo here to reproduce the problem. It also describes the steps to reproduce.
Environment
- OS name, version and architecture: Alpine Linux v3.11 docker running on Amazon Linux 2 x86_64
- Node version: v12.16.0
- Node installation method: docker
- Package name and version: “grpc”: “1.22.2” and “@grpc/grpc-js”: “0.7.0”
Additional context
-
We are using Workers but the grpc server is created in the main thread.
-
Server Options:
{
'grpc.max_send_message_length': 104857600,
'grpc.max_receive_message_length': 104857600,
'grpc.max_connection_idle_ms': 15000,
'grpc.max_connection_age_ms': 30000,
'grpc.keepalive_time_ms': 5000,
'grpc.keepalive_timeout_ms': 1000,
'grpc.keepalive_permit_without_calls': 1,
// Allow grpc pings from client without data.
// It must be 0 with Workers otherwise it throws RESOURCE_EXHAUSTED.
'grpc.http2.min_ping_interval_without_data_ms': 0
}
- Client Options:
{
'grpc.max_send_message_length': 104857600,
'grpc.max_receive_message_length': 104857600,
'grpc.keepalive_time_ms': 5000,
'grpc.keepalive_timeout_ms': 1000,
'grpc.keepalive_permit_without_calls': 1
}
- Error:
Jul 15 09:10:37 ca-cqoexb data-service-be460c54df5d E0715 13:10:37.418067384 17 connectivity_state.cc:154] assertion failed: cur != GRPC_CHANNEL_SHUTDOWN
Jul 15 09:10:37 ca-cqoexb data-service-be460c54df5d Aborted
[UPDATE]
I found that grpc core removed the assertion from connectivity_state.cc:154
that is causing the issue. However, it has been released in version 1.25. Here is the link to the PR that removed the line and the explanation of the changes. Is there any ETA to release version 1.25 of grpc-node and if not what is the best option for us supposing it will fix the problem?
[UPDATE 2]
I was able to reproduce and update the Reproduction steps section. Please, note that removing max-age will cause RST_STREAM
s if working with NLB.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:4
- Comments:10 (6 by maintainers)
Top GitHub Comments
The list of supported options can be found here: https://github.com/grpc/grpc-node/blob/master/PACKAGE-COMPARISON.md.
The
grpc
package is now deprecated, so this change will likely not happen. We recommend switching to@grpc/grpc-js
.