Add KeepAlive support
See original GitHub issueWith our sue of gRPC Java across Google Compute Engine (GCE) L3 Load Balancers (Network Load Balancers), we seem to be hitting similar issues we had with gRPC in Go: https://github.com/grpc/grpc-go/issues/536
Basically Google L3 load balancers silently drop long-lasting TCP connections after 600
seconds.
While we were able to work around the issue by specifying a custom Dialer in Go:
func WithKeepAliveDialer() grpc.DialOption {
return grpc.WithDialer(func(addr string, timeout time.Duration) (net.Conn, error) {
d := net.Dialer{Timeout: timeout, KeepAlive: *flagGrpcClientKeepAliveDuration}
return d.Dial("tcp", addr)
})
}
There seems to be no way of overriding the KeepAlive peridods for NettyClientTransport
. We know it’s possible to set the keep alive period in the kernel of the machines, but that’s a bit of a stretch to expect the user-code programmers to know about it.
Can we either:
- have the ability to specify the TCP keep alive period on create of channel
- documentation around it, especially how it can cause hard-to-debug problems on GCE?
cc @ejona86 since he seems to have had opinions about it in https://github.com/grpc/grpc-java/issues/737
Issue Analytics
- State:
- Created 7 years ago
- Reactions:6
- Comments:21 (8 by maintainers)
Top GitHub Comments
Here is an except from the document I’m trying to get agreement on:
TCP keepalive is hard to configure in Java and Go. Enabling is easy, but one hour is far too infrequent to be useful; an application-level keepalive seems beneficial for configuration.
TCP keepalive is active even if there are no open streams. This wastes a substantial amount of battery on mobile; an application-level keepalive seems beneficial for optimization.
Application-level keepalive implies HTTP/2 PING. If we take a page from TCP keepalive’s book there are three parameters to tune: time (time since last receipt before sending a keepalive), interval (interval between keepalives when not receiving reply), and retry (number of times to retry sending keepalives). Interval and retry don’t quite apply to PING because the transport is reliable, so they will be replaced with timeout (equivalent to interval * retry), the time between sending a PING and not receiving any bytes to declare the connection dead.
Doing some form of keepalive is relatively straightforward. But avoiding DDoS is not as easy. Thus, avoiding DDoS is the most important part of the design. To mitigate DDoS the design:
Most RPCs are unary with quick replies, so keepalive is less likely to be triggered. It would primarily be triggered when there is a long-lived RPC.
Since keepalive is not occurring on HTTP/2 connections without any streams, there will be a higher chance of failure for new RPCs following a long period of inactivity. To reduce the tail latency for these RPCs, it is important to not reset the `keepalive time’ when a connection becomes active; if a new stream is created and there has been greater than ‘keepalive time’ since the last read byte, then a keepalive PING should be sent (ideally before the HEADERS frame). Doing so detects the broken connection with a latency of 'keepalive timeout’ instead of 'keepalive time + timeout’.
'keepalive time’ is ideally measured from the time of the last byte read. However, simplistic implementations may choose to measure from the time of the last keepalive PING (aka, polling). Such implementations should take extra precautions to avoid issues due to latency added by outbound buffers, such as limiting the outbound buffer size and using a larger 'keepalive timeout’.
As an optional optimization, when 'keepalive timeout’ is exceeded, don’t kill the connection. Instead, start a new connection. If the new connection becomes ready and the old connection still hasn’t received any bytes, then kill the old connection. If the old connection wins the race, then kill the new connection mid-startup.
The 'keepalive time’ is expected to be an application-configurable option, with at least second precision. It is unspecified whether 'keepalive timeout’ is application-configurable, but it should be at least multiple times the round-trip time to allow for lost packets and TCP retransmits. It may also need to be higher to account for long garbage collector pauses.
@ZedYu, true, but not all clients will necessarily ping. That strategy would work only in a perfectly homogenous environment where the server knew apriori how the client’s ping interval is configured.