BDP PINGs are sent much more frequently than necessary
See original GitHub issueWhat version of gRPC-Java are you using?
1.37.0
What is your environment?
5.4.0-74-generic #83-Ubuntu SMP Sat May 8 02:35:39 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
What did you expect to see?
Client does not flood server with PING frames when autoTuneFlowControl
is enabled (default)
What did you see instead?
Connection closed on server with
io.netty.handler.codec.http2.Http2Exception: Maximum number 10000 of outstanding control frames reached
at io.netty.handler.codec.http2.Http2Exception.connectionError(Http2Exception.java:108)
at io.netty.handler.codec.http2.Http2ControlFrameLimitEncoder.handleOutstandingControlFrames(Http2ControlFrameLimitEncoder.java:96)
at io.netty.handler.codec.http2.Http2ControlFrameLimitEncoder.writePing(Http2ControlFrameLimitEncoder.java:69)
at io.netty.handler.codec.http2.Http2FrameCodec.write(Http2FrameCodec.java:333)
Steps to reproduce the bug
Client makes request-response calls continuously such that there is constant number of outstanding requests.
Server is 3rd party GRPC implementation based on Netty.
It only acks received PING frames, and does not send own PING frames(ack=false).
Acked frames content is 1234
.
Client and server are on the same host.
Eventually (after several seconds) connection is closed by server with
io.netty.handler.codec.http2.Http2Exception: Maximum number 10000 of outstanding control frames reached
at io.netty.handler.codec.http2.Http2Exception.connectionError(Http2Exception.java:108)
at io.netty.handler.codec.http2.Http2ControlFrameLimitEncoder.handleOutstandingControlFrames(Http2ControlFrameLimitEncoder.java:96)
at io.netty.handler.codec.http2.Http2ControlFrameLimitEncoder.writePing(Http2ControlFrameLimitEncoder.java:69)
at io.netty.handler.codec.http2.Http2FrameCodec.write(Http2FrameCodec.java:333)
There is workaround NettyChannelBuilder.flowControlWindow(int)
which happens to disable autoTuneFlowControl
.
Issue Analytics
- State:
- Created 2 years ago
- Comments:10 (5 by maintainers)
Top Results From Across the Web
Understanding Network Latency - Scaleway's Blog
In this blog post, we attempt to demystify the topic of network latency. We take a look at the relationship between latency, bandwidth...
Read more >How to understand serialization delay and BDP?
Serialization delay is the time that it takes to serialize a packet, meaning how long time it takes to physically put the packet...
Read more >Impact of Bandwidth Delay Product on TCP Throughput
The sender is not allowed to send more than the Advertised Window number of bytes unless another ACK (with new Advertised Window) is...
Read more >CCIE 400-101: Network Principles - Latency, Windowing, BDP ...
Latency is most often used to describe the delay between the time that data is requested and the time when it arrives (also...
Read more >gRPC-Go performance Improvements
The idea is simple and powerful: every time a receiver gets a data frame it sends out a BDP ping (a ping with...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@voidzcy It is painful to get grpc-java into compilable state on local machine, so I composed grpc only project where this behavior is trivially reproduced as well. It is property of netty based grpc-java client, as server only does what is mandated by http2 spec - acks received PING.
With above example NettyServerHandler.onPingRead is called with frequency comparable to inbound requests, but connection is not torn down because of overly aggressive buffer flushing of grpc-java library (thats why Http2ControlFrameLimitEncoder does not kick in).
With
autoTuneFlowControl
enabled, It seems the rate of PINGs grows with a) - number of requests [1], [2]; b) - decrease of PING round trip time [3].I think algorithm needs to be adjusted for high-rps low latency scenario.
I came across this issue when doing some work to improve throughput / reduce cost. In our setup, both incoming and outgoing gRPC requests proxy through a local HTTP2 sidecar over loopback or a domain socket. As a result I’d expect the RTT to be very small and the max BDP PINGs per second to be very high.
I disabled
autoTuneFlowControl
for one service and saw a 3-4% reduction in the number of machines needed to handle the service’s throughput, which is a non-trivial cost reduction for us. When I compared production profiles to before the change, I observed the largest reduction in CPU was in reading and writing file descriptors (not surprising). I saw an unexpected increase in CPU spent inNative.eventFDWrite
stemming fromAbstractStream$TransportState.requestMessagesFromDeframer
. I suspect that increase is due to the IO event loop being less likely to already being running at the moment the application thread requests messages (since the IO event loop is now doing less work as a result of reducing the PINGs).Another observation is that most of the gain from disabling
autoTuneFlowControl
actually came from reduced CPU usage in the local H2 proxy (the thing responding to all those PING frames), rather than from reduced CPU usage of the grpc-java application process.