Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BDP PINGs are sent much more frequently than necessary

See original GitHub issue

What version of gRPC-Java are you using?

1.37.0

What is your environment?

5.4.0-74-generic #83-Ubuntu SMP Sat May 8 02:35:39 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

What did you expect to see?

Client does not flood server with PING frames when autoTuneFlowControl is enabled (default)

What did you see instead?

Connection closed on server with

    io.netty.handler.codec.http2.Http2Exception: Maximum number 10000 of outstanding control frames reached
	at io.netty.handler.codec.http2.Http2Exception.connectionError(Http2Exception.java:108)
	at io.netty.handler.codec.http2.Http2ControlFrameLimitEncoder.handleOutstandingControlFrames(Http2ControlFrameLimitEncoder.java:96)
	at io.netty.handler.codec.http2.Http2ControlFrameLimitEncoder.writePing(Http2ControlFrameLimitEncoder.java:69)
	at io.netty.handler.codec.http2.Http2FrameCodec.write(Http2FrameCodec.java:333)

Steps to reproduce the bug

Client makes request-response calls continuously such that there is constant number of outstanding requests.

Server is 3rd party GRPC implementation based on Netty.

It only acks received PING frames, and does not send own PING frames(ack=false). Acked frames content is 1234.

Client and server are on the same host.

Eventually (after several seconds) connection is closed by server with

    io.netty.handler.codec.http2.Http2Exception: Maximum number 10000 of outstanding control frames reached
	at io.netty.handler.codec.http2.Http2Exception.connectionError(Http2Exception.java:108)
	at io.netty.handler.codec.http2.Http2ControlFrameLimitEncoder.handleOutstandingControlFrames(Http2ControlFrameLimitEncoder.java:96)
	at io.netty.handler.codec.http2.Http2ControlFrameLimitEncoder.writePing(Http2ControlFrameLimitEncoder.java:69)
	at io.netty.handler.codec.http2.Http2FrameCodec.write(Http2FrameCodec.java:333)

There is workaround NettyChannelBuilder.flowControlWindow(int) which happens to disable autoTuneFlowControl.

Issue Analytics

State:
Created 2 years ago
Comments:10 (5 by maintainers)

Top GitHub Comments

1reaction

mostroverkhovcommented, Jun 17, 2021

@voidzcy It is painful to get grpc-java into compilable state on local machine, so I composed grpc only project where this behavior is trivially reproduced as well. It is property of netty based grpc-java client, as server only does what is mandated by http2 spec - acks received PING.

With above example NettyServerHandler.onPingRead is called with frequency comparable to inbound requests, but connection is not torn down because of overly aggressive buffer flushing of grpc-java library (thats why Http2ControlFrameLimitEncoder does not kick in).

With autoTuneFlowControl enabled, It seems the rate of PINGs grows with a) - number of requests [1], [2]; b) - decrease of PING round trip time [3].

I think algorithm needs to be adjusted for high-rps low latency scenario.

0reactions

davidkilliansccommented, Aug 2, 2022

I came across this issue when doing some work to improve throughput / reduce cost. In our setup, both incoming and outgoing gRPC requests proxy through a local HTTP2 sidecar over loopback or a domain socket. As a result I’d expect the RTT to be very small and the max BDP PINGs per second to be very high.

I disabled autoTuneFlowControl for one service and saw a 3-4% reduction in the number of machines needed to handle the service’s throughput, which is a non-trivial cost reduction for us. When I compared production profiles to before the change, I observed the largest reduction in CPU was in reading and writing file descriptors (not surprising). I saw an unexpected increase in CPU spent in Native.eventFDWrite stemming from AbstractStream$TransportState.requestMessagesFromDeframer. I suspect that increase is due to the IO event loop being less likely to already being running at the moment the application thread requests messages (since the IO event loop is now doing less work as a result of reducing the PINGs).

Another observation is that most of the gain from disabling autoTuneFlowControl actually came from reduced CPU usage in the local H2 proxy (the thing responding to all those PING frames), rather than from reduced CPU usage of the grpc-java application process.