Bidi calls never close, leaking memory client and server
See original GitHub issueUsing v1.8.0.
The following client/server leaves the bidi RPC call open, holding onto resources and ultimately going OOM if enough calls are made:
// proto
message Request {}
message Response {}
service TestService {
rpc Ping(stream Request) returns (stream Response);
}
/////////////////////////////
// server impl
private static class ServerServiceImpl extends TestServiceGrpc.TestServiceImplBase {
@Override
public StreamObserver<TestRpc.Request> ping(StreamObserver<example.TestRpc.Response> responseObserver) {
return new StreamObserver<TestRpc.Request>() {
@Override
public void onNext(TestRpc.Request request) {
responseObserver.onNext(TestRpc.Response.newBuilder().build());
responseObserver.onCompleted();
}
@Override public void onError(Throwable throwable) {}
@Override public void onCompleted() {}
};
}
}
/////////////////////////////
// client impl
private void oneCall(Channel chan) {
ClientCall<TestRpc.Request, TestRpc.Response> call =
chan.newCall(TestServiceGrpc.getPingMethod(), CallOptions.DEFAULT);
call.start(new ClientCall.Listener<TestRpc.Response>(), new Metadata());
call.sendMessage(TestRpc.Request.newBuilder().build());
call.request(1);
}
for (int i = 0; i < 1000; ++i) {
oneCall(channel);
}
// If I attach a debugger to the client here, I can see 1000 instances of DefaultStream and 1000 instances of a bunch of other grpc/netty bits, even after > 5 minutes and repeated attempts at GC.
Thread.sleep(9999999);
Replacing the client’s ClientCall.Listener
with one that calls .halfClose()
upon completion works around the issue.
Issue Analytics
- State:
- Created 6 years ago
- Comments:9 (5 by maintainers)
Top Results From Across the Web
grpc/grpc - Gitter
Hi! I would like to know what is the best practice for error handling for streaming messages (bidi, client-streaming, server-streaming). I mean for...
Read more >Memory leak in Java client bindings when server request ...
I did a load of testing, and had it down to calls to .set , .get[Range] and .clear[Range] within a transaction. If I...
Read more >Finding and fixing memory leaks in Go - DEV Community
But, what happens if you don't Close the client when you should? You get a memory leak. The underlying connections never get cleaned...
Read more >How to analyze potential memory leak in Grpc-Java
There is a lot "io.grpc.StatusRuntimeException: CANCELLED: client cancelled" error in server side. And occationally, I will get "io.netty.util.
Read more >Undefined behavior or memory leak when using placement-new
There is a sentence in the standard which is not very clear about its meaning in [basic.life]/5 saying that if a destructor call...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
In HTTP (and gRPC) the bi-di stream is done once the server finishes responding. We should simply throw away anything the client tries to send past that point (which is core to our design).
I would fully believe okhttp has the same bug. In fact, I’d expect okhttp to have it before netty.
https://github.com/grpc/grpc-java/pull/4222 contains a fix for this problem: it changes the Netty client to send a reset stream when it receives an end-of-stream from the server without having already half-closed. There are more details in the comments on #4222, but sending the reset stream frees the stream resources on the client, and receipt of the reset stream frees the stream resources on the server.
Since we can’t assume all clients will be updated to send the reset stream, I’ll need to send out another PR to let the server releases stream resources even without this behavior change in the client. But just the client updates in #4222 alone are enough to solve this problem, as least as far as I’ve been able to reproduce and test it.