Server dies on keep-alive ack timeout
See original GitHub issueThe rsocket server dies on keep-alive ack timeout. I’ve tried adding onErrorResume, but to no avail. How can I prevent the server from closing its socket on error?
Error
[2019-05-20 11:12:21.438] ERROR [parallel-1] RegistryRSocketServer: Error occurred during session
io.rsocket.exceptions.ConnectionErrorException: No keep-alive acks for 60000 ms
at io.rsocket.keepalive.KeepAliveConnection.lambda$startKeepAlives$1(KeepAliveConnection.java:97)
at reactor.core.publisher.LambdaMonoSubscriber.onNext(LambdaMonoSubscriber.java:137)
at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1476)
at reactor.core.publisher.MonoProcessor.onNext(MonoProcessor.java:389)
at io.rsocket.keepalive.KeepAliveHandler.doCheckTimeout(KeepAliveHandler.java:112)
at io.rsocket.keepalive.KeepAliveHandler$Server.onIntervalTick(KeepAliveHandler.java:128)
at io.rsocket.keepalive.KeepAliveHandler.lambda$start$0(KeepAliveHandler.java:63)
at reactor.core.publisher.LambdaSubscriber.onNext(LambdaSubscriber.java:130)
at reactor.core.publisher.FluxInterval$IntervalRunnable.run(FluxInterval.java:123)
at reactor.core.scheduler.PeriodicWorkerTask.call(PeriodicWorkerTask.java:59)
at reactor.core.scheduler.PeriodicWorkerTask.run(PeriodicWorkerTask.java:73)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
io.rsocket.exceptions.ConnectionErrorException: No keep-alive acks for 60000 ms
at io.rsocket.keepalive.KeepAliveConnection.lambda$startKeepAlives$1(KeepAliveConnection.java:97)at reactor.core.publisher.LambdaMonoSubscriber.onNext(LambdaMonoSubscriber.java:137)
at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1476)20 May 2019 at reactor.core.publisher.MonoProcessor.onNext(MonoProcessor.java:389)
at io.rsocket.keepalive.KeepAliveHandler.doCheckTimeout(KeepAliveHandler.java:112)
at io.rsocket.keepalive.KeepAliveHandler$Server.onIntervalTick(KeepAliveHandler.java:128)
at io.rsocket.keepalive.KeepAliveHandler.lambda$start$0(KeepAliveHandler.java:63)
at reactor.core.publisher.LambdaSubscriber.onNext(LambdaSubscriber.java:130)
at reactor.core.publisher.FluxInterval$IntervalRunnable.run(FluxInterval.java:123)
at reactor.core.scheduler.PeriodicWorkerTask.call(PeriodicWorkerTask.java:59)
at reactor.core.scheduler.PeriodicWorkerTask.run(PeriodicWorkerTask.java:73)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
Version: 0.12.2-RC2
Server Code
server = RSocketFactory
.receive()
.frameDecoder(ZERO_COPY)
.addConnectionPlugin(micrometerDuplexConnectionInterceptor)
.errorConsumer(e -> log.error("Error occurred during session", e))
.acceptor(socketAcceptor)
.transport(serverTransport)
.start()
.onErrorResume(e -> Mono.empty())
.subscribe();
Acceptor
@Override
public Mono<RSocket> accept(ConnectionSetupPayload connectionSetupPayload, RSocket rSocket) {
return Mono.just(new RegistryRSocket(scheduler));
}
requestStream
return Mono.just(payload)
.map(this::getRequestFromPayload)
.flux()
.map(/*..Does Something..*/)
.onErrorResume(throwable ->
Flux.just(createPayloadFromThrowable(throwable)));
private Payload createPayloadFromThrowable(Throwable t) {
return ByteBufPayload.create(ErrorFrameFlyweight.encode(DEFAULT, 0, t));
}
Any help would be greatly appreciated
Issue Analytics
- State:
- Created 4 years ago
- Comments:27 (12 by maintainers)
Top Results From Across the Web
TCP Keepalive Best Practices - detecting network drops and ...
Send TCP Keepalives successfully (within 15 minutes), before idle socket timeout (typically 60 or 30 minutes). Make sure TCP Keepalives retry at ...
Read more >When TCP sockets refuse to die - The Cloudflare Blog
After a total of three sent probes, and a further three seconds of delay, the connection dies with ETIMEDOUT, and final the RST...
Read more >2. TCP keepalive overview
Keepalive can be used to advise you when your peer dies before it is able to notify you. This could happen for several...
Read more >TCP Keepalive and firewall killing idle sessions - Server Fault
To our surprise, the idle but alive connections get killed after about 40 minutes as before. Wireshark running on the client side shows...
Read more >server still working,client received “keepalive ping failed to ...
If possible, provide a recipe for reproducing the error. client setting ClientParameters{ Time: 10, Timeout: 20, PermitWithoutStream: true, }
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@mostroverkhov It seems to be much, much more resilient now. I actually managed to run out of ENIs in amazon, so haven’t been able to do all the testing I’ve wanted so far. I’ll report back on the issue tomorrow and let you know if it resolves it or not, it seems like it does. Thank you for your help.
I’ve upgraded the instance I was testing on, and it seems remarkably stable. I am fetching gigabytes of data over the socket with no problems, with the exception of heavy load causing keep-alives to not go through, killing the connection occasionally. (This is probably my own fault) I am closing this ticket as the most recent snapshot fixes my problems. Am very much looking forward to it being released as I can’t live without it!