question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Linkerd cancels long-lasting grpc requests

See original GitHub issue

Issue Type:

  • Bug report
  • Feature request

What happened: I updated linkerd from the version 1.6.1 to the version 1.6.2 and after that linkerd cancels long-lasting grpc requests ( Request takes 15 seconds ). My client receives grpc cancel code Error: rpc error: code = Canceled desc = stream terminated by RST_STREAM with error code: CANCEL. Grpc service does not cancel this call. Client and Service run in go and use grpc v1.24.0 module.

The linkerd 1.6.2 changelog does not mention anything what can cause this problem. I tried to update linkerd to the latest version, 1.7.0, but there is still same problem. Trace log:

 W 0930 10:23:12.838 UTC THREAD28 TraceId:7835fadfedce8425: Exception propagated to the default monitor (upstream address: /10.240.0.5:57834, downstream address: /10.244.3.226:10001, label: #/io.l5d.k8s/prod/grpc/eventhub-inserter-v823). 
 Reset.Cancel 
  
 E 0930 10:23:12.872 UTC THREAD34: [S L:/10.244.4.189:4143 R:/10.240.0.5:57834] dispatcher failed 
 com.twitter.finagle.ChannelClosedException: ChannelException at remote address: /10.240.0.5:57834. Remote Info: Not Available 
 at com.twitter.finagle.netty4.transport.ChannelTransport$$anon$2.channelInactive(ChannelTransport.scala:175) 
 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) 
 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) 
 at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) 
 at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) 
 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) 
 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) 
 at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) 
 at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) 
 at com.twitter.finagle.netty4.channel.ChannelRequestStatsHandler.channelInactive(ChannelRequestStatsHandler.scala:41) 
 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) 
 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) 
 at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) 
 at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) 
 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) 
 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) 
 at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) 
 at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:390) 
 at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:355) 
 at io.netty.handler.codec.http2.Http2ConnectionHandler.channelInactive(Http2ConnectionHandler.java:427) 
 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) 
 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) 
 at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) 
 at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) 
 at com.twitter.finagle.netty4.channel.ChannelStatsHandler.channelInactive(ChannelStatsHandler.scala:229) 
 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) 
 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) 
 at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) 
 at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:390) 
 at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:355) 
 at io.netty.handler.ssl.SslHandler.channelInactive(SslHandler.java:1050) 
 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) 
 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) 
 at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) 
 at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1429) 
 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) 
 at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) 
 at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:947) 
 at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822) 
 at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) 
 at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) 
 at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:335) 
 at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897) 
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
 at com.twitter.finagle.util.BlockingTimeTrackingThreadFactory$$anon$1.run(BlockingTimeTrackingThreadFactory.scala:23) 
 at io.netty.util.concurrent.FastThreadLocalRu

What you expected to happen: Linkerd does not throttle long-lasting grpc request

Environment:

  • linkerd/namerd version, config files: 1.6.2
  • Platform, version, and config files (Kubernetes, DC/OS, etc): Kubernetes v1.14.6
  • Cloud provider or hardware configuration: AKS

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:13 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
zaharidichevcommented, Oct 7, 2019

So yes, I actually just exposed this conf parameter in this branch so its not in the latest release yet. Once this pr gets merged (hopefully) and we release a new version it will be present in the docs.

1reaction
PeterUherekcommented, Oct 7, 2019

One question that I have is. Is this l5d to l5d request that is failing.

It looks like connection between l5d to server is failing.

Does both of your server and client have a linkerd proxy running i front of them.

Yes, client and server have linkerd proxy running in front of them.

I assume the situation does no change when you take namerd out of the equation?

Yes, It does not change.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Unable to configure Linkerd as gRPC load balancer - Linkerd2
Hi, I am trying to configure Linkerd as load balancer for my gRPC ... the once a Channel is created it does not...
Read more >
Linkerd stops sending traffic to grpc kubernetes pods
Hi,. I have been seen this behavior multiple times now. I am running linkerd:1.3.4 . The following is the full set of configuration:...
Read more >
Configuring Timeouts - Linkerd
To limit how long Linkerd will wait before failing an outgoing request to another service, you can configure timeouts. These work by adding...
Read more >
Troubleshooting | Linkerd
The default node heartbeat interval was increased to 5 minutes in Kubernetes 1.17 meaning that users running Linkerd versions prior to edge-20.3.4 on...
Read more >
HTTP/2, gRPC and Linkerd
The HTTP protocol and its simple enveloped-request-and-response model underlie all ... Messages can be multiplexed, reordered, and canceled, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found