question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Linkerd pods keep getting restarted

See original GitHub issue

Issue Type: Linker daemon set keeps starting causing application failures

What happened: We see occationally linkerd pods getting restarted. As you can see some pods are getting restarted.

NAME        READY     STATUS    RESTARTS   AGE

l5d-****   1/1       Running   95         3d

When i look at logs it looks like liveness probe as read timedout

  Warning  Unhealthy  26m (x1766 over 3d)  kubelet, ******  Liveness probe failed: Get http://******/admin/ping: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Environment:

  • linkerd/namerd version, config files: we are running 1.6.0-openj9-experimental version
  • Platform, version, and config files (Kubernetes, DC/OS, etc): Kubernetes
  • Cloud provider or hardware configuration: GKE

I verified GKE node CPU or memory it was under threshold.

On top of this we did zone resiliency testing. What have GKE regional cluster spread across 3 zones and we cordoned and drained nodes from one zone. So all pods from one zone scheduled on remaining 2 zones in a cluster. This caused issues for services talking over linkerd. Linkerd pods kept restarting.

Any idea whats going on.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:13 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
AkshathPatkarcommented, Jun 18, 2019

:ok i added this to LOCAL_JVM_OPTIONS variable to add -XX:-UseBiasedLocking. will update after the testing

https://github.com/linkerd/linkerd/blob/d553ec678e6218c479ac852e49a940944b3c42f3/project/LinkerdBuild.scala#L307-L334

0reactions
adleongcommented, Jul 8, 2019

Closing due to inactivity.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshooting | Linkerd
When a Node is missing the podCIDR field, Linkerd can not verify this, and it's possible that the Node creates a Pod with...
Read more >
Graceful Pod Shutdown | Linkerd
This means that if the pod's main container attempts to make any new network calls after the proxy has received the TERM signal,...
Read more >
Debugging 502s | Linkerd
Linkerd turns connection errors into HTTP 502 responses. This can make issues which were previously undetected suddenly visible. This is a good thing....
Read more >
Automatic Proxy Injection | Linkerd
Linkerd will automatically inject the data plane proxy into your pods based ... You will need to update the pods (e.g. with kubectl...
Read more >
Adding your services to Linkerd
Adding the annotation to existing pods does not automatically mesh them. ... can mesh every deployment in a namespace by combining this with...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found