Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Inaccessible pods on other nodes for high availability cluster

See original GitHub issue

I made a 3 node cluster on ec2, and wanted to just launch a generic application to make sure everything is accessible. I created a microk8s environment on each machine, and got them to add to a HA cluster. When I tried to launch the microbot deployment on ubuntu tutorial, each machine could only access its own pod.

When running microk8s kubectl get pods -o wide, I get the following:

NAME                        READY   STATUS    RESTARTS   AGE    IP            NODE               NOMINATED NODE   READINESS GATES
microbot-5f5499d479-ngz56   1/1     Running   0          179m   10.1.94.76    ip-172-31-18-128   <none>           <none>
microbot-5f5499d479-nkctv   1/1     Running   1          175m   10.1.162.72   ip-172-31-21-37    <none>           <none>
microbot-5f5499d479-zkjcn   1/1     Running   1          175m   10.1.162.73   ip-172-31-21-37    <none>           <none>

This is with the deployment scaled to 3.

If I curl on the machine ending in 128, I get a 1/3 chance in hitting itself, and 2/3 with the one ending in 37. On my other machine, it always hangs because it is attempting to access one of the other two machines and does not seem to be able to.

ufw is disabled, and I’ve tried running

sudo iptables -P FORWARD ACCEPT
sudo apt-get install iptables-persistent

on each machine, to no avail. I can ping them on other services fine. I enabled ingress w/o a service, and each one shoots me a 404 error, so it can clearly route.

I’ve attached the inspection logs. inspection-report-20210112_132731.tar.gz

Issue Analytics

State:
Created 3 years ago
Comments:7 (1 by maintainers)

Top GitHub Comments

2reactions

rockautcommented, Jan 14, 2021

I also had problems on my Ubuntu Cluster, for me it turned out to be problems with net_bridge. So I had to enable the modules and sysctl.

Added to /etc/modules-load.d/modules.conf:

overlay
br_netfilter
bridge

and to /etc/sysctl.conf:

net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1

Rebooted and all worked.

I previously had docker already installed and uninstalled and also played around with cni and podman - so it might got crushed somewhere with those fiddlings.

0reactions

stale[bot]commented, Dec 17, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Top Results From Across the Web

Inaccessible pods on other nodes for high availability cluster

I made a 3 node cluster on ec2, and wanted to just launch a generic application to make sure everything is accessible.

A Guide to High Availability/Disaster Recovery for Applications ...

If the control plane fails, or otherwise becomes inaccessible, many functions will stop working. Scheduling of Pods is one of the most critical....

Kubernetes Tip: What Happens To Pods Running On Node ...

The value entirely depends upon business requirements such as application SLA's, Cluster Resource Utilization, etc. If an environment has tight ...

High Availability | OpenShift Container Platform 3.11

This topic describes setting up high availability for pods and services on your OpenShift Container Platform cluster. IP failover manages a pool of...

What Should I Do If a Cluster Is Available But Some Nodes ...

Log in to the CCE console and click the cluster. In the navigation pane, choose Nodes. Click Monitor in the row of the...