Best practices for configuring nodes with two network interfaces
See original GitHub issueHi. I’ve tried microk8s, and it worked like a charm on AWS. Then I switched to dedicated servers, which have two network interfaces. When it’s a single node cluster - it all works as it should. But after adding another node, from time to time, I’ve started getting the errors like below:
Normal Scheduled 4m36s default-scheduler Successfully assigned default/pod-using-nfs to gphotos-k8s-1
Warning FailedCreatePodSandBox 4m5s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "5377466d8cbd8e69533fd050f4e37087791367b044c377599138956e7b9ab1a2": error getting ClusterInformation: Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.152.183.1:443: i/o timeout
Warning FailedCreatePodSandBox 3m24s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "90494effa859015a65e4ae2937913faa47e200d9cb310ea76cf82970fef45f92": error getting ClusterInformation: Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.152.183.1:443: i/o timeout
Warning FailedCreatePodSandBox 2m43s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "04bdcc98491e8a8ddcb45f588d470b16d4ac49e244d5cff61fed9079490fb375": error getting ClusterInformation: Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.152.183.1:443: i/o timeout
Warning FailedCreatePodSandBox 117s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "2f61a1974a74309ce1fc1efba82640af2824ada98503caebb226ecada46645dc": Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/ippools: dial tcp 10.152.183.1:443: i/o timeout
Normal Pulling 105s kubelet Pulling image "alpine"
Normal Pulled 105s kubelet Successfully pulled image "alpine" in 406.538693ms
Normal Created 105s kubelet Created container app
Normal Started 105s kubelet Started container app
What can be the case? inspection-report-20210323_221244.tar.gz
Issue Analytics
- State:
- Created 2 years ago
- Comments:8
Top Results From Across the Web
Best practices for configuring nodes with two network interfaces
My deployment is on-prem with dual interfaces and ufw. Host default interface is configured for a public subnet. Second interface was on a...
Read more >Best Practices for Using Multiple Network Interfaces (NICs ...
Figure 4. When configuring a controller with multiple NICs, each NIC should communicate with a different subnet. Configuring two or more NICs on ......
Read more >Best practices for configuring network interfaces
Launching an Amazon Linux or Windows Server instance with multiple network interfaces automatically configures interfaces, private IPv4 addresses, ...
Read more >Best practices for configuring nodes with two network interfaces
Hi. I've tried microk8s, and it worked like a charm on AWS. Then I switched to dedicated servers, which have two network interfaces....
Read more >Configure multiple network interfaces for Pods - Google Cloud
The multiple network interface configuration supports associating network interfaces with node pools, which can provide performance benefits.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Wanted to capture this here for anyone else stumbling into similar issues or the above fixes not working for their use case.
My deployment is on-prem with dual interfaces and ufw. Host default interface is configured for a public subnet. Second interface was on a private subnet.
I put in the suggested ufw rules captured here under the My dns and dashboard pods are CrashLooping and My pods can’t reach the internet or each other (but my MicroK8s host machine can) sections but was still having the above
dial tcp 10.152.183.1:443: i/o timeout
issues.For my case, it was an issue with kube-apiserver args. microk8s doesn’t define
--advertise-address
, so when left unset, it’ll default to--bind-address
which defaults to the host’s default interface (a public IP in my case).Even when adding a ufw rule to allow private subnet in to the public address, I’d still get occasional timeouts.
All I did was add the
--advertise-address=<my-private-subnet-ip>
in/var/snap/microk8s/current/args/kube-apiserver
and things worked like a charm!Hope this helps someone else!
@mjuarez I’m in the same situation as you (default interface is public, cluster interface is private), and attempted your solution, but I’ve noticed that after setting
--advertise-address=<my-private-ip>
on all nodes’ kube-apiserver args, then attempting to create a cluster, other nodes seem to join successfully (join completes with no error), but attempting to accesskubectl
from the other nodes yields:Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "10.152.183.1")
, and the other nodes do not appear in the list of nodes on the primary node after much time.I did notice that the ca.crt was mirrored over to the joined node from the primary, so then I tried to refresh the
server.crt
on the joined node, which yielded:req: Error on line 1 of config file "/var/snap/microk8s/3597/certs/csr.conf" unable to find 'distinguished_name' in config problems making Certificate Request
So I noticed thatmicrok8s.refresh-certs
was using csr.conf for some reason (which just contains the textchangeme
) instead of csr.conf.rendered. So, for a successful workaround, I ran:sudo cp /var/snap/microk8s/current/certs/csr.conf.rendered /var/snap/microk8s/current/certs/csr.conf
sudo microk8s refresh-certs -e server.crt && sudo microk8s refresh-certs -e front-proxy-client.crt
After this, the partially joined nodes were visible to the cluster.
Note that I had tried temporarily renaming
/var/snap/microk8s/current/var/lock/no-cert-reissue
(and restartingsnap.microk8s.daemon-apiserver-kicker
) to no avail.my context: ubuntu 22.04, microk8s 1.24/stable