question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Best practices for configuring nodes with two network interfaces

See original GitHub issue

Hi. I’ve tried microk8s, and it worked like a charm on AWS. Then I switched to dedicated servers, which have two network interfaces. When it’s a single node cluster - it all works as it should. But after adding another node, from time to time, I’ve started getting the errors like below:

Normal   Scheduled               4m36s  default-scheduler  Successfully assigned default/pod-using-nfs to gphotos-k8s-1
  Warning  FailedCreatePodSandBox  4m5s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "5377466d8cbd8e69533fd050f4e37087791367b044c377599138956e7b9ab1a2": error getting ClusterInformation: Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.152.183.1:443: i/o timeout
  Warning  FailedCreatePodSandBox  3m24s  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "90494effa859015a65e4ae2937913faa47e200d9cb310ea76cf82970fef45f92": error getting ClusterInformation: Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.152.183.1:443: i/o timeout
  Warning  FailedCreatePodSandBox  2m43s  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "04bdcc98491e8a8ddcb45f588d470b16d4ac49e244d5cff61fed9079490fb375": error getting ClusterInformation: Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.152.183.1:443: i/o timeout
  Warning  FailedCreatePodSandBox  117s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "2f61a1974a74309ce1fc1efba82640af2824ada98503caebb226ecada46645dc": Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/ippools: dial tcp 10.152.183.1:443: i/o timeout
  Normal   Pulling                 105s   kubelet            Pulling image "alpine"
  Normal   Pulled                  105s   kubelet            Successfully pulled image "alpine" in 406.538693ms
  Normal   Created                 105s   kubelet            Created container app
  Normal   Started                 105s   kubelet            Started container app

What can be the case? inspection-report-20210323_221244.tar.gz

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8

github_iconTop GitHub Comments

4reactions
mjuarezcommented, Nov 8, 2021

Wanted to capture this here for anyone else stumbling into similar issues or the above fixes not working for their use case.

My deployment is on-prem with dual interfaces and ufw. Host default interface is configured for a public subnet. Second interface was on a private subnet.

I put in the suggested ufw rules captured here under the My dns and dashboard pods are CrashLooping and My pods can’t reach the internet or each other (but my MicroK8s host machine can) sections but was still having the above dial tcp 10.152.183.1:443: i/o timeout issues.

For my case, it was an issue with kube-apiserver args. microk8s doesn’t define --advertise-address, so when left unset, it’ll default to --bind-address which defaults to the host’s default interface (a public IP in my case).

Even when adding a ufw rule to allow private subnet in to the public address, I’d still get occasional timeouts.

All I did was add the --advertise-address=<my-private-subnet-ip> in /var/snap/microk8s/current/args/kube-apiserver and things worked like a charm!

Hope this helps someone else!

1reaction
matt-deboercommented, Aug 9, 2022

@mjuarez I’m in the same situation as you (default interface is public, cluster interface is private), and attempted your solution, but I’ve noticed that after setting --advertise-address=<my-private-ip> on all nodes’ kube-apiserver args, then attempting to create a cluster, other nodes seem to join successfully (join completes with no error), but attempting to access kubectl from the other nodes yields: Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "10.152.183.1"), and the other nodes do not appear in the list of nodes on the primary node after much time.

I did notice that the ca.crt was mirrored over to the joined node from the primary, so then I tried to refresh the server.crt on the joined node, which yielded: req: Error on line 1 of config file "/var/snap/microk8s/3597/certs/csr.conf" unable to find 'distinguished_name' in config problems making Certificate Request So I noticed that microk8s.refresh-certs was using csr.conf for some reason (which just contains the text changeme) instead of csr.conf.rendered. So, for a successful workaround, I ran:

  • sudo cp /var/snap/microk8s/current/certs/csr.conf.rendered /var/snap/microk8s/current/certs/csr.conf
  • sudo microk8s refresh-certs -e server.crt && sudo microk8s refresh-certs -e front-proxy-client.crt

After this, the partially joined nodes were visible to the cluster.

Note that I had tried temporarily renaming /var/snap/microk8s/current/var/lock/no-cert-reissue (and restarting snap.microk8s.daemon-apiserver-kicker) to no avail.

my context: ubuntu 22.04, microk8s 1.24/stable

Read more comments on GitHub >

github_iconTop Results From Across the Web

Best practices for configuring nodes with two network interfaces
My deployment is on-prem with dual interfaces and ufw. Host default interface is configured for a public subnet. Second interface was on a...
Read more >
Best Practices for Using Multiple Network Interfaces (NICs ...
Figure 4. When configuring a controller with multiple NICs, each NIC should communicate with a different subnet. Configuring two or more NICs on ......
Read more >
Best practices for configuring network interfaces
Launching an Amazon Linux or Windows Server instance with multiple network interfaces automatically configures interfaces, private IPv4 addresses, ...
Read more >
Best practices for configuring nodes with two network interfaces
Hi. I've tried microk8s, and it worked like a charm on AWS. Then I switched to dedicated servers, which have two network interfaces....
Read more >
Configure multiple network interfaces for Pods - Google Cloud
The multiple network interface configuration supports associating network interfaces with node pools, which can provide performance benefits.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found