question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Pods are stuck in "ContainerCreating" state when running microk8s behind proxy

See original GitHub issue

After installing microk8s behind a proxy, containers are stuck in “ContainerCreating” state.

Reproduce

To test running microk8s behind a proxy, I set up a squid proxy on my local machine and started a virtual machine running ubuntu using vagrant (10.10.1.98) is the local IP of my host:

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.box = "generic/ubuntu2004"

  config.vm.define "ubuntu" do |ubuntu|
    ubuntu.vm.hostname = "ubuntu"
    config.vm.network "public_network"
  end

  config.proxy.http = "http://10.10.1.98:3128"
  config.proxy.https = "http://10.10.1.98:3128"
  config.proxy.no_proxy = "localhost,127.0.0.1"

  config.vm.provider "virtualbox" do |vb|
    vb.gui = false
    vb.memory = "4096"
  end

  config.vm.provision "shell", path: "init.sh"
end

The script “init.sh” configures iptables to force the use of the proxy:

echo "Accept 22"
sudo iptables -A INPUT -p tcp --dport 22 -j ACCEPT
sudo iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
sudo iptables -A OUTPUT -p tcp --dport 22 -j ACCEPT
sudo iptables -A OUTPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
echo "Accept lo"
sudo iptables -A INPUT -i lo -j ACCEPT
sudo iptables -A OUTPUT -o lo -j ACCEPT
echo "Drop all"
sudo iptables -P INPUT DROP
sudo iptables -P OUTPUT DROP
echo "Allow to proxy"
iptables -A OUTPUT -d 10.10.1.98 -j ACCEPT
echo "Allow pods to communicate with api"
sudo iptables -P FORWARD ACCEPT

The installation of microk8s was easy with no error:

microk8s is running
high-availability: no
  datastore master nodes: 127.0.0.1:19001
  datastore standby nodes: none
addons:
  enabled:
    dashboard            # The Kubernetes dashboard
    dns                  # CoreDNS
    ha-cluster           # Configure high availability on the current node
    helm3                # Helm 3 - Kubernetes package manager
    metrics-server       # K8s Metrics Server for API access to service metrics

Following the recommendations in https://microk8s.io/docs/install-proxy, I edited the file /var/snap/microk8s/current/args/containerd-env and then restarted microk8s:

HTTPS_PROXY=http://10.10.1.98:3128
NO_PROXY=10.1.0.0/16,10.152.183.0/24
ulimit -n 65536 || true
ulimit -l 16384 || true

The problem

However, pods are stuck. For example, when I start a busybox pod:

$ microk8s kubectl run busybox --image=busybox --restart=Never
$  microk8s kubectl describe po/busybox
Name:         busybox
Namespace:    default
Priority:     0
Node:         ubuntu/10.10.1.105
Start Time:   Wed, 21 Apr 2021 07:53:31 +0000
Labels:       run=busybox
Annotations:  <none>
Status:       Pending
IP:           
IPs:          <none>
Containers:
  busybox:
    Container ID:  
    Image:         busybox
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Args:
      sh
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-xd6vb (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-xd6vb:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-xd6vb
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                From               Message
  ----     ------                  ----               ----               -------
  Normal   Scheduled               17m                default-scheduler  Successfully assigned default/busybox to ubuntu
  Warning  FailedCreatePodSandBox  17m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "96a8ebda2e835f197fa08af300e83c2eb108935426cc45846cb86904fbb4a6d5": error getting ClusterInformation: Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.152.183.1:443: i/o timeout
  Warning  FailedCreatePodSandBox  16m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "3c437fa525d79df4464cb757fbbc8f79572c52bc06a548d6809c5328c04f1fda": error getting ClusterInformation: Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.152.183.1:443: i/o timeout
  Warning  FailedCreatePodSandBox  15m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "de39dec3b0ef2edc42ad7e8e27e954b1a82522c50f644ca713e08df9b9c108a6": error getting ClusterInformation: Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.152.183.1:443: i/o timeout
  Warning  FailedCreatePodSandBox  15m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "737ba67fca5cb1d561073308816bce3be7cc8c6c88a70722584b0f1bfd49de7d": error getting ClusterInformation: Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.152.183.1:443: i/o timeout
  Warning  FailedCreatePodSandBox  14m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "86a479b8fd35eb3259ad441b323b29b39a2435c097dc03ab29cf2a6c2f8f06ea": error getting ClusterInformation: Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.152.183.1:443: i/o timeout
  Warning  FailedCreatePodSandBox  13m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "acac350f10619d85640756ec94b2a98c228456f06dbfe97130ab56cb323a92fb": error getting ClusterInformation: Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.152.183.1:443: i/o timeout
  Warning  FailedCreatePodSandBox  13m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "ba3e51c5a50c5ada49860198706062d048a70d49b61aad441292a65876e76b27": error getting ClusterInformation: Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.152.183.1:443: i/o timeout
  Warning  FailedCreatePodSandBox  12m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "a847fcd68fd4bcff60a933128136b7459cd033a7109715e05f43555397cc6303": error getting ClusterInformation: Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.152.183.1:443: i/o timeout
  Warning  FailedCreatePodSandBox  11m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "403ff37f42ac74871515287bb7da714e01f12a5a23bf19609b7c9109764f9c84": error getting ClusterInformation: Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.152.183.1:443: i/o timeout
  Warning  FailedCreatePodSandBox  5s (x16 over 10m)  kubelet            (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "7475ba83a8a8c721c87e6df0e9d5f57b38601a69c71efa7d62b3530f9613d3e9": error getting ClusterInformation: Get https://[10.152.183.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.152.183.1:443: i/o timeout

As soon as I accept output traffic on my default interface eth0 the container starts:

sudo iptables -A OUTPUT -o eth0 -j ACCEPT

When analyzing network traffic using Wireshark, I noticed that the virtual machine tried to access the URL https://10.152.183.1:443 via the proxy (I got logs from the proxy returning a 503). From what I understand, this should not happen because of the configuration of the NO_PROXY variable inside /var/snap/microk8s/current/args/containerd-env.

Maybe I missed a step?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
polyedrecommented, May 7, 2021

Update on this:

I started from scratch with a virtual machine and instead of using iptables rules to restrict the access to internet I started the VM in a private network only connected to the machine hosting the proxy service.

I just needed to setup variables in /var/snap/microk8s/current/args/containerd-env. I added localhost,127.0.0.1 and the IP used by the node (from microk8s kubectl get node -o jsonpath='{..adress}').

HTTPS_PROXY=http://X.X.X.X:3128
NO_PROXY=localhost,127.0.0.1,10.1.0.0/16,10.152.183.0/24,Y.Y.Y.Y

This works perfectly!

0reactions
polyedrecommented, Apr 21, 2021

I do not think that the iptables rules are forcing the kubelet to use the proxy. Here are the relevant iptables rules:

:INPUT DROP [621:68575]
:FORWARD ACCEPT [0:0]
:OUTPUT DROP [3498:209880]
-A INPUT -p tcp -m tcp --dport 22 -j ACCEPT
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A FORWARD -s 10.1.0.0/16 -m comment --comment "generated for MicroK8s pods" -j ACCEPT
-A FORWARD -d 10.1.0.0/16 -m comment --comment "generated for MicroK8s pods" -j ACCEPT
-A OUTPUT -p tcp -m tcp --dport 22 -j ACCEPT
-A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A OUTPUT -o lo -j ACCEPT
-A OUTPUT -d 10.10.1.98/32 -j ACCEPT

And the variables http_proxy and https_proxy were already configured in the file /etc/environment:

PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
HTTP_PROXY="http://10.10.1.98:3128"
http_proxy="http://10.10.1.98:3128"

HTTPS_PROXY="http://10.10.1.98:3128"
https_proxy="http://10.10.1.98:3128"

NO_PROXY="localhost,127.0.0.1,10.152.183.1,10.152.183.2,[REDACTED],10.152.183.254,10.152.183.255"
no_proxy="localhost,127.0.0.1,10.152.183.1,10.152.183.2,[REDACTED],10.152.183.254,10.152.183.255"

It is strange because when accepting all output through the eth0 (the default route) interface, the cluster pulls the images through the proxy and all the network paquets from the VM go through the proxy

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pods stuck in ContainerCreating status · Issue #357 - GitHub
OK, I see. When I microk8s.kubectl describe any-pod-not-running , it's output that Error from server (NotFound): pods "kube ...
Read more >
Kubernetes stuck on ContainerCreating - Server Fault
In my case, a pod was stuck at 'ContainerCreating' because a docker image pull was hung (some layers were downloaded, some were stuck...
Read more >
Troubleshooting - MicroK8s
If a pod is not behaving as expected, the first port of call should be the logs. First determine the resource identifier for...
Read more >
MicroK8s containers unable to start. All pods stuck on ...
I've been Googling and going through logs to try and solve this, but I can't seem to get microk8s to work on my...
Read more >
PODs stuck in ContainerCreating state on TKGS Guest ...
In vSphere 7.0 U3, after an HA failover or reboot of a TKGS Worker Node, pods will show stuck in ContainerCreating state.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found