Workers stuck in ContainerCreating state
See original GitHub issueI’m not sure how this has happened, but I have a bunch of workers (20ish) that are stuck in ContainerCreating with no IP assigned. The standard trick of just trying to delete them isn’t working (kubectl delete po
). Adding --force --grace-period 0
to kubectl delete po
doesn’t work either. Any suggestions as to how to get rid of these phantom pods?
This is with dask-gateway 0.5.0 on GKE
Here’s the kubectl get po output:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
dask-gateway-anaconda-scheduler-e23bd3b9461842bc90cd84972f5ed4e9 1/1 Running 0 2m19s 10.60.208.8 gke-gartner-dask-2-default-pool-0a3f0337-kp1w <none> <none>
dask-gateway-anaconda-worker-092560ec333a4f449b81fc4b41434d1c 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-rxzr <none> <none>
dask-gateway-anaconda-worker-0d33b613dc894f68bfa39d8493783266 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-9g8p <none> <none>
dask-gateway-anaconda-worker-1029f444f66842aa9ee643c599d3f4f0 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-default-pool-0a3f0337-cl8b <none> <none>
dask-gateway-anaconda-worker-2fe03b1dfce24997b2c2729726f1bc31 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-schv <none> <none>
dask-gateway-anaconda-worker-30fc2943215545ab947b7303149452d5 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-3t2c <none> <none>
dask-gateway-anaconda-worker-332a1c83124240f1928701b4447220cd 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-n0fc <none> <none>
dask-gateway-anaconda-worker-48d0669ccedc498caf8ce7e50353712d 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-cq7b <none> <none>
dask-gateway-anaconda-worker-4c22c12f2db04bb9a2990b036127f5de 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-zv8z <none> <none>
dask-gateway-anaconda-worker-50da68a1611040cb934f9c24520d684b 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-w57r <none> <none>
dask-gateway-anaconda-worker-5696afe163dd42a0bf912077bc17f991 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-l04h <none> <none>
dask-gateway-anaconda-worker-62990c1424ca4f1db00abf6361760e5b 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-schv <none> <none>
dask-gateway-anaconda-worker-6937adc44b214c559ae1f6a4d50c8441 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-qqsd <none> <none>
dask-gateway-anaconda-worker-70ccd6ea870c4fd593d1723921b13226 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-w57r <none> <none>
dask-gateway-anaconda-worker-72951189bb784fbd9b1b7770caac0a49 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-cz3v <none> <none>
dask-gateway-anaconda-worker-781749a34a724ae5b791fb80a51f1f97 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-mpww <none> <none>
dask-gateway-anaconda-worker-815a3050b1fd41cf98b90b64b3f25217 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-default-pool-0a3f0337-cl8b <none> <none>
dask-gateway-anaconda-worker-854cf1edee9d4dd691e2ca9d1380086a 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-n0fc <none> <none>
dask-gateway-anaconda-worker-8e1045f6ef8c4c7baa3998069d8749b5 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-18lp <none> <none>
dask-gateway-anaconda-worker-8e43e9c13e1a475299f2610f9a7e481c 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-9g8p <none> <none>
dask-gateway-anaconda-worker-90805c3872024b0e98bc141a3a8f8270 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-w57r <none> <none>
dask-gateway-anaconda-worker-9438089332cb45ce98b5077d50c8d31a 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-bmv1 <none> <none>
dask-gateway-anaconda-worker-9c1e3170e0d04e85b402c64c2a270466 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-3t2c <none> <none>
dask-gateway-anaconda-worker-a48305d3a9984fc399f41d67ce0c733e 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-v9q2 <none> <none>
dask-gateway-anaconda-worker-b23599394f7443e9b409d6067fb62f29 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-n0fc <none> <none>
dask-gateway-anaconda-worker-b2b89841256c45bbb23dd4f2c8a2f78b 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-default-pool-0a3f0337-n4wx <none> <none>
dask-gateway-anaconda-worker-bf97f8a1a27940eabe14c23cbff2e1a4 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-schv <none> <none>
dask-gateway-anaconda-worker-c0ff5c10146242629633502bdb10136c 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-default-pool-0a3f0337-2lbp <none> <none>
dask-gateway-anaconda-worker-c4a3eccc12794cc8b28cd82b564492b2 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-default-pool-0a3f0337-cl8b <none> <none>
dask-gateway-anaconda-worker-da7a7c9c535a4041a7e2be972ce1b1a1 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-bmv1 <none> <none>
dask-gateway-anaconda-worker-eadd4a757ac54c3b8856fdece6c2158c 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-l04h <none> <none>
dask-gateway-anaconda-worker-f4eeea877fa4413c8c4cadeffe5382c9 0/1 ContainerCreating 0 5h34m <none> gke-gartner-dask-2-user-pool-829c4a5c-v9q2 <none> <none>
gateway-dask-gateway-66565d4c7d-zf2f9 1/1 Running 0 21h 10.60.173.12 gke-gartner-dask-2-user-pool-829c4a5c-bmv1 <none> <none>
scheduler-proxy-dask-gateway-69db6d9bbf-z99bw 1/1 Running 0 20h 10.60.173.19 gke-gartner-dask-2-user-pool-829c4a5c-bmv1 <none> <none>
web-proxy-dask-gateway-69769d57d9-lngkm 1/1 Running 0 22h 10.60.0.185 gke-gartner-dask-2-default-pool-0a3f0337-26wz <none> <none>
kubectl delete po fails:
kubectl get po -n dask-gateway | grep worker | cut -d " " -f 1 | xargs kubectl delete po
Error from server (NotFound): pods "dask-gateway-anaconda-worker-092560ec333a4f449b81fc4b41434d1c" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-0d33b613dc894f68bfa39d8493783266" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-1029f444f66842aa9ee643c599d3f4f0" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-2fe03b1dfce24997b2c2729726f1bc31" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-30fc2943215545ab947b7303149452d5" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-332a1c83124240f1928701b4447220cd" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-48d0669ccedc498caf8ce7e50353712d" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-4c22c12f2db04bb9a2990b036127f5de" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-50da68a1611040cb934f9c24520d684b" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-5696afe163dd42a0bf912077bc17f991" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-62990c1424ca4f1db00abf6361760e5b" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-6937adc44b214c559ae1f6a4d50c8441" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-70ccd6ea870c4fd593d1723921b13226" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-72951189bb784fbd9b1b7770caac0a49" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-781749a34a724ae5b791fb80a51f1f97" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-815a3050b1fd41cf98b90b64b3f25217" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-854cf1edee9d4dd691e2ca9d1380086a" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-8e1045f6ef8c4c7baa3998069d8749b5" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-8e43e9c13e1a475299f2610f9a7e481c" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-90805c3872024b0e98bc141a3a8f8270" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-9438089332cb45ce98b5077d50c8d31a" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-9c1e3170e0d04e85b402c64c2a270466" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-a48305d3a9984fc399f41d67ce0c733e" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-b23599394f7443e9b409d6067fb62f29" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-b2b89841256c45bbb23dd4f2c8a2f78b" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-bf97f8a1a27940eabe14c23cbff2e1a4" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-c0ff5c10146242629633502bdb10136c" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-c4a3eccc12794cc8b28cd82b564492b2" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-da7a7c9c535a4041a7e2be972ce1b1a1" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-eadd4a757ac54c3b8856fdece6c2158c" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-f4eeea877fa4413c8c4cadeffe5382c9" not found
being extra aggressive about the delete doesn’t do anything either.
kubectl get po -n dask-gateway | grep worker | cut -d " " -f 1 | xargs kubectl delete --force --grace-period 0 po
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
Error from server (NotFound): pods "dask-gateway-anaconda-worker-092560ec333a4f449b81fc4b41434d1c" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-0d33b613dc894f68bfa39d8493783266" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-1029f444f66842aa9ee643c599d3f4f0" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-2fe03b1dfce24997b2c2729726f1bc31" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-30fc2943215545ab947b7303149452d5" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-332a1c83124240f1928701b4447220cd" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-48d0669ccedc498caf8ce7e50353712d" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-4c22c12f2db04bb9a2990b036127f5de" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-50da68a1611040cb934f9c24520d684b" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-5696afe163dd42a0bf912077bc17f991" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-62990c1424ca4f1db00abf6361760e5b" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-6937adc44b214c559ae1f6a4d50c8441" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-70ccd6ea870c4fd593d1723921b13226" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-72951189bb784fbd9b1b7770caac0a49" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-781749a34a724ae5b791fb80a51f1f97" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-815a3050b1fd41cf98b90b64b3f25217" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-854cf1edee9d4dd691e2ca9d1380086a" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-8e1045f6ef8c4c7baa3998069d8749b5" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-8e43e9c13e1a475299f2610f9a7e481c" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-90805c3872024b0e98bc141a3a8f8270" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-9438089332cb45ce98b5077d50c8d31a" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-9c1e3170e0d04e85b402c64c2a270466" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-a48305d3a9984fc399f41d67ce0c733e" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-b23599394f7443e9b409d6067fb62f29" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-b2b89841256c45bbb23dd4f2c8a2f78b" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-bf97f8a1a27940eabe14c23cbff2e1a4" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-c0ff5c10146242629633502bdb10136c" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-c4a3eccc12794cc8b28cd82b564492b2" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-da7a7c9c535a4041a7e2be972ce1b1a1" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-eadd4a757ac54c3b8856fdece6c2158c" not found
Error from server (NotFound): pods "dask-gateway-anaconda-worker-f4eeea877fa4413c8c4cadeffe5382c9" not found
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (5 by maintainers)
Top Results From Across the Web
Kubernetes stuck on ContainerCreating - Server Fault
It happens when you are using secrets and they are not found (like a typo in the yaml or you forgot to create...
Read more >Pods get stuck at ContainerCreating state #8323 - GitHub
We are running linkerd stable-2.11.1 on Azure AKS. In todays instance of this issue, I was working on migrating pods to a new...
Read more >Kubernetes pod is stuck in ContainerCreating state after ...
failed to sync usually means the pods can't be fit into any of the workers (maybe adding more will help) or from your...
Read more >Learn why your EKS pod is stuck in the ContainerCreating state
My Amazon Elastic Kubernetes Service (Amazon EKS) pod is stuck in the ContainerCreating state with the error "failed to create pod sandbox".
Read more >Troubleshooting in Kubernetes: A Strategic Guide
Pods stuck in ContainerCreating state ... ContainerCreating , for instance, implies that kube-scheduler has assigned a worker node for the ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
if i find a way to reproduce this then i’ll update the bug report.
I could reproduce this state with the following
cluster_options
:i.e. worker memory config was tiny