question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[PWK] Default kube-dns dnsmasq configuration only allows 150 concurrent DNS req's

See original GitHub issue

Hi All,

Awesome work with PWK. Tried myself ontop of the existing PWD but was having issues exposing the necessary bits and overlays to satisfy kubeadm. (Noticed this is now CentOS based, that makes sense… But another conversation for another day!)

Wanted to highlight an issue with the default dnsmasq configuration inside kube-dns, it bombs out at a configured limit of 150 concurrent requests, leading to a failed healthcheck and a restart, meaning DNS goes away for 15+ seconds cluster wide at random times.

It’s pretty easy to get to 150 requests across even a demo cluster, especially given real-world DNS requests get stuck for a little bit as theres no forward resolver configured.

Affected clusters will have the following logs: kubectl --namespace=kube-system logs <kube-dns-podxyz> dnsmasq

I0809 13:43:47.485776      45 nanny.go:108] dnsmasq[64]: Maximum number of concurrent DNS queries reached (max: 150)
I0809 13:43:57.500488      45 nanny.go:108] dnsmasq[64]: Maximum number of concurrent DNS queries reached (max: 150)
I0809 13:44:07.512986      45 nanny.go:108] dnsmasq[64]: Maximum number of concurrent DNS queries reached (max: 150)

kubectl --namespace=kube-system logs <kube-dns-podxyz> sidecar

ERROR: logging before flag.Parse: W0809 13:40:41.250002       7 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:35703->127.0.0.1:53: i/o timeout
ERROR: logging before flag.Parse: W0809 13:40:54.273636       7 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:45732->127.0.0.1:53: i/o timeout
ERROR: logging before flag.Parse: W0809 13:41:01.274108       7 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:46985->127.0.0.1:53: i/o timeout

With an increasing dnsmasq restart count in kubectl --namespace=kube-system describe po <kube-dns-podxyz>

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:1
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
jjocommented, Aug 15, 2017

Just completeness, as per #181, I can workaround it to allow extn DNS resolution with:

kubectl get deployment --namespace=kube-system kube-dns -oyaml|sed -r 's,(.*--server)=(/ip6.arpa/.*),&\n\1=8.8.8.8,'|kubectl apply -f -
0reactions
marcosnilscommented, Aug 16, 2017

Hey @luxas is there anything we can do from kubeadm perspective to bootstrap kube-dns with this option by default?. Haven’t seen anything for that AFAIK.

Read more comments on GitHub >

github_iconTop Results From Across the Web

DNSMASQ 'concurrent DNS queries reached (max: 150)' - Help
Periodic failure to lookup DNS. The only odd thing about my setup is that I turn off all internet connectivity between 1and 5am....
Read more >
kubernetes DNS fails - Stack Overflow
Digging into Javier's answer, I found the solution to the problem here: https://github.com/kubernetes/kubeadm/issues/787.
Read more >
Solving Kubernetes DNS issues on systemd servers
dnsmasq [20]: Maximum number of concurrent DNS queries reached (max: 150) ... However, KubeDNS is dependent on the resolv.conf file to resolve ...
Read more >
Dnsmasq - Lightweight Name Resolution For Your Home Lab
Runs on Linux, macOS. Simple package install with apt-get (on ubuntu linux). Provides DNS and DHCP services - I'm just running DNS here ......
Read more >
I'm not sure that this is related to the vm-driver, but in this setup ...
It seems like dnsmasq failing is the root cause. Logs contain “Maximum number of concurrent DNS queries reached (max: 150)”.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found