Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Splunk-connect consuming lot of API Server resources | 98% of the API connections are consumed by Fluentd

See original GitHub issue

Hello Team,

We were using splunk-connect 1.0.1 and it was fine. Now we have upgraded to splunk-connect 1.4.0 and it exponentially increased number of API connections. We see 2 issues in our clusters due to this:

API Server is very busy processing these API calls and its consuming lot of resources even with less workload in the cluster.
We have enabled audit logging which is filled up very quickly and 98% of the events are from Splunk pods.

We need to see why fluentd consuming these many connections and resources. We understand it uses watch API but how we can minimize the consumption ?

What happened: As Above consuming CPU and Memory resources along with Disk space for audit logs.

What you expected to happen: We expected to happen this smooth without much stress on API server

How to reproduce it (as minimally and precisely as possible): Install splunk-connect 1.4.0 with Openshift 3.11.x / Docker enterprise v1.14 version.

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version):v1.14.8
Ruby version (use ruby --version):
OS (e.g: cat /etc/os-release): CentOS 7.7
Splunk version: 7.x
Others:

Issue Analytics

State:
Created 3 years ago
Comments:17 (1 by maintainers)

Top GitHub Comments

1reaction

crdnbcommented, May 7, 2020

Thanks @matthewmodestino - it’s working very well now!

1reaction

mghermancommented, Mar 25, 2020

We are also seeing this occur, every few days the CPU utilisation on the control plane nodes (api server) inceases to 100%. Restarting the splunk-connect logging pods resolves this issue for another few days.

During the period of hi api-server usage the audit logs indicate a large number of repeated queries to /api/v1/watch/pods?fieldSelector=spec.nodeName%3Dus-prod-kubewrk-001.atl01.example.org&resourceVersion=78464987 for each worker node (6). These requests appear to be being made at a rate of approximately 1600 r/min on each api node (3).