Splunk-connect consuming lot of API Server resources | 98% of the API connections are consumed by Fluentd
See original GitHub issueHello Team,
We were using splunk-connect 1.0.1 and it was fine. Now we have upgraded to splunk-connect 1.4.0 and it exponentially increased number of API connections. We see 2 issues in our clusters due to this:
-
API Server is very busy processing these API calls and its consuming lot of resources even with less workload in the cluster.
-
We have enabled audit logging which is filled up very quickly and 98% of the events are from Splunk pods.
We need to see why fluentd consuming these many connections and resources. We understand it uses watch API but how we can minimize the consumption ?
What happened: As Above consuming CPU and Memory resources along with Disk space for audit logs.
What you expected to happen: We expected to happen this smooth without much stress on API server
How to reproduce it (as minimally and precisely as possible): Install splunk-connect 1.4.0 with Openshift 3.11.x / Docker enterprise v1.14 version.
Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version
):v1.14.8 - Ruby version (use
ruby --version
): - OS (e.g:
cat /etc/os-release
): CentOS 7.7 - Splunk version: 7.x
- Others:
Issue Analytics
- State:
- Created 3 years ago
- Comments:17 (1 by maintainers)
Thanks @matthewmodestino - it’s working very well now!
We are also seeing this occur, every few days the CPU utilisation on the control plane nodes (api server) inceases to 100%. Restarting the splunk-connect logging pods resolves this issue for another few days.
During the period of hi api-server usage the audit logs indicate a large number of repeated queries to
/api/v1/watch/pods?fieldSelector=spec.nodeName%3Dus-prod-kubewrk-001.atl01.example.org&resourceVersion=78464987
for each worker node (6). These requests appear to be being made at a rate of approximately 1600 r/min on each api node (3).