question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

AKS deploy not sending to SplunkCloud - lots of errors in splunk container logs

See original GitHub issue

Splunk Connect install using these references: https://github.com/splunk/splunk-connect-for-kubernetes

Two indexes created in our Splunk Cloud deploy: k8s-event-index & k8s-metrics-index

Our HEC deployed and tested pushing events via Postman (works once i make sure the HEC has “Enable indexer acknowledgement” disabled in the config) Note: some wierdness in the HEC when using multiple indexes (Event and Metrics indexes)

For simplicity we tested Splunk Connect install using these two different Helm entries (both of which deployed successful:

First deploy options:

helm install --name my-splunk-connect --set global.splunk.hec.host=https://http-inputs-travelport.splunkcloud.com/services/collector/event --set global.splunk.hec.token=123456 --set splunk-kubernetes-metrics.splunk.hec.indexName=k8s-metrics-index --set splunk-kubernetes-logging.splunk.hec.indexName=k8s-event-index --set splunk-kubernetes-objects.splunk.hec.indexName=k8s-event-index https://github.com/splunk/splunk-connect-for-kubernetes/releases/download/1.1.0/splunk-connect-for-kubernetes-1.1.0.tgz

Second deploy options:

helm install --name my-splunk-connect --set global.splunk.hec.insecureSSL=true --set splunk-kubernetes-objects.kubernetes.insecureSSL=true --set global.splunk.hec.protocol=https --set global.splunk.hec.host=https://http-inputs-travelport.splunkcloud.com/services/collector/event --set global.splunk.hec.token=123456 --set splunk-kubernetes-metrics.splunk.hec.indexName=k8s-metrics-index --set splunk-kubernetes-logging.splunk.hec.indexName=k8s-event-index --set splunk-kubernetes-objects.splunk.hec.indexName=k8s-event-index https://github.com/splunk/splunk-connect-for-kubernetes/releases/download/1.1.0/splunk-connect-for-kubernetes-1.1.0.tgz
Deployments detail following install:
Name                Location    ResourceGroup          KubernetesVersion    ProvisioningState    Fqdn
------------------  ----------  ---------------------  -------------------  -------------------  ----------------------------------------------------
mySplunkAKSCluster  eastus      mySplunkResourceGroup  1.11.9               Succeeded            mysplunkakscluster-dns-9b7c1652.hcp.eastus.azmk8s.io

PODS-
NAME                                                              READY   STATUS    RESTARTS   AGE
azure-vote-back-5db676f7d-d2tvx                                   1/1     Running   0          2d
azure-vote-front-559f8954f4-296xn                                 1/1     Running   0          2d
my-splunk-connect-splunk-kubernetes-logging-j5bdt                 1/1     Running   0          38m
my-splunk-connect-splunk-kubernetes-metrics-agg-6bfb66b8cf6pnv4   1/1     Running   0          38m
my-splunk-connect-splunk-kubernetes-metrics-whnmc                 1/1     Running   0          38m
my-splunk-connect-splunk-kubernetes-objects-869f988945-hbh4w      1/1     Running   0          38m

ERRORS THAT CONTINUE: Requesting logs for ‘objects’, ‘metrics’, and ‘logging’ containers as follows:

2019-04-08 11:27:11 +0000 [warn]: #0 failed to flush the buffer. retry_time=0 next_retry_seconds=2019-04-08 11:27:12 +0000 chunk="586031de827e46b2889726c87c564adc" error_class=SocketError error="Failed to open TCP connection to https:443 (getaddrinfo: Temporary failure in name resolution)"
  2019-04-08 11:27:11 +0000 [warn]: #0 suppressed same stacktrace
2019-04-08 11:27:12 +0000 [warn]: #0 failed to flush the buffer. retry_time=1 next_retry_seconds=2019-04-08 11:27:13 +0000 chunk="586031de827e46b2889726c87c564adc" error_class=SocketError error="Failed to open TCP connection to https:443 (getaddrinfo: Temporary failure in name resolution)"
  2019-04-08 11:27:12 +0000 [warn]: #0 suppressed same stacktrace
2019-04-08 11:27:13 +0000 [warn]: #0 failed to flush the buffer. retry_time=2 next_retry_seconds=2019-04-08 11:27:15 +0000 chunk="586031de827e46b2889726c87c564adc" error_class=SocketError error="Failed to open TCP connection to https:443 (getaddrinfo: Temporary failure in name resolution)"
  2019-04-08 11:27:13 +0000 [warn]: #0 suppressed same stacktrace
2019-04-08 11:27:15 +0000 [warn]: #0 failed to flush the buffer. retry_time=3 next_retry_seconds=2019-04-08 11:27:20 +0000 chunk="586031de827e46b2889726c87c564adc" error_class=SocketError error="Failed to open TCP connection to https:443 (getaddrinfo: Temporary failure in name resolution)"
  2019-04-08 11:27:15 +0000 [warn]: #0 suppressed same stacktrace
2019-04-08 11:27:20 +0000 [error]: #0 failed to flush the buffer, and hit limit for retries. dropping all chunks in the buffer queue. retry_times=3 records=22 error_class=SocketError error="Failed to open TCP connection to https:443 (getaddrinfo: Temporary failure in name resolution)"
  2019-04-08 11:27:20 +0000 [error]: #0 suppressed same stacktrace
kubectl logs -lapp=splunk-kubernetes-metrics --all-containers=true
2019-04-08 11:39:06 +0000 [warn]: #0 failed to flush the buffer. retry_time=2 next_retry_seconds=2019-04-08 11:39:08 +0000 chunk="5860347ad852591e06621bee1c570b92" error_class=SocketError error="Failed to open TCP connection to https:443 (getaddrinfo: Try again)"
  2019-04-08 11:39:06 +0000 [warn]: #0 suppressed same stacktrace
2019-04-08 11:39:13 +0000 [warn]: #0 failed to flush the buffer. retry_time=3 next_retry_seconds=2019-04-08 11:39:17 +0000 chunk="5860347ad852591e06621bee1c570b92" error_class=SocketError error="Failed to open TCP connection to https:443 (getaddrinfo: Try again)"
  2019-04-08 11:39:13 +0000 [warn]: #0 suppressed same stacktrace
2019-04-08 11:39:18 +0000 [info]: #0 Use URL https://mysplunkakscluster-dns-9b7c1652.hcp.eastus.azmk8s.io/api/v1/pods for scraping limits requests metrics
2019-04-08 11:39:19 +0000 [info]: #0 Use URL https://mysplunkakscluster-dns-9b7c1652.hcp.eastus.azmk8s.io/api/v1/nodes for scraping node metrics
2019-04-08 11:39:19 +0000 [info]: #0 Use URL https://mysplunkakscluster-dns-9b7c1652.hcp.eastus.azmk8s.io/api/v1/nodes for scraping node metrics
2019-04-08 11:39:19 +0000 [info]: #0 Use URL https://mysplunkakscluster-dns-9b7c1652.hcp.eastus.azmk8s.io/api/v1/nodes/aks-agentpool-39541254-0:10250/proxy/stats/summary for scraping resource usage metrics
2019-04-08 11:39:22 +0000 [error]: #0 failed to flush the buffer, and hit limit for retries. dropping all chunks in the buffer queue. retry_times=3 records=606 error_class=SocketError error="Failed to open TCP connection to https:443 (getaddrinfo: Try again)"
  2019-04-08 11:39:22 +0000 [error]: #0 suppressed same stacktrace
  2019-04-08 10:57:31 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/rest-client-2.0.2/lib/restclient/request.rb:642:in `transmit'
  2019-04-08 10:57:31 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/rest-client-2.0.2/lib/restclient/request.rb:145:in `execute'
  2019-04-08 10:57:31 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/rest-client-2.0.2/lib/restclient/request.rb:52:in `execute'
  2019-04-08 10:57:31 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluent-plugin-kubernetes-metrics-1.1.0/lib/fluent/plugin/in_kubernetes_metrics.rb:633:in `scrape_cadvisor_metrics'
  2019-04-08 10:57:31 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.4.0/lib/fluent/plugin_helper/timer.rb:80:in `on_timer'
  2019-04-08 10:57:31 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/cool.io-1.5.3/lib/cool.io/loop.rb:88:in `run_once'
  2019-04-08 10:57:31 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/cool.io-1.5.3/lib/cool.io/loop.rb:88:in `run'
  2019-04-08 10:57:31 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.4.0/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start'
  2019-04-08 10:57:31 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.4.0/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2019-04-08 10:57:31 +0000 [error]: #0 Timer detached. title=:cadvisor_metric_scraper
kubectl logs -lapp=splunk-kubernetes-logging --all-containers=true
2019-04-08 11:40:33 +0000 [warn]: #0 failed to flush the buffer. retry_time=2 next_retry_seconds=2019-04-08 11:40:35 +0000 chunk="586034d42ba67c76daf16aad429aea3d" error_class=SocketError error="Failed to open TCP connection to https:443 (getaddrinfo: Temporary failure in name resolution)"
  2019-04-08 11:40:33 +0000 [warn]: #0 suppressed same stacktrace
2019-04-08 11:40:35 +0000 [warn]: #0 failed to flush the buffer. retry_time=3 next_retry_seconds=2019-04-08 11:40:38 +0000 chunk="586034d42ba67c76daf16aad429aea3d" error_class=SocketError error="Failed to open TCP connection to https:443 (getaddrinfo: Temporary failure in name resolution)"
  2019-04-08 11:40:35 +0000 [warn]: #0 suppressed same stacktrace
2019-04-08 11:40:38 +0000 [error]: #0 failed to flush the buffer, and hit limit for retries. dropping all chunks in the buffer queue. retry_times=3 records=53 error_class=SocketError error="Failed to open TCP connection to https:443 (getaddrinfo: Temporary failure in name resolution)"
  2019-04-08 11:40:38 +0000 [error]: #0 suppressed same stacktrace
2019-04-08 11:40:39 +0000 [warn]: #0 failed to flush the buffer. retry_time=0 next_retry_seconds=2019-04-08 11:40:40 +0000 chunk="586034dcc127bba3d9f6bac69f659281" error_class=SocketError error="Failed to open TCP connection to https:443 (getaddrinfo: Temporary failure in name resolution)"
  2019-04-08 11:40:39 +0000 [warn]: #0 suppressed same stacktrace
2019-04-08 11:40:40 +0000 [warn]: #0 failed to flush the buffer. retry_time=1 next_retry_seconds=2019-04-08 11:40:41 +0000 chunk="586034dcc127bba3d9f6bac69f659281" error_class=SocketError error="Failed to open TCP connection to https:443 (getaddrinfo: Temporary failure in name resolution)"
  2019-04-08 11:40:40 +0000 [warn]: #0 suppressed same stacktrace

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:8

github_iconTop GitHub Comments

1reaction
matthewmodestinocommented, Apr 10, 2019

the host should only be http-inputs-yourStackName.splunkcloud.com not the URI path, and then you can set the port to 443, cause thats what cloud uses.

#global settings
global:
  logLevel: info 
  splunk:
    hec:
      protocol: https
      insecureSSL: false
      host: http-inputs-yourStackName.splunkcloud.com
      token: <yourToken>

#local config for logging chart
splunk-kubernetes-logging:
  kubernetes:
    clusterName: eks 
  journalLogPath: /run/log/journal
  splunk:
    hec:
      port: 443
      indexName: eks_logs 
0reactions
hhagewoodcommented, Apr 12, 2019

For the time being i’ll go with a single index deploy.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Kubernetes Logging: Introduction & Challenges - Splunk
Learn the basics of Kubernetes logging and the role that logging plays in your overall observability strategy in this blog post.
Read more >
Collect Kubernetes metrics and logs with Splunk App for ...
Specify the data collection options for collecting metrics and logs from the cluster. If you're running SAI on Splunk Cloud, you must enter ......
Read more >
Splunk Cloud & HTTP Event Collector: Docker log-driver error ...
I turned on the HTTP Event Collector in Splunk, but I am not able to pass logs via the Docker log-driver options even...
Read more >
Kubernetes Incident Response Best Practices - Splunk
This blog post covers best practices and options for incident response strategy within a Kubernetes environment.
Read more >
Solved: Can't receive Container's logs from Docker with Sp...
I would suggest you at first try to enable Splunk Logging Driver on the container level, so just when you deploy your first...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found