question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Not Seeing Metrics Data in Splunk Cloud Index

See original GitHub issue

What happened: Not seeing metrics data in Splunk Cloud when implementing Splunk Connect for Kubernetes. We are however able to see log and object data. Seeing the following in the metrics pod logs:

2019-07-23 15:44:57 +0000 [debug]: #0 [Response] POST https://http-inputs-<redacted>.splunkcloud.com/services/collector: #<Net::HTTPOK 200 OK readbody=true> 2019-07-23 15:45:12 +0000 [debug]: #0 Received new chunk, size=1046393 2019-07-23 15:45:12 +0000 [debug]: #0 Sending 1046393 bytes to Splunk. 2019-07-23 15:45:23 +0000 [debug]: #0 [Response] POST https://http-inputs-<redacted>.splunkcloud.com/services/collector: #<Net::HTTPOK 200 OK readbody=true> 2019-07-23 15:45:26 +0000 [debug]: #0 Received new chunk, size=1046240 2019-07-23 15:45:26 +0000 [debug]: #0 Sending 1046240 bytes to Splunk. 2019-07-23 15:45:27 +0000 [debug]: #0 [Response] POST https://http-inputs-<redacted>.splunkcloud.com/services/collector: #<Net::HTTPOK 200 OK readbody=true> 2019-07-23 15:45:40 +0000 [debug]: #0 Received new chunk, size=1047250 2019-07-23 15:45:40 +0000 [debug]: #0 Sending 1047250 bytes to Splunk. 2019-07-23 15:45:48 +0000 [debug]: #0 [Response] POST https://http-inputs-<redacted>.splunkcloud.com/services/collector: #<Net::HTTPOK 200 OK readbody=true> 2019-07-23 15:45:57 +0000 [debug]: #0 Received new chunk, size=1047460 2019-07-23 15:45:57 +0000 [debug]: #0 Sending 1047460 bytes to Splunk. 2019-07-23 15:46:07 +0000 [debug]: #0 taking back chunk for errors. chunk=“58e5b164f15e9a50596c723c5ca003d1” 2019-07-23 15:46:07 +0000 [warn]: #0 failed to flush the buffer. retry_time=0 next_retry_seconds=2019-07-23 15:46:08 +0000 chunk=“58e5b164f15e9a50596c723c5ca003d1” error_class=SocketError error=“Failed to open TCP connection to http-inputs-<DEDACTED>.splunkcloud.com:443 (getaddrinfo: Try again)” 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/2.5.0/net/http.rb:939:in rescue in block in connect' 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/2.5.0/net/http.rb:936:in block in connect’ 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/2.5.0/timeout.rb:93:in block in timeout' 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/2.5.0/timeout.rb:103:in timeout’ 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/2.5.0/net/http.rb:935:in connect' 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/2.5.0/net/http.rb:1541:in begin_transport’ 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/2.5.0/net/http.rb:1493:in transport_request' 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/2.5.0/net/http.rb:1467:in request’ 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/net-http-persistent-3.0.1/lib/net/http/persistent.rb:951:in block in request' 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/net-http-persistent-3.0.1/lib/net/http/persistent.rb:648:in connection_for’ 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/net-http-persistent-3.0.1/lib/net/http/persistent.rb:945:in request' 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluent-plugin-splunk-hec-1.1.0/lib/fluent/plugin/out_splunk_hec.rb:355:in send_to_hec’ 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluent-plugin-splunk-hec-1.1.0/lib/fluent/plugin/out_splunk_hec.rb:167:in write' 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.4.0/lib/fluent/plugin/output.rb:1125:in try_flush’ 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.4.0/lib/fluent/plugin/output.rb:1425:in flush_thread_run' 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.4.0/lib/fluent/plugin/output.rb:454:in block (2 levels) in start’ 2019-07-23 15:46:07 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.4.0/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create’ 2019-07-23 15:46:08 +0000 [debug]: #0 Received new chunk, size=1047460 2019-07-23 15:46:08 +0000 [debug]: #0 Sending 1047460 bytes to Splunk. 2019-07-23 15:46:11 +0000 [debug]: #0 [Response] POST https://http-inputs-<redacted>.splunkcloud.com/services/collector: #<Net::HTTPOK 200 OK readbody=true> 2019-07-23 15:46:11 +0000 [warn]: #0 retry succeeded. chunk_id=“58e5b164f15e9a50596c723c5ca003d1” 2019-07-23 15:46:12 +0000 [debug]: #0 Received new chunk, size=1123692 2019-07-23 15:46:12 +0000 [debug]: #0 Sending 1123692 bytes to Splunk. 2019-07-23 15:46:12 +0000 [debug]: #0 [Response] POST https://http-inputs-<redacted>.splunkcloud.com/services/collector: #<Net::HTTPOK 200 OK readbody=true> 2019-07-23 15:46:25 +0000 [debug]: #0 Received new chunk, size=1130330 2019-07-23 15:46:25 +0000 [debug]: #0 Sending 1130330 bytes to Splunk. 2019-07-23 15:46:28 +0000 [debug]: #0 [Response] POST https://http-inputs-<redacted>.splunkcloud.com/services/collector: #<Net::HTTPOK 200 OK readbody=true>

What you expected to happen:

It looks like from the metrics pod logs that some posts to Splunk Cloud work and others do not. We are also not seeing any data in the metrics index in Splunk Could. Since the HEC connection is a global setting, is there a setting we need to add to the metrics collector? Is there something specific with posts that do not work that needs to be accounted for in the Helm template?

How to reproduce it (as minimally and precisely as possible):

Install Splunk Connect for Kubernetes with Helm with the following values yaml:

global: logLevel: debug splunk: hec: protocol: https insecureSSL: false host: http-inputs-<redacted>.splunkcloud.com token: <redacted> port: 443 splunk-kubernetes-logging: journalLogPath: /run/log/journal splunk: hec: indexName: <redacted>-log splunk-kubernetes-objects: rbac: create: true serviceAccount: create: true name: splunk-kubernetes-objects kubernetes: insecureSSL: false objects: core: v1: - name: pods interval: 30s - name: namespaces interval: 30s - name: nodes interval: 30s - name: services interval: 30s - name: config_maps interval: 30s - name: persistent_volumes interval: 30s - name: service_accounts interval: 30s - name: persistent_volume_claims interval: 30s - name: resource_quotas interval: 30s - name: component_statuses interval: 30s - name: events mode: watch apps: v1: - name: deployments interval: 30s - name: daemon_sets interval: 30s - name: replica_sets interval: 30s - name: stateful_sets interval: 30s splunk: hec: indexName: <redacted>-objects splunk-kubernetes-metrics: rbac: create: true serviceAccount: create: true name: splunk-kubernetes-metrics splunk: hec: indexName: <redacted>-metrics

Helm command - helm install --name k8-connect splunk-connect-for-kubernetes-1.2.0.tgz --namespace test3 -f values.yaml

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version):1.13-EKS OS (e.g: cat /etc/os-release):amazonlinux Splunk version: Others:

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:10 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
matthewmodestinocommented, Jul 23, 2019

looks like hec config

let’s doublecheck this:

2019-07-23 15:46:07 +0000 [warn]: #0 failed to flush the buffer. retry_time=0 next_retry_seconds=2019-07-23 15:46:08 +0000 chunk="58e5b164f15e9a50596c723c5ca003d1" error_class=SocketError error="Failed to open TCP connection to http-inputs-.splunkcloud.com:443 (getaddrinfo: Try again)"

Is your splunk cloud endpoint http or https? also is the url broken in the configmap for the pod?

try kubectl desccribe cm <splunk-kubernetes-metrics-configmap-name-here>

0reactions
BobWieberdinkcommented, Jul 29, 2019

Interestingly enough, the issue, so far, seems to be environment related. I replaced HEC Host DNS with one of the IPs in the DNS record and I don’t see the issue anymore. I am now working to recreate in one of our other K8 environments to see if the issue happens in the new environment.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshoot the Splunk Metrics Workspace
Solution. Expand the time range to view a wider range of data. In the Analysis Panel of the Metrics Workspace, adjust the filters...
Read more >
Manage Splunk Cloud Platform indexes
Manage your Indexes and Data in Splunk Cloud Platform ... metrics index. For more information about the metrics data format see Metrics.
Read more >
Solved: Unable to access Metrics Indexes
Solved: Hi all, I'm using the (excellent) TrackMe app which uses a Metrics Index. The index has been created on a Indexer Cluster...
Read more >
Solved: Why am I missing metrics.log data from some of my
I've made a search that returns all hosts that sends events of some kind to indexer, but does not send internal metrics events....
Read more >
Send metrics to a metrics index
If you gather metrics data, you can send the data directly to a metrics index using an HTTP Event Collector (HEC). The most...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found