question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

fail to flush the buffer due to "too many connection resets"

See original GitHub issue

What happened: I found in the logs , it failed to flush the buffer frequently due to “too many connection resets”

2020-03-16 13:19:56 +0000 [warn]: #0 failed to flush the buffer. retry_time=1 next_retry_seconds=2020-03-16 13:21:01 +0000 chunk=“5a0f8a89e26f39ee8d09457092bcfae5” error_class=Net::HTTP::Persistent::Error error=“too many connection resets (due to Net::ReadTimeout - Net::ReadTimeout) after 0 requests on 70098224503720, last used 1584364796.8142624 seconds ago”

What you expected to happen:

No “failed to flush the buffer” should happen

How to reproduce it (as minimally and precisely as possible):

Apply splunk-connect-for-kubernetes 1.3.0 to openshift by HELM

Anything else we need to know?: My buffer setting buffer: ‘@type’: memory total_limit_size: 4000m chunk_limit_size: 8m chunk_limit_records: 10000 flush_at_shutdown: true flush_interval: 3s flush_thread_count: 10 flush_thread_interval: 0.1 flush_thread_burst_interval: 0.01 overflow_action: block retry_forever: true retry_wait: 60 compress: gzip

I did not see any related logs in splunk master node.

Some pods generate about 1.5 million logs within one hour. And developers want to monitor these logs real-time. I am not that familiar with fluentd, but it should able to handle 3K logs per seconds, right?

Environment:

  • Kubernetes version (use kubectl version): $ kubectl version Client Version: version.Info{Major:“1”, Minor:“11+”, GitVersion:“v1.11.0+d4cacc0”, GitCommit:“d4cacc0”, GitTreeState:“clean”, BuildDate:“2019-06-09T23:23:08Z”, GoVersion:“go1.10.8”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“11+”, GitVersion:“v1.11.0+d4cacc0”, GitCommit:“d4cacc0”, GitTreeState:“clean”, BuildDate:“2019-04-10T17:49:11Z”, GoVersion:“go1.10.8”, Compiler:“gc”, Platform:“linux/amd64”}

$ oc version oc v3.11.117 kubernetes v1.11.0+d4cacc0 features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://internal-master.ocp.local:443 openshift v3.11.104 kubernetes v1.11.0+d4cacc0

  • Ruby version (use ruby --version):
  • OS (e.g: cat /etc/os-release): NAME=“Red Hat Enterprise Linux Server” VERSION=“7.7 (Maipo)” ID=“rhel” ID_LIKE=“fedora” VARIANT=“Server” VARIANT_ID=“server” VERSION_ID=“7.7” PRETTY_NAME=“Red Hat Enterprise Linux” ANSI_COLOR=“0;31” CPE_NAME=“cpe:/o:redhat:enterprise_linux:7.7:GA:server” HOME_URL=“https://www.redhat.com/” BUG_REPORT_URL=“https://bugzilla.redhat.com/

REDHAT_BUGZILLA_PRODUCT=“Red Hat Enterprise Linux 7” REDHAT_BUGZILLA_PRODUCT_VERSION=7.7 REDHAT_SUPPORT_PRODUCT=“Red Hat Enterprise Linux” REDHAT_SUPPORT_PRODUCT_VERSION="7.7

  • Splunk version: Splunk Enterprise Version: 7.2.3

  • Others:

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:3
  • Comments:7 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
crdnbcommented, May 7, 2020

In my case, the default index was set to “main”, which the HEC token had no write permissions on it. So setting the default index to an existing, writeable index, sovlved the problem.

0reactions
github-actions[bot]commented, Dec 17, 2021

This issue was closed because it has been inactive for 14 days since being marked as stale.

Read more comments on GitHub >

github_iconTop Results From Across the Web

[3.11] fluentd fails to send logs with message buffer flush took ...
Based on the logs it seems that there were many restarts of ES nodes and since this day we have been seen the...
Read more >
failed to flush the buffer in fluentd looging - Stack Overflow
I'm using fluentd logging on k8s for application logging, we are handling 100M (around 400 tps) and getting this issue. I'm using M6g.2xlarge(8 ......
Read more >
Disconnections explained | Docs | Twitter Developer Platform
A full buffer disconnect generally indicates that your application's code isn't keeping up with the amount of data that we're streaming to you...
Read more >
MySQL 8.0 Reference Manual :: 5.1.8 Server System Variables
The MySQL server maintains many system variables that configure its operation. Each system variable has a default value. System variables can be set...
Read more >
Database Engine events and errors - SQL Server
Buffer provided to read column value is too small. ... This error can be caused by many factors; for more information, see SQL...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found