Optimize fluentd buffers and guidance
See original GitHub issueWhat would you like to be added:
The Fluentd buffers should be optimized and guidance and simple test results should be shared.
buffer:
"@type": memory
total_limit_size: 600m
chunk_limit_size: 10m
chunk_limit_records: 10000
flush_interval: 3s
flush_thread_count: 1
overflow_action: block
retry_max_times: 5
retry_type: periodic
Why is this needed:
chunk_limit_size
has tested best at 10-20MB in high volume environments when customers have been doing scale/tress tests, plus we want to keep thruput high and chunk size low! We should try never to be sending a payload that big.
retry_type
keeps us away from fluentd’s exponential backoff timer, which can cause fun lag issues. We should lean to always firing out data as fast as we can. dont want to be holding buffers. Saw an instance where after failed attempts fluentd would wait 24 hours to send
Issue Analytics
- State:
- Created 3 years ago
- Reactions:3
- Comments:11 (3 by maintainers)
Top Results From Across the Web
Performance Tuning - Fluentd
This article describes how to optimize Fluentd performance within a single process. ... Follow the Pre-installation Guide to configure your OS properly.
Read more >5 Tips to Optimize Fluentd Performance - Treasure Data Blog
5 Tips to Optimize Fluentd Performance · Use td-agent2, not td-agent1. · Use 'num_threads' option. · Avoid extra computations. · Use external 'gzip' ......
Read more >Performance tweaking for fluentd aggregator (EFK stack)
Buffer. According to the document of fluentd, buffer is essentially a set of chunk. Chunk is filled by incoming events and is written...
Read more >Fluentd Performance Tuning - 旺阳
We have deployed the log colleges system in the k8s cluster before, but have poor performance. The maximum resource of Fluentd is 800M ......
Read more >Buffering & Storage - Fluent Bit: Official Manual
How this Filesystem buffering mechanism deals with high memory usage and backpressure ?: Fluent Bit controls the number of Chunks that are up...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
having set the
retry_max_interval
to something reasonable should workaround the silly 1d retry wait intervals seen with default config and also cope with very short issues better than just a periodic 60s wait time.This issue was closed because it has been inactive for 14 days since being marked as stale.