New Feature/Enhancement Request: Support Gzip Compressed logs over HTTP
See original GitHub issueIs your feature request related to a problem? Please describe.
We are using the Http sink in some of our applications to write to a Fluentd agent. One application creates logs that are larger than the conventional logs due certain requirements. Most cases, the logs are larger than 5MB and sometimes even as large as 10MB. In another application where we use this sink, most logs are less than ~500 KB, but it produces thousands of logs per second. To support the large size and high number of logs, I am using a custom array batch formatter with a size of 16MB. Now I want to conserve network bandwidth and reduce the latency by using Gzip compression on the logs and sending it over http.
This would be beneficial for some people that want to use the http sink wanting to conserve bandwidth, reduce latency and decrease the number of http requests for shipping the logs. In the case of applications running in the cloud, this would reduce ingress/egress costs (Fluentd and Elasticsearch both support compressed logs over http. Provided links in the additional context section)
Pros:
- Reduce network bandwidth for large logs
- Reduce http request latency for large payloads
- Each log will take up less storage on the FS
- More logs can sent over one http request using the same batch size
- Fluentd and Elasticsearch both support compressed logs over http
- Reduce ingress/egress data transfer costs when running within cloud (Costs for AWS)
Cons:
- Application performance (CPU/Memory) may be impacted when compressing logs. Would need to do a comparison to see the impact as it would vary based on size and throughput.
- The backend application needs to support Gzip compressed http requests.
Describe the solution you’d like
I’ve done a little investigation into integrating it with ‘DurableHttpUsingFileSizeRolledBuffers’ and here are my findings. I went through the sink code and found that you are internally using the File sink for the file buffer. The File sink has a FileLifeCycle Extension that supports Gzip hook which can be leveraged to write Gzip compressed logs into the file (I was able to get this working after making changes locally and can share these changes, if you’d like to see).
Next, we would need to implement a Http Client that supports Gzip compressed request payloads. This would require making changes to the existing http client or creating a new http client to support this. A new formatter would also need to be created to make sure the batch is created in gzip format and appends gzip logs.
I did test to make sure that logs can be sent compressed over http and is received properly by the end client before requesting this feature. In the code shared below, I am taking a string serialized into json, converting it into byte stream, compressing it and sending it over http.
Is it possible that this can be integrated into the Http Sink? If so, I can contribute by working on this feature, and I would also be able to test it as this is something we would like to use.
Describe alternatives you’ve considered I haven’t considered any other alternatives other than sending the logs uncompressed over http.
Additional context
Http compression support:
The sample code I used to create compressed payload and sent over http to Fluentd instance:
private const string jsonValue1 = "{\"LogLevel\":\"Information\",\"Message\":\"Test1\",\"RequestId\":\"06c1b946-1474-4e63-bf49-944cf0e3541d\",\"LogTime\":\"02/12/2021 17:54:31.193324+00:00\"}";
private const string jsonValue2 = "{\"LogLevel\":\"Information\",\"Message\":\"Test2\",\"RequestId\":\"1a810f65-a289-4e1c-873d-7dcd9078db5f\",\"LogTime\":\"02/13/2021 18:54:31.193324+00:00\"}";
static void Main(string[] args)
{
// Create byte stream
var buffer = ConvertToBytes($"[ {jsonValue1}, {jsonValue2} ]");
using var input = new MemoryStream(buffer);
// Convert byte stream into Gzip Stream
using var output = new MemoryStream();
using (var gzipStream = new GZipStream(output, CompressionLevel.Optimal))
{
input.CopyTo(gzipStream);
}
// Create the Http Content stream w/headers
using var contentStream = new MemoryStream(output.ToArray());
var content = new StreamContent(contentStream);
content.Headers.Add("Content-Type", "application/x-www-form-urlencoded");
content.Headers.Add("Content-Encoding", "gzip");
// Http post method
using var client = new HttpClient();
var response = client.PostAsync(content: content, requestUri: requestUri).Result;
Console.WriteLine(response.IsSuccessStatusCode);
}
private static byte[] ConvertToBytes(string value) => Encoding.UTF8.GetBytes(value);
Issue Analytics
- State:
- Created 3 years ago
- Comments:17 (17 by maintainers)
Top GitHub Comments
v8.0.0-beta.3 is now available on nuget.org. Please try it out and get back to me with feedback.
Thanks for the collaboration!
I agree, if we go ahead with Gzip we should also look at Brotli.
But before we do anything, I would like to thank you for the time and effort you’ve spent analyzing the issue. It’s much appreciated! ❤️