Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Use of ThreadLocal storage in HTTPServer seems resulting in inefficient memory allocation

See original GitHub issue

Disclaimer

I might be missing something in the design of the class (I’m sorry if this is the case), and yet I think it’s worth checking and possibly suggest alternative implementations.

TL;DR

The class HTTPServer in client_java/simpleclient_httpserver/src/main/java/io/prometheus/client/exporter/HTTPServer.java seems to allocate more memory than what is actually needed, and for too long.

Description

The class relies, to handle the HTTP requests, on a ByteArrayOutputStream in ThreadLocal storage, that is:

as big as the serialized reply, and allocated 5 times, once per thread in the pool
its content is never re-used, since it is overwritten for each incoming request
never de-allocated or cleaned up after the reply has been delivered

Basically it looks like if the class is implemented to avoid the instantiation of new ByteArrayOutputStream objects and, to achieve this, it pays the maximum cost in terms of memory allocation.

Code walkthrough

The following code is from the HTTPServer class, as it is currently on the master branch, version 0.12.1-SNAPSHOT. The class has a member response of type LocalByteArray, which is a ThreadLocal ByteArrayOutputStream:

private static class LocalByteArray extends ThreadLocal<ByteArrayOutputStream> {
    @Override
    protected ByteArrayOutputStream initialValue()
    {
        return new ByteArrayOutputStream(1 << 20);
    }
}

This is how the class handles the incoming HTTP requests, I’ve added the comments.

public void handle(HttpExchange t) throws IOException {
    String query = t.getRequestURI().getRawQuery();
    String contextPath = t.getHttpContext().getPath();
    // take, from the thread local storage, the output stream containing the previous response
    // this is the only reference in the code to 'this.response`
    ByteArrayOutputStream response = this.response.get();
    // throw away the previous response
    response.reset();
    // wrap the output stream with a writer
    OutputStreamWriter osw = new OutputStreamWriter(response, Charset.forName("UTF-8"));
    if ("/-/healthy".equals(contextPath)) {
        osw.write(HEALTHY_RESPONSE);
    } else {
        String contentType = TextFormat.chooseContentType(t.getRequestHeaders().getFirst("Accept"));
        t.getResponseHeaders().set("Content-Type", contentType);
        Predicate<String> filter = sampleNameFilterSupplier == null ? null : sampleNameFilterSupplier.get();
        filter = SampleNameFilter.restrictToNamesEqualTo(filter, parseQuery(query));
        // write the reply content into the writer
        if (filter == null) {
            TextFormat.writeFormat(contentType, osw, registry.metricFamilySamples());
        } else {
            TextFormat.writeFormat(contentType, osw, registry.filteredMetricFamilySamples(filter));
        }
    }

    // flush the writer and write the new reply into `response`
    osw.close();

    // decide whether to compress or not the data
    if (shouldUseCompression(t)) {
        t.getResponseHeaders().set("Content-Encoding", "gzip");
        t.sendResponseHeaders(HttpURLConnection.HTTP_OK, 0);
        final GZIPOutputStream os = new GZIPOutputStream(t.getResponseBody());
        try {
            // send the data from `response` to the output stream
            response.writeTo(os);
        } finally {
            os.close();
        }
    } else {
        long contentLength = response.size();
        t.getResponseHeaders().set("Content-Length", String.valueOf(contentLength));
        if (t.getRequestMethod().equals("HEAD")) {
            contentLength = -1;
        }
        t.sendResponseHeaders(HttpURLConnection.HTTP_OK, contentLength);
        // send the data from `response` to the output stream
        response.writeTo(t.getResponseBody());
    }
    t.close();
    // `response` still contains the reply content
}

Possible solutions

Option A

The simplest alternative implementation would be to get rid of the response field and instantiate a new ByteArrayOutputStream within the handle method. Basically replacing

ByteArrayOutputStream response = this.response.get();
response.reset();

with

ByteArrayOutputStream response = new ByteArrayOutputStream(1 << 20);

Resulting in:

PRO: lower memory allocation. We no longer keep 5 responses always in memory.
CON: we instantiate one new object each time the endpoint is scraped. This object, though, becomes garbage-collectable right after the method exits.

Option B

Another option came to my mind, but is seems more complex to me. Given that the method ends up writing in the OutputStream from t.getResponseBody(), another alternative could be to wrap this stream into a writer and let the method TextFormat.writeFormat write directly on it. This way, no new ByteArrayOutputStreams need to be created to handle the request. Going this way, however, the changes are not so trivial, for example the lines

long contentLength = response.size();
t.getResponseHeaders().set("Content-Length", String.valueOf(contentLength));

rely on the ByteArrayOutputStream containing the data.

In this case:

PRO: same PROs as before, plus no ByteArrayOutputStream allocation
CON: deeper code refactoring and possible open points

EDIT: Description improved as per comment https://github.com/prometheus/client_java/issues/703#issuecomment-930136163

Issue Analytics

State:
Created 2 years ago
Comments:14 (11 by maintainers)

Top GitHub Comments

1reaction

brian-brazilcommented, Jan 17, 2022

The TextFormat.writeFormat can throw exceptions, which will cause the request to fail.

1reaction

brian-brazilcommented, Jan 17, 2022

We need to know if there’s an exception when generating the output to set the response code correctly.

Top Results From Across the Web

Why is thread local storage so slow? - Stack Overflow

It seems that the thread local storage bottleneck is causing a huge (~50%) slowdown in allocating memory from these regions compared to an ......

Error message definitions for WebSphere® Application ... - IBM

A memory allocation failure occurred which indicates larger problems than the plug-in not working. Resolution: Restart the server and try again.

The case of the application that used thread local storage it ...

The problem was that the application was using TLS slots it never allocated, so it was inadvertently using somebody else's TLS slots as...

Core Command-line Options - Valgrind

Regardless of which tool is in use, Valgrind takes control of your ... This suppresses Memcheck memory-leak errors, in the case where the...

Reactor Netty Reference Guide

Snapshots appear higher in the order above because, conceptually, they're always "the ... reactor.netty.bytebuf.allocator.used.heap.memory.

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Use of ThreadLocal storage in HTTPServer seems resulting in inefficient memory allocation

Disclaimer

TL;DR

Description

Code walkthrough

Possible solutions

Option A

Option B

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Add log4j2 annotation processor so packages is not required

Accept header mismatch for OpenMetrics