question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Use of ThreadLocal storage in HTTPServer seems resulting in inefficient memory allocation

See original GitHub issue

Disclaimer

I might be missing something in the design of the class (I’m sorry if this is the case), and yet I think it’s worth checking and possibly suggest alternative implementations.

TL;DR

The class HTTPServer in client_java/simpleclient_httpserver/src/main/java/io/prometheus/client/exporter/HTTPServer.java seems to allocate more memory than what is actually needed, and for too long.

Description

The class relies, to handle the HTTP requests, on a ByteArrayOutputStream in ThreadLocal storage, that is:

  • as big as the serialized reply, and allocated 5 times, once per thread in the pool
  • its content is never re-used, since it is overwritten for each incoming request
  • never de-allocated or cleaned up after the reply has been delivered

Basically it looks like if the class is implemented to avoid the instantiation of new ByteArrayOutputStream objects and, to achieve this, it pays the maximum cost in terms of memory allocation.

Code walkthrough

The following code is from the HTTPServer class, as it is currently on the master branch, version 0.12.1-SNAPSHOT. The class has a member response of type LocalByteArray, which is a ThreadLocal ByteArrayOutputStream:

private static class LocalByteArray extends ThreadLocal<ByteArrayOutputStream> {
    @Override
    protected ByteArrayOutputStream initialValue()
    {
        return new ByteArrayOutputStream(1 << 20);
    }
}

This is how the class handles the incoming HTTP requests, I’ve added the comments.

public void handle(HttpExchange t) throws IOException {
    String query = t.getRequestURI().getRawQuery();
    String contextPath = t.getHttpContext().getPath();
    // take, from the thread local storage, the output stream containing the previous response
    // this is the only reference in the code to 'this.response`
    ByteArrayOutputStream response = this.response.get();
    // throw away the previous response
    response.reset();
    // wrap the output stream with a writer
    OutputStreamWriter osw = new OutputStreamWriter(response, Charset.forName("UTF-8"));
    if ("/-/healthy".equals(contextPath)) {
        osw.write(HEALTHY_RESPONSE);
    } else {
        String contentType = TextFormat.chooseContentType(t.getRequestHeaders().getFirst("Accept"));
        t.getResponseHeaders().set("Content-Type", contentType);
        Predicate<String> filter = sampleNameFilterSupplier == null ? null : sampleNameFilterSupplier.get();
        filter = SampleNameFilter.restrictToNamesEqualTo(filter, parseQuery(query));
        // write the reply content into the writer
        if (filter == null) {
            TextFormat.writeFormat(contentType, osw, registry.metricFamilySamples());
        } else {
            TextFormat.writeFormat(contentType, osw, registry.filteredMetricFamilySamples(filter));
        }
    }

    // flush the writer and write the new reply into `response`
    osw.close();

    // decide whether to compress or not the data
    if (shouldUseCompression(t)) {
        t.getResponseHeaders().set("Content-Encoding", "gzip");
        t.sendResponseHeaders(HttpURLConnection.HTTP_OK, 0);
        final GZIPOutputStream os = new GZIPOutputStream(t.getResponseBody());
        try {
            // send the data from `response` to the output stream
            response.writeTo(os);
        } finally {
            os.close();
        }
    } else {
        long contentLength = response.size();
        t.getResponseHeaders().set("Content-Length", String.valueOf(contentLength));
        if (t.getRequestMethod().equals("HEAD")) {
            contentLength = -1;
        }
        t.sendResponseHeaders(HttpURLConnection.HTTP_OK, contentLength);
        // send the data from `response` to the output stream
        response.writeTo(t.getResponseBody());
    }
    t.close();
    // `response` still contains the reply content
}

Possible solutions

Option A

The simplest alternative implementation would be to get rid of the response field and instantiate a new ByteArrayOutputStream within the handle method. Basically replacing

ByteArrayOutputStream response = this.response.get();
response.reset();

with

ByteArrayOutputStream response = new ByteArrayOutputStream(1 << 20);

Resulting in:

  • PRO: lower memory allocation. We no longer keep 5 responses always in memory.
  • CON: we instantiate one new object each time the endpoint is scraped. This object, though, becomes garbage-collectable right after the method exits.

Option B

Another option came to my mind, but is seems more complex to me. Given that the method ends up writing in the OutputStream from t.getResponseBody(), another alternative could be to wrap this stream into a writer and let the method TextFormat.writeFormat write directly on it. This way, no new ByteArrayOutputStreams need to be created to handle the request. Going this way, however, the changes are not so trivial, for example the lines

long contentLength = response.size();
t.getResponseHeaders().set("Content-Length", String.valueOf(contentLength));

rely on the ByteArrayOutputStream containing the data.

In this case:

  • PRO: same PROs as before, plus no ByteArrayOutputStream allocation
  • CON: deeper code refactoring and possible open points

EDIT: Description improved as per comment https://github.com/prometheus/client_java/issues/703#issuecomment-930136163

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:14 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
brian-brazilcommented, Jan 17, 2022

The TextFormat.writeFormat can throw exceptions, which will cause the request to fail.

1reaction
brian-brazilcommented, Jan 17, 2022

We need to know if there’s an exception when generating the output to set the response code correctly.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why is thread local storage so slow? - Stack Overflow
It seems that the thread local storage bottleneck is causing a huge (~50%) slowdown in allocating memory from these regions compared to an ......
Read more >
Error message definitions for WebSphere® Application ... - IBM
A memory allocation failure occurred which indicates larger problems than the plug-in not working. Resolution: Restart the server and try again.
Read more >
The case of the application that used thread local storage it ...
The problem was that the application was using TLS slots it never allocated, so it was inadvertently using somebody else's TLS slots as...
Read more >
Core Command-line Options - Valgrind
Regardless of which tool is in use, Valgrind takes control of your ... This suppresses Memcheck memory-leak errors, in the case where the...
Read more >
Reactor Netty Reference Guide
Snapshots appear higher in the order above because, conceptually, they're always "the ... reactor.netty.bytebuf.allocator.used.heap.memory.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found