A connection to <address> was leaked. Did you forget to close a response body?
See original GitHub issueI understand this issue comes up from time to time on the issues but I’m not actually able to find the cause of the issue I’m encountering.
Using the Fabric8 Kubernetes Client (https://github.com/fabric8io/kubernetes-client) which uses Okhttp to make calls to a Restful API (Kubernetes to be specific). They raised a similar issue for their Watch Manager (https://github.com/square/okhttp/pull/4374) but I’m still seeing issues when I make direct calls to their client.
I’ve tracked the code to the following code in the Kubernetes Client:
protected <T> T handleResponse(OkHttpClient client, Request.Builder requestBuilder, Class<T> type, Map<String, String> parameters) throws ExecutionException, InterruptedException, KubernetesClientException, IOException { VersionUsageUtils.log(this.resourceT, this.apiVersion); Request request = requestBuilder.build(); Response response = client.newCall(request).execute(); try (ResponseBody body = response.body()) { assertResponseCode(request, response); if (type != null) { return Serialization.unmarshal(body.byteStream(), type, parameters); } else { return null; } } catch (Exception e) { if (e instanceof KubernetesClientException) { throw e; } throw requestException(request, e); } finally { if(response != null && response.body() != null) { response.body().close(); } } }
The scenario is multiple calls to the same client in quick succession (10-50 at once) and sporadically I see the error:
2018-11-19 10:57:46,735 [OkHttp ConnectionPool] [WARN ] okhttp3.OkHttpClient - A connection to https://<host>:6443/ was leaked. Did you forget to close a response body? To see where this was allocated, set the OkHttpClient logger level to FINE: Logger.getLogger(OkHttpClient.class.getName()).setLevel(Level.FINE);
When I set it to FINE I see it complaining about “client.newCall(request).execute();” in “handleResponse” above.
When I create a new client for each request, I still see the issue but it is dramatically reduced (from 10 messages per 100 calls to 1 message per 100 calls). This leads me to believe there may be some sort of race condition, or maybe the exception handling in that case is incorrectly reporting the error? I ran my code for the entire weekend in YourKit and see thousands of messages but funnily enough when I GC, I didn’t see a large growth in any okhttp objects (StreamAllocation objects increased by 160 objects).
I attempted to create a test, but using the MockWebServer does not replicate the issue and I’m unaware of a public server you may want to use in your bug testing. I even tried to replicate it using the Kubernetes mock server and couldn’t replicate it.
I know that its odd to come here for this issues, but the code they’re using looks to be solid and the exception seems to indicate an issue with the “client.newCall(request).execute();” call.
Any help would be appreciated!
Issue Analytics
- State:
- Created 5 years ago
- Comments:15 (5 by maintainers)
Top GitHub Comments
Not at a computer right now, but the interesting thing is that no exceptions are thrown. In the Fabric8 test class I made, the only exceptions thrown seem to be the resource leaks.
As for reproducing it, I was only able to do it against Kubernetes 1.8 clusters and below. I tried it against 6-8 different clusters and each time I hit a Kubernetes v1.9+ API it worked fine (no exceptions). In each case they used the same client-cert authentication (not token). It was also specific endpoints (apps/Deployment, but not v1/nodes for example).
All the pieces together seem to indicate situations where there are many/constant calls to the API the InputStream is not closed in all cases. It’s possible an exception is swallowed somewhere but there were no indications in the log.
I’ll try to replicate with your code when I get log in tonight 😃
It looks like this issue magically fixed itself. I’ve been trying different things (large json responses, etc.) and cannot reproduce.
Suddenly, even the clusters that were encountering the issues are no longer hitting them. That tied with the fact that everyone should be on 1.10 and above now (https://www.zdnet.com/article/kubernetes-first-major-security-hole-discovered/) which always worked, I’m going to close this.
Thanks for looking into it!