Zuul leaves some connections in CLOSE_WAIT state for further reuse, but some never get reused and stay in that state forever, blocking further requests
See original GitHub issueI have a Zuul
server that proxies all my requests to autodiscovered (via Eureka) routes.
This works fine most of the time. However, I have noticed some very odd behaviour the occurs sporadically and can only partially be recreated.
After making multiple simultaneous requests, for example the swagger-ui.html
page for a given API description, which loads not only the page itself but also numerous webjars and resources, some connections end up in a CLOSE_WAIT
state.
tcp6 1 0 host:54470 host:37612 CLOSE_WAIT user 425593599 4396/java
tcp6 1 0 host:57724 host:37612 CLOSE_WAIT user 426384390 4396/java
tcp6 1 0 host:59402 host:52887 CLOSE_WAIT user 425517966 4396/java
tcp6 1 0 host:59403 host:52887 CLOSE_WAIT user 425489000 4396/java
tcp6 1 0 host:59404 host:52887 CLOSE_WAIT user 425518687 4396/java
tcp6 1 0 host:59405 host:52887 CLOSE_WAIT user 425469338 4396/java
tcp6 1 0 host:59406 host:52887 CLOSE_WAIT user 425518688 4396/java
tcp6 1 0 host:59407 host:52887 CLOSE_WAIT user 425476214 4396/java
tcp6 1 0 host:60118 host:37612 CLOSE_WAIT user 426773630 4396/java
tcp6 1 0 host:60154 host:37612 CLOSE_WAIT user 426810662 4396/java
tcp6 1 0 host:60155 host:37612 CLOSE_WAIT user 426824573 4396/java
tcp6 1 0 host:60156 host:37612 CLOSE_WAIT user 426821100 4396/java
tcp6 1 0 host:60157 host:37612 CLOSE_WAIT user 426825547 4396/java
tcp6 1 0 host:60158 host:37612 CLOSE_WAIT user 426820353 4396/java
tcp6 1 0 host:60159 host:37612 CLOSE_WAIT user 426618721 4396/java
tcp6 1 0 host:60160 host:37612 CLOSE_WAIT user 426802727 4396/java
tcp6 1 0 host:60161 host:37612 CLOSE_WAIT user 426825548 4396/java
tcp6 1 0 host:60162 host:37612 CLOSE_WAIT user 426824574 4396/java
tcp6 1 0 host:60163 host:37612 CLOSE_WAIT user 426618722 4396/java
tcp6 1 0 host:60167 host:37612 CLOSE_WAIT user 426689993 4396/java
tcp6 1 0 host:60168 host:37612 CLOSE_WAIT user 426618745 4396/java
tcp6 1 0 host:60169 host:37612 CLOSE_WAIT user 426796620 4396/java
tcp6 1 0 host:60170 host:37612 CLOSE_WAIT user 426824617 4396/java
tcp6 1 0 host:60171 host:37612 CLOSE_WAIT user 426827273 4396/java
The 4396 process in this case is my IDE with which I was debugging the Zuul
server. When I perform another refresh of the same browser site, many of the connections are successfully closed, though some more pop up after a while. The behaviour also happens, although less frequently, when making numerous cURL
requests to any given route.
I dug around in the SimpleHostRoutingFilter
which uses a PoolingHttpClientConnectionManager
and noticed something peculiar:
- The TTL of the default configuration is set to
-1
, i.e. infinite - The connections that are in
CLOSE_WAIT
get reused for establishing new connections in line 318 ofPoolingHttpClientConnectionManager
(something which I find extremely odd, but I am unsure if this might be a standard Java approach)
However, there are some connections that live on eternally in a CLOSE_WAIT
state that I cannot get rid of. The other end of the route does not have any open connections still lying around - it is merely the Zuul
which is not successfully closing the connections in the CLOSE_WAIT
state.
Eventually these connections clog up the pool and I stop getting responses from my services altogether, and I have not seen Zuul
clean them up even after >1 day.
What is odd as well is that, the cap seems to be 50
connections although the maxPerRoute
parameter is set to 20
.
Is this a known issue? Is there a workaround known? I was planning to subclass/replace the SimpleHostRoutingFilter
with my own and pass it a connection pool manager configuration with some TTL value to see if there could be any improvements, but I thought I should first ask if this is a known issue seeing how the effort required is non-trivial.
Issue Analytics
- State:
- Created 7 years ago
- Comments:23 (14 by maintainers)
we’ll talk about it tomorrow morning.
@tkvangorder we most likely try to get the release done next week.