TTL on connections
See original GitHub issueWhat kind of issue is this?
- Feature Request. Start by telling us what problem you’re trying to solve. Often a solution already exists! Don’t send pull requests to implement new features without first getting our support. Sometimes we leave features out on purpose to keep the project small.
We are using OkHttp as an HTTP client as part of a multi service application in AWS. Our services are behind ELBs and we are using a connection pool per client (with a client per downstream service). ELBs use DNS round robing with a TTL of 60 seconds. Our idle timeout is set to 50 seconds.
We are seeing persistent connections that are in use for a long period of time. In particular we observer errors in cases where a downstream service/stack gets replaced (which will replace the ELB nodes) and those connections become stale.
In addition to errors, we are also not fully leveraging ELB nodes being scaled out, due to the fact that connections in the pool live for significantly longer than the DNS TTL and new nodes are not being used.
This is unlikely to be a problem when OkHttp is used for mobile clients as there are many clients with significantly fewer load balancer nodes, but in our case we are looking at a much smaller set of client nodes that connect to the ELB/LB nodes.
We have a couple of options here (as far as I can tell):
- Don’t use ConnectionPools/persistent connections (setting
maxIdleConnections
seems to mostly do this) - Manage our own implementation that periodically evicts connections regardless of their idle time.
The former might be ok for the parts that are not directly on the user request path, but it would be unfortunate if we couldn’t use connection pooling at all. The latter is an approach that we have implemented, but it has some gremlins as care needs to be taken to properly manage the lifetime of the client and its pool and the scheduled task that periodically evicts connections.
Is there interest in allowing clients to specify a connection time-to-live in addition to the idle timeout? Managing this as part of the client or (more likely) the pool seems much better in terms of properly managing the lifecycle of the pool and its management thread.
Or are we misusing OkHttp in the way we use it?
Issue Analytics
- State:
- Created 6 years ago
- Reactions:8
- Comments:6 (3 by maintainers)
Top GitHub Comments
It is not how HTTP works. Clients aren’t supposed to have special configuration for their endpoints. In particular, browsers don’t do this.
Instead the solution must be server-side.
As one option yes. Another option would be to monitor DNS changes and only recycle connections if DNS records actually change.
We ended up creating our own
Call.Factory
that uses a scheduled task to evict the pool on a fixed schedule (currently 5 minutes). This effectively imposes an upper bound on the connection TTL (modulo connections that are currently active and won’t be evicted).Instead of doing this from the consumer side, I think it would be beneficial if
okhttp3.ConnectionPool#cleanup
would not only cleanup idle connections but any connection that isn’t currently in use and exceeds a connection TTL configured in the pool.