Support for customize the max connections and max pools
See original GitHub issueWhen Boto is used to perform multiple requests concurrently, for example to sped up the write throughput for DynamoDB, it uses the default parameters given by the HTTPAdapter [1], that is currently set to 10 by default.
This number means all of those amount of connections over this number will be dropped and they will not be recycled after, therefore when a new connection is necessary - and the first 10 are bussy - a full handshake TCP trip is performed, that will add a important latency for the whole time operation.
The idea will be add the properly fields to customize the current max connections and max pools, giving to the user the freedom to set this values to other ones different than the default ones.
Does it make sense for you ? if it does, I can try to send the properly pull request with a tentative solution.
[1] https://github.com/boto/botocore/blob/develop/botocore/vendored/requests/adapters.py#L82
Issue Analytics
- State:
- Created 8 years ago
- Reactions:4
- Comments:10 (3 by maintainers)

Top Related StackOverflow Question
@jamesls your understanding of the issue is correct. urllib3 maintains a cache of idle TCP connections (the “pool”), but it’s set at a fixed size. If there are no connections available from the pool, it creates a new one. When it’s finished with that new connection it tries to return it to the pool, but if the pool is already full then it simply discards it. urllib3 issues a logging warning when it does so, since this essentially represents an inefficiency where the pool size is smaller than the number of simultaneous requests – future connections will perform the full TCP connection set up. This is not a correctness issue, it’s just an efficiency / performance issue.
Both urllib3 and the Python requests library that wraps it expose settings to change the pool size, but because the connections are created under the hood by boto we don’t have access to these settings from the boto client (without monkeypatching or similar).
Plumbing through the
pool_connectionssetting that the requests HTTPAdapter exposes (as mentioned in the original post on this issue) would allow us to set the pool size to whatever is appropriate for our load. It should also be easy to do, it’s simply exposing a parameter and then passing that parameter value to the underlying requests library. A more general solution would be to expose overrides for all the default HTTPAdapter settings, but this is the only one I really care about.I’ll note that although this issue is probably worst for services like DynamoDB in our case we trigger the error when loading more than 10 photos from S3 simultaneously, so hopefully whatever solution you arrive at is not specific to DynamoDB.
Lastly, simply for reference if you’re curious, here’s the place in the urllib3 connection pool code where the warning is issued:
https://github.com/shazow/urllib3/blob/65b8c52c16dee5c3a523de2c1c21853ba0e581f2/urllib3/connectionpool.py#L257
and here’s the docs for the connection pool:
https://urllib3.readthedocs.io/en/1.4/pools.html
and here’s a sort of how-to article on the setting in the requests library with way more information than you probably want on the subject:
https://laike9m.com/blog/requests-secret-pool_connections-and-pool_maxsize,89/
Thanks!
Thanks for the info. One last thing I noticed, while
pool_connectionis the number of conn pools to be cached in memory at any given point (1 conn pool per host),pool_maxsizeis the total number of connections to keep in a single connection pool.pool_maxsizeis what we’d need to accommodate the multithreaded scenario that’s been outlined here, but I’m not sure ifpool_connectionmatters that much (it would matter if you’re accessing multiple AWS services). In requests, they use the same default value for both:I wonder if we should simplify and do something similar, expose a
max_poolsizeand just set that value for bothpool_connectionandpool_maxsize.