question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support for customize the max connections and max pools

See original GitHub issue

When Boto is used to perform multiple requests concurrently, for example to sped up the write throughput for DynamoDB, it uses the default parameters given by the HTTPAdapter [1], that is currently set to 10 by default.

This number means all of those amount of connections over this number will be dropped and they will not be recycled after, therefore when a new connection is necessary - and the first 10 are bussy - a full handshake TCP trip is performed, that will add a important latency for the whole time operation.

The idea will be add the properly fields to customize the current max connections and max pools, giving to the user the freedom to set this values to other ones different than the default ones.

Does it make sense for you ? if it does, I can try to send the properly pull request with a tentative solution.

[1] https://github.com/boto/botocore/blob/develop/botocore/vendored/requests/adapters.py#L82

Issue Analytics

  • State:closed
  • Created 8 years ago
  • Reactions:4
  • Comments:10 (3 by maintainers)

github_iconTop GitHub Comments

8reactions
wiltziuscommented, Sep 4, 2016

@jamesls your understanding of the issue is correct. urllib3 maintains a cache of idle TCP connections (the “pool”), but it’s set at a fixed size. If there are no connections available from the pool, it creates a new one. When it’s finished with that new connection it tries to return it to the pool, but if the pool is already full then it simply discards it. urllib3 issues a logging warning when it does so, since this essentially represents an inefficiency where the pool size is smaller than the number of simultaneous requests – future connections will perform the full TCP connection set up. This is not a correctness issue, it’s just an efficiency / performance issue.

Both urllib3 and the Python requests library that wraps it expose settings to change the pool size, but because the connections are created under the hood by boto we don’t have access to these settings from the boto client (without monkeypatching or similar).

Plumbing through the pool_connections setting that the requests HTTPAdapter exposes (as mentioned in the original post on this issue) would allow us to set the pool size to whatever is appropriate for our load. It should also be easy to do, it’s simply exposing a parameter and then passing that parameter value to the underlying requests library. A more general solution would be to expose overrides for all the default HTTPAdapter settings, but this is the only one I really care about.

I’ll note that although this issue is probably worst for services like DynamoDB in our case we trigger the error when loading more than 10 photos from S3 simultaneously, so hopefully whatever solution you arrive at is not specific to DynamoDB.

Lastly, simply for reference if you’re curious, here’s the place in the urllib3 connection pool code where the warning is issued:

https://github.com/shazow/urllib3/blob/65b8c52c16dee5c3a523de2c1c21853ba0e581f2/urllib3/connectionpool.py#L257

and here’s the docs for the connection pool:

https://urllib3.readthedocs.io/en/1.4/pools.html

and here’s a sort of how-to article on the setting in the requests library with way more information than you probably want on the subject:

https://laike9m.com/blog/requests-secret-pool_connections-and-pool_maxsize,89/

Thanks!

1reaction
jameslscommented, Sep 8, 2016

Thanks for the info. One last thing I noticed, while pool_connection is the number of conn pools to be cached in memory at any given point (1 conn pool per host), pool_maxsize is the total number of connections to keep in a single connection pool. pool_maxsize is what we’d need to accommodate the multithreaded scenario that’s been outlined here, but I’m not sure if pool_connection matters that much (it would matter if you’re accessing multiple AWS services). In requests, they use the same default value for both:


class HTTPAdapter(BaseAdapter):
    def __init__(self, pool_connections=DEFAULT_POOLSIZE,
                 pool_maxsize=DEFAULT_POOLSIZE, max_retries=DEFAULT_RETRIES,
                 pool_block=DEFAULT_POOLBLOCK):

I wonder if we should simplify and do something similar, expose a max_poolsize and just set that value for both pool_connection and pool_maxsize.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Increase the max connections of my Amazon RDS ...
In Amazon RDS for MySQL, the max_connections metric monitors the set maximum number of (allowed) simultaneous client connections.
Read more >
How to modify the size of the database connections pool
To modify the Maximum Connections that the MySQL database server can support: Locate the my.cnf file on the MySQL database server (often in...
Read more >
Connection pool settings
Maximum connections. Specifies the maximum number of physical connections that you can create in this pool. These are the physical connections ...
Read more >
How to Manage Connection Pools for PostgreSQL ...
PostgreSQL databases have a fixed maximum number of connections, ... then click Create a Pool to open the Create Connection Pool window.
Read more >
How to increase the max-pool-size for the Workpoint data ...
The max-pool-size connection setting for the Workpoint data source (WPDS) may be set to 90, which is too low for heavy workflow usage....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found