Closed keep-alive connections not handled correctly
See original GitHub issueI have run into a case where a session-based GET request fails:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/packages/urllib3/connectionpool.py", line 376, in _make_request
httplib_response = conn.getresponse(buffering=True)
TypeError: getresponse() got an unexpected keyword argument 'buffering'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/packages/urllib3/connectionpool.py", line 559, in urlopen
body=body, headers=headers)
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/packages/urllib3/connectionpool.py", line 378, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1197, in getresponse
response.begin()
File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 297, in begin
version, status, reason = self._read_status()
File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 266, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/adapters.py", line 370, in send
timeout=timeout
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/packages/urllib3/connectionpool.py", line 609, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/packages/urllib3/util/retry.py", line 245, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/packages/urllib3/packages/six.py", line 309, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/packages/urllib3/connectionpool.py", line 559, in urlopen
body=body, headers=headers)
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/packages/urllib3/connectionpool.py", line 378, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1197, in getresponse
response.begin()
File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 297, in begin
version, status, reason = self._read_status()
File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 266, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
requests.packages.urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test.py", line 11, in <module>
r = s.get(url, allow_redirects=False)
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/sessions.py", line 480, in get
return self.request('GET', url, **kwargs)
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/adapters.py", line 412, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
I finally narrowed the case to a situation where
- a request is made where the connection is kept open (
Session
) e.g. keep-alive is enabled - the connection is to a host that is reverse-proxied by nginx
- nginx configuration is reloaded
- a new request is made using the session
I believe this is because nginx will close the idle connection (at step 3) and when the next request is generated (step 4) it tries to write to the closed socket which in turn will generate connection abort exception.
As far as I understand Keep-Alive nginx is doing everything correctly. From RFC2616:
A client, server, or proxy MAY close the transport connection at any time. For example, a client might have started to send a new request at the same time that the server has decided to close the “idle” connection. From the server’s point of view, the connection is being closed while it was idle, but from the client’s point of view, a request is in progress.
This means that clients, servers, and proxies MUST be able to recover from asynchronous close events. Client software SHOULD reopen the transport connection and retransmit the aborted sequence of requests without user interaction so long as the request sequence is idempotent (see section 9.1.2). Non-idempotent methods or sequences MUST NOT be automatically retried, although user agents MAY offer a human operator the choice of retrying the request(s). Confirmation by user-agent software with semantic understanding of the application MAY substitute for user confirmation. The automatic retry SHOULD NOT be repeated if the second sequence of requests fails.
What nginx does on configuration reload is to iterate over connections and close any that it considers idle. (It will allow pending requests to finish, but will close them after completion, too.) So for the client it appears as if the remote end has just hung up. Since the socket state is checked only when a new request is being sent it appears at that point.
I created a pair of test programs (https://gist.github.com/santtu/a38bb50c44623a162df72cb6d0a45f0a), a simple socket server which fakes a keep-alive HTTP/1.1 server but will terminate the connection after a short delay, and another program that will do busyloop GETs on an URL using a persistent session.
Here are the results. First is the server (test2.py
) and then the client (test.py
):
$ python3 test2.py
127.0.0.1 wrote:
b'GET / HTTP/1.1\r\nHost: localhost:8080\r\nAccept: */*\r\nAccept-Encoding: gzip, deflate\r\nUser-Agent: python-requests/2.8.1\r\nConnection: keep-alive'
$ python3 test.py http://localhost:8080/
200 OK
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/packages/urllib3/connectionpool.py", line 376, in _make_request
httplib_response = conn.getresponse(buffering=True)
TypeError: getresponse() got an unexpected keyword argument 'buffering'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/packages/urllib3/connectionpool.py", line 559, in urlopen
body=body, headers=headers)
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/packages/urllib3/connectionpool.py", line 378, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1197, in getresponse
response.begin()
File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 297, in begin
version, status, reason = self._read_status()
File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 266, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/adapters.py", line 370, in send
timeout=timeout
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/packages/urllib3/connectionpool.py", line 609, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/packages/urllib3/util/retry.py", line 245, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/packages/urllib3/packages/six.py", line 309, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/packages/urllib3/connectionpool.py", line 559, in urlopen
body=body, headers=headers)
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/packages/urllib3/connectionpool.py", line 378, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1197, in getresponse
response.begin()
File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 297, in begin
version, status, reason = self._read_status()
File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 266, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
requests.packages.urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test.py", line 11, in <module>
r = s.get(url, allow_redirects=False)
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/sessions.py", line 480, in get
return self.request('GET', url, **kwargs)
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.5/site-packages/requests-2.8.1-py3.5.egg/requests/adapters.py", line 412, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
The first 200 OK
comes from the faked response after which the client ends up blocking on the socket, finally getting connection abort when it is a few seconds later closed by the server side.
Any thoughts on this?
Issue Analytics
- State:
- Created 7 years ago
- Comments:6 (4 by maintainers)
The server has not read from the socket layer yet, correct. However, the requests side of the connection cannot know that: from our perspective, our (blocking) writes have succeeded, which means that we have transferred data to the server. We are now expecting a response. Naturally, then, if the server closed the connection five minutes ago we will detect it (and if you want to test this case, you’ll find that we correctly re-open the connection).
Yes, we can do that. In practice, we like to leave that choice up to our users, who are better placed to decide what to retry than we are.
Neither of those statements is true, and the documentation calls them out as untrue. Note here, which provides a
max_retries
parameter that can be set to an integer value. If you set it to 1, this problem should be resolved for you.This is incorrect thinking when working with asynchronous protocols. As noted in the RFC, the server has closed the connection when it is idle. The closing may have happened 5 minutes ago with the TCP FIN already at the client end. In this case the client could check if the connection has been closed (after all, the underlying TCP socket has been closed) before trying to write to it. That it can detect. But generally even that would leave a race condition since in asynchronous case client can write and server can close at the same time.
Mind you, it is entirely (as pointed out in the RFC from 1999) possible for the client to detect an aborted connection (like here, from the exception) and for GET or HEAD to retry the request once.
Since this is a behavior is described in the HTTP/1.1 specification as a valid behavior for persistent connections I do not think the claim of having “100%” support for keep-alives is valid. Without a retry a purely administrative operation (configuration reload) on a server designed for zero-downtime operation (nginx) will show up as errors when using sessions. The only real work-around currently is to not use keep-alives at all. The alternative is to implement the low-level retry-once semantic into all client code wishing to use requests. Neither of these workarounds are terribly good, one loses the benefits of persistent connections and the other requires implementing the same handling code in all places wishing to actually handle persistent connection terminations correctly.