[QUESTION] interplay with multiprocessing
See original GitHub issueDescribe the bug
I’m trying to run parallel tasks with a timeout per task (using multiprocessing) inside an API method. On trying to terminate the child processes post the time limit, the server process shuts down and disconnects.
To Reproduce
- Create a file:
repro.py
import os
import time
import uvicorn
from concurrent.futures import ProcessPoolExecutor
def simple_routine(sleep_for):
print(f"PID {os.getpid()} has sleep time: {sleep_for}")
time.sleep(sleep_for)
return "done"
def test_endpoint():
print(f"main process: {os.getpid()}")
START_TIME = time.time()
with ProcessPoolExecutor(max_workers=2) as pool:
futures = [
pool.submit(simple_routine, 1),
pool.submit(simple_routine, 10),
]
results = []
for fut in futures:
try:
results.append(fut.result(timeout=2))
except:
results.append("not done")
# terminate the processes which are still running
for pid, proc in pool._processes.items():
print("terminating pid ", pid)
proc.terminate()
print("exiting at: ", int(time.time() - START_TIME))
return True
async def app(scope, receive, send):
await send({
'type': 'http.response.start',
'status': 200,
'headers': [
[b'content-type', b'text/plain'],
]
})
test_endpoint()
await send({
'type': 'http.response.body',
'body': b'Hello, world!',
})
if __name__=="__main__":
uvicorn.run(app, host="0.0.0.0", port=5000)
- Run it as
python repro.py
. - Open another python interpreter and make this web request.
import requests
for _ in range(20):
print(requests.get("http://localhost:5000/").text)
- The server process shuts down after the first request.
Expected behavior
We start 2 processes one of which exceeds the time limit after which we try terminate it. The server shouldn’t shut down and continue serving requests. Interestingly, the server doesn’t actually exit until the long running process is complete.
INFO: Started server process [7041]
INFO: Uvicorn running on http://0.0.0.0:5000 (Press CTRL+C to quit)
INFO: Waiting for application startup.
INFO: ASGI 'lifespan' protocol appears unsupported.
INFO: Application startup complete.
INFO: 127.0.0.1:44954 - "GET / HTTP/1.1" 200 OK
main process: 7041
PID 7060 has sleep time: 1
PID 7061 has sleep time: 10
terminating pid 7060
terminating pid 7061
exiting at: 10
INFO: Shutting down
INFO: Finished server process [7041]
With Flask, the behavior of an identical app is as expected.
main process: 1015
PID 1035 has run time: 1
PID 1039 has run time: 1
PID 1038 has run time: 10
terminating pid 1035
terminating pid 1038
terminating pid 1039
exiting at: 2
127.0.0.1 - - [09/Jan/2020 08:51:37] "POST /test-endpoint HTTP/1.1" 200 -
Environment
- OS: [Ubuntu 18.04.1 LTS]
- Uvicorn Version: 0.11.1
- Python version: 3.6.8
Additional context
This came up while trying to port a WSGI application to FastAPI - link. On suggestion of @dmontagu, I tried to reproduce it with starlette and just uvicorn and saw that the error persists.
Hypercorn shows similar behavior in that the application shuts down after serving the first request. So, the issue likely has something to do with how async servers manage processes? Could you please point to where I might look to solve this?
Thank you for looking.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:2
- Comments:14 (5 by maintainers)
Hi, @selimb there is an explanation what is happening and why
asyncio setups signal handler in a specific way – it calls
signal.set_wakeup_fd
and passes an fd of the opened socket. After it, if any signals are sent to the process, they will be written to this socket/fd.Any child process will inherit not only signal handlers’ behavior but an opened socket. And as a result, when we are sending a signal to the child process, it will be written to the socket and the parent process will receive it too, even though this signal was sent not to him; Or if you will send it to the parent process, the child process will receive this signal too;
How you can avoid this behavior – at the very beginning of the child process you can execute the following code
PS. I’ve downloaded an example from https://bugs.python.org/issue43064, and added
signal.set_wakeup_fd(-1)
to the first line of theworker_sync
method. And as a result, I got an expected result (one call for main process, and three calls for child):For FastAPI usage, I solved this by setting
multiprocessing
to use the"spawn"
method in FastAPI’sstartup
event handler: