[BUG] Gunicorn Workers Hangs And Consumes Memory Forever
See original GitHub issueDescribe the bug
I have deployed FastAPI which queries the database and returns the results. I made sure closing the DB connection and all. I’m running gunicorn with this line ;
gunicorn -w 8 -k uvicorn.workers.UvicornH11Worker -b 0.0.0.0 app:app --timeout 10
So after exposing it to the web, I run a load test which makes 30-40 requests in parallel to the fastapi. And the problem starts here. I’m watching the ‘HTOP’ in the mean time and I see that RAM usage is always growing, seems like no task is killed after completing it’s job. Then I checked the Task numbers, same goes for it too, seems like gunicorn workers do not get killed. After some time RAM usage gets at it’s maximum, and starts to throw errors. So I killed the gunicorn app but the thing is processes spawned by main gunicorn proces did not get killed and still using all the memory.
Environment:
-
OS: Ubuntu 18.04
-
FastAPI Version : 0.38.1
-
Python version : 3.7.4
Issue Analytics
- State:
- Created 4 years ago
- Comments:63 (16 by maintainers)
Top GitHub Comments
Hi everyone,
I just read the source code of fastAPI and test it myself. First of all, this should not be a memory leak issue, but the problem is if your machine has a lot of CPUs, it will occupy a lot of memory.
The only difference is in
starlette.routing.py
methodrequest_response()
If the your rest interface is not async, it will run in
loop.run_in_executor
, but starlette do not specify the executor here, so the default thread pool size should be os.cpu_count() * 5, my test machine has 40 cpus so I should have 200 threads in the pool. And after each request it will not release the object in these threads, unless the thread be reused by next request, which will occupy a lot of memory, but at the end it’s not memory leak.below is my test code if you want to reproduce it
Even though it’s not memory leak, I still think it’s not a good implementation cuz it’s sensitive to your cpu count and when you run large deep learning model in fastAPI, you will find it occupy a ton of memory. So I suggest could we make the thread pool size configurable?
If you are interested in my process reading the source code, pls refer to my blog and give me a like(https://www.jianshu.com/p/e4595c48d091)
Sorry for only write blogs in Chinese 😃
Current Solution
python 3.9 already limit the threads in thread pool as below,
If for ur program, 32 thread is not too large, you can upgrade python to 3.9 to avoid this issue.
Use async to define ur interface, then the request will run in an event loop, but the throughput maybe infected.
Some statistics for python3.7, python3.8, and async.