Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

FastAPI: Some requests are failing due to 10s timeout

See original GitHub issue

First Check

I added a very descriptive title to this issue.
I used the GitHub search to find a similar issue and didn’t find it.
I searched the FastAPI documentation, with the integrated search.
I already searched in Google “How to X in FastAPI” and didn’t find any information.
I already read and followed all the tutorial in the docs and didn’t find an answer.
I already checked if it is not related to FastAPI but to Pydantic.
I already checked if it is not related to FastAPI but to Swagger UI.
I already checked if it is not related to FastAPI but to ReDoc.

Commit to Help

I commit to help with one of those options 👆

Example Code

The gunicorn configuration we are using:

"""gunicorn server configuration."""
import os

threads = 2
workers = 4
timeout = 60
keepalive = 1800
graceful_timeout = 1800
bind = f":{os.environ.get('PORT', '80')}"
worker_class = "uvicorn.workers.UvicornWorker"

Description

We have deployed a model prediction service in production that is using FastAPI and unfortunately, some of the requests are failing due to a 10s timeout. In terms of concurrent requests, we typically only load about 2/3 requests per second, so I wouldn’t think that would be too much strain on FastAPI. The first thing we tried to do is isolate the FastAPI framework from the model itself, and when we performed some tracing, we noticed that a lot of time (6 seconds) was spent on this segment: starlette.exceptions:ExceptionMiddleware.__call__.

Would really appreciate some guidance on what the above segment implies and what is causing timeout issues for some requests under a not too strenuous load.

Wanted Solution

I would like to optimize my FastAPI application so that latency for all requests are consistent and there are no outliers that error out due to the 10s timeout.

Wanted Code

N/A

Alternatives

No response

Operating System

macOS

Operating System Details

No response

FastAPI Version

0.65.2

Python Version

3.7.10

Additional Context

No response

Issue Analytics

State:
Created 2 years ago
Comments:5 (1 by maintainers)

Top GitHub Comments

2reactions

jgould22commented, Dec 6, 2021

You need to do your heavy processing outside of the event loop.

Async Python is Cooperative Multitasking, if one of your requests gets a hold of the event loop and doesnt call an await for a long period of time the other async tasks on the event loop will not make any progress (in your prometheus code for example).

Offload your CPU bound tasks to another process (people often use Celery ) so that you are not blocking the loop.

I think you could check this by changing your controller code to just a def so FastAPI will spawn it on a thread pool as described here https://fastapi.tiangolo.com/async/

If you are already doing then than I am not sure what the issue might be.

0reactions

rileyhuncommented, Dec 6, 2021

It’s a Bert Model so yes, but there’s middleware that sends metrics to Prometheus that seems to be taking a long time as well