question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[QUESTION] Validation in the FastAPI response handler is a lot heavier than expected

See original GitHub issue

First check

  • I used the GitHub search to find a similar issue and didn’t find it.
  • I searched the FastAPI documentation, with the integrated search.
  • I already searched in Google “How to X in FastAPI” and didn’t find any information.

Description

So I have built a Tortoise ORM to Pydantic adaptor, and it’s about stable, so I started profiling and found some interesting.

Pydantic will validate the data I fetch from the DB, which seems redundant as the DB content is already validated. So we are doing double validation. Further profiling I found that the majority of time is spent by FastAPI preparing the data for serialisation, and then validating it, and then actually serialising (This specific step is what https://github.com/tiangolo/fastapi/issues/1224#issuecomment-617243856 refers to)

So I am doing essentially triple validation…

I then saw that there is the orjson integration, I tried that… and it made no difference that I could tell. (I’ll get to this later)

I did a few experiments (none of them properly tested, but just to get an idea) with a simple benchmark: (The database was populated with 200 junk user profiles generated by hypothesis, response is 45694 bytes)

Key: R1 → Using FastAPI to serialise a List[User] model automatically (where User is a Pydantic model) R2 → Using FastAPI to serialise a List[User] model automatically, but disabled the validation step in serialize_response R3 → Manually serialised the data using an ORJSONResponse R4 → Using FastAPI to serialise a List[User] model automatically, bypassed the jsonable_encoder as I’m serialising with orjson R5 → Using FastAPI to serialise a List[User] model automatically, bypassed both validation and jsonable_encoder C1 → Use provided pydantic from_orm C2 → Custom constructor that doesn’t validate

My results are: R1 + C1 → 42req/s R1 + C2 → 43req/s (Seems the 3 FastAPI steps overpower the validation overhead of from_orm) R2 + C1 → 56req/s (Disabling the validation IN FastAPI has a much bigger impact?) R2 + C2 → 63req/s (So, no extra validation) R3 + C1 → 75req/s R3 + C2 → 160req/s R4 + C1 → 53req/s (This orjson-specific optimization gave us a 26% speedup here!) R4 + C2 → 64req/s (This orjson-specific optimization gave us a 48% speedup here!) R5 + C1 → 74req/s (So, almost as fast as bypassing the FastAPI response handler) R5 + C2 → 147req/s

Was somewhat surprised by these results. Seems that Disabling all validation AND skipping the FastAPI response handler gave me a nearly 4x improvement!!

Outcomes:

  1. Doing an optimal build from ORM to Pydantic doesn’t help much by itself, but with an optimal response handler, it can fly!
  2. We should really consider as https://github.com/tiangolo/fastapi/issues/1224#issuecomment-617243856 proposed if using orjson, as it gives a big improvement by itself!
  3. Validation in the FastAPI response handler is a lot heavier than expected

Questions: How advisable/possible is it to have a way to disable validation in the FastAPI response handler? Is it dangerous to do so? Should it be conditionally bypassed? e.g. we specify a way to mark it as safe?

I’m just trying to get rid of some bottlenecks, and to understand the system.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:22
  • Comments:31 (6 by maintainers)

github_iconTop GitHub Comments

8reactions
havardthomcommented, Sep 27, 2021

Based on discussions here and other issues I’ve decided to skip FastAPI’s response handler along with its validation and serialization.

  • The response validation is just a performance hit since our application code already uses pydantic models which are validated, we work on models, not on dicts.
  • The serialization with jsonable_encoder was created before Pydantic’s own .json() utility was released (source: https://github.com/tiangolo/fastapi/issues/1107#issuecomment-612963659), and jsonable_encoder seems to have worse performance. We would much rather use Pydantic’s .json() utility which also have support for custom json encoders and alternative json libraries such as ujson or orjson.

So we use a custom Pydantic BaseModel to leverage orjson and custom encoders:

import orjson
from pydantic import BaseModel as PydanticBaseModel
from bson import ObjectId

def orjson_dumps(v, *, default):
    # orjson.dumps returns bytes, to match standard json.dumps we need to decode
    return orjson.dumps(v, default=default, option=orjson.OPT_NON_STR_KEYS).decode()

class BaseModel(PydanticBaseModel):
    class Config:
        json_load = orjson.loads
        json_dumps = orjson_dumps
        json_encoders = {ObjectId: lambda x: str(x)}

And we use a custom FastAPI response class to give us flexibility of returning either the Pydantic model itself or an already seralized model (if we want more control over alias, include, exclude etc.). Returning a response class also make the data flow more explicit, easier to understand for outside eyes/new developers.

from typing import Any

from fastapi.response import JSONResponse
from pydantic import BaseModel

class PydanticJSONResponse(JSONResponse):
    def render(self, content: Any) -> bytes:
        if content is None:
            return b""
        if isinstance(content, bytes):
            return content
        if isinstance(content, BaseModel):
            return content.json(by_alias=True).encode(self.charset)
        return content.encode(self.charset)

Just wanted to share my approach, thanks for reading.

Full example:

from typing import Any, List, Optional

import orjson
from fastapi import FastAPI, status
from fastapi.response import JSONResponse
from pydantic import BaseModel as PydanticBaseModel
from bson import ObjectId


def orjson_dumps(v, *, default):
    # orjson.dumps returns bytes, to match standard json.dumps we need to decode
    return orjson.dumps(v, default=default, option=orjson.OPT_NON_STR_KEYS).decode()


class BaseModel(PydanticBaseModel):
    class Config:
        json_load = orjson.loads
        json_dumps = orjson_dumps
        json_encoders = {ObjectId: lambda x: str(x)}


class Item(BaseModel):
    name: str
    description: Optional[str] = None
    price: float
    tax: Optional[float] = None
    tags: List[str] = []


class PydanticJSONResponse(JSONResponse):
    def render(self, content: Any) -> bytes:
        if content is None:
            return b""
        if isinstance(content, bytes):
            return content
        if isinstance(content, BaseModel):
            reutrn content.json(by_alias=True).encode(self.charset)
        return content.encode(self.charset)


app = FastAPI()


@app.post("/items/", status_code=status.HTTP_200_OK, response_model=Item)
async def create_item():
    item = Item(name="Foo", price=50.2)
    return PydanticJSONResponse(content=item)
5reactions
sm-Fifteencommented, May 19, 2020

Skip validation when not needed? (I would prefer an automatic way, but don’t know how to make it automatic)

I do find output validation to be useful in production to ensure that the API format contracts are always respected (which can be difficult to prove otherwise, your data source may end up throwing a null value in a place you didn’t expect), so I don’t know about disabling it automatically, but I would agree that being able to disable it for routes that return very large payloads would be useful performance-wise.

As for skipping jsonable_encoder, I’m 100% with you that it should be skippable if the framework user knows the json response renderer can handle any configuration of datatypes returned by the given route.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Handling Errors - FastAPI
These handlers are in charge of returning the default JSON responses when you raise an HTTPException and when the request has invalid data....
Read more >
tiangolo/fastapi - Gitter
I'd like to validate a field on my pydantic model depending on something I pull from the DB. Since the DB query is...
Read more >
High-performing Apps With Python: A FastAPI Tutorial - Toptal
We now turn our attention to request handlers where these schemas will be used to do all the heavy lifting of data conversion...
Read more >
Flask vs FastAPI first impressions - DEV Community ‍ ‍
This is also hugely messy in the logs as there's a lot of superfluous ... as err: return jsonify(error="JSON Schema Validation Error", ...
Read more >
Validations showing invalid details in response for 422 ...
I will try to rephrase and condense your question because it contains a lot of code that is entirely unrelated to the actual...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found