Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`created_cloned_field` — slow performance with many models

See original GitHub issue

First Check

I added a very descriptive title to this issue.
I used the GitHub search to find a similar issue and didn’t find it.
I searched the FastAPI documentation, with the integrated search.
I already searched in Google “How to X in FastAPI” and didn’t find any information.
I already read and followed all the tutorial in the docs and didn’t find an answer.
I already checked if it is not related to FastAPI but to Pydantic.
I already checked if it is not related to FastAPI but to Swagger UI.
I already checked if it is not related to FastAPI but to ReDoc.

Commit to Help

I commit to help with one of those options 👆

Example Code

import fastapi
import pydantic


class NestedModel(pydantic.BaseModel):
    x: pydantic.BaseModel
    y: pydantic.BaseModel


def create_app():
    for _ in range(100):
        fastapi.routing.APIRoute(
            "/test", endpoint=lambda: ..., response_model=NestedModel
        )


# PROFILING

import yappi

yappi.set_clock_type("CPU")

with yappi.run():
    create_app()

stats = yappi.get_func_stats()
stats.save("fastapi.pprof", type="pstat")

Description

When building a FastAPI application with nested Pydantic models, the create_cloned_field utility in the APIRoute initialization is quite slow.

For the trivial example, you can see that create_cloned_field dominates the runtime with 90% of CPU time. The majority of this is spent deep copying.

Note, timing is CPU time not WALL time

If we replace this trivial application with the one from Prefect, from prefect.orion.api.server import create_app, we can see that this is significant in a real world example.

With a patch to retain the cache across calls to this function, we can get this down to 50% of the CPU time with a ~5x overall speedup.

This speedup persists and is even more significant in a real-world application with create_cloned_field accounting for only 11% of the CPU time.

Profiling of Prefect app creation with patch

Operating System

macOS

Operating System Details

No response

FastAPI Version

0.74.0

Python Version

3.8.12

Additional Context

This may also be resolvable with https://github.com/samuelcolvin/pydantic/issues/1008 as mentioned in https://github.com/tiangolo/fastapi/issues/894#issuecomment-576484427

Issue Analytics

State:
Created 2 years ago
Reactions:14
Comments:10 (1 by maintainers)

Top GitHub Comments

3reactions

ddaniercommented, Sep 13, 2022

We are currently using a monkey patched version of the normal main.py like this:

# flake8: noqa
"""
This is a faster version of the main.py.

This version is faster by monkey patching the internals of FastAPI. It is
intended to NEVER be used in production. It is only for testing and
local development.

To can activate this file you may just create a `docker-compose.override.yml`
with the following contents:

---
version: "3.6"

services:
  api:
    command:
      [
        "poetry",
        "run",
        "uvicorn",
        "something.fast_main:app",
        "--host=0.0.0.0",
        "--reload",
      ]
---
"""

from dataclasses import is_dataclass
from typing import Optional, cast
from weakref import WeakKeyDictionary

from pydantic import BaseModel, create_model
from pydantic.fields import ModelField
from pydantic.utils import lenient_issubclass


def patched_create_cloned_field(
        field: ModelField,
        *,
        cloned_types: Optional[dict[type[BaseModel], type[BaseModel]]] = WeakKeyDictionary(),
) -> ModelField:
    # _cloned_types has already cloned types, to support recursive models
    if cloned_types is None:
        cloned_types = WeakKeyDictionary()
    original_type = field.type_
    if is_dataclass(original_type) and hasattr(original_type, "__pydantic_model__"):
        original_type = original_type.__pydantic_model__
    use_type = original_type
    if lenient_issubclass(original_type, BaseModel):
        original_type = cast(type[BaseModel], original_type)
        use_type = cloned_types.get(original_type)
        if use_type is None:
            use_type = create_model(original_type.__name__, __base__=original_type)
            cloned_types[original_type] = use_type
            for f in original_type.__fields__.values():
                use_type.__fields__[f.name] = patched_create_cloned_field(
                    f, cloned_types=cloned_types,
                )
    new_field = fastapi.utils.create_response_field(name=field.name, type_=use_type)
    new_field.has_alias = field.has_alias
    new_field.alias = field.alias
    new_field.class_validators = field.class_validators
    new_field.default = field.default
    new_field.required = field.required
    new_field.model_config = field.model_config
    new_field.field_info = field.field_info
    new_field.allow_none = field.allow_none
    new_field.validate_always = field.validate_always
    if field.sub_fields:
        new_field.sub_fields = [
            patched_create_cloned_field(sub_field, cloned_types=cloned_types)
            for sub_field in field.sub_fields
        ]
    if field.key_field:
        new_field.key_field = patched_create_cloned_field(
            field.key_field, cloned_types=cloned_types,
        )
    new_field.validators = field.validators
    new_field.pre_validators = field.pre_validators
    new_field.post_validators = field.post_validators
    new_field.parse_json = field.parse_json
    new_field.shape = field.shape
    new_field.populate_validators()
    return new_field


import fastapi  # noqa

fastapi.routing.create_cloned_field = patched_create_cloned_field


from something.main import app  # noqa

Note that the docs at the top only fit our own setup with running everything in docker and note that I did remove the app name (replaced by “something”).

Anyways this seems to work for us now and we do not have any issues. I cannot reproduce the problems we had any more. RAM usage is still also down by a huge amount.

Nice thing about this additional file + the monkey patch is that we still can just build a normal production version that does include this.

1reaction

teebucommented, Jun 4, 2022

Any update on this?