question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[QUESTION] enhance serialization speed

See original GitHub issue

Description

I have to return sometimes big objects, I’m constrained in the fact that chunking them is not an option

An example of such an object would be a dict {"key:: value} where value is a list of list, 20 list of 10k elements.

I wrote this simple test case that shows quite clearly the massive hit in several scenarios (run with pytest tests/test_serial_speed.py --log-cli-level=INFO) Here’s the output:

======================================================================================================================= 1 passed in 3.68 seconds =======================================================================================================================
(fastapi) ➜  fastapi git:(slow_serial) ✗ pytest tests/test_serial_speed.py --log-cli-level=INFO
========================================================================================================================= test session starts ==========================================================================================================================
platform linux -- Python 3.6.8, pytest-5.0.0, py-1.8.0, pluggy-0.12.0
rootdir: /home/lotso/PycharmProjects/fastapi
plugins: cov-2.7.1
collected 1 item                                                                                                                                                                                                                                                       

tests/test_serial_speed.py::test_routes 
---------------------------------------------------------------------------------------------------------------------------- live log call -----------------------------------------------------------------------------------------------------------------------------
INFO     tests.test_serial_speed:test_serial_speed.py:39 route1: 0.05402565002441406
INFO     tests.test_serial_speed:test_serial_speed.py:18 app.tests.test_serial_speed.route1, 9.395180225372314, ['http_status:200', 'http_method:GET', 'time:wall']
INFO     tests.test_serial_speed:test_serial_speed.py:18 app.tests.test_serial_speed.route1, 9.395131000000001, ['http_status:200', 'http_method:GET', 'time:cpu']
INFO     tests.test_serial_speed:test_serial_speed.py:52 route1: 0.049863576889038086
INFO     tests.test_serial_speed:test_serial_speed.py:18 app.tests.test_serial_speed.route2, 10.358616590499878, ['http_status:200', 'http_method:GET', 'time:wall']
INFO     tests.test_serial_speed:test_serial_speed.py:18 app.tests.test_serial_speed.route2, 10.358592000000002, ['http_status:200', 'http_method:GET', 'time:cpu']
INFO     tests.test_serial_speed:test_serial_speed.py:64 route1: 0.05589580535888672
INFO     tests.test_serial_speed:test_serial_speed.py:18 app.tests.test_serial_speed.route3, 11.318845272064209, ['http_status:200', 'http_method:GET', 'time:wall']
INFO     tests.test_serial_speed:test_serial_speed.py:18 app.tests.test_serial_speed.route3, 11.318446000000002, ['http_status:200', 'http_method:GET', 'time:cpu']
PASSED                                                                                                                                                                                                                                                           [100%]

====================================================================================================================== 1 passed in 31.60 seconds =======================================================================================================================

all routes do the same with slight variations:

  1. build a big object, it’s a dict, one key and a list of list value with 3 sublists of 100k elements, a little bit extreme maybe but it’s to show clearly the impact
  2. return the object

as you can see the time taken to build such an object is small, around 0.05s, but…

route1 just returns it, it takes 9s route2 returns it but has the response_model=BigData in the signature, it takes 1s more route3 is not intuitive to me, I thought that by already building a BigData object and returning it, there would be no penalty, but it’s again slower

How can I […]? improve performance

edit: the tests are available at this branch, can PR should you want to https://github.com/euri10/fastapi/tree/slow_serial

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:28 (17 by maintainers)

github_iconTop GitHub Comments

4reactions
dmontagucommented, Oct 3, 2019

Yeah, it’s also easy enough to write a decorator that performs the conversion to a response for endpoints you know are safe. Something like:

def go_fast(f):
    @wraps(f)
    async def wrapped(*args, **kwargs):
        return UJSONResponse(await f(*args, **kwargs))
    return wrapped

(Might want to use inspect.iscoroutinefunction to also handle def endpoints.)

1reaction
dmontagucommented, Oct 3, 2019

I investigated – the problem is that jsonable_encoder is very slow for objects like this since it has to make many isinstance calls for each value in the returned list.

This seems like a pretty substantial shortcoming – I think there should be a way to override the use of jsonable_encoder with something faster in cases where you know you don’t need its functionality. (Currently you can provide a custom_encoder, but it won’t speed things up in cases where you are returning a list/dict since jsonable_encoder will still loop over each entry and perform lots of isinstance checks.)

A 6x overhead is not good!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Boost serialization performance: text vs. binary format
I used boost.serialization to store matrices and vectors representing lookup tables and some meta data (strings) with an in memory size of ...
Read more >
How to improve performance for serializing and deserializing ...
One avenue I am considering is to persist and load from a small relational database (I have sqlite in mind). Should I expect...
Read more >
Improve Serialization Performance in Django Rest Framework
The "regular" serializer took only 2.1 seconds. That's 60% faster than the read only ModelSerializer , and a whooping 85% faster than the ......
Read more >
JSON serialization performance issues with JsonObject
Daniel Becroft (Customer) asked a question. ... We're seeing some significant performance issues when serializing a JSON document using the ...
Read more >
Improve Serialization Performance in Django Rest Framework ...
Kudos for the clinical dissection of serializer performance and specially for contributing to improving DRF's performance! One question on the maths though.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found