Lowering the overhead of Cyberbrain
See original GitHub issueCyberbrain adds a huge overhead to program execution, both in time spent and memory usage. This issue is for discussing possible improvements.
Time
Profiled one run with py-spy
sudo py-spy record -o profile.svg -- python -m examples.password.password examples/password/sonnets/* -s 1 --l33t
here’s the result: https://laike9m.github.io/images/cyberbrain_profile.svg
I only did a brief check. In summary, the overhead of sys.settrace
is smaller than expected. It took up ~1/6 of the extra time.
Major consuming time operations:
- JSON pickle
- ~~[ ] https://github.com/jsonpickle/jsonpickle/issues/326~~
- Cache JSON strings so that they don’t need to be recomputed.
- Protobuf encoding
- Value stack operations
- https://github.com/alexmojaki/cheap_repr/issues/16, took ~8% of the total time
-
parameters = inspect.signature(handler).parameters
invalue_stack.py
. Kinda unexpected. - The
log
function inlogger.py
. This is also unexpected. - TODO: More to add
Apparently there are some low-hanging fruits, and we should fix them first.
Ultimately, we need to rewrite part of Cyberbrain in C/C++. There are many options, but I’d like to automate it as much as I can, so I will first look into Nuitka and mypyc. If they don’t work well, Cython is also a good option.
Probably the only good news is that the overhead of sys.settrace
only contributes a small portion to the overhead. So in the short-term I won’t bother replacing it. Once we optimized the other stuff to the extent that sys.settrace
becomes the majority overhead, we’ll come back to it.
Optimize JSON pickle
Cybebrain uses the jsonpickle library to convert Python objects to JSON, so that they can be displayed in the devtools console. jsonpickle is pure Python and really slow, it took ~23% of the total time, which is the biggest performance bottleneck.
The to JSON process can’t be parallelized, since we have to do it before the original object gets modified. Thus the only way left is to speed up the library. Some options I’ve considered or tried
- Rewrite it in C++. Though jsonpickle is a relatively small lib, Writing it in C++ by myself is still a huge amount of work, and not really realistic. Not to mention it’s hard to keep up with the upstream change.
- Use Nuikta. I tried it on my Mac, it works surprisingly well and cut the execution time from 8.5s to 6s. However Nuikta isn’t really designed to be used by a library, but more for applications. Some reasons:
- Nuikta doesn’t support cross-compilation, but it’s hard for library owners to compile a shared library for every platform.
- Nuikta does not let users use the generated C file as a C extension, but only the shared libraray.
- Use Cython. Cython seems to do the job of compiling a Python package to C, and let you use it as a C extension, but I haven’t tried it. Some refs:
Memory
TBD
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (5 by maintainers)
Top GitHub Comments
@laike9m thanks for the clarification. Yes, I am thinking about multi-frame tracing. In my vision, we can do this, or close enough. One way to lower the overhead of Cyberbrain in multi-frame tracing is that cut some tracing branches.
There are two types of branches that can be cut.
The pure function calls
Every invocation will produce a snapshot of the current variables which are in use. If collecting snapshots before and after the invocation in the current frame and without non-pure calls, can be re-calculated the detailed events of the invocation in the deeper frame.
About how to determine non-pure invocation in frame, yes, we cannot do that. Maybe let the user decides what frame is the pure calculation, to cut some tracing branches.
In the above example, we only need to save the arguments and the return value of the
add
function calls. We can re-calculate the detail if the user makes an interaction to view the invocations in theadd
function.There is no need for deep copying everything. The snapshots in
multiply
only need to record the variableanswer
modifications.need further discussion
After https://github.com/laike9m/Cyberbrain/commit/9789ab0cf804d7990ead9842b6fd372ed39f4cac (Replaced protobuf with msgpack)
py-spy result: https://laike9m.github.io/images/9789ab0.svg
Message encoding is not a bottleneck anymore.