beartype performance is substantially slower than no beartype
See original GitHub issueHello,
I was quite interested when I found this stackoverflow answer about beartype… As a POC, I cooked up a performance test using beartype, Enthought traits, traitlets, and plain-ole-python-ducktyping…
But… I found that beartype was pretty slow in my test… As an attempt to be as fair as possible, I used assert
to enforce types in my duck-typed function (ref - main_duck_assert()
)…
I also confess that Enthought traits are compiled, so the Enthought traits data below is mostly just an FYI…
Running my comparison 100,000 times…
$ python test_type.py
timeit duck getattr time: 0.0395 seconds
timeit duck assert time: 0.0417 seconds
timeit traits time: 0.0633 seconds
timeit traitlets time: 0.5236 seconds
timeit bear time: 0.0782 seconds
$
Question Am I doing something wrong with bear-typing (ref my POC code below)? Is there a way to improve the “beartyped” performance?
My rig…
- Linux under VMWare (running on a Lenovo T430); kernel version
4.19.0-12-amd64
- Python 3.7.0
- beartype version 0.8.1
- Enthought traits version 6.3.0
- traitlets, version 5.1.0
from beartype import beartype
from traits.api import HasTraits as eHasTraits
from traits.api import Unicode as eUnicode
from traits.api import Int as eInt
from traitlets import HasTraits as tHasTraitlets
from traitlets import Unicode as tUnicode
from traitlets import Integer as tInteger
from timeit import timeit
def main_duck_getattr(arg01="__undefined__", arg02=0):
"""Proof of concept code implenting duck-typed args and getattr"""
getattr(arg01, "capitalize") # Type-checking with attributes
getattr(arg02, "to_bytes") # Type-checking with attributes
str_len = len(arg01) + arg02
getattr(str_len, "to_bytes")
return ("duck_bar", str_len,)
def main_duck_assert(arg01="__undefined__", arg02=0):
"""Proof of concept code implenting duck-typed args and assert"""
assert isinstance(arg01, str)
assert isinstance(arg02, int)
str_len = len(arg01) + arg02
assert isinstance(str_len, int)
return ("duck_bar", str_len,)
class MainTraits(eHasTraits):
"""Proof of concept code implenting Enthought traits args"""
arg01 = eUnicode()
arg02 = eInt()
def __init__(self, *args, **kwargs):
super(MainTraits, self).__init__(*args, **kwargs)
def run(self, arg01="__undefined__", arg02=0):
self.arg01 = arg01
self.arg02 = arg02
self.str_len = len(self.arg01) + self.arg02
return ("traits_bar", self.str_len)
class MainTraitlets(tHasTraitlets):
"""Proof of concept code implenting traitlets args"""
arg01 = tUnicode()
arg02 = tInteger()
def __init__(self, *args, **kwargs):
super(MainTraitlets, self).__init__(*args, **kwargs)
def run(self, arg01="__undefined__", arg02=0):
self.arg01 = arg01
self.arg02 = arg02
self.str_len = len(self.arg01) + self.arg02
return ("traitlets_bar", self.str_len)
@beartype
def main_bear(arg01: str="__undefined__", arg02: int=0) -> tuple:
"""Proof of concept code implenting bear-typed args"""
str_len = len(arg01) + arg02
return ("bear_bar", str_len,)
if __name__=="__main__":
num_loops = 100000
duck_result_getattr = timeit('main_duck_getattr("foo", 1)', setup="from __main__ import main_duck_getattr", number=num_loops)
print("timeit duck getattr time:", round(duck_result_getattr, 4), "seconds")
duck_result_assert = timeit('main_duck_assert("foo", 1)', setup="from __main__ import main_duck_assert", number=num_loops)
print("timeit duck assert time:", round(duck_result_assert, 4), "seconds")
traits_result = timeit('mm.run("foo", 1)', setup="from __main__ import MainTraits;mm = MainTraits()", number=num_loops)
print("timeit traits time:", round(traits_result, 4), "seconds")
traitlets_result = timeit('tt.run("foo", 1)', setup="from __main__ import MainTraitlets;tt = MainTraitlets()", number=num_loops)
print("timeit traitlets time:", round(traitlets_result, 4), "seconds")
bear_result = timeit('main_bear("foo", 1)', setup="from __main__ import main_bear", number=num_loops)
print("timeit bear time:", round(bear_result, 4), "seconds")
Issue Analytics
- State:
- Created 2 years ago
- Comments:13 (7 by maintainers)
Top Results From Across the Web
Beartype: Fast runtime type checking in Python - Hacker News
Results: Runs 1000x slower than not using it, but 200x faster than the competition! So only use this in correctness-critical paths, or use...
Read more >Python – Best way to check function arguments? - iTecNote
As this example suggests, bear typing explicitly supports type checking of parameters and return values annotated as either simple types or tuples of...
Read more >Is Python really 'too slow'? - Reddit
There's always 8 ways to do a given thing, often times with performance implications, and without fail the best option is the least...
Read more >This bear won't be gentle, or go away soon - Foster's Daily Democrat
Over the summer, assurances continued that only slower growth, but not a ... Bear-type mutual funds and "inverse" exchange-traded-funds are likely to again ......
Read more >Release develop Evan Hubinger - Coconut
How do I use a runtime type checker like beartype when Coconut seems to ... significantly slow down functions that use it, so...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hah-hah! I love fielding questions like this, because overly scrupulous fixation on efficiency is my middle name(s).
Thankfully, according to the wizened sages of old and our own
timeit
timings,@beartype
is still as blazing fast at call time as it always was. In general,@beartype
adds anywhere from 1µsec (i.e., 10-6 seconds) in the worst case to 0.01µsec (i.e., 10-8 seconds) in the best case of call-time overhead to each decorated callable. This superficially seems reasonable – but is it?Let’s delve deeper.
Formulaic Formulas: They’re Back in Fashion
First, let’s formalize how exactly we arrive at the call-time overheads above.
Given any pair of reasonably fair timings (which yours absolutely are) between an undecorated callable and its equivalent
@beartype
-decorated callable, let:n
be the number of times (i.e., loop iterations) each callable is repetitiously called.γ
be the total time in seconds of all calls to that undecorated callable.λ
be the total time in seconds of all calls to that@beartype
-decorated callable.Then the call-time overhead
Δ(n, γ, λ)
added by@beartype
to each call is:Plugging in
n = 100000
,γ = 0.0435s
, andλ = 0.0823s
from your excellent timings, we see that@beartype
on average adds call-time overhead of 0.388µsec to each decorated call: e.g.,Again, this superficially seems reasonable – but is it? Let’s delve deeper.
Function Call Overhead: The New Glass Ceiling
Next, the added cost of calling
@beartype
-decorated callables is a residual artifact of the added cost of stack frames (i.e., function and method calls) in Python. The mere act of calling any pure-Python callable adds a measurable overhead – even if the body of that callable is just a noop doing absolutely nothing. This is the minimal cost of Python function calls.Since Python decorators almost always add at least one additional stack frame (typically as a closure call) to the call stack of each decorated call, this measurable overhead is the minimal cost of doing business with Python decorators. Even the fastest possible Python decorator necessarily pays that cost.
Our quandary thus becomes: “Is 1—0.01µsec of call-time overhead reasonable or is this sufficiently embarrassing as to bring multigenerational shame upon our entire extended family tree, including that second cousin twice-removed who never sends a kitsch greeting card featuring Santa playing with mischievous kittens at Christmas time?”
We can answer that by first inspecting the theoretical maximum efficiency for a pure-Python decorator that performs minimal work by wrapping the decorated callable with a closure that just defers to the decorated callable. This excludes the identity decorator (i.e., decorator that merely returns the decorated callable unmodified), which doesn’t actually perform any work whatsoever. The fastest meaningful pure-Python decorator is thus:
By replacing
@beartype
with@fastest_decorator
in your awesome snippet, we can expose the minimal cost of Python decoration:Again, plugging in
n = 100000
,γ = 0.0889s
, andλ = 0.1185s
from your excellent timings, we see that@fastest_decorator
on average adds call-time overhead of 0.3µsec to each decorated call: e.g.,Holy Balls of Flaming Dumpster Fires
Holy balls, people. I’m actually astounded myself.
Above, we saw that
@beartype
on average only adds call-time overhead of 0.388µsec to each decorated call. But0.388µsec - 0.3µsec = 0.088µsec
, so@beartype
only adds 0.1µsec (generously rounding up) of additional call-time overhead above and beyond that necessarily added by the fastest possible Python decorator.Not only is
@beartype
within the same order of magnitude as the fastest possible Python decorator, it’s effectively indistinguishable from the fastest possible Python decorator on a per-call basis.Of course, even a negligible time delta accumulated over 10,000 function calls becomes slightly less negligible. Still, it’s pretty clear that
@beartype
remains the fastest possible runtime type-checker for now and all eternity. Amen.But, but… That’s Not Good Enough!
Yeah. None of us are pleased with the performance of the official CPython interpreter anymore, are we? CPython is that geriatric old man down the street that everyone puts up with because they’ve seen Pixar’s Up! and he means well and he didn’t really mean to beat your equally geriatric 20-year-old tomcat with a cane last week. Really, that cat had it comin’.
If
@beartype
still isn’t ludicrously speedy enough for you under CPython, we also officially support PyPy3 – where you’re likely to extract even more ludicrous speed.Does that fully satisfy your thirsty cravings for break-neck performance? If so, feel free to toggle that Close button. If not, I’d be happy to hash it out over a casual presentation of further timings, fake math, and Unicode abuse.
tl;dr
@beartype
(and every other runtime type checker) will always be negligibly slower than hard-coded inlined runtime type-checking, thanks to the negligible (but surprisingly high) cost of Python function calls. Where this is unacceptable, PyPy3 is your code’s new BFFL.You’ve gone above and beyond the bear call of duty. Now, I can only inundate you with my prophetic memes.