The Future Sound of Beartype: Going Deep on Type-checking
See original GitHub issue2020 was an overloaded faceplant for humanity, right? Thanks for devoting your scarce attention in trying times to this small slice of the meaty pie that is Python quality assurance. You are awesome.
Greetings from a distant past, fellow Pythonista. By the time you read this, everything below has already become a nostalgic time capsule obsoleted by the sandy hands of Time and Guido. Still, let’s do it.
In this pinned issue, we discuss where beartype
is going, where beartype
has been, and how we can get from here to 1.0.0 without hitting any more code pigeons with whom we had a deal.
But first…
Let’s Talk About You!
…because you are awesome. Here are all the ways we hope to build a vibrant O(1)
community that will outlive even cyborg head-in-a-jar @leycec.
Forum
Our forum “GitHub Discussions” is now up. Thanks to the explosive growth in both GitHub stars and page views, GitHub automatically unlocked its beta forum implementation for us. Hooray!
Ask me anything (AMA). I promise to answer at least 10% of all questions – with particular attention to weeb genre tags like Japan, video games, heavy metal, scifi, fantasy, and the intersection of those five hot topics. So Japanese scifi-fantasy video games with metal OSTs. They exist, people.
Wiki
Our wiki is now open to public modifications. Since beartype
is trivial to install (pip
), configure (no configuration), and use (no API), there’s currently no incentive for a wiki. I acknowledge this. My glorious dream is that the wiki will be an extemporaneous idea factory and whiteboard for unsound and probably dangerous methods of constant-time runtime type checking.
If that fails to manifest and the wiki just devolves into a spam-clogged cesspit of black-hat depravity, we’ll reassess. Until then, do what thou wilt shall be the whole of the Law.
Pull Requests
We greedily welcome any pull request no matter how small or thanklessly ambitious. However, note that volunteer contributions will be… complicated. On the one hand, beartype
is meticulously documented, commented, and tested. On the other hand, beartype
is internally implemented as a two-pass type checker:
- A stupidly fast
O(1)
type checker that only tests whether all passed parameters and returned values satisfy all type hints. - A stupidly slow
O(n)
type checker that raises human-readable exceptions in the event that one or more passed parameters or returned values violate a type hint.
Also, did we mention that the first pass is stupidly overtuned with cray-cray memoization, micro-optimizations, and caching? Despite my best efforts, this means that meaningful pull requests may never happen. I admit that this is non-ideal – but also unavoidable. Speed and PEP-compliance (in that order) are our primary motivations here. Maintainability and discoverability are tertiary concerns.
This is the high cost of adrenaline. So it goes.
The Future: It Looks Better Than the Past
Sloppy Markdown calendars >>>> real project management that middle-management always stuffs into Excel macro-driven Gantt charts, so:
Year | Month | Thing @leycec Is Doing for beartype |
---|---|---|
2020 | December | Success! |
2021 | January | Success! |
2022 | February | Success! beartype 0.6.0, Part Un. (That means “one” in Quebec.) |
2022 | March | beartype 0.6.0, Part Deux. Ship it. |
2022 | April | beartype 0.7.0, Part Un. |
2022 | May | beartype 0.7.0, Part Deux. Ship it. |
2022 | June | beartype 0.8.0, Part Un. |
2022 | July | beartype 0.8.0, Part Deux. Ship it. |
2022 | August | beartype 0.9.0, Part Un. |
2022 | September | beartype 0.9.0, Part Deux. Ship it. |
2022 | October | beartype 1.0.0, Part Un. |
2022 | November | beartype 1.0.0, Part Deux. But we are not ready yet! |
2022 | December | beartype 1.0.0, Part Trois. Ship it. |
The official roadmap says one year to 1.0.0. Will we make it? Defly not. We’ll fumble the pass in the final yard, stumble into the enemy mascot, and land in the celebratory Gatoraid™ bucket as the ref slashes the air with a red card to mass hysteria from the crowbar-wielding rowdy crowd.
The official roadmap cannot be denied, however.
Beartype 0.6.0: The Mappings Are Not the Territory
Beartype 0.6.0 intends to extend deep type-checking support to core data structures and abstract base classes (ABCs) implemented via the hash()
builtin, including:
dict
.frozenset
.collections.ChainMap
.collections.OrderedDict
.collections.abc.DefaultDict
.collections.abc.ItemsView
.collections.abc.KeysView
.collections.abc.Mapping
.collections.abc.MutableMapping
.collections.abc.MutableSet
.collections.abc.Set
.collections.abc.ValuesView
.typing.ChainMap
.typing.DefaultDict
.typing.Dict
.typing.FrozenSet
.typing.ItemsView
.typing.KeysView
.typing.Mapping
.typing.MutableMapping
.typing.MutableSet
.typing.OrderedDict
.typing.Set
.typing.ValuesView
.
beartype
currently only shallowly type-checks these type hints. We can do better. We must do better! The future itself may very well depend upon it.
These are among the last big-ticket hints we need to deeply type-check, but they’re also the least trivial. Although the C-based CPython implementation almost certainly stores both set members and dictionary keys and values as hash bucket sequences, it fails to expose those sequences to the Python layer. This means beartype
has no efficient random access to arbitrary set members or dictionary keys and values.
Does that complicate O(1)
runtime type-checking of sets and dictionaries? Yes. Yes, it does. I do have a number of risky ideas here, most of which revolve around internal caches of KeysView
, ValuesView
, and ItemsView
iterators (i.e., the memory views returned by the dict.keys()
, dict.values()
, and dict.items()
methods). I don’t want to blow anything up, so this requires care, forethought, and a rusty blood-flecked scalpel.
Memory views only provide efficient access to the next dictionary object iterated by those views. This means the only efficient means of deeply type-checking one unique dictionary object per call to a @beartype
-decorated callable is for beartype
to internally cache and reuse memory views across calls. This must be done with maximum safety. To avoid memory leaks, cached memory views must be cached as weak rather than strong references. To avoid exhausting memory, cached memory views must be cached in a bounded rather than unbounded data structure.
The stdlib @functools.lru_cache
decorator is the tortoise of Python’s caching world. Everyone thinks it’s fast until they inspect its implementation. Then they just define their own caching mechanism. Of course, beartype
did exactly that with respect to an unbounded cache: our private beartype._util.cache.utilcachecall
submodule defines the fastest-known pure-Python unbounded cache decorator, which we liberally call everywhere to memoize internal beartype
callables.
Now, we’ll need to define a similar bounded cache that caches no more than the passed maximum number of cache entries. This isn’t hard, but it’s something that needs to be done that takes resources. This is why my face looks constipated on a daily basis.
But that’s not all. We then need to populate that cache with weak references to memory views dynamically created and cached at call time for @beartype
-decorated callables in O(1)
time with negligible constants. In Pythonic pseudocode, this might resemble for the specific case of item views:
from weakref import ref
#FIXME: Currently, this is unbounded. Define this as some sort of bounded
#dictionary containing only recently accessed key-value pairs.
dict_id_to_items_view = {}
'''
Bounded dictionary cache mapping from the unique identifier of each mapping
recently passed to a :mod:`beartype`-decorated callable to a weak reference to
a memory view iterating over that mapping's key-value pairs.
'''
def get_dict_nonempty_next_item(mapping: MappingType) -> object:
'''
Get the next key-value pair from the items view internally cached by
:mod:`beartype` for the passed non-empty mapping.
Specifically, this getter (in order):
#. If either :mod:`beartype` has yet to internally cache a items view for
this mapping *or* the prior call to this function returned the last
key-value pair from this items view, internally creates and caches a new
items view for this mapping.
#. Returns the next items from this view.
Caveats
----------
**This mapping is assumed to be non-empty.** If this is *not* the case,
this getter raises a :class:`StopIteration` exception.
Parameters
----------
mapping : MappingType
Non-empty mapping to be iterated.
Returns
----------
object
Next key-value pair from the items view internally cached by
:mod:`beartype` for this non-empty mapping.
Raises
----------
StopIteration
If this mapping is empty. Ergo, this getter should *only* be passed
mappings known to be non-empty.
'''
# Integer uniquely identifying this mapping.
mapping_id = id(mapping)
# Items view previously cached for this mapping if any *OR* "None".
items_iter = dict_id_to_items_view.get(mapping_id, None)
# If this is the first call to a decorated callable passed or returning
# this mapping...
if items_iter is None:
#FIXME: Protect this both here and below with a
#"try: ... except Exception: ..." block, where the body of the
#"except Exception:" condition should probably just return
#"beartype._util.utilobject.SENTINEL", as the only type hints
#that would ever satisfy that sentinel are type hints *ALL* objects
#already satisfy (e.g., "Any", "object").
dict_id_to_items_view[mapping_id] = ref(iter(mapping.items()))
# Else, this memory view was previously cached.
# Attempt to return the next key-value pair from this memory view.
try:
return next(dict_id_to_items_view[mapping_id])
# If we get to the end (i.e., the prior call to next() raises a
# "StopIteration" exception) *OR* anything else happens (i.e., the prior
# call to next() raises a "RuntimeError" exception due to the underlying
# mapping having since been externally mutated), silently start over. :p
except Exception:
# Note that we could also recursively call ourselves here: e.g.,
# return get_dict_nonempty_next_item(mapping)
# However, that would be both inefficient and dangerous.
dict_id_to_items_view[mapping_id] = ref(iter(mapping.items()))
return next(dict_id_to_items_view[mapping_id])
@beartype
would then generate wrappers internally calling the above get_dict_nonempty_next_item()
function to obtain a basically arbitrary (…yes, yes, insertion order, I know, but let’s just pretend because it’s late and I’m tired) mapping key, value, or key-value pair to be deeply type-checked in O(1)
time. This is more expensive than randomly deeply type-checking sequence items, but hopefully not prohibitively so. It’s all relative here. As long as we can shave this down to milliseconds of overhead per call, we’re still golden babies.
Note that we basically can’t do this under Python < 3.8, due to the lack of assignment expressions there. Since get_dict_nonempty_next_item()
returns a new key-value pair each call, we can’t repeatedly call that for each child pith and expect the same key-value pair to be returned. So, assignment expressions under Python >= 3.8 only. </shrug>
Under Python < 3.8, beartype
will fallback to just unconditionally deeply type-checking the first key, value, item, or member of each passed or returned mapping and set. That’s non-ideal, but Python < 3.8 is the past and the past is bad, so nobody cares. It’s best that way.
Beartype 0.7.0: Well-hung Low-hanging Fruit
Beartype 0.7.0 intends to extend deep type-checking support to non-core data structures and abstract base classes (ABCs) – each of which is trivial to support in isolation but all of which together will break me like a shoddy arrow over their knees:
type
.collections.deque
.collections.Counter
.collections.abc.Collection
.collections.abc.Container
.collections.abc.Iterable
.collections.abc.Reversible
.typing.Collection
.typing.Container
.typing.Counter
.typing.Deque
.typing.Iterable
.typing.NamedTuple
.typing.Reversible
.typing.Type
.typing.TypedDict
.
It is doable. Will it be done? Four out of five respondents merely shrug.
Beartype 0.8.0: Calling All Callables
Beartype 0.8.0 intends to extend deep type-checking support to callables:
collections.abc.AsyncIterable
.collections.abc.Awaitable
.collections.abc.Callable
.collections.abc.Coroutine
.typing.AsyncIterable
.typing.Awaitable
.typing.Callable
.typing.Coroutine
.
Dynamically type-checking callables at runtime in O(1)
is highly non-trivial and (maybe) even infeasible.
Some types of callables can’t reasonably be deeply type-checked at all at runtime. This includes one-time-only constructs like generators and iterators, which can’t be iterated without being destroyed. So, the Heisenberg Uncertainty Principle of Python objects.
Most types of callables can reasonably be deeply type-checked at runtime, but it’s unclear how that can happen in O(1)
time. The only non-harmful approach is to ignore the callables themselves and mono-focus instead on the annotations on callables. Specifically:
- Ignore unannotated callables, because we can’t reasonably call them to deeply type-check them without horrible side effects destroying the fabric of your fragile web app like tissue paper.
- Generate code deeply type-checking callable annotations without iteration in
@beartype
-decorated callables. Since the arguments subscripting a callable type hint are finite and typically quite small in number (e.g.,collections.abc.Callable[[int, str], dict[float, bool]]
), there is little incentive to randomize here. Instead, we generate code resembling the code we currently generate for fixed-length tuples (e.g.,tuple[int, str, float]
).
This will necessitate looking up annotations for stdlib callables in the infamous third-party typeshed
, which complicates matters. Whose bright idea was it, anyway, to offload annotations for stdlib callables onto some third-party repo? That’s what PEP 563 is for, official Python developers. PEP 563 means nobody has to care about the space or time costs associated with annotations anymore. Maybe we should actually use that. </sigh>
Beartype 0.9.0: Type Variables Mean Pain
Beartype 0.9.0 intends to extend deep type-checking support to parametrized type hints (i.e., type hints subscripted by one or more type variables). I won’t even begin to speculate how this will happen. I have a spontaneous aneurysm every time I try thinking about going deeper than a toy example like def uhoh(it_hurts: T) -> list[T]
. It’s when you get into type variables subscripting arbitrarily nested container and union type hints that my already meagre mental faculties begin unravelling under the heavy cognitive load.
It is possible to do this. But doing this could mean the last threads of my already badly frayed sanity. Challenge accepted.
Beartype 1.0.0: All the Remaining Things
Beartype 1.0.0 intends to extend deep type-checking support to everything that’s left – some of which may not even reasonably be doable at runtime. The most essential of these include:
- Class decoration. Generalizing the
@beartype
decorator to support both callables and classes is both critical and trivial – with one notable exception: parametrized classes (i.e., user-defined generic classes subclassing eithertyping.Generic
ortyping.Protocol
, subscripted by one or more type variables). Supporting parametrized classes will probably prove to be non-trivial, because it means constraining types not merely within each class method but across all class methods annotated by parametrized type hints, which means that state needs to be internally preserved between method calls, which we currently do not do at all, because that is unsafe and hard. But that’s fine. That’s what we’re here for. We’re not here for Easy Mode™ type checking. We’re here because this is the Soulsborne of the type-checking world. If it isn’t brutal, ugly, and constantly stealing your soul(s), it ain’tbeartype
. typing.Literal
, thetyping
hint formerly known as PEP 586. Whiletyping.Literal
itself is mostly inconsequential, supporting that hint inbeartype
requires a significant refactoring that should yield speedups across the board for all other hints. Well, isn’t that special?
Let’s trip over the above issue-laden minefields when we actually survive the preceding milestones with intact sanity, keyboard, and fingernails. Phew.
Beyond the Future: What Does That Even Mean?
Beartype 1.0.0 brings us perfect compliance constant-time compliance with all annotation standards. That’s good. But is that it?
It is not.
Beartype + Sphinx + Read the Docs = A Match Made in Docstrings
The beartype
API should be up on Read the Docs. It isn’t, because we are slothful and full of laziness. Righting this wrong is a two-step process:
#. Enable local generation of Sphinx-based HTML documentation. Fortunately, we’ve judiciously documented every class, callable, and global throughout the codebase with properly-formatted reStructuredText (reST) in preparation for this wondrous day. The only real work here will be adding a top-level docs/
directory containing the requisite Sphinx directory structure. Still, it’s probably one to two weeks worth of hard-headed volunteerism.
#. Enable remote hosting of that documentation on Read the Docs. I’ve never actually done this part before, so this will be the learning exercise. We’ll probably need to wire up our GitHub Actions-based release automation to generate and publish new documentation with each stable release of beartype
.
tl;dr: critical and trivial, if a little time-consuming. Let’s do docs! And this leads us directly to…
Beartype Configuration API: It’s Happening
beartype
currently has no public API… effectively. Technically, of course:
- The
beartype.cave
submodule is a public clearing house for common types and tuples of types, which was essential in the early days before we implemented full-blown PEP 484 compliance. Now? Vestigial, bro. - The
beartype.roar
submodule exposes public exceptions raised by the@beartype
decorator at decoration time and by wrapper functions generated by that decorator at call time. Since there’s probably no valid use case for actually catching and handling any of these exceptions in external third-party code, this submodule doesn’t do terribly much for anyone either.
Neither of those submodules could be considered to be an actual API. Post beartype 1.0.0
, we’d like to change that. The most meaningful change is the change everyone (including me, actually!) really wants: the flexibility to configure beartype
to deeply type-check more than merely one container item per nesting level per call. While O(1)
constant-time type-checking will always be the beartype
default, we’d also like to enable callers to both locally and globally enable:
O(log n)
logarithmic-time type-checking. Type-checking only a logarithmic number of container items per nesting level per call strikes a comfortable balance between type-checking just one and type-checking all containers items. Of course, this will still necessitate randomly selecting container items. Rather than just generating one random index each call, however, callables decorated withO(log n)
type-checking will need to generate a logarithmic number of random indices each call. Since the Python standard library provides no means of doing so, we need to look further afield. The most efficient means of doing so that is commonly available is thenumpy.random.default_rng().integers()
method. On systems lackingnumpy
,beartype
will probably ignore attempts to enableO(log n)
type-checking by emitting a non-fatal warning and falling back toO(1)
type-checking. Other than that, this seems mostly trivial to implement. Good 'nuff.O(n)
linear-time type-checking, affectionately referred to as “full fat” type-checking by an early adopter in a thread I’ve long since misplaced. Linear-time type-checking is reasonable only under certain ideal conditions that most callables fail to satisfy – like, say, callables guaranteed to receive and return either containers no larger than a certain size or larger containers that are only ever received or returned (and thus type-checked) exactly once throughout the entire codebase. But sometimes you absolutely know those things are the case (…at least for now) and you’re willing to play dice with the DevGodsOfCrunch that that will absolutely, probably, hopefully never change. This is even more trivial to implement. But don’t blame us when your entire web app crunches to a halt and you get “the call” on a Saturday night. We told you so.- Caller-configurable hybrid type-checking. In this use case, callers configure container sizes at which they want various type-checking strategies to kick in. For example, callers might stipulate that for containers of arbitrary size
n
:- If
n <= 10
, performO(n)
type-checking on those containers. - If
n <= 100
, performO(log n)
type-checking on those containers. - For all other sizes, fallback to
O(1)
type-checking on those containers.
- If
Here’s how this API might shape out in practice. First, we define a new beartype.config
submodule declaring a new enumeration ContainerStrategy = Enum('ContainerStrategy', 'O1 Ologn On Hybrid')
enabling callers to differentiate between these strategies. Then, we augment the @beartype
decorator to accept an optional container_strategy
parameter whose value is a member of this enumeration that (wait for it) defaults to O1
.
Here’s how that API would be used in practice:
from beartype import beartype
from beartype.config import ContainerStrategy
@beartype(container_strategy=ContainerStrategy.Ologn)
def i_know_what_im_doing(promise: list[list[list[int]]]) -> int:
'''
A logarithmic number of items at all three nesting levels of
the passed "promise" parameter will be deeply type-checked
on each call to this function.
'''
return promise[0[0[0]]]
Let’s say you decide you like that. Thanks to the magic of the functools.partial()
function, you could then mandate that strategy throughout your codebase as follows:
from beartype import beartype
from beartype.config import ContainerStrategy
from functools import partial
beartype_Ologn = partial(beartype, container_strategy=ContainerStrategy.Ologn)
@beartype_Ologn
def i_still_know_what_im_doing(i_swear: list[list[list[int]]]) -> int:
'''
A logarithmic number of items at all three nesting levels of
the passed "promise" parameter will be deeply type-checked
on each call to this function.
'''
return promise[0[0[0]]]
You would then decorate callables with your custom @beartype_Ologn
decorator rather than the default @beartype
decorator. This conveniently circumvents the need for beartype
to define a global configuration API, which I’m reluctant to do, because I’m lazy. Full stop.
Of course, we should probably just publicly define @beartype_Ologn
and @beartype_On
decorators for everybody so that nobody even has to explicitly bother with functools.partial()
. We should. But… we’re lazy!
And now for something completely different.
Beyond Typing PEPs, There Lies Third-Party Typing
Third-party types include all of the scientific ones that made Python a household name in machine learning, data science, material science, and biotechnology over the past decade – including:
- Multidimensional NumPy arrays.
- Multidimensional Pandas data frames.
- Graphs, including:
- NetworkX graphs, digraphs, and multigraphs.
- GraphViz-based graphs such as in PyDot.
- Probably many, many more.
- Tensors, including:
- PyTorch named and unnamed tensors.
- TensorFlow tensors.
- Probably many, many more.
- Geometric constructs like SciPy-based Voronoi diagrams and KD trees.
There will probably never be a standard for type-hinting these types, because these types reside outside the Python standard library.
But that doesn’t mean we’re done here. We can produce type hints that are both PEP-compliant and generically usable at runtime by any runtime type checker (as well as by static type checkers with explicit support for those hints).
How? By leveraging PEP 3119 - “Introducing Abstract Base Classes”, which standardized the __isinstancecheck__()
and __issubclasscheck__()
metaclass dunder methods. These methods enable third-party developers to dynamically create new classes on-the-fly that:
- Technically comply with PEP 484, because “user-defined classes (including those defined in the standard library or third-party modules)” are explicitly PEP-compliant.
- Can be used as PEP-compliant type hints to validate the structure of arbitrary third-party data structures, including those listed above.
The idea here is to extract that machinery into a new PyPI-hosted package named deeptyping
(or something something) that declares one type hint factory for each of the above data structures. deeptyping
will be designed from the ground-up to be implicitly useful at both static time and runtime by performing deep type-checking on runtime calls to the isinstance()
and issubclass()
builtins. This stands in stark contrast to the standard typing
module, which almost always prohibits calls to those builtins and is thus mostly useless at runtime. Ergo, “deep” typing.
For example, here’s what PEP-compliant NumPy array type hints might look like:
from beartype import beartype
from deeptyping import NumpyArray
# This constrains the passed "funbag_of_fun" parameter to be a two-dimensional
# NumPy array with any float dtype (e.g., "np.float32", "np.float64").
@beartype
def catch_the_funbag(funbag_of_fun: NumpyArray(dtype=float, ndim=2)) -> float:
return funbag_of_fun[0, 0]
The deeptyping.NumpyArray()
class factory function would have signature resembling:
from numpy import dtype as _dtype
def NumpyArray(dtype: _dtype, ndim: int): ...
That function would be implemented as a class factory dynamically creating and returning new memoized classes compliant with PEPs 3119 and 484. For example, here’s untested pseudocode (that will blow up horribly, which will make me feel bad, so don’t try this) for the metaclass of the class that function might create when passed the above parameters:
from numpy import ndarray
class _NumpyArrayDtypeFloatNdim2Metaclass(object):
'''
Metaclass dynamically generated by the :func:`deeptyping.NumpyArray`
class factory function for deep type-checking numpy arrays of
dtype :class:`float` and dimensionality 2 in a manner compliant with
PEPs 3119 and 484.
'''
def __isinstancecheck__(obj: object, cls: type) -> bool:
return (
isinstance(obj, ndarray) and
obj.dtype.type is float and
obj.ndim == 2
)
def __issubclasscheck__(subcls: type, cls: type) -> bool:
return issubclass(subcls, ndarray)
There are probably substantially better ways to do that. Hopefully, it would suffice to statically declare merely one generic metaclass that would then be shared between all dynamically created classes with that metaclass returned by the deeptyping.NumpyType()
factory function.
Anyways. Everything above is pure speculation. The overarching point is that neither beartype
nor any other runtime type checker will need to be refactored to depend on or otherwise explicitly support deeptyping
, because all PEP 484-compliant runtime type checkers already implicitly support that sort of thing. It is good.
Technically, there do exist third-party packages violating all annotation standards that support typing hinting of a few of the above data structures. The third-party Traits
package, for example, supports NumPy-specific type hints – but only by violating all annotation standards. That rather defeats the point of standards. From the important perspective of PEP-compliance, it is bad.
Thus deeptyping
.
Beyond typing_inspect
, There Lies pepitup
The dirty little secret behind Python’s annotation standards is that they all lack usable APIs. No, the typing.get_type_hints()
function doesn’t count, because that function doesn’t do anything useful that you can’t do yourself while doing a whole lot that’s non-useful (like being face-stabbingly slow). The only exception is PEP 544 – “Protocols: Structural subtyping (static duck typing)”, which does surprisingly have a usable API. That’s nice.
This pains us. In theory, it shouldn’t be possible to get any Python Enhancement Proposal (PEP) that lacks a usable API passed the peer review process – let alone the seven PEPs lacking usable APIs that beartype
currently supports.
But here we are. It happened. It happened because the official Python developer community wrongly perceived static type checking to be the only useful kind of type checking. Most annotation PEPs never even use the adjective “runtime” except in an incidental or derogatory manner. PEP 484, for example, declares:
The
Sequence[int]
notation works at runtime by implementing__getitem__()
in the metaclass (but its significance is primarily to an offline type checker).
No, its significance is to all type checkers. Who wrote that?
This PEP aims to provide a standard syntax for type annotations, opening up Python code to easier static analysis and refactoring, potential runtime type checking, and (perhaps, in some contexts) code generation utilizing type information. Of these goals, static analysis is the most important. This includes support for off-line type checkers such as mypy, as well as providing a standard notation that can be used by IDEs for code completion and refactoring.
Of course, it’s never explained why static type-checking is the most (by which they mean “only”) important goal. It’s just assumed a priori that runtime type-checking is sufficiently insane, inefficient, and ineffectual as to be universally unimportant – when, in fact, static type-checking of dynamically-typed languages is insane by definition.
Runtime type checkers can decide entire classes of decision problems undecidable by static type checkers, most of which are industry standard throughout the Python community.
Runtime type checkers also never report false negatives (not even for one-shot objects unintrospectable at runtime like generators and iterators). Runtime type checkers never have to guess, infer, or otherwise derive types. But static type checkers always do those things, because guesswork is all they do.
So beartype
challenges common assumptions merely by existing. But that’s not enough. We may have solved runtime type-checking by internally implementing private stand-in APIs for all these PEPs, but nobody else can safely access that or reuse our efforts.
We don’t have our Delorean yet, so we can’t retroactively go back and fix this in the past. Even adding public APIs for these PEPs to the Python standard library wouldn’t really help anyone, because Python 3.9 will still be alive through most of 2025. But that doesn’t mean we just have to “suck it up.”
The third-party typing_inspect
package partially solves this problem by defining a limited public API for these PEPs. But it only supports:
- The most recent stable release of Python.
- A sharply limited subset of these PEPs.
Because of these limitations, you couldn’t reimplement beartype
based on typing_inspect
, for example. But we can publicize beartype
internals as our own separate third-party package that supports:
- All actively maintained releases of Python.
- The complete feature set of these PEPs.
beartype
currently sequesters these internals to the private nested beartype._util.hint.pep
subpackage, with associated machinery haphazardly strewn about. The idea here is to extract that machinery into a new PyPI-hosted package named pepitup
(or something something) and then refactor beartype
to depend on that package.
It’s a nice idea. But nice ideas often never happen. Let’s see if this one does!
Influencer versus Introvert: Introvert Wins!
And last but certainly not least… influencing. Let’s recount the ways @leycec should start banging on that social influencing drum to hype up the nascent O(1)
runtime type-checking scene:
- Academic publication. This is the biggie.
beartype
is the table flip of the type-checking world. Cue that emoji.(╯°□°)╯︵ ┻━┻
beartype
behaves fundamentally differently (both theoretically and practically) from all existing type checkers – runtime or not. This means we have more than a few novel things to talk about. Clearly, I like talking. Let’s be honest, right? So that’s not the issue. The issue is that passing peer review as an independent scientist with no current University affiliation isballs hardnon-trivial. But I’m committed to doing this. Without publication, no one can formally citebeartype
in their own publications, which meansbeartype
is invisible to academia, which is bad. Moreover, publication constitutes a soft (maybe hard) prerequisite for securing eventual grant funding from governmental agencies in Canada and the U.S., the two nations I hold dual-citizenship in. For sanity, publication will probably happen in incremental stages:- arXiv preprint. In this stage, we publish a preliminary technical report documenting
beartype
features, tradeoffs, and complications. I’m fluent in LyX (which I love) and conversant in LaTeX (which I loathe), so this should be “fun” for several qualifying definitions of “fun.” - Informal peer review. In this stage, I beg various
beartype
aficionados, motivators, and early adopters with University affiliation for their totally nice and life-affirming constructive criticism. After my fragile ego recovers from the death blows by e-mail, I’ll scrap the whole report and rewrite it from the ground up for… - Journal publication! In this stage, we summit the high mountain of formal peer review with minimal blood loss, festering head wounds, and coronary artery disease. We can safely assume whatever journal(s) I submit to will have blatantly pitiable impact factors and submission standards suggestive of attention-seeking desparation, which I for one welcome.
- arXiv preprint. In this stage, we publish a preliminary technical report documenting
- Blog article series. My wife and I don’t even have blogs anymore… because we got lazy and they sorta bit-rotted and now we really wish we hadn’t let that happen. So that’s a blocker. But I promise the uncaring world this: we will resurrect those unseemly blogs from their digital graves in 2021 and it shall be glorious! We’re talking “GeoCities circa 1998” tier with dancing unicorns, Under Construction signs, Comic Sans typography, and
siezure-inducinghypnotic flashing text littering every ad-strewn header. - Twitter feed. Just… “Ugh.” I know I should. But I’m a lakeside hermit with an unwieldy penchant for wilderness areas, Japanese culture (早稲田大学生, represent), and Python type-checking. The intersection of these three interests is the empty set. But yeah. Gotta snag those sweet retweet amplifiers if we want
O(1)
to go the whole nine yards. Let’s see if @leycec can extricate himself from the seductive yet limiting INTP shell long enough to actualize this.
Phew. That real-world stuff really isn’t as easy as it looks.
And… We’re Done
🤯
Issue Analytics
- State:
- Created 3 years ago
- Reactions:17
- Comments:32 (24 by maintainers)
Top GitHub Comments
beartype 0.11.0
has just impacted the fragile surface of PyPI andconda-forge
, turning 😑 into 🤯. Sincepip
is our friend in all things:So much typing goodness is in store for your codebase. Play the highlight real, boys! Alternately, read the infodump and weep for your free time.
Colour: Beyond Monochrome
Is that… No, but it couldn’t be. Yes, but it is! It’s thematically appropriate and aesthetically pleasing ANSII pigmentation in type-checking violations raised by @beartype:
Colour me impressed, @beartype. For safety, colourization only happens conditionally when standard output is attached to an interactive terminal. Let us know if you detest this scheme. B-b-but… how could you? It’s beautiful.
Praise be to machine learning guru @justinchuby (Justin Chu) for his outstanding volunteerism in single-handedly making all this juiciness happen at PR #162. 😮
pyright: @beartype No Longer Disagrees with You
@beartype now officially supports two static type-checkers:
VSCode’s rampant popularity makes PyLance and thus pyright the new ad-hoc standard for Python typing. Microsoft’s will be done. I’m pretty certain – but not certain – that @beartype is the first “large” Python project to support multiple competing static type-checkers. We lack common sense.
Unofficially, we don’t advise doing this at home. The @beartype codebase is now a festering litter dump of
# type: ignore[curse-your-cheating-eyes-mypy]
and# pyright: ignore[TheseAreNotTheFalsePositivesYouAreLookingFor]
. Avert thy eyes, all who code there.Class Decoration: Save Your Wrists with @beartype
Rejoice, RSI-wracked buddies! @beartype now decorates classes, too – including non-trivial nested classes with self-referential annotations postponed under PEP 563, just 'cause:
Say goodbye to decorating methods manually. Your wrists that are throbbing with pain will thank you.
But this isn’t simply a nicety. If your codebase is object-oriented (…please Guido, let it be so), let us now suggest that you incrementally refactor your existing usage of @beartype from methods to classes. Why? Because:
typing.Self
).typing.Self
explodes onto the typing scene with Python 3.11, enabling methods to trivially annotate that they return instances of their classes.typing.Self
is probably the first in a new category of class-centric type hints. Exciting! But @beartype will only supporttyping.Self
when decorating classes – not methods. Less exciting.beartype.door: The Decidedly Object-oriented Runtime-checker
Oh, boy. Now we hittin’ the Hard Stuff™.
Has anyone ever tried to actually use type hints at runtime? Like, not merely annotate classes, callables, and attributes with type hints but actually use those type hints for a productive purpose? Anybody? Anybody? …helllllllllllllo?
Fret not, bear bros. @beartype now enables anyone to introspect, query, sort, compare, type-check, or otherwise manhandle type hints at any time in constant time. Dare your codebase open… the DOOR (Decidedly Object-oriented Runtime-checker)? spooky Halloween sounds
Praise be to Harvard microscopist and lead @napari dev @tlambert03 (Talley Lambert) for his phenomenal volunteerism in single-handedly building a new typing world. He ran the labyrinthian gauntlet of many, many painful PRs so that you didn’t have to.
beartype.peps: This Day All PEP 563 Dies
So.
beartype.door
. That’s great and all. Usable type hints. Blah, blah, blah. But what if you actually want to usetyping
-centric Python Enhancement Proposals (PEPs) that break runtime, were never intended to be used at runtime, and have no official runtime API? Of course, I speak of PEP 563.Fret not even more, bear bros. @beartype now enables anyone to resolve PEP 563-postponed type hints without actually needing to sleep on a bed of nails with this GitHub issue as your only comfort:
Now run that, because you trust @leycec implicitly. You do trust @leycec implicitly, don’t you? 🥲
beartype.peps:
the final word on PEP 563.Lastly but not leastly…
…to financially feed @leycec and his friendly @beartype through our new GitHub Sponsors profile. Come for the candid insider photos of a sordid and disreputable life in the Canadian interior; stay for the GitHub badge and warm feelings of general goodwill.
Cue hypnagogic rave music that encourages fiscal irresponsibility.
Shoutouts to the Beasts in Back
Greets to the bear homies: @tlambert03, @justinchuby, @rskol, @langfield, @posita, @braniii, @dosisod, @bitranox, @jdogburck, @da-geek-incite, @MaxSchoenau, @gelatinouscube42, @stevemarin, @rbroderi, @kloczek, @twoertwein, @wesselb, and @patrick-kidger. 🏩
Beartype 0.6.0 has been released to a fanfare of cat mewls, crackling icicles, and the quietude of a snow-blanketed Canadian winter. ☃️
We hope you are warm and safe, coding as only you can code. This release brings explicit support for
None
, subscripted generics, and PEP 561 compliance after resolving 10 issues and merging 8 pull requests. Changes include:Compatibility Improved
None
singleton. As a return type hint,None
is typically used to annotate callables containing no explicitreturn
statement and thus implicitly returningNone
.@beartype
now implicitly reducesNone
at all nesting levels of type hints to that singleton’s type per PEP 484.beartype
now fully conforms to PEP 561, resolving issue #25 kindly submitted by best macOS package manager ever @harens. In useful terms, this means that:beartype
now complies with mypy, Python’s popular third-party static type checker. If your package had no mypy errors or warnings before addingbeartype
as a mandatory dependency, your package will still have no mypy errors or warnings after addingbeartype
as a mandatory dependency.beartype
preserves PEP 561 compliance. If your package was PEP 561-compliant before addingbeartype
as a mandatory dependency, your package will still be PEP 561-compliant after addingbeartype
as a mandatory dependency. Of course, if your package currently is not PEP 561-compliant,beartype
can’t help you there. We’d love to, really. It’s us. Not you.beartype
codebase is now mostly statically rather than dynamically typed, much to our public shame. Thus begins the eternal struggle to preserve duck typing in a world that hates bugs.beartype
package now contains a top-levelpy.typed
file, publicly declaring this package to be PEP 561-compliant.Compatibility Broken
Packaging Improved
beartype
developers and automation tooling to trivially install recommended (but technically optional) dependencies. These include:pip install -e .[dev]
, installingbeartype
in editable mode as well as all dependencies required to both locally testbeartype
and build documentation forbeartype
from the command line.pip install beartype[doc-rtd]
, installingbeartype
as well as all dependencies required to build documentation from the external third-party Read The Docs (RTD) host.README.rst
file now documentsbeartype
installation with both Homebrew and MacPorts on macOS, entirely courtesy the third-party Homebrew tap and Portfile maintained by build automation specialist and mild-mannered student @harens. Thanks a London pound, Haren!Features Added
Public
beartype.cave
types and type tuples, including:beartype.cave.CallableCTypes
, a tuple of all C-based callable types (i.e., types whose instances are callable objects implemented in low-level C rather than high-level Python).beartype.cave.HintGenericSubscriptedType
, the C-based type of all subscripted generics if the active Python interpreter targets Python >= 3.9 orbeartype.cave.UnavailableType
otherwise. This type was previously namedbeartype.cave.HintPep585Type
before we belatedly realized this type broadly applies to numerous categories of PEP-compliant type hints, including PEP 484-compliant subscripted generics.Features Optimized
O(n)
→O(1)
exception handling.@beartype
now internally raises human-readable exceptions in the event of type-checking violations with anO(1)
rather thanO(n)
algorithm, significantly reducing time complexity for the edge case of invalid large sequences either passed to or returned from@beartype
-decorated callables. For forward compatibility with a future version ofbeartype
enabling users to explicitly switch between constant- and linear-time checking, the priorO(n)
exception-handling algorithm has been preserved in a presently disabled form.O(n)
→O(1)
callable introspection during internal memoization.@beartype
now avoids calling the inefficient stdlibinspect
module from our private@beartype._util.cache.utilcachecall.callable_cached
decorator memoizing functions throughout thebeartype
codebase. The priorO(n)
logic performed by that call has been replaced by equivalentO(1) logic performed by a call to our newly defined
beartype._util.func.utilfuncargsubmodule, optimizing function argument introspection without the unnecessary overhead of
inspect`.@beartype
now temporarily caches the code object for the currently decorated callable to support efficient introspection of that callable throughout the decoration process. Relatedly, this also has the beneficial side effect of explicitly raising human-readable exceptions from the@beartype
decorator on attempting to decorate C-based callables, which@beartype
now explicitly does not support, because C-based callables have no code objects and thus no efficient means of introspection. Fortunately, sane code only ever applies@beartype
to pure-Python callables anyway. …right, sane code? Right!?!?Features Deprecated
beartype.cave.HintPep585Type
type, to be officially removed inbeartype
0.1.0.Issues Resolved
str.replace()
calls.@beartype
now wraps all unsafe internal calls to the low-levelstr.replace()
method with calls to the considerably safer high-levelbeartype._util.text.utiltextmunge.replace_str_substrs()
function, guaranteeing that memoized placeholder strings are properly unmemoized during decoration-time code generation. Thanks to temperate perennial flowering plant @Heliotrop3 for this astute observation and resolution to long-standing background issue #11.KeyPool
release validation.@beartype
now validates that objects passed to therelease()
method of the privatebeartype._util.cache.pool.utilcachepool.KeyPool
class have been previously returned from theacquire()
method of that class. Thanks to @Heliotrop3, the formidable bug assassin, for their unswerving dedication to the cause of justice with this resolution to issue #13.@beartype
now internally provides a highly microoptimized Least Recently Used (LRU) cache for subsequent use throughout the codebase, particularly with respect to caching iterators over dictionaries, sets, and other non-sequence containers. This resolves issue #17, again graciously submitted by open-source bug mercenary @Heliotrop3.@beartype
now internally provides a privatebeartype._util.func.utilfuncorigin.get_callable_origin_label
getter synthesizing human-readable labels for the files declaring arbitrary callables, a contribution by master code-mangler @Heliotrop3 resolving issue #18. Thanks again for all the insidious improvements, Tyler! You are the master of everyone’s code domain.create-release
GitHub Action to @ncipollo’s actively maintainedrelease-action
, resolving issue #22 kindly submitted by human-AI-hybrid @Heliotrop3.Tests Improved
tox-gh-actions
GitHub Action streamliningtox
usage with our own ad-hoc build matrix that appears to be simpler and faster despite offering basically identical functionality.beartype
codebase for unresolvedFIXME:
comments.tox
and GitHub Actions-based continuous integration (CI) configurations now both correctly exercise themselves against both PyPy 3.6 and 3.7, resolving the upstream actions/setup-python#171 issue forbeartype
.mypy
package is installed under CPython. This test is sufficiently critical that we perform it under our CI workflow, guaranteeing test failures on any push or PR violating mypy expectations.README.rst
functional test, optionally exercising the syntactic validity of our front-facingREADME.rst
documentation when the third-partydocutils
package (i.e., the reference reST parser) is installed. This test is sufficiently expensive that we currently avoid performing it under our CI workflow.beartype._util.text.utiltextmunge
submodule with lavish attention to regex-based fuzzy testing of the criticalnumber_lines()
function. Humblegit log
shout outs go out to @Heliotrop3 for this mythic changeset that warps the fragile fabric of the GitHub cloud to its own pellucid yet paradoxically impenetrable intentions, resolving issue #24.Documentation Revised
beartype
repository now defines a largely unpopulated skeleton for Sphinx-generated documentation formatted as reST and typically converted to HTML to be hosted at Read The Docs (RTD), generously contributed by @felix-hilden, Finnish computer vision expert and our first contributor! This skeleton enables:autodoc
,viewcode
).sphinx_rtd_theme
, a third-party Sphinx extension providing RTD’s official Sphinx HTML.README.rst
documentation signifying the success of the most recent attempt to build and host this skeleton at RTD.sphinx
script, building Sphinx-based package documentation when manually run from the command line by interactive developers.beartype
mascot “Mr. Nectar Palm” – again courtesy @felix-hilden, because sleep is for the weak and Felix has never known the word.beartype
and existing type checkers, resolving clarity concerns raised by @kevinjacobs-progenity at issue #7. Thanks for the invaluable commentary, Kevin!beartype
in a live manner.API Changed
beartype.cave.CallableCTypes
.beartype.cave.HintGenericSubscriptedType
.beartype.cave.HintPep585Type
.