[REQUEST] Enable customization of repr() of built-in types, add hook points to enable customization of external classes (e.g. from other libraries) without `@auto` or monkey-patching `__rich_repr__`
See original GitHub issuerich
is amazing, and thank you @willmcgugan for creating it and the community around it.
Questionnaire
- Consider posting in https://github.com/willmcgugan/rich/discussions for feedback before raising a feature request.
- This is almost certainly a terrible idea
- I don’t think community discussion would help here
- I took a look and searched, but did not find anything relevant (searched for “quote” and “json”)
- Have you checked the issues for a similar suggestions?
- I searched the issue tracker for any closed issues that mention “quote” or “json”
- I also spent a good amount of time with the docs and source code
Preface
rich.pretty.pprint
is amazing, I use it in so many things! My terminal is pretty, output is well-formatted, and coworkers look upon me with awe. 😉
One of those things is to print out objects consisting only of basic Python types (str
, int
, list
, dict
), which means that the output (after settings indent_guides=False
) of pprint
for an object heirarchy of these types is ALSO valid JSON! 🎉
…sometimes.
Problem
JSON only accepts strings with double-quotes. By default, Python’s str.__repr__
returns a string that uses single-quotes, unless the string contains a single quote but does not contain a double-quote.
This means that a very large percentage of strings will be single-quoted, which is not valid JSON. This is annoying.
>>> 'asdf'
'asdf'
>>> 'asdf"'
'asdf"'
>>> 'asdf\''
"asdf'"
>>> 'asdf\'\"'
'asdf\'"'
This can be more readily seen by actually using Python’s json
module. Note the change in double-quotes to single-quotes.
>>> json.loads('{"key": "value"}')
{'key': 'value'}
>>> json.loads("{'key': 'value'}")
JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
How would you improve Rich?
Give as much detail as you can. Example code of how you would like it to work would help.
Easy Fix
Enable rich
power users the ability to hook the repr’d output of all types (including builtins like str
).
Move to_repr
outside the scope of _traverse
in rich.pretty
, and just pass in max_string
as an argument instead of relying on variable capture.
Robust Fix
Give rich
more robust options for controlling the display of objects they do not have control over, without requiring class wrapping (rich.repr.auto
) or monkey patching (some_class.__rich_repr__ = myFancyRepr
).
Option 1: Hooking based on isinstance()
. This would allow for pretty-printing any sub-type of an object.
Option 2: Hooking based on type()
. This would allow for EXACT class matches, useful for class heirarchies where the base class may actually be instantiated and isinstance()
is too broad.
What problem does it solved for you?
What problem do you have that this feature would solve? I may be able to suggest an existing way of solving it.
Copy-pasting rich.pretty.pprint
output for basic types would be valid JSON (after hacking in json.dumps()
for str.__repr__
equivalent).
This functionality could also be used to add custom display logic to types that were not built with rich
in mind – for example, having a specially-formatted pretty-print for some library / dependency object class, without the need to monkey-patch __repr__
or __rich_repr__
on that type, or relying on auto
.
By the way rich.repr.auto
is pretty great, but I wish there was a variant that used obj.__dict__
instead of the prototype of obj.__init__
. I realize this flies in the face of the actual purpose of repr()
, but hey – customization is fun! (And very few people write repr()
in a manner that can actually re-constitute objects)
Research
I figure I haven’t been the first guy to want something like this, so to Google, Python.org, and StackOverflow we go.
There’s a few threads about forcing quoting in str.__repr__
, but everything that talks about JSON is just using e.g. json.dumps
or json.loads
. As mentioned earlier, these ARE great and DO properly convert objects to valid JSON – regardless of how Python might decide to print them.
But I want pretty output. That’s what rich
is all about, right?
__rich_repr__
After actually RTFM’ing, I saw a shimmer of hope when I read about __rich_repr__
! rich
does support custom presentation!
This was a wonderful finding! My prayers had been answered. So I tried it out
import rich.pretty
def rich_repr(self):
yield f'rich_repr'
yield 1
yield 2
def repr_(self): return f'repr_'
def str_(self): return f'str_'
class A:
__rich_repr__ = rich_repr
__repr__ = repr_
__str__ = str_
a = A()
obj = {'a': a}
print('a:', a)
# a: str_
print('a!r:', repr(a))
# a!r: repr_
print(f'str: {obj!s}')
# str: {'a': repr_}
print(f'repr: {obj!r}')
# repr: {'a': repr_}
rich.pretty.pprint(obj)
# {'a': A('rich_repr', 1, 2)}
Turns out it does not do what I need to do, but it’s still a neat feature! Very powerful and very cool. 😎
Can we even use __rich_repr__
on str
? No, we can’t.
>>> str.__rich_repr__ = rich_repr
TypeError: can't set attributes of built-in/extension type 'str'
👎 Conclusion: __rich_repr__
is tangential to my goal (and cool!) but doesn’t do what I need (and cannot be added to built-in types anyway)
__repr__
Let’s just use built-in Python features. Continuing on the above example:
class B:
__repr__ = repr_
__str__ = str_
b = B()
obj = {'a': a, 'b': b}
rich.pretty.pprint(obj)
# {'a': A('rich_repr', 1, 2), 'b': repr_}
This yields something closer to what we want, and shows that rich
does indeed use __repr__
when there is no __rich_repr__
.
No surprises here, and everything looks normal (the repr_
value is unquoted, since we didn’t add quotes in repr_
the function).
🤙Conclusion: Absent __rich_repr__
, rich will use good old __repr__
Override str.__repr__
However, there’s a big problem: We cannot override the value of str.__repr__
– it’s immutable.
>>> str.__repr__
<slot wrapper '__repr__' of 'str' objects>
>>> str.__repr__ = repr_
TypeError: can't set attributes of built-in/extension type 'str'
👎 Conclusion: We cannot override str.__repr__
Subclassing str
Perhaps we an subclass str
and just rely on json.dumps()
to do the heavy lifting!
import json
class Str(str):
def __repr__(self):
return json.dumps(self)
value = Str('value_string')
rich.pretty.pprint({'key': value})
# {'key': "value_string"}
Voila! The key value is single-quoted, as described very far above. However, the value, of our fun-new-class, Str
, is double-quoted! Success!
However, this would require walking the object heirarchy of my dict
and array
objects, and replacing every str
object with a funky Str
object. This is definitely achievable! Let’s give it a go.
👍 Conslusion: Subclassing str
to use a custom __repr__
works
Replacing Objects
Let’s see if we can replace objects with our Str
type.
For simplicity, let’s just worry about a basic dictionary with a two keys and two values.
>>> x1 = 'x1'
>>> x2 = Str('x2')
>>> obj = { x1: 'str_value', x2: Str('Str_value') }
>>> obj
{'x1': 'str_value', "x2": "Str_value"}
This is expected, and we’re closer. Let’s not worry about other types, and just replace the keys and values with our Str
type. Note we iterate over a copy of obj
so we can modify obj
from the loop.
>>> for k, v in dict(obj).items():
... obj[Str(k)] = Str(v)
...
>>> obj
{'x1': "str_value", "x2": "Str_value"}
Closer! This might seem wrong at quick glance, but it’s correct, since we updated the value in the dictionary.
>>> x1
'x1'
>>> Str(x1)
"x1"
>>> x1 == Str(x1)
True
>>> hash(x1) == hash(Str(x1))
True
Since a matching key was found, the value was updated – but the key itself was not.
💰 If you find this line, mention it in your comment and I’ll toss you $20 for coffee.
Let’s start over and try that loop again.
>>> obj = { x1: 'str_value', x2: Str('Str_value') }
>>> obj
{'x1': 'str_value', "x2": "Str_value"}
>>> for k in dict(obj):
... v = obj.pop(k)
... obj[Str(k)] = Str(v)
...
>>> obj
{"x1": "str_value", "x2": "Str_value"}
And huzzah! We’re on the right track.
👍 Conslusion: Manually replacing str
objects with subclassed objects works, but is potentially tedious. Definitely not an ideal situation. rich
already has to handle iteration of container types (effectively) with _CONTAINERS
and _BRACES
etc.
Rich Pretty Internals
I decided to dive into the internals of rich
to see how __rich_repr__
is actually implemented, and see if there were any hook points.
For this spelunking, I used the following code snippet, for simplicity:
import rich.pretty
rich.pretty.pprint(['hello'])
Ultimately, you end up in pretty.py
at the _traverse
function, specifically the inner function to_repr
, which performs the actual calls repr
on the object.
It looks like there is no way to intercept this, since to_repr
is declared dynamically inside _traverse
ipdb> bt
/Users/zachriggle/github.com/rich/wtf.py(36)<module>()
---> 36 rich.pretty.pprint(['hello'])
...
/Users/zachriggle/github.com/rich/rich/pretty.py(717)pretty_repr()
--> 717 node = traverse(_object, max_length=max_length, max_string=max_string)
/Users/zachriggle/github.com/rich/rich/pretty.py(684)traverse()
--> 684 node = _traverse(_object, root=True)
/Users/zachriggle/github.com/rich/rich/pretty.py(669)_traverse()
--> 669 child_node = _traverse(child)
/Users/zachriggle/github.com/rich/rich/pretty.py(680)_traverse()
--> 680 node = Node(value_repr=to_repr(obj), last=root)
> /Users/zachriggle/github.com/rich/rich/pretty.py(478)to_repr()
--> 478 obj_repr = repr(obj)
ipdb> p obj
'hello'
👎 Conclusion: There are no extension / hook points in rich
to manually override the presentation of specific types, because to_repr
is declared dynamically inside the scope of _traverse
Closing Notes
It looks like there is not currently a way to arbitrarily customize the output of random classes (specifically builtins) with rich.print
or rich.pretty.pprint
.
We cannot add new properties to str
(e.g. __rich_repr__
, nor can we redefine its existing ones (e.g. __repr__
), and it looks like rich.pretty.pretty_repr
is only called with the top-level object (which we could hook) – which for str
types goes directly to repr
via to_repr
which we cannot hook.
It looks like an easy fix is to move to_repr
outside traverse
, as its only dependency on that scope is the max_string
value – this could simply be passed in via argument rather than capture.
Did I help
If I was able to resolve your problem, consider sponsoring my work on Rich, or buy me a coffee to say thanks.
I hope so! I definitely look forward to contributing some coffee money your way!
Issue Analytics
- State:
- Created 2 years ago
- Comments:7
Closing this, because I don’t think that pretty.py is the best solution for JSON. I think I can implement a more reliable JSON pretty printing solution.
My case is lucky that I don’t have to worry about this too much, but those are good points.
Oh, I don’t expect this to be something Rich officially supports! Thus:
But exposing the function
do_repr
per #1415 is a small change and enables my hacking of the functionality to do what I need it to.This is news to me! How does this actually work? Are there docs on this? I must’ve missed them.
I’m more interested in the way that Rich formats (i.e. spaces and newlines, keeping some
dict
s on the same line, etc) the data than the colorizing. I don’t think thejson
pretty-printing (as shown) achieves the density of information while still being readable that Rich does.