question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[REQUEST] Enable customization of repr() of built-in types, add hook points to enable customization of external classes (e.g. from other libraries) without `@auto` or monkey-patching `__rich_repr__`

See original GitHub issue

rich is amazing, and thank you @willmcgugan for creating it and the community around it.

Questionnaire

  • Consider posting in https://github.com/willmcgugan/rich/discussions for feedback before raising a feature request.
    • This is almost certainly a terrible idea
    • I don’t think community discussion would help here
    • I took a look and searched, but did not find anything relevant (searched for “quote” and “json”)
  • Have you checked the issues for a similar suggestions?
    • I searched the issue tracker for any closed issues that mention “quote” or “json”
    • I also spent a good amount of time with the docs and source code

Preface

rich.pretty.pprint is amazing, I use it in so many things! My terminal is pretty, output is well-formatted, and coworkers look upon me with awe. 😉

One of those things is to print out objects consisting only of basic Python types (str, int, list, dict), which means that the output (after settings indent_guides=False) of pprint for an object heirarchy of these types is ALSO valid JSON! 🎉

…sometimes.

Problem

JSON only accepts strings with double-quotes. By default, Python’s str.__repr__ returns a string that uses single-quotes, unless the string contains a single quote but does not contain a double-quote.

This means that a very large percentage of strings will be single-quoted, which is not valid JSON. This is annoying.

>>> 'asdf'
'asdf'
>>> 'asdf"'
'asdf"'
>>> 'asdf\''
"asdf'"
>>> 'asdf\'\"'
'asdf\'"'

This can be more readily seen by actually using Python’s json module. Note the change in double-quotes to single-quotes.

>>> json.loads('{"key": "value"}')
{'key': 'value'}
>>> json.loads("{'key': 'value'}")
JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

How would you improve Rich?

Give as much detail as you can. Example code of how you would like it to work would help.

Easy Fix

Enable rich power users the ability to hook the repr’d output of all types (including builtins like str).

Move to_repr outside the scope of _traverse in rich.pretty, and just pass in max_string as an argument instead of relying on variable capture.

Robust Fix

Give rich more robust options for controlling the display of objects they do not have control over, without requiring class wrapping (rich.repr.auto) or monkey patching (some_class.__rich_repr__ = myFancyRepr).

Option 1: Hooking based on isinstance(). This would allow for pretty-printing any sub-type of an object.

Option 2: Hooking based on type(). This would allow for EXACT class matches, useful for class heirarchies where the base class may actually be instantiated and isinstance() is too broad.

What problem does it solved for you?

What problem do you have that this feature would solve? I may be able to suggest an existing way of solving it.

Copy-pasting rich.pretty.pprint output for basic types would be valid JSON (after hacking in json.dumps() for str.__repr__ equivalent).

This functionality could also be used to add custom display logic to types that were not built with rich in mind – for example, having a specially-formatted pretty-print for some library / dependency object class, without the need to monkey-patch __repr__ or __rich_repr__ on that type, or relying on auto.

By the way rich.repr.auto is pretty great, but I wish there was a variant that used obj.__dict__ instead of the prototype of obj.__init__. I realize this flies in the face of the actual purpose of repr(), but hey – customization is fun! (And very few people write repr() in a manner that can actually re-constitute objects)

Research

I figure I haven’t been the first guy to want something like this, so to Google, Python.org, and StackOverflow we go.

There’s a few threads about forcing quoting in str.__repr__, but everything that talks about JSON is just using e.g. json.dumps or json.loads. As mentioned earlier, these ARE great and DO properly convert objects to valid JSON – regardless of how Python might decide to print them.

But I want pretty output. That’s what rich is all about, right?

__rich_repr__

After actually RTFM’ing, I saw a shimmer of hope when I read about __rich_repr__! rich does support custom presentation!

This was a wonderful finding! My prayers had been answered. So I tried it out

import rich.pretty

def rich_repr(self): 
    yield f'rich_repr'
    yield 1
    yield 2
    
def repr_(self): return f'repr_'
def str_(self): return f'str_'

class A:
    __rich_repr__ = rich_repr
    __repr__ = repr_
    __str__ = str_

a = A()
obj = {'a': a}

print('a:', a)
# a: str_

print('a!r:', repr(a))
# a!r: repr_

print(f'str: {obj!s}')
# str: {'a': repr_}

print(f'repr: {obj!r}')
# repr: {'a': repr_}

rich.pretty.pprint(obj)
# {'a': A('rich_repr', 1, 2)}

Turns out it does not do what I need to do, but it’s still a neat feature! Very powerful and very cool. 😎

Can we even use __rich_repr__ on str? No, we can’t.

>>> str.__rich_repr__ = rich_repr
TypeError: can't set attributes of built-in/extension type 'str'

👎 Conclusion: __rich_repr__ is tangential to my goal (and cool!) but doesn’t do what I need (and cannot be added to built-in types anyway)

__repr__

Let’s just use built-in Python features. Continuing on the above example:

class B:
    __repr__ = repr_
    __str__ = str_

b = B()
obj = {'a': a, 'b': b}
rich.pretty.pprint(obj)
# {'a': A('rich_repr', 1, 2), 'b': repr_}

This yields something closer to what we want, and shows that rich does indeed use __repr__ when there is no __rich_repr__.

No surprises here, and everything looks normal (the repr_ value is unquoted, since we didn’t add quotes in repr_ the function).

🤙Conclusion: Absent __rich_repr__, rich will use good old __repr__

Override str.__repr__

However, there’s a big problem: We cannot override the value of str.__repr__ – it’s immutable.

>>> str.__repr__
<slot wrapper '__repr__' of 'str' objects>
>>> str.__repr__ = repr_
TypeError: can't set attributes of built-in/extension type 'str'

👎 Conclusion: We cannot override str.__repr__

Subclassing str

Perhaps we an subclass str and just rely on json.dumps() to do the heavy lifting!

import json

class Str(str):
    def __repr__(self):
        return json.dumps(self)

value = Str('value_string')

rich.pretty.pprint({'key': value})
# {'key': "value_string"}

Voila! The key value is single-quoted, as described very far above. However, the value, of our fun-new-class, Str, is double-quoted! Success!

However, this would require walking the object heirarchy of my dict and array objects, and replacing every str object with a funky Str object. This is definitely achievable! Let’s give it a go.

👍 Conslusion: Subclassing str to use a custom __repr__ works

Replacing Objects

Let’s see if we can replace objects with our Str type.

For simplicity, let’s just worry about a basic dictionary with a two keys and two values.

>>> x1 = 'x1'
>>> x2 = Str('x2')
>>> obj = { x1: 'str_value', x2: Str('Str_value') }
>>> obj
{'x1': 'str_value', "x2": "Str_value"}

This is expected, and we’re closer. Let’s not worry about other types, and just replace the keys and values with our Str type. Note we iterate over a copy of obj so we can modify obj from the loop.

>>> for k, v in dict(obj).items():
...     obj[Str(k)] = Str(v)
...
>>> obj
{'x1': "str_value", "x2": "Str_value"}

Closer! This might seem wrong at quick glance, but it’s correct, since we updated the value in the dictionary.

>>> x1
'x1'
>>> Str(x1)
"x1"
>>> x1 == Str(x1)
True
>>> hash(x1) == hash(Str(x1))
True

Since a matching key was found, the value was updated – but the key itself was not.

💰 If you find this line, mention it in your comment and I’ll toss you $20 for coffee.

Let’s start over and try that loop again.

>>> obj = { x1: 'str_value', x2: Str('Str_value') }
>>> obj
{'x1': 'str_value', "x2": "Str_value"}
>>> for k in dict(obj):
...     v = obj.pop(k)
...     obj[Str(k)] = Str(v)
...
>>> obj
{"x1": "str_value", "x2": "Str_value"}

And huzzah! We’re on the right track.

👍 Conslusion: Manually replacing str objects with subclassed objects works, but is potentially tedious. Definitely not an ideal situation. rich already has to handle iteration of container types (effectively) with _CONTAINERS and _BRACES etc.

Rich Pretty Internals

I decided to dive into the internals of rich to see how __rich_repr__ is actually implemented, and see if there were any hook points.

For this spelunking, I used the following code snippet, for simplicity:

import rich.pretty
rich.pretty.pprint(['hello'])

Ultimately, you end up in pretty.py at the _traverse function, specifically the inner function to_repr, which performs the actual calls repr on the object.

https://github.com/willmcgugan/rich/blob/0c4704324c179c857db634b65d34f78bd01a2a0a/rich/pretty.py#L464-L478

https://github.com/willmcgugan/rich/blob/0c4704324c179c857db634b65d34f78bd01a2a0a/rich/pretty.py#L675-L676

It looks like there is no way to intercept this, since to_repr is declared dynamically inside _traverse

ipdb> bt
  /Users/zachriggle/github.com/rich/wtf.py(36)<module>()
---> 36 rich.pretty.pprint(['hello'])

...

  /Users/zachriggle/github.com/rich/rich/pretty.py(717)pretty_repr()
--> 717         node = traverse(_object, max_length=max_length, max_string=max_string)

  /Users/zachriggle/github.com/rich/rich/pretty.py(684)traverse()
--> 684     node = _traverse(_object, root=True)

  /Users/zachriggle/github.com/rich/rich/pretty.py(669)_traverse()
--> 669                         child_node = _traverse(child)

  /Users/zachriggle/github.com/rich/rich/pretty.py(680)_traverse()
--> 680             node = Node(value_repr=to_repr(obj), last=root)

> /Users/zachriggle/github.com/rich/rich/pretty.py(478)to_repr()
--> 478                 obj_repr = repr(obj)

ipdb> p obj
'hello'

👎 Conclusion: There are no extension / hook points in rich to manually override the presentation of specific types, because to_repr is declared dynamically inside the scope of _traverse

Closing Notes

It looks like there is not currently a way to arbitrarily customize the output of random classes (specifically builtins) with rich.print or rich.pretty.pprint.

We cannot add new properties to str (e.g. __rich_repr__, nor can we redefine its existing ones (e.g. __repr__), and it looks like rich.pretty.pretty_repr is only called with the top-level object (which we could hook) – which for str types goes directly to repr via to_repr which we cannot hook.

It looks like an easy fix is to move to_repr outside traverse, as its only dependency on that scope is the max_string value – this could simply be passed in via argument rather than capture.

Did I help

If I was able to resolve your problem, consider sponsoring my work on Rich, or buy me a coffee to say thanks.

I hope so! I definitely look forward to contributing some coffee money your way!

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7

github_iconTop GitHub Comments

1reaction
willmcgugancommented, Aug 28, 2021

Closing this, because I don’t think that pretty.py is the best solution for JSON. I think I can implement a more reliable JSON pretty printing solution.

1reaction
zachrigglecommented, Aug 18, 2021

I suspect there are differences in escaping strings in Python. Also None -> null, False -> false, True -> True etc.

My case is lucky that I don’t have to worry about this too much, but those are good points.

I get you are proposing an extension mechanism since Rich pprint is already pretty close

Oh, I don’t expect this to be something Rich officially supports! Thus:

This is almost certainly a terrible idea

But exposing the function do_repr per #1415 is a small change and enables my hacking of the functionality to do what I need it to.

Rich can still colorize the output.

This is news to me! How does this actually work? Are there docs on this? I must’ve missed them.

colorize the output

I’m more interested in the way that Rich formats (i.e. spaces and newlines, keeping some dicts on the same line, etc) the data than the colorizing. I don’t think the json pretty-printing (as shown) achieves the density of information while still being readable that Rich does.

Read more comments on GitHub >

github_iconTop Results From Across the Web

[REQUEST] Enable customization of repr() of built-in types ...
[REQUEST] Enable customization of repr() of built-in types, add hook points to enable customization of external classes (e.g. from other libraries) without ......
Read more >
Python: Why does a dynamically added __repr__ ...
For custom classes, implicit invocations of special methods are only guaranteed to work correctly if defined on an object's type, not in the ......
Read more >
Customizing the string representation of your objects
You'll almost always just want to customize one of Python's two string representations (repr, the programmer readable-one).
Read more >
2016-July.txt - Python mailing list
I use a class with a custom metaclass like this: # Python 3 version ... allow functions inside the namespace to access names...
Read more >
Python String Conversion 101: Why Every Class Needs a “repr”
How and why to implement Python “to string” conversion in your own classes using Python's “repr” and “str” mechanisms and associated coding conventions....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found