Consider implementing Rich Repr Protocol for classes in hf_api
See original GitHub issueIs your feature request related to a problem? Please describe.
Currently, the __repr__
's for classes in the hf_api
module are nice and compact but can require a lot of side-scrolling to view all of the information. For example, the __repr__
for ModelInfo
looks like:
ModelInfo: {
modelId: flyswot/convnext-tiny-224_flyswot
sha: c6d4b2138e10efeafef8f5305ce16270ca583618
lastModified: 2022-04-05T16:08:35.000Z
tags: ['pytorch', 'convnext', 'image-classification', 'dataset:image_folder', 'transformers', 'generated_from_trainer', 'model-index']
pipeline_tag: image-classification
siblings: [ModelFile(rfilename='.gitattributes'), ModelFile(rfilename='.gitignore'), ModelFile(rfilename='README.md'), ModelFile(rfilename='config.json'), ModelFile(rfilename='preprocessor_config.json'), ModelFile(rfilename='pytorch_model.bin'), ModelFile(rfilename='training_args.bin')]
config: {'architectures': ['ConvNextForImageClassification'], 'model_type': 'convnext'}
id: flyswot/convnext-tiny-224_flyswot
author: flyswot
private: False
downloads: 999
library_name: transformers
model-index: [{'name': 'convnext-tiny-224_flyswot', 'results': [{'task': {'name': 'Image Classification', 'type': 'image-classification'}, 'dataset': {'name': 'image_folder', 'type': 'image_folder', 'args': 'default'}, 'metrics': [{'name': 'F1', 'type': 'f1', 'value': 0.9756290792360154}]}]}]
cardData: {'tags': ['generated_from_trainer'], 'datasets': ['image_folder'], 'metrics': ['f1'], 'model-index': [{'name': 'convnext-tiny-224_flyswot', 'results': [{'task': {'name': 'Image Classification', 'type': 'image-classification'}, 'dataset': {'name': 'image_folder', 'type': 'image_folder', 'args': 'default'}, 'metrics': [{'name': 'F1', 'type': 'f1', 'value': 0.9756290792360154}]}]}]}
}
Describe the solution you’d like
rich
is a:
Python library for rich text and beautiful formatting in the terminal.
One of the features rich
offers its own print
. An example of this
However:
Rich is able to syntax highlight any output, but the formatting is restricted to built-in containers, dataclasses, and other objects Rich knows about, such as objects generated by the attrs library. To add Rich formatting capabilities to custom objects, you can implement the rich repr protocol. -source
It might be nice to add this for the container classes under hf_api
. This requires implementing __rich_rep__
.
As an example, I have created a simple version of this for ModelInfo
:
def __rich_repr__(self:ModelInfo):
for key, val in self.__dict__.items():
yield key, val
This results in the following when using rich
’s version of print
:
ModelInfo(
modelId='flyswot/convnext-tiny-224_flyswot',
sha='c6d4b2138e10efeafef8f5305ce16270ca583618',
lastModified='2022-04-05T16:08:35.000Z',
tags=[
'pytorch',
'convnext',
'image-classification',
'dataset:image_folder',
'transformers',
'generated_from_trainer',
'model-index'
],
pipeline_tag='image-classification',
siblings=[
ModelFile(rfilename='.gitattributes'),
ModelFile(rfilename='.gitignore'),
ModelFile(rfilename='README.md'),
ModelFile(rfilename='config.json'),
ModelFile(rfilename='preprocessor_config.json'),
ModelFile(rfilename='pytorch_model.bin'),
ModelFile(rfilename='training_args.bin')
],
config={'architectures': ['ConvNextForImageClassification'], 'model_type': 'convnext'},
id='flyswot/convnext-tiny-224_flyswot',
author='flyswot',
private=False,
downloads=999,
library_name='transformers',
model-index=[
{
'name': 'convnext-tiny-224_flyswot',
'results': [
{
'task': {
'name': 'Image Classification',
'type': 'image-classification'
},
'dataset': {
'name': 'image_folder',
'type': 'image_folder',
'args': 'default'
},
'metrics': [{'name': 'F1', 'type': 'f1', 'value': 0.9756290792360154}]
}
]
}
],
cardData={
'tags': ['generated_from_trainer'],
'datasets': ['image_folder'],
'metrics': ['f1'],
'model-index': [
{
'name': 'convnext-tiny-224_flyswot',
'results': [
{
'task': {
'name': 'Image Classification',
'type': 'image-classification'
},
'dataset': {
'name': 'image_folder',
'type': 'image_folder',
'args': 'default'
},
'metrics': [
{'name': 'F1', 'type': 'f1', 'value': 0.9756290792360154}
]
}
]
}
]
}
)
It also comes with nice colour highlighting:
There is scope for making this nicer, but even though this version is much longer, it does allow everything to be visible, even on narrow displays.
One of the reasons I think this could be a candidate for inclusion is that:
you can add rich_repr methods to third-party libraries without including Rich as a dependency. If Rich is not installed, then nothing will break. Hopefully more third-party libraries will adopt Rich repr methods in the future.
This means that implementing a __rich__
repr doesn’t require this library or end-users, to have rich
installed. If they like the features of rich
, they get the benefits, but it’s not forced on anyone.
Describe alternatives you’ve considered
Not doing this! Although I find even the naive __rich_repr__
quite nice, this is a matter of personal preference. There is also a potential scope creep with implementing rich_repr__
across more and more code. However, I think the container classes in hf_api
are particularly good candidates because it contains information you might want to inspect on a terminal/print out in a notebook etc.
Additional context
If this is something that is deemed worth exploring more I would be happy to make an initial pull request to begin implementing this.
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:7 (4 by maintainers)
Top GitHub Comments
I’d love to give it a go! The only thing is I’m afk until the start of June but if you’re okay to wait until then, I’m happy to make a start on this.
Hey @davanstrien, that’s a very interesting proposal, thank you!
I think what we’d ideally reach for here is for the
__repr__
to be updated to output similar information to what is displayed by the__rich_repr__
you mention here, without necessarily implementing syntax highlighting.rich
looks like a cool package, but as we aim forhuggingface_hub
to be a very central package to be used across many different libraries, we aim for it to be as lean as possible. This means that we want it lean in terms of dependencies, of course (not impacted by this asrich
is optional), but also lean in terms of code readability, class attributes, etc. My personal feeling is that having an optional method that only benefits users that have a certain library installed and that adds features that could be implemented so that all users benefit from it without leveraging that optional dependency goes against that principle.What do you think?