Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Model Zoo Revamp

See original GitHub issue

Problem

As of now our model zoo doesn’t make it clear which models are available docs/model_zoo.md is not well maintained example doesn’t include BERT or MMF. It doesn’t show how users can submit a new model to the zoo (Making a PR is not enough since the S3 bucket is not publicly available and we don’t show users instructions on how to upload things, we assume they know how to use the aws api)

This creates problems since

Submitted examples can’t have good test cases, we can’t check in mar files
Limits code reuse between teams in open source, can’t share mar files with each other and it’s not clear what’s been done before or not

Solution

A better solution needs to make sure the zoo is public, searchable, automatically updated, allow user submissions and needs to worry about preventing user spam like spamming unwanted or harmful objects to an S3 bucket we maintain

Should we use something like pytorch model hub? hf hub? use a homegrown basic s3 hub?

Current Experience

The current experience is the torchserve team maintains an S3 bucket where only they have write access to common models users care about

Pros

Curated models that work

Cons

Doesn’t allow community contributions which prevents rich set of examples, higher quality unit tests and growth overall

Pytorch hub

Pros

PyTorch brand, curated

Cons

May require some work to support a mar file format
Cannot host weights without code review, does not allow arbitrary files to be stored

HuggingFace Hub

Pros

Can upload arbitrary files including mar files from either a web UI or CLI
Model Hub discovery is good
No code review process,

Cons

anyone can submit (not sure how they deal with spam and harmful content)

Homegrown Hub

Create our own model hub, or maybe standardize mar format more and revamp torch hub?

Pros

Most flexible, can support any data format we like

Cons

Need to host a service so community members can submit and inspect available models
Need to deal with security, spam and harmful content since if users can submit anything it’s a security risk to just unzip a random file from the internet

Issue Analytics

State:
Created 2 years ago
Comments:10 (5 by maintainers)

Top GitHub Comments

2reactions

osansevierocommented, Mar 16, 2022

Hey all, Omar from HF Here 🤗

We’d love to support your use case on the Hugging Face Hub if it makes sense! Just for clarification, the Hub is not constrained to 🤗 transformers models (or models created with Trainer). The Hub uses git-based repositories that anyone can create and upload models to, we actually have integrations with different libraries, many of which are not transformers nor NLP-focused.

One thing that you might find useful is that model cards have metadata that allow reporting things such as the dataset, metrics, tags, etc. This can help with discoverability and even comparison of evaluation results.

There is also the community Inference API that enables widgets to try out the models directly in the browser (or through HTTP requests), or Spaces for fancier demos such as the ones at https://huggingface.co/pytorch.

Let us know if we can help 😄 🦙

cc @LysandreJik @julien-c

1reaction

msaroufimcommented, Mar 18, 2022

Hi @osanseviero I think this makes sense, I think at least for the hosting and model card part your hub is a good experience. I’m embarrassed to admit I couldn’t find instructions to upload directories or files and populate a simple model card to the hub directly so if you can link me one I can whip out a POC very quickly

For everything else let’s talk more. My email is my first name and last name at fb.com

Top Results From Across the Web

Model Zoo - Deep learning code and pretrained models for ...

ModelZoo curates and provides a platform for deep learning researchers to easily find code and pre-trained models for a variety of platforms and...

DepthAI Model Zoo - GitHub

DepthAI Model Zoo is a collection of open-source neural network models and datasets created and maintained by DepthAI developers and the community. A...

Model Zoo - Neural Network Distiller - GitHub Pages

Instead, the model-zoo contains a number of deep learning models that have been compressed using Distiller following some well-known research papers. These are ......

Model Zoo: A Growing Brain That Learns Continually

This paper argues that continual learning methods can benefit by splitting the capacity of the learner across multiple models. We use statistical learning ......

Open Model Zoo Demos - OpenVINO™ Documentation

To test your change, open a new terminal. You will see [setupvars.sh] OpenVINO environment initialized . To run Python demo applications ...