question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Problem

As of now our model zoo doesn’t make it clear which models are available docs/model_zoo.md is not well maintained example doesn’t include BERT or MMF. It doesn’t show how users can submit a new model to the zoo (Making a PR is not enough since the S3 bucket is not publicly available and we don’t show users instructions on how to upload things, we assume they know how to use the aws api)

This creates problems since

  1. Submitted examples can’t have good test cases, we can’t check in mar files
  2. Limits code reuse between teams in open source, can’t share mar files with each other and it’s not clear what’s been done before or not

Solution

A better solution needs to make sure the zoo is public, searchable, automatically updated, allow user submissions and needs to worry about preventing user spam like spamming unwanted or harmful objects to an S3 bucket we maintain

Should we use something like pytorch model hub? hf hub? use a homegrown basic s3 hub?

Current Experience

The current experience is the torchserve team maintains an S3 bucket where only they have write access to common models users care about

Pros

  1. Curated models that work

Cons

  1. Doesn’t allow community contributions which prevents rich set of examples, higher quality unit tests and growth overall

Pytorch hub

Pros

  1. PyTorch brand, curated

Cons

  1. May require some work to support a mar file format
  2. Cannot host weights without code review, does not allow arbitrary files to be stored

HuggingFace Hub

Pros

  1. Can upload arbitrary files including mar files from either a web UI or CLI
  2. Model Hub discovery is good
  3. No code review process,

Cons

  1. anyone can submit (not sure how they deal with spam and harmful content)

Homegrown Hub

Create our own model hub, or maybe standardize mar format more and revamp torch hub?

Pros

  1. Most flexible, can support any data format we like

Cons

  1. Need to host a service so community members can submit and inspect available models
  2. Need to deal with security, spam and harmful content since if users can submit anything it’s a security risk to just unzip a random file from the internet

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
osansevierocommented, Mar 16, 2022

Hey all, Omar from HF Here 🤗

We’d love to support your use case on the Hugging Face Hub if it makes sense! Just for clarification, the Hub is not constrained to 🤗 transformers models (or models created with Trainer). The Hub uses git-based repositories that anyone can create and upload models to, we actually have integrations with different libraries, many of which are not transformers nor NLP-focused.

One thing that you might find useful is that model cards have metadata that allow reporting things such as the dataset, metrics, tags, etc. This can help with discoverability and even comparison of evaluation results.

There is also the community Inference API that enables widgets to try out the models directly in the browser (or through HTTP requests), or Spaces for fancier demos such as the ones at https://huggingface.co/pytorch.

Let us know if we can help 😄 🦙

cc @LysandreJik @julien-c

1reaction
msaroufimcommented, Mar 18, 2022

Hi @osanseviero I think this makes sense, I think at least for the hosting and model card part your hub is a good experience. I’m embarrassed to admit I couldn’t find instructions to upload directories or files and populate a simple model card to the hub directly so if you can link me one I can whip out a POC very quickly

For everything else let’s talk more. My email is my first name and last name at fb.com

Read more comments on GitHub >

github_iconTop Results From Across the Web

Model Zoo - Deep learning code and pretrained models for ...
ModelZoo curates and provides a platform for deep learning researchers to easily find code and pre-trained models for a variety of platforms and...
Read more >
DepthAI Model Zoo - GitHub
DepthAI Model Zoo is a collection of open-source neural network models and datasets created and maintained by DepthAI developers and the community. A...
Read more >
Model Zoo - Neural Network Distiller - GitHub Pages
Instead, the model-zoo contains a number of deep learning models that have been compressed using Distiller following some well-known research papers. These are ......
Read more >
Model Zoo: A Growing Brain That Learns Continually
This paper argues that continual learning methods can benefit by splitting the capacity of the learner across multiple models. We use statistical learning ......
Read more >
Open Model Zoo Demos - OpenVINO™ Documentation
To test your change, open a new terminal. You will see [setupvars.sh] OpenVINO environment initialized . To run Python demo applications ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found