question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

MiniLM: releasing all models?

See original GitHub issue

Hi there,

First of all: great work on distilling a strong teacher into a well performing student and eliminating the issue of parameter size discrepancy in teacher-student models! I am always happy to see smaller, usable models.

I was wondering if you plan to release the Small MiniLM model (L6xH384). It says We release the uncased 12-layer and 6-layer MiniLM models with 384 hidden size [...], but I can only find the link to the 12-layer model.

Thanks so much

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

3reactions
nreimerscommented, Jun 9, 2021

Hi @WenhuiWang0824

great, thank you for the great work and releasing the model. The MiniLMv1 models work great for bi-encoders and cross-encoders, so I’m eager to test the v2 models.

It would be great if the models could also be added to the huggingface model hub: https://huggingface.co/microsoft

This would make it easy to load and use the models.

Let me know if you need help putting the models on the hub.

2reactions
wenhui0924commented, Jun 9, 2021

Hi @volker42maru and @maksymbevza,

We have released the monolingual and multilingual minilmv2 models distilled from different teachers. Please find the model links in the MiniLM folder.

Thanks

Read more comments on GitHub >

github_iconTop Results From Across the Web

README.md · sentence-transformers/all-MiniLM-L6-v2 at main
We're on a journey to advance and democratize artificial intelligence through open source and open science.
Read more >
MiniLM: Deep Self-Attention Distillation for Task ... - arXiv
Comparison between the publicly released 6-layer models with 768 hidden size distilled from BERTBASE. We compare task-.
Read more >
MINILM: Deep Self-Attention Distillation for ... - NIPS papers
Table 2: Comparison between the publicly released 6-layer models with 768 hidden size distilled from BERTBASE. We compare task-agnostic distilled models without ...
Read more >
Compatible third party NLP models - Elastic
The Elastic Stack machine learning features support transformer models that ... All MiniLM L12 v2 Suitable similarity functions: dot_product , cosine , ...
Read more >
https://raw.githubusercontent.com/microsoft/unilm/...
[Model Release] September, 2022: [**BEiT ... Both English and multilingual MiniLM models are released. "[MiniLMv2: Multi-Head Self-Attention ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found