question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Adding Universal Language Model Fine-tuning ULMFiT pre-trained LM to spacy and alowing a simple way to train new models

See original GitHub issue

Feature description

Universal Language Model Fine-tuning for Text Classification presented a novel method to fine tune a pre-trained universal language model to a particular classification task which achieved beyond state-of-the art (18-24% reduction in error rate) on multiple benchmark text classification tasks. The fine tuning requires very few examples (100) to achieve very good results.

Here is an excerpt of the abstract which provides a good TL;DR of the paper (duh):

Inductive transfer learning has greatly impacted computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch. We propose Universal Language Model Fine-tuning (ULMFiT), an effective transfer learning method that can be applied to any task in NLP, and introduce techniques that are key for fine-tuning a language model. Our method significantly outperforms the state-of-the-art on six text classification tasks, reducing the error by 18- 24% on the majority of datasets. Furthermore, with only 100 labeled examples, it matches the performance of training from scratch on 100× more data. We opensource our pretrained models and code

I propose that spacy adds their pre-trained models and a simple way to fine tune to a new task as a core feature of the library.

Could the feature be a custom component or spaCy plugin?

If so, we will tag it as project idea so other users can take it on.

This seems like a core feature of spacy, greatly increasing its industrial potential. I would argue to make it a first class citizen if authors and licensing of this work permits that.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:46
  • Comments:10 (3 by maintainers)

github_iconTop GitHub Comments

22reactions
sebastianrudercommented, May 18, 2018

Author here. I’d love to see this happen and I’m sure @jph00 would also be on board. Fast.ai is working on pre-trained models for other languages and we’ll be working to simplify and make the code more robust.

9reactions
honnibalcommented, May 19, 2018

Super keen on this! @jph00 the vision for plugging in other libraries is to have Thinc as a thin wrapper on top. I’ve just merged a PR on this, and have fixed up an example of wrapping a BiLSTM model and inserting it into a Thinc model: https://github.com/explosion/thinc/blob/master/examples/pytorch_lstm_tagger.py#L122

You can find the wrapper here: https://github.com/explosion/thinc/blob/master/thinc/extra/wrappers.py#L13

This wrapping approach is the long-standing plan for plugging “foreign” models into spaCy and Prodigy. We want to have similar wrappers for Tensorflow, DyNet, MXNet etc. The Thinc API is pretty minimal, so it’s easy to wrap this way.

Btw, as well as a plugin, I’m very interested in finding the right solution for pre-training the “embed” and “encode” steps in spaCy’s NER, parser, etc. The catch is that our performance target is 10k words per second per CPU core, which I think means we can’t use BiLSTM. The CNN architecture I’ve got is actually pretty good, and we’re currently only a little off the target (7.5k words per second in my latest tests).

Read more comments on GitHub >

github_iconTop Results From Across the Web

Universal Language Model Fine-tuning for Text Classification
We propose Universal Language Model Fine-tuning (ULMFiT), an effective transfer learning method that can be applied to any task in NLP, and ...
Read more >
ULMFiT - - PRIMO.ai
Adding Universal Language Model Fine-tuning ULMFiT pre-trained LM to spacy and alowing a simple way to train new models ...
Read more >
Universal Language Model Fine-tuning for Text Classification
Your browser can't play this video. Learn more. Switch camera.
Read more >
Multi-label classification of symptom terms from free-text ...
ULMFiT is an technique to fine-tune a pre-trained sequential LSTM-based LM to a target corpus and then fine-tune a classifier to a target...
Read more >
Khmer Language Model Using ULMFiT | by Phylypo Tum
Previously, we implemented a machine learning (ML) approach to categorize traffic accidents of Khmer news articles for our website.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found