SpaCy NER training example from version 1.5.0 doesn't work in 1.6.0
See original GitHub issueI tried to use the training example here:
https://github.com/explosion/spaCy/blob/master/examples/training/train_ner.py
with SpaCy 1.6.0. I get results like this:
Who is Shaka Khan?
Who 1228 554 WP 2
is 474 474 VBZ PERSON 3
Shaka 57550 129921 NNP PERSON 1
Khan 12535 48600 NNP LOC 3
? 482 482 . LOC 3
I like London and Berlin
I 467 570 PRP LOC 3
like 502 502 VBP LOC 1
London 4003 24340 NNP LOC 3
and 470 470 CC PERSON 3
Berlin 11964 60816 NNP PERSON 1
The tagging is odd, and from Khan is recognized as a LOC and Berlin as a PERSON. If I back up to version 1.5.0, the result is as expected:
Who is Shaka Khan?
Who 1228 554 WP 2
is 474 474 VBZ 2
Shaka 57550 129921 NNP PERSON 3
Khan 12535 48600 NNP PERSON 1
? 482 482 . 2
I like London and Berlin
I 467 570 PRP 2
like 502 502 VBP 2
London 4003 24340 NNP LOC 3
and 470 470 CC 2
Berlin 11964 60816 NNP LOC 3
Could this be an issue with the off the shelf English model that spacy.en.download 1.6.0 fetched?
Issue Analytics
- State:
- Created 7 years ago
- Comments:7 (5 by maintainers)
Top Results From Across the Web
Training Pipelines & Models · spaCy Usage Documentation
Train and update components on your own data and integrate custom models.
Read more >Older versions of spaCy throws "KeyError: 'package'" error ...
Older versions of spaCy throws "KeyError: 'package'" error when trying to install a model ; :~/NeuroNER-master/src$ python3.5 ; "/usr/lib/python3.
Read more >pip install spacy==1.6.0 - PyPI
spaCy is a library for advanced natural language processing in Python and Cython. spaCy is built on the very latest research, but it...
Read more >Rasa Open Source Change Log
This breaks backward compatibility of previously trained models. It is not possible to load models trained with previous versions of Rasa Open ......
Read more >7. How to Train spaCy NER Model
In this notebook, we will not be interested in the refining of this training set, rather the use of it to train a...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
TL;DR
I made a bug fix to
thinc
for 1.6 that’s messed up the example, as it’s written.The best fix is to not call
.end_training()
after updating the model. I’m working on making this less confusing.What’s going on
spaCy 1.x uses the Averaged Perceptron algorithm for all its machine learning. You can read about the algorithm in the POS tagger blog post, where you can also find a straight-forward Python implementation: https://explosion.ai/blog/part-of-speech-pos-tagger-in-python
AP uses the Averaged Parameter Trick for SGD. There are two copies of the weights:
During training predictions are made with the current weights, and the averaged weights are updated in the background. At the end of training, we swap the current for the averages. This makes a huge difference for most training scenarios.
However, when I wrote the code, I didn’t pay much attention to the current use-case of “resuming” training, in order to add another class. I recently fixed a long-standing error in the averaged perceptron code:
After loading a model, Thinc was not initialising the averages to the newly loaded weights. This saves memory, because the averages require another copy of the weights, and also additional book-keeping. The consequence of this bug was that when you updated a feature after resuming training, you wiped the weights that were previously associated with it. This is really bad — it means that as you train new examples, you’re deleting all the information previously associated with it.
I finally fixed this bug in this commit: https://github.com/explosion/thinc/commit/09b030b4aa0e58fd3eef0eda5340795fd079b248
The consequence of this is that the correction makes the model behave differently on these small-data example cases.
What’s still unclear is, how should we compute an average between the old weights and the new ones? The old weights were trained on about 20 passes over about 80,000 sentences of annotation. So the new 5 passes over 5 examples shouldn’t change the weights at all if we take an unbiased average. This seems undesirable.
If you have so little data, it’s probably not a good idea to average.
About NER and training more generally (making this the megathread)
#762 , #612 , #701, #665 . Attn: @savvopoulos, @viksit
People are having a lot of pain with training the NER system. Some of the problems are easy to fix — the current workflow around saving and loading data is pretty bad, and it’s made worse by some Python 2/3 unicode save/load bugs in the example scripts.
What’s hard to solve is that people seem to want to train the NER system on like, 5 examples. The current algorithm expects more like 5,000. I realise I never wrote this anywhere, and the examples all show five examples. I guess I’ve been doing this stuff too long, and it’s no longer obvious to me what is and isn’t obvious. I think has been the root cause of a lot of confusion.
Things will improve with spaCy 2.0 a little bit. You might be able to get a useful model with as little as 500 or 1,000 sentences annotated with a new NER class. Maybe.
We’re working on ways to make all of this more efficient. We’re working on making annotation projects less expensive and more consistent, and we’re working on algorithms that require fewer annotated examples. But there will always be limits.
The thing is…I think most teams should be annotating literally 10,000x as much data as they’re currently trying to get away with. You should have at least 1,000 sentences just of evaluation data, that your machine learning model never sees. Otherwise how will you know that your system is working? By typing stuff into it, manually? You wouldn’t test your other code like that, would you? 😃
@honnibal Thanks for your explanation.
Currently, the example code of training and updating NER in the document only use 2 sentences, which is obviously not enough (I realize it after reading your comment).
I think if you put your explanation in the document, that will be better. Everyone tries to read the doc to learn something, they go to the issues only if they could not find what they want in the doc.
More problems about the example code
How to use the updated NER model? Update: find an example here: https://spacy.io/docs/usage/training#train-entity
Seems the example is trying to retrain a NER model, not update the original one?