Retrain JUST the NER component to have character CNN features?
See original GitHub issueRecent work – (Gu, et al., 2020; “PubMedBERT”) and (Veysel and Talby, 2020) – seems to show that OOVs (writ large, meaning what your word-level vocabulary is or is not) are detrimental to NER performance, at least in bio-medical domains (gene and protein tagging, e.g.). The latter publication above uses a character CNN feeding a LSTM and outperforms the state of the art – including Stanza and BERT-derived models, such as PubMedBERT.
With that in mind, I tried to retrain just a NER model using the ScispaCy^^ base models (en_core_sci_md-0.3.0
), and I got the following error when turning on the --chr
/--use-chars
The training command:
$ python -m spacy train en /path/to/output/dir /path/to/train.json /path/to/dev.json --base-model 'en_core_sci_md' --pipeline ner -R -v 'en_core_sci_md' -ne 2 --meta-path /path/to/model/en_core_sci_md/meta.json --chr
This gives the following error. (Sorry I can’t copy the whole thing; I’m visually copying and typing from screen to screen at the moment.)
...
ValueError [E149] Error deserializing model. Check that the config used to create the component matches the model being loaded.
...
This happened when the model was reloaded (presumably) from the ground up – parsing, tagging, NER, etc. – to run on the validation/dev set after the first iteration.
Note that the model trains fine, for several iterations, and saves a working model if I omit the --chr
/--use-chars
flag. This is doubtless because the models have tied parameters and there is no char CNN component to the Tok2Vec features for any of the other parts of the whole pipeline (tagging and parsing). I don’t want to retrain the parser and tagger to use char CNN features, so is there a workaround?
(^^Perhaps I should crosspost there, but this seems to be a spaCy issue – something about model component mismatch.)
Your Environment
- Operating System: macOS High Sierra 10.13.6 (I know, out of date, but probabaly not relevant!)
- Python Version Used: Python 3.7.9
- spaCy Version Used: 2.3.2 (compatible with ScispaCy model)
- Environment Information: BASH (?)
Issue Analytics
- State:
- Created 3 years ago
- Comments:15 (9 by maintainers)
Top GitHub Comments
https://nightly.spacy.io/api/architectures#CharacterEmbed
https://nightly.spacy.io/usage/layers-architectures#sublayers
(Hmm, there’s really no good way to search the nightly docs, that doesn’t make things easy to find…)
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.