Spacy 3.0 - specify my own candidate generator to use custom UMLS path
See original GitHub issueI’m in the process of updating my python project from using Spacy 2 to use Spacy 3. My project uses my own versions of Scispacy candidate_generation.py and linking_utils.py in order to use a custom version of concept_aliases.json and other UMLS data files. How do I accomplish this in Spacy 3 (Scispacy 0.4.0)?
In V2, I simply made copies of candidate_generation.py and linking_utils.py which referenced local copies of the KB data files and instantiated my version of candidate_generation.py and passed the object to the EntitlyLinker constructor like so:
_candidate_generator = CandidateGenerator()
_el = EntityLinker(resolve_abbreviations=True, name="umls",
candidate_generator=_candidate_generator)
My candidate_generation.py, in turn, references my linking_utils.py.
Issue Analytics
- State:
- Created 3 years ago
- Comments:9
Top Results From Across the Web
Build a Custom NER model using spaCy 3.0
SpaCy provides ready-to-use language-specific pre-trained models to perform parsing, tagging, NER, lemmatizer, tok2vec, attribute_ruler, and ...
Read more >EntityLinker · spaCy API Documentation
An EntityLinker component disambiguates textual mentions (tagged as named entities) to unique identifiers, grounding the named entities into the “real ...
Read more >A Full SpaCy Pipeline and Models for Scientific ... - Morioh
This repository contains custom pipes and models related to using spaCy for scientific documents. In particular, there is a custom tokenizer that adds ......
Read more >Custom Named Entity Recognition (NER) model with spaCy 3 ...
Note: spaCy v3.1, however, no longer takes .json format and this has to be converted to their ... Here, I also set --gpu-id...
Read more >TOWARDS SYNTHETIC CLINICAL TEXT GENERATION
Their work split documents into sentences, extracted named entities, and replaced the entities with fillable slots to be later filled by a document...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Daniel, Thank you very much. I finally got it working and determined that I had neglected to download a recent copy of concept_aliases.json and was using an older copy.
The line
response = self._nlp(document.normalized_text)
is executed in a method that gets called by user action. The other assignments are done at class init time. Let me try your example, although the code defaults to umls if you omit the name. Umls is what I want. Thanks.