A way to tell what tokens `LatinBackOffLemmatizer()` has failed to lemmatize
See original GitHub issueIn LatinBackOffLemmatizer()
and the lemmatizers in its chain I can’t seem to find an option to return an empty value (such as in OldEnglishDictionaryLemmatizer()
's best_guess=False
option), instead of returning the input value, when the lemmatizer fails to assign a lemma.
Without such an option, it doesn’t seem possible to tell successful from unsuccessful lemmatization attempts programmatically, severely limiting the range of the lemmatizer’s applications.
Issue Analytics
- State:
- Created 9 months ago
- Comments:6 (5 by maintainers)
Top Results From Across the Web
8.1.8. cltk.lemmatize package - The Classical Language Toolkit
Tagging of individual words is performed by the choose_tag() method, which should be ... If a tagger is unable to determine a tag...
Read more >NLTK Lemmatization: How to Lemmatize Words with NLTK?
Create the tokens with “word_tokenize” from the text. Lemmatize the tokens with “lemmatizer.lemmatize()”. Append the lemmas into a list. Unite ...
Read more >spacy 3.0 warnings about lemmatization and POS · Issue #5036
[W108] The rule-based lemmatizer did not find POS annotation for the token 'hesitated'. Check that your pipeline includes components that assign ...
Read more >Lemmatization Approaches with Examples in Python
Lemmatization is the process of converting a word to its base form. ... We will see how to optimally implement and compare the...
Read more >Is Spacy lemmatization not working properly or does it not ...
The spaCy lemmatizer is not failing, it's performing as expected. ... See how the verb has "consult" as a lemma, while the noun...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Ok, I made something that might help you.
self.backoff1
is now aDefaultLemmatizer
instance that returns None if no result was found.And you get: