stop words missing for en_core_web_md
See original GitHub issueNew to spaCy I want to configure stopwords.
The regular spacy.en.STOP_WORDS
do not seem to apply when loading the bigger file of en_core_web_md
How can I configure the big file to use the regular stop words?
Issue Analytics
- State:
- Created 6 years ago
- Reactions:2
- Comments:10 (5 by maintainers)
Top Results From Across the Web
stop words missing for en_core_web_md and ... - GitHub
en_core_web_md and en_core_web_lg models are giving 'False' for all words in the sentence using "is_stop" attribute.
Read more >missing stop words from spacy en_core_web_lg
I downloaded en_core_web_lg(en_core_web_lg-2.0.0) but when I load it and used it on spacy. But it seems to miss lots of basic common stop...
Read more >How to use custom stop words in spaCy - BotFlo
First, create a file called stop_words.py and add the following code to it. As you can see, I am simply iterating over all...
Read more >How To Get Rid Of Noun Tag For Unknown Words In Spacy ...
For now we'll be considering stop words as words that just contain no meaning and we want to remove them. You can do...
Read more >Stop words with NLTK - Python Programming Tutorials
This word means nothing, unless of course we're searching for someone who is maybe lacking confidence, is confused, or hasn't practiced much speaking....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@ines , I’m using
en_core_web_md
v 2.0.0 and this continues to be an issue. Works just fine with the small model.Also having this problem with
en_core_web_md
v 2.0.0. I had to use the following as a workaround:nlp.vocab.add_flag(lambda s: s.lower() in spacy.lang.en.stop_words.STOP_WORDS, spacy.attrs.IS_STOP)