question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

stop words missing for en_core_web_md

See original GitHub issue

New to spaCy I want to configure stopwords. The regular spacy.en.STOP_WORDS do not seem to apply when loading the bigger file of en_core_web_md How can I configure the big file to use the regular stop words?

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:2
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
jmidyetcommented, Dec 24, 2017

@ines , I’m using en_core_web_md v 2.0.0 and this continues to be an issue. Works just fine with the small model.

1reaction
georgekcommented, Jan 24, 2018

Also having this problem with en_core_web_md v 2.0.0. I had to use the following as a workaround:

nlp.vocab.add_flag(lambda s: s.lower() in spacy.lang.en.stop_words.STOP_WORDS, spacy.attrs.IS_STOP)

Read more comments on GitHub >

github_iconTop Results From Across the Web

stop words missing for en_core_web_md and ... - GitHub
en_core_web_md and en_core_web_lg models are giving 'False' for all words in the sentence using "is_stop" attribute.
Read more >
missing stop words from spacy en_core_web_lg
I downloaded en_core_web_lg(en_core_web_lg-2.0.0) but when I load it and used it on spacy. But it seems to miss lots of basic common stop...
Read more >
How to use custom stop words in spaCy - BotFlo
First, create a file called stop_words.py and add the following code to it. As you can see, I am simply iterating over all...
Read more >
How To Get Rid Of Noun Tag For Unknown Words In Spacy ...
For now we'll be considering stop words as words that just contain no meaning and we want to remove them. You can do...
Read more >
Stop words with NLTK - Python Programming Tutorials
This word means nothing, unless of course we're searching for someone who is maybe lacking confidence, is confused, or hasn't practiced much speaking....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found