The library considers the verb "see" as a stop word!
See original GitHub issueIn a string such as she loves to see him
, the verb see
is taken as a stop word and is removed from the final array which is not right
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (2 by maintainers)
Top Results From Across the Web
Why is removing stop words not always a good idea - Medium
Words such as articles and some verbs are usually considered stop words because they don't help us to find the context or the...
Read more >Removing Stop Words from Strings in Python - Stack Abuse
In this article, you will see how to remove stop words using Python's NLTK, Gensim, and SpaCy libraries along with a custom script...
Read more >NLTK's list of english stopwords - gists · GitHub
So I created this as a gist, which you can directly use without downloading. Here are the steps to do so (in Python):...
Read more >Stopwords - Ranks.nl
Collection of stopword lists in 40+ languages. Find the English stopwords below and/or follow the links to view our other language stop word...
Read more >75 Stop Words That Are Common in SEO & When You Should ...
Common words like its, an, the, for, and that, are all considered stop words. While they're important for communicating verbally, stop words ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Also, stopword-trainer can be used to to continually update your list if stopwords.
From a theoretical standpoint it’s not correct, but sometimes it will be correct in practical use. When the word see is used in a lot if documents. I thought of generating stopword lists, sotted on stopwordiness. But for now, that’s too much work. We can take it out. Can you create a PR?
I’ve created a library to generate lists of stopwords where you can redlist words that you want to keep (as opposed to a stopword list being a blacklist). It will also be better for tribe langue within an organization or group of people. https://github.com/eklem/stopword-trainer/