Finalize Marathi stopword list
See original GitHub issue@MaheshBhosale thank you for your additions! I had to merge this manually because of several conflicts.
I need you to do a few things:
- Take a look at the formatting and changes I made to your code: https://github.com/cltk/cltk/commit/da0164455dc13d852897570cee069fad8ee99a82 (I also deleted the redundant
stops/Marathi/
(use lowercase) - Check each of the commands in the docs, to ensure that they work right
- Important: Please tell us about how you made the stops list. What texts did you use? And were those texts pre-modern? As a toolkit for historians of languages and literature is important that we represent the old version of languages, not their modern form (sometimes quite different, even with the same name).
- From the Marathi stops list, remove any nouns, adjectives, verbs, and adverbs.
- To the stops list, also add the various inflections of words for things like articles and pronouns. I can explain more if you’re unsure what I mean. (@diyclassics can help too).
- Add docs to for Marathi stops.
Issue Analytics
- State:
- Created 6 years ago
- Comments:13 (13 by maintainers)
Top Results From Across the Web
Effect of stopwords in Indian language IR
Dolamic and Savoy [5] proposed a stopword list of 165 for Hindi, 114 for Bengali and 99 for Marathi. They show that the...
Read more >Stop Words Cleaner for Marathi - Spark NLP
Description. This model removes 'stop words' from text. Stop words are words so common that they can be removed without significantly altering ...
Read more >Automatically building a stopword list for an information ...
ABSTRACT This paper discusses the issues involved in an information retrieval system when spelling errors are encountered in a query. We look at...
Read more >Stop words - Funnelback - Squiz Help Center
A custom stop words list can be used instead of the default list by defining the -STOP query processor option. The value should...
Read more >Creating an index > NLP and tokenization > Ignoring stop-words
stopwords =path/to/stopwords/file[ path/to/another/file ...] Stop word files list (space separated). Optional, default is empty. You can specify several file ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Thanks @kylepjohnson
Gotcha. Thanks for the clarification. Working on it.