Add case_sensitive option to WhitespaceTokenizer
See original GitHub issueDescription of Problem:
WhitespaceTokenizer does not have the case_sensitive
option. This is in reference to a discussion on the Rasa Community Forum https://forum.rasa.com/t/case-sensitivity/2541/8 with @Ghostvv
Overview of the Solution:
Add the case_sensitive
option to WhitespaceTokenizer to allow models using this tokenizer to be case insensitive.
Examples (if relevant):
If a user types “Burger” when the WhitespaceTokenizer case_sensitive: false
, the slot should fill regardless even if training only contains “burger”.
Blockers (if relevant): Not sure.
Definition of Done:
- Add the
case_sensitive
option to the WhitespaceTokenizer - Test a model and ensure that when this option is specified in the pipeline configuration, then a slot is filled regardless of case (so it should be case insensitive).
Issue Analytics
- State:
- Created 4 years ago
- Comments:12 (9 by maintainers)
Top Results From Across the Web
Whitespace tokenizer doesn't allow lowercase search?
Hello, I want to use the whitespace tokenizer and be able to have my search be case insensitive. However, I am unable to...
Read more >Case insensitive whitespace tokenizer - Rasa Open Source
Hi, I found several forum posts advising people to use the following for implementing a case insensitive pipeline.
Read more >How can I solr case insensitive search for a Text data
Try setting minGramSize=2 in attributes of EdgeNGramFilterFactory , remove asterisk from the query: solrParams.add("q","category:te"); , restart Solr and ...
Read more >Tokenizers | Apache Solr Reference Guide 6.6
Arguments may be passed to tokenizer factories by setting attributes on the ... or "Part Number", case sensitive, with an optional semi-colon separator....
Read more >Add custom analyzers to string fields - Azure Cognitive Search
For example, use the Whitespace tokenizer to break sentences into ... For token filters that have options, add a "tokenFilter" section to ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thank you everyone! I have tested this functionality and it is working as intended.
Closed the PR comments