question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add case_sensitive option to WhitespaceTokenizer

See original GitHub issue

Description of Problem: WhitespaceTokenizer does not have the case_sensitive option. This is in reference to a discussion on the Rasa Community Forum https://forum.rasa.com/t/case-sensitivity/2541/8 with @Ghostvv

Overview of the Solution: Add the case_sensitive option to WhitespaceTokenizer to allow models using this tokenizer to be case insensitive.

Examples (if relevant): If a user types “Burger” when the WhitespaceTokenizer case_sensitive: false, the slot should fill regardless even if training only contains “burger”.

Blockers (if relevant): Not sure.

Definition of Done:

  • Add the case_sensitive option to the WhitespaceTokenizer
  • Test a model and ensure that when this option is specified in the pipeline configuration, then a slot is filled regardless of case (so it should be case insensitive).

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:12 (9 by maintainers)

github_iconTop GitHub Comments

5reactions
ccelottocommented, Jul 21, 2019

Thank you everyone! I have tested this functionality and it is working as intended.

1reaction
sibbsnbcommented, Jul 17, 2019

Closed the PR comments

Read more comments on GitHub >

github_iconTop Results From Across the Web

Whitespace tokenizer doesn't allow lowercase search?
Hello, I want to use the whitespace tokenizer and be able to have my search be case insensitive. However, I am unable to...
Read more >
Case insensitive whitespace tokenizer - Rasa Open Source
Hi, I found several forum posts advising people to use the following for implementing a case insensitive pipeline.
Read more >
How can I solr case insensitive search for a Text data
Try setting minGramSize=2 in attributes of EdgeNGramFilterFactory , remove asterisk from the query: solrParams.add("q","category:te"); , restart Solr and ...
Read more >
Tokenizers | Apache Solr Reference Guide 6.6
Arguments may be passed to tokenizer factories by setting attributes on the ... or "Part Number", case sensitive, with an optional semi-colon separator....
Read more >
Add custom analyzers to string fields - Azure Cognitive Search
For example, use the Whitespace tokenizer to break sentences into ... For token filters that have options, add a "tokenFilter" section to ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found