Can not config token_pattern for CountVectorsFeaturizer in config.yml?
See original GitHub issueRasa version: 1.1.2 Rasa X version (if used & relevant):
Python version: 3.7.3 Operating system (windows, osx, …):
Issue: In the config.yml I want to config token_pattern: r’(?u)\b\w+\b’ for chinese under CountVectorsFeaturizer Component. But it doesn’t work.
CountVectorsFeaturizer reads my configuration r’(?u)\b\w+\b’ as a normal string not a regex. and failed in train method and go into exception branch:
try:
# noinspection PyPep8Naming
X = self.vectorizer.fit_transform(lem_exs).toarray()
except ValueError:
self.vectorizer = None (will come here)
return
Error (including full traceback):
Command or request that led to error:
Content of configuration file (config.yml) (if relevant):
# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: "zh"
pipeline:
- name: "JiebaTokenizer"
dictionary_path: "jieba_dict"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "CountVectorsFeaturizer"
token_pattern: r'(?u)\b\w+\b'
- name: "EmbeddingIntentClassifier"
# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
- name: MemoizationPolicy
- name: KerasPolicy
- name: MappingPolicy
- name: "FallbackPolicy"
nlu_threshold: 0.3
core_threshold: 0.3
fallback_action_name: 'action_default_fallback'
Content of domain file (domain.yml) (if relevant):
Issue Analytics
- State:
- Created 4 years ago
- Comments:9 (6 by maintainers)
Top Results From Across the Web
How to config token_pattern for CountVectorsFeaturizer in ...
In the config.yml I want to config token_pattern: r'(?u)\b\w+\b' for chinese under CountVectorsFeaturizer Component. But it doesn't work.
Read more >Configuration Introduction - CircleCI
The config.yml file is located in a folder called .circleci at the top of your repo project. CircleCI uses the YAML syntax for...
Read more >config.yml - Sonatype Help
The main configuration file for the IQ Server installation is a YAML formatted file called config.yml. By default, config.yml is located in the...
Read more >Config options - Vikunja
Right now it is not possible to configure openid authentication via environment variables. Variables are nested in the config.yml , these nested variables...
Read more >config.yml · master · Muriel Figueredo Franco / SecBot · GitLab
config.yml 1.03 KiB. Open in Web IDE Toggle dropdown ... Configuration for Rasa NLU. 5. pipeline: 6. - name: "SpacyNLP" ... name: CountVectorsFeaturizer....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thanks for the help on this @psds01!
@psds01 After changed token_pattern: “(?u)\b\w+\b” in config.yml. It seems ok now.
Thank you very much:)