Possible bug with punctuation in a text
See original GitHub issueHi guys when running the mitie model with the following sample
{
"rasa_nlu_data": {
"common_examples": [
{
"text": "Hi, thank you for your interest, the last price I can go for it is 200,000.",
"intent": "new_offer",
"entities": [
{
"start": 0,
"end": 2,
"value": "Hi",
"entity": "greeting"
},
{
"start": 67,
"end": 74,
"value": "200,000",
"entity": "price"
}
]
}
]
}
}
i am getting this error
Traceback (most recent call last):
File "/usr/local/Cellar/python/2.7.12_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/local/Cellar/python/2.7.12_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/local/lib/python2.7/site-packages/rasa_nlu/train.py", line 65, in <module>
do_train(config)
File "/usr/local/lib/python2.7/site-packages/rasa_nlu/train.py", line 59, in do_train
trainer.train(training_data)
File "/usr/local/lib/python2.7/site-packages/rasa_nlu/trainers/mitie_trainer.py", line 25, in train
self.entity_extractor = self.train_entity_extractor(data.entity_examples)
File "/usr/local/lib/python2.7/site-packages/rasa_nlu/trainers/mitie_trainer.py", line 42, in train_entity_extractor
start, end = self.start_and_end(tokens, val_tokens)
File "/usr/local/lib/python2.7/site-packages/rasa_nlu/trainers/mitie_trainer.py", line 31, in start_and_end
start, end = locs[0], locs[0] + len(entity_tokens)
IndexError: list index out of range
I am not sure how to fix this, but i am positive it is related to the punctuation in in the text string, because when i remove the last ‘.’ in the training sample text field it works. Thanks for your amazing product!
Issue Analytics
- State:
- Created 7 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
?Punctuation Problem in Photoshop - YouTube
... trouble with punctuation coming up at the beginning of your text, ... Some versions of Photoshop have the default punctuation placement ...
Read more >Punctuation triggering 70-character max S… - Apple Community
Punctuation triggering 70-character max SMS ... but I wonder if there's some bug in iOS that is seeing these punctuation marks as objects...
Read more >Why Is My Punctuation Floating High? - CreativePro Network
Floating punctuation is a classic problem having to do with the Fractions OpenType feature: When the Fractions feature is enabled, some OpenType fonts...
Read more >An Apple bug means iPhone users could be charged more ...
If you're a punctuation pedant when it comes to texting, beware. A bug in Apple's iOS 11 operating system means if you include...
Read more >Speech to Text Adding an 'Oh' Before Every Comma? Here's ...
Fortunately, when we find these bugs, there are often quick solutions that fix the issue. The speech to text "Oh" bug is one...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
No they are not ignored, what happens to them is not defined. It depends on the entity extractor (mitie / spacy).
I am now just getting a warning,
[u'Hi', u',', u'thank', u'you', u'for', u'your', u'interest', u',', u'the', u'last', u'price', u'I', u'can', u'go', u'for', u'it', u'is', u'200,000.'].Entities must span whole tokens.
but what actually happens with this sample? Is it ignored?