Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Possible bug with punctuation in a text

See original GitHub issue

Hi guys when running the mitie model with the following sample

{
  "rasa_nlu_data": {
    "common_examples": [
     
      {
        "text": "Hi, thank you for your interest, the last price I can go for it is 200,000.",
        "intent": "new_offer",
        "entities": [
          {
            "start": 0,
            "end": 2,
            "value": "Hi",
            "entity": "greeting"
          },
          {
            "start": 67,
            "end": 74,
            "value": "200,000",
            "entity": "price"
          }
        ]
      }
    ]
  }
}

i am getting this error

Traceback (most recent call last):
  File "/usr/local/Cellar/python/2.7.12_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/local/Cellar/python/2.7.12_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/site-packages/rasa_nlu/train.py", line 65, in <module>
    do_train(config)
  File "/usr/local/lib/python2.7/site-packages/rasa_nlu/train.py", line 59, in do_train
    trainer.train(training_data)
  File "/usr/local/lib/python2.7/site-packages/rasa_nlu/trainers/mitie_trainer.py", line 25, in train
    self.entity_extractor = self.train_entity_extractor(data.entity_examples)
  File "/usr/local/lib/python2.7/site-packages/rasa_nlu/trainers/mitie_trainer.py", line 42, in train_entity_extractor
    start, end = self.start_and_end(tokens, val_tokens)
  File "/usr/local/lib/python2.7/site-packages/rasa_nlu/trainers/mitie_trainer.py", line 31, in start_and_end
    start, end = locs[0], locs[0] + len(entity_tokens)
IndexError: list index out of range

I am not sure how to fix this, but i am positive it is related to the punctuation in in the text string, because when i remove the last ‘.’ in the training sample text field it works. Thanks for your amazing product!

Issue Analytics

State:
Created 7 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

tmbocommented, Mar 14, 2017

No they are not ignored, what happens to them is not defined. It depends on the entity extractor (mitie / spacy).

0reactions

mikkelamcommented, Mar 9, 2017

I am now just getting a warning,

[u'Hi', u',', u'thank', u'you', u'for', u'your', u'interest', u',', u'the', u'last', u'price', u'I', u'can', u'go', u'for', u'it', u'is', u'200,000.'].Entities must span whole tokens.

but what actually happens with this sample? Is it ignored?

Top Results From Across the Web

?Punctuation Problem in Photoshop - YouTube

... trouble with punctuation coming up at the beginning of your text, ... Some versions of Photoshop have the default punctuation placement ...

Punctuation triggering 70-character max S… - Apple Community

Punctuation triggering 70-character max SMS ... but I wonder if there's some bug in iOS that is seeing these punctuation marks as objects...

Why Is My Punctuation Floating High? - CreativePro Network

Floating punctuation is a classic problem having to do with the Fractions OpenType feature: When the Fractions feature is enabled, some OpenType fonts...

An Apple bug means iPhone users could be charged more ...

If you're a punctuation pedant when it comes to texting, beware. A bug in Apple's iOS 11 operating system means if you include...

Speech to Text Adding an 'Oh' Before Every Comma? Here's ...

Fortunately, when we find these bugs, there are often quick solutions that fix the issue. The speech to text "Oh" bug is one...