question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Possible bug with punctuation in a text

See original GitHub issue

Hi guys when running the mitie model with the following sample

{
  "rasa_nlu_data": {
    "common_examples": [
     
      {
        "text": "Hi, thank you for your interest, the last price I can go for it is 200,000.",
        "intent": "new_offer",
        "entities": [
          {
            "start": 0,
            "end": 2,
            "value": "Hi",
            "entity": "greeting"
          },
          {
            "start": 67,
            "end": 74,
            "value": "200,000",
            "entity": "price"
          }
        ]
      }
    ]
  }
}

i am getting this error

Traceback (most recent call last):
  File "/usr/local/Cellar/python/2.7.12_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/local/Cellar/python/2.7.12_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/site-packages/rasa_nlu/train.py", line 65, in <module>
    do_train(config)
  File "/usr/local/lib/python2.7/site-packages/rasa_nlu/train.py", line 59, in do_train
    trainer.train(training_data)
  File "/usr/local/lib/python2.7/site-packages/rasa_nlu/trainers/mitie_trainer.py", line 25, in train
    self.entity_extractor = self.train_entity_extractor(data.entity_examples)
  File "/usr/local/lib/python2.7/site-packages/rasa_nlu/trainers/mitie_trainer.py", line 42, in train_entity_extractor
    start, end = self.start_and_end(tokens, val_tokens)
  File "/usr/local/lib/python2.7/site-packages/rasa_nlu/trainers/mitie_trainer.py", line 31, in start_and_end
    start, end = locs[0], locs[0] + len(entity_tokens)
IndexError: list index out of range

I am not sure how to fix this, but i am positive it is related to the punctuation in in the text string, because when i remove the last ‘.’ in the training sample text field it works. Thanks for your amazing product!

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
tmbocommented, Mar 14, 2017

No they are not ignored, what happens to them is not defined. It depends on the entity extractor (mitie / spacy).

0reactions
mikkelamcommented, Mar 9, 2017

I am now just getting a warning,

[u'Hi', u',', u'thank', u'you', u'for', u'your', u'interest', u',', u'the', u'last', u'price', u'I', u'can', u'go', u'for', u'it', u'is', u'200,000.'].Entities must span whole tokens.

but what actually happens with this sample? Is it ignored?

Read more comments on GitHub >

github_iconTop Results From Across the Web

?Punctuation Problem in Photoshop - YouTube
... trouble with punctuation coming up at the beginning of your text, ... Some versions of Photoshop have the default punctuation placement ...
Read more >
Punctuation triggering 70-character max S… - Apple Community
Punctuation triggering 70-character max SMS ... but I wonder if there's some bug in iOS that is seeing these punctuation marks as objects...
Read more >
Why Is My Punctuation Floating High? - CreativePro Network
Floating punctuation is a classic problem having to do with the Fractions OpenType feature: When the Fractions feature is enabled, some OpenType fonts...
Read more >
An Apple bug means iPhone users could be charged more ...
If you're a punctuation pedant when it comes to texting, beware. A bug in Apple's iOS 11 operating system means if you include...
Read more >
Speech to Text Adding an 'Oh' Before Every Comma? Here's ...
Fortunately, when we find these bugs, there are often quick solutions that fix the issue. The speech to text "Oh" bug is one...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found