Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

EntityExtractor().setRequireSentences(True) throws error.

See original GitHub issue

EntityExtractor().setRequireSentences(True)

throws this error:

Py4JJavaError: An error occurred while calling o461.getParam.
: java.util.NoSuchElementException: Param requireSentences does not exist.

From the EntityExtractor.scala file, I found the sentences parameter defined on lines 99 to 105, but when I changed lines 191 and 192 in annotator.py from:

def setRequireSentences(self, value):
    return self._set(requireSentences=value)

to:

def setRequireSentences(self, value):
    return self._set(sentences=value)

The error was basically the same:

Py4JJavaError: An error occurred while calling o474.getParam.
: java.util.NoSuchElementException: Param sentences does not exist.

Issue Analytics

State:
Created 6 years ago
Comments:14 (8 by maintainers)

Top GitHub Comments

1reaction

saif-ellaficommented, Jan 31, 2018

Will update the website asap as well. setInsideSentences decides whether you want to respect or not the sentence bounds. Is set to True by default.

For example, if you search for a piece of text ‘This is a sentence. This is another’, if InsideSentences is True, this will never match since the target belongs to different sentences. Most of the times it is not the case.

Feel free to share feedback anytime

1reaction

saif-ellaficommented, Jan 31, 2018

@sofianeh wow. We found the bug. Actually it is not a bug its a bad decision. We are normalizing the input file you provide (testfile.txt) and thus, it lower cases all its content. Hence it is not matching ‘Hello World’ but it would match ‘hello world’

We’ll fix this quick. Not normalize the input unless the user uses RecursivePipelines (a new Pipeline object that allows transforming the input with the same Pipeline the user is using)

For now, your pipeline will work if you include a normalizer

normalizer = Normalizer()\
  .setInputCols(["token"])\
  .setOutputCol("normal")

thank you

Top Results From Across the Web

Guide to Spring Boot REST API Error Handling - Toptal

In the handleEntityNotFound() method, we set the HTTP status code to NOT_FOUND and usethe new exception message. Here is what the response for...

Named Entity Extraction with Python - NLP-FOR-HACKERS

Named Entity Extraction is the first step towards information extraction from text. This guide helps you understand how NER works and how to...

how to handle spelling mistake(typos) in entity extraction in ...

I have added multiple sentences like this. At the time of testing, all sentences in training file are working fine. But if any...

Named Entity Recognition and Classification for Entity Extraction

As a simple example, let's extract titles from the first 10 documents. This code extracts the titles, but some author names get caught...

7. Extracting Information from Text - NLTK

If this location data was stored in Python as a list of tuples (entity, ... This method of getting meaning from text is...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

EntityExtractor().setRequireSentences(True) throws error.

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

pyspark on EMR not able to import

Add support for BigInt