Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

spacy ner scorer return all zeros and duplicate labels

See original GitHub issue

Hi all,

I am trying to get scores on my test set using Scorer. Here is my simple code:

test_set = [
    ('Your 40577 is finished', [(6,11,'LABEL1')] ),
    ('Finished with SODE20915', [(14,23,'LABEL2')] )
]

scorer = Scorer()
for text,annotin test_set:
    doc_gold_text = nlp(text)  # here i tried also: doc_gold_text = nlp.make_doc(txt)
    gold = GoldParse(doc_gold_text, entities=annotin)
    pred_value = nlp(txt)
   #print(gold.words, gold.tags, gold.labels, gold.ner)
    scorer.score(pred_value, gold)
print(scorer.scores)

And what I am getting as output is all zeros and duplicate LABEL1 as ‘LABEL1’ and “‘LABEL1’”:

{'uas': 0.0, 'las': 0.0, 'las_per_type': {'': {'p': 0.0, 'r': 0.0, 'f': 0.0}}, 'ents_p': 0.0, 'ents_r': 0.0, 'ents_f': 0.0, 'ents_per_type': {"'LABEL1'": {'p': 0.0, 'r': 0.0, 'f': 0.0}, 'LABEL1': {'p': 0.0, 'r': 0.0, 'f': 0.0},"'LABEL2'": {'p': 0.0, 'r': 0.0, 'f': 0.0}, 'LABEL2': {'p': 0.0, 'r': 0.0, 'f': 0.0}}, 'tags_acc': 0.0, 'token_acc': 100.0, 'textcat_score': 0.0, 'textcats_per_cat': {}}

I don’t understand why I have duplicate labels? And model made correct predictions in both sentences, but I have P,R,F1 scores equal to 0? Maybe i missed something?

When I print GoldParse object (uncomment print line in the loop) I am getting this:

['Your', '40577','is','finished'] [None, None,None,None] [None, None,None,None] ['O', 'U-CNTNUM','O','O']
['Finished ', 'with', '41735'] [None, None, None] [None, None, None] ['O', 'O', 'U-CNTNUM']

I checked my train data it seems ok. Here is example of train data:

[('This paper BIME 13935 will be done soon', {'entities': [(11, 21, 'LABEL2')]}), 
('What is status of 43391', {'entities': [(18, 23, 'LABEL1')]})]

Model is trained on GPU using minibatches.

Your Environment

Operating System: Linux Ubuntu
Python Version Used: python 3.7
spaCy Version Used: spacy 2.2.4
Environment Information: GPU

Issue Analytics

State:
Created 3 years ago
Comments:7 (5 by maintainers)

Top GitHub Comments

1reaction

ivangru90commented, Apr 21, 2020

Oh I missed that, now it works. Great, thank you!

1reaction

adrianeboydcommented, Apr 21, 2020

Thanks, a full example makes this much easier to troubleshoot!

This may not be the same problem as in your original code where you don’t have the same mistake, but in the example above the mistake is here:

scorer.score(doc_gold_text, gold)

You’re giving the scorer the blank tokens-only document instead of the annotated one. You need to have:

scorer.score(pred_value, gold)

With this change you can see non-zero scores for the demo example:

'ents_p': 50.0, 'ents_r': 50.0, 'ents_f': 50.0, 'ents_per_type': {'LOC': {'p': 50.0, 'r': 100.0, 'f': 66.66666666666666}, 'PERSON': {'p': 0.0, 'r': 0.0, 'f': 0.0}}

Top Results From Across the Web

Linguistic Features · spaCy Usage Documentation

spaCy is a free open-source library for Natural Language Processing in Python. It features NER, POS tagging, dependency parsing, word vectors and more....

Why is my SpaCy v3 scorer returing 0 for precision, recall and ...

Thanks a lot for the answer, this works now :) A side question, is there a way to divide the scores per LABAL,...

How to mass identify recurring textual features for NER using ...

This step helps establish how confident we can be in the model's ability to get all the tags right. from spacy.scorer import Scorer...

Training Custom NER models in SpaCy to auto-detect named ...

The spaCy library allows you to train NER models by both updating an existing spacy model to suit the specific context of your...

SPACY v3: Custom trainable relation extraction component

spaCy v3.0 features new transformer-based pipelines that get spaCy's accuracy right up to the current state-of- the -art, and a new training ...