question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

None and false prediction score when evaluating my custom model using GoldParse

See original GitHub issue

Hello there,

After training my custom ner spacy model, i wanted to calculate the score of prediction but i get the same false result for each prediction :

scorer = Scorer()
doc_gold_text = model.make_doc(labeled_data[1][0])
gold = GoldParse(doc_gold_text, labeled_data[1][1]['entities'])
pred_value = model(labeled_data[1][0])
scorer.score(pred_value, gold)
print(model.evaluate([(pred_value,gold)]).scores)

give me :

{'uas': 0.0, 'las': 0.0, 'las_per_type': {'': {'p': 0.0, 'r': 0.0, 'f': 0.0}}, 'ents_p': 0.0, 'ents_r': 0.0, 'ents_f': 0.0, 'ents_per_type': {}, 'tags_acc': 0.0, 'token_acc': 100.0, 'textcat_score': 0.0, 'textcats_per_cat': {}}

Looking to the GoldParse Heads, i got a list of None : print(gold.heads) [None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, ... My test input is :

(text1,{'entities': [(0,31, 'Titre'),
                (98,106, 'Label'),(122,155, 'Valeur'),
                (193,210, 'Label'),(217,224, 'Valeur'),
                (291,295, 'Label'),(315,354, 'Valeur'),
                (422,428, 'Label'),(446,466, 'Valeur'),
                (504,520, 'Label'),(528,542, 'Valeur'),
                (608,623, 'Label'),(632,637, 'Valeur'),
                (675,687, 'Label'),(699,704, 'Valeur'),
                (768,785, 'Label'),(792,807, 'Valeur'),
                (845,860, 'Label'),(869,884, 'Valeur'),
                (954,975, 'Label'),(978,993, 'Valeur'),
                (1059,1074, 'Label'),(1083,1087, 'Valeur'),
                (1197,1203, 'Label'),(1221,1224, 'Valeur'),
                (3301,3323, 'Label'),(3325,3361, 'Valeur'),
                (3428,3445, 'Label'),(3452,3502, 'Valeur'),
                (4748,4760, 'Label'),(4772,4819, 'Valeur'),
                (8361,8370, 'Label'),(8384,8442, 'Valeur'),
                (8514,8521, 'Label'),(8538,8549, 'Valeur'),
                (8618,8627, 'Label'),(8641,8692, 'Valeur')]})
  • Does anyone have an idea about that?

  • And I was wondering, is there a parameter during training so that the ner model change his weight according to a validation set?

Your Environment

  • Operating System: Darwin-19.3.0-x86_64-i386-64bit
  • Python Version Used: 3.7.6
  • spaCy Version Used: 2.2.4

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
Todaimecommented, Apr 17, 2020

@svlandeg Thank you for your quick answer ! 😉

  • To train my model I used :
nlp = spacy.blank('xx')  # Creation d'un model blanc.

if 'ner' not in nlp.pipe_names :
        ner = nlp.create_pipe('ner')
        nlp.add_pipe(ner, last=True)
else :
        ner = nlp.get_pipe('ner')
 for _, annotations in data_set:
        for ent in annotations.get('entities'):
            ner.add_label(ent[2])

other_pipe = [pipe for pipe in nlp.pipe_names if pipe != 'ner']

with nlp.disable_pipes(*other_pipe) :
            optimizer = nlp.begin_training()

for ite in range(nb_iteration) :
        compteur_iteration += 1
        random.shuffle(train_set)
        losses = {}
        for text, annotation in data_set :
            nlp.update(
            [text],
            [annotation],
            drop = 0.3,
            sgd = optimizer,
            losses = losses
            )

So the output is the nlp trained model 😃

@adrianeboyd, I added the entities param name (it doesn’t change the metric values) and gold.ner give me that :

['B-Titre', 'I-Titre', 'I-Titre', 'I-Titre', 'L-Titre', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'U-Label', 'O', 'B-Valeur', 'I-Valeur', 'I-Valeur', 'I-Valeur', 'I-Valeur', 'I-Valeur', 'I-Valeur', 'I-Valeur', 'L-Valeur', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'B-Label', 'L-Label', 'O', 'U-Valeur', 'O', 'O' 'O', 'O', 'O', 'O',....

Again , thank you for your time 😃

Devly yours 💻

0reactions
github-actions[bot]commented, Oct 31, 2021

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Evaluation in a Spacy NER model - python - Stack Overflow
You can find different metrics including F-score, recall and precision in spaCy/scorer.py. This example shows how you can use it:
Read more >
Training Pipelines & Models · spaCy Usage Documentation
Train and update components on your own data and integrate custom models. spaCy's tagger, parser, text categorizer and many other components are powered...
Read more >
Evaluating Precision and Recall of NER - Prodigy Support
The model predicts named entities in the text in the evaluation dataset. Each entity predicted with a score above the threshold is compared ......
Read more >
NLP - Ocean Ode
Language Model - stats model for NLP tasks · Basic Cleaning by SpaCy · Tokenizing Text · POS tagging - tensorizer · Dependency...
Read more >
The and-or trick in python - The Kitchin Research Group
for value in ('', 0, None, [], (), {}, False): if value: print value, ... With the and operator, each argument is evaluated,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found