Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Training baseline scores vary despite random seeds fixed

See original GitHub issue

Raised this issue first on the Prodigy support forum here but it’s actually a Spacy issue.

I have been using prodigy to train a ‘texcat’ model like so:

python -m prodigy train textcat my_annotations en_vectors_web_lg --output ./my_model

and I noticed that the baseline score hugely varies between runs (0.2-0.55). This is even more puzzling to me given fix_random_seed(0) is called at the beginning of training.

I tracked down these variations to be coming from the model output. This is a minimal example to re-create this behaviour.

How to reproduce the behaviour

import spacy

component = 'textcat'
pipe_cfg = {"exclusive_classes": False}

for i in range(5):
    spacy.util.fix_random_seed(0)

    nlp = spacy.load('en_vectors_web_lg')

    example = ("Once hot, form ping-pong-ball-sized balls of the mixture, each weighing roughly 25 g.",
                {'cats': {'Labe1': 1.0, 'Label2': 0.0, 'Label3': 0.0}})


    # Set up component pipe
    nlp.add_pipe(nlp.create_pipe(component, config=pipe_cfg), last=True)
    pipe = nlp.get_pipe(component)
    for label in set(example[1]['cats']):
        pipe.add_label(label)

    # Set up training and optimiser
    optimizer = nlp.begin_training(component_cfg={component: pipe_cfg })

    # Run one document through textcat NN for scoring
    print(f"Scoring '{example[0]}'")
    print(f"Result: {pipe.model([nlp.make_doc(example[0])])}")

Calling fix_random_seeds should create the same output given a fixed seed and no weight updates as far as I understand. It does indeed in the linear model but not the CNN model if I read the architecture of the model correctly here https://github.com/explosion/spaCy/blob/908dea39399bbc0c966c131796f339af5de54140/spacy/_ml.py#L708 So the output from the first half of the first layer stays the same for each iteration but the second half does not.

Your Environment

spaCy version: 2.2.4
Platform: Darwin-18.7.0-x86_64-i386-64bit
Python version: 3.7.7
thinc version 7.4.0

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:7 (5 by maintainers)

Top GitHub Comments

2reactions

svlandegcommented, Jul 9, 2020

Hi @michel-ds, we found the problem and resolved it in PR #5735 - I added your specific test to the test suite and it runs now without error: https://github.com/explosion/spaCy/blob/develop/spacy/tests/regression/test_issue5551.py This will be fixed from spaCy 3.0 onwards.

1reaction

michel-dscommented, Jul 10, 2020

Hi @svlandeg I can confirm that I am getting identical numbers with the develop branch version of SpaCy.

Scoring 'Once hot, form ping-pong-ball-sized balls of the mixture, each weighing roughly 25 g.'
Result: (array([[0.37729517, 0.7529206 , 0.46667254]], dtype=float32), <function forward.<locals>.backprop at 0x1149c64d0>)
Scoring 'Once hot, form ping-pong-ball-sized balls of the mixture, each weighing roughly 25 g.'
Result: (array([[0.37729517, 0.7529206 , 0.46667254]], dtype=float32), <function forward.<locals>.backprop at 0x1127b1c20>)
Scoring 'Once hot, form ping-pong-ball-sized balls of the mixture, each weighing roughly 25 g.'
Result: (array([[0.37729517, 0.7529206 , 0.46667254]], dtype=float32), <function forward.<locals>.backprop at 0x1149ddf80>)
Scoring 'Once hot, form ping-pong-ball-sized balls of the mixture, each weighing roughly 25 g.'
Result: (array([[0.37729517, 0.7529206 , 0.46667254]], dtype=float32), <function forward.<locals>.backprop at 0x113bf3560>)
Scoring 'Once hot, form ping-pong-ball-sized balls of the mixture, each weighing roughly 25 g.'
Result: (array([[0.37729517, 0.7529206 , 0.46667254]], dtype=float32), <function forward.<locals>.backprop at 0x1127b8a70>)

I had to use a blank model in the code snippet above nlp = spacy.blank("en") but I hope that didn’t falsify the results of my test.

Thanks for fixing! Looking forward to version 3.0.