Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

OSError: [E050] in nlp.initialize()

See original GitHub issue

I am trying to train an entity linker but I am getting the error

OSError: [E050] Can't find model 'corpus/en_vectors'. It doesn't seem to be a Python package or a valid path to a data directory.

in initialize().

I’ve looked at the docs here, but am struggling to see why the error is occurring.

How to reproduce the behaviour

from spacy.kb import KnowledgeBase
from spacy.training import Example
import spacy

nlp = spacy.load('en_core_web_lg')
# here I usually load my local vocab path, but the same error occurs without this
# nlp.vocab.from_disk(self.vocab_path)
nlp.vocab.vectors.name = "spacy_pretrained_vectors"

def create_kb(vocab):
    entity_vector_length = 300
    kb = KnowledgeBase(vocab=vocab, entity_vector_length=entity_vector_length)
    # here I usually load my local knowledge base, but the same error occurs if you dont add anything 
    # kb.from_disk(self.kb_path)
    return kb

entity_linker = nlp.add_pipe("entity_linker")
entity_linker.set_kb(create_kb)

train_data = []
text_1 = "Russ Cochran his reprints include EC Comics."
dict_1 = {(0, 12): {"Q7381115": 1.0, "Q2146908": 0.0}}
train_data.append((text_1, {"links": dict_1}))
text_2 = "Russ Cochran has been publishing comic art."
dict_2 = {(0, 12): {"Q7381115": 1.0, "Q2146908": 0.0}}
train_data.append((text_2, {"links": dict_2}))
text_3 = "Russ Cochran captured his first major title with his son as caddie."
dict_3 = {(0, 12): {"Q7381115": 0.0, "Q2146908": 1.0}}
train_data.append((text_3, {"links": dict_3}))
text_4 = "Russ Cochran was a member of University of Kentucky's golf team."
dict_4 = {(0, 12): {"Q7381115": 0.0, "Q2146908": 1.0}}
train_data.append((text_4, {"links": dict_4}))

examples = []
for text, annotation in train_data:
    doc = nlp.make_doc(text)
    example = Example.from_dict(doc, annotation)
    examples.append(example)

other_pipes = [pipe for pipe in nlp.pipe_names if pipe != "entity_linker"]
with nlp.select_pipes(disable=other_pipes):
    optimizer = nlp.initialize()
    for itn in range(n_iter):
        random.shuffle(examples)
        losses = {}
        batches = minibatch(examples, size=compounding(4.0, 32.0, 1.001))
        for batch in batches:
            nlp.update(
                batch, drop=0.2, losses=losses, sgd=optimizer,
            )

I’m getting this same issue when I try to load my own knowledge base and vocab. For this I thought maybe I needed to change the config file to point to the correct vectors location (which is “local_kb_path/vocab/vectors”), so I tried:

config = {"initialize": {"vectors": 'local_kb_path/vocab/vectors'}}
entity_linker = nlp.add_pipe("entity_linker", config=config)

but this gives ‘extra fields not permitted’.

Many thanks!

Your Environment

spaCy version: 3.0.0rc2
Platform: Darwin-18.6.0-x86_64-i386-64bit
Python version: 3.7.9
Pipelines: en_core_web_md (3.0.0a0), en_core_web_lg (3.0.0a0)

Issue Analytics

State:
Created 3 years ago
Comments:6 (4 by maintainers)

Top GitHub Comments

1reaction

svlandegcommented, Jan 4, 2021

Sounds good! I’ll close this in the meantime, but feel free to reopen or open a new issue if you can’t get it to work. (for usage questions, you can also use our new discussion board btw! https://github.com/explosion/spaCy/discussions)

1reaction

lizgzilcommented, Jan 4, 2021

Thank you @svlandeg I will let you know how I get on with that advice 👍

Top Results From Across the Web

OSError: [E050] Can't find model 'en' - Stack Overflow

when using spacy we have to download the model using python -m spacy download en_core_web_sm. If you have already done that make sure...

OSError: [E050] Can't find model 'en'. It doesn't seem to be a ...

I wanted to use the chatterbot spacy collaborate system and trained data on chatterbot and created a response chat system.

OSError: [E050] Can't find model 'en'. It doesn't seem to be a ...

I followed the github one. so as ines said on github import en_core_web_sm nlp = en_core_web_sm.load() Those commands were running successfully ...

Issue with exported spacy models - solved - Prodigy Support

OSError : [E050] Can't find model 'en_model.vectors'. It doesn't seem to be a shortcut link, a Python package or a valid path to...

Models & Languages · spaCy Usage Documentation

Use spacy.load() ... python -m spacy download en_core_web_smimport spacynlp ... Initializing the language object directly yields the same result as ...