question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

OSError: [E050] in nlp.initialize()

See original GitHub issue

I am trying to train an entity linker but I am getting the error

OSError: [E050] Can't find model 'corpus/en_vectors'. It doesn't seem to be a Python package or a valid path to a data directory.

in initialize().

I’ve looked at the docs here, but am struggling to see why the error is occurring.

How to reproduce the behaviour

from spacy.kb import KnowledgeBase
from spacy.training import Example
import spacy

nlp = spacy.load('en_core_web_lg')
# here I usually load my local vocab path, but the same error occurs without this
# nlp.vocab.from_disk(self.vocab_path)
nlp.vocab.vectors.name = "spacy_pretrained_vectors"

def create_kb(vocab):
    entity_vector_length = 300
    kb = KnowledgeBase(vocab=vocab, entity_vector_length=entity_vector_length)
    # here I usually load my local knowledge base, but the same error occurs if you dont add anything 
    # kb.from_disk(self.kb_path)
    return kb

entity_linker = nlp.add_pipe("entity_linker")
entity_linker.set_kb(create_kb)

train_data = []
text_1 = "Russ Cochran his reprints include EC Comics."
dict_1 = {(0, 12): {"Q7381115": 1.0, "Q2146908": 0.0}}
train_data.append((text_1, {"links": dict_1}))
text_2 = "Russ Cochran has been publishing comic art."
dict_2 = {(0, 12): {"Q7381115": 1.0, "Q2146908": 0.0}}
train_data.append((text_2, {"links": dict_2}))
text_3 = "Russ Cochran captured his first major title with his son as caddie."
dict_3 = {(0, 12): {"Q7381115": 0.0, "Q2146908": 1.0}}
train_data.append((text_3, {"links": dict_3}))
text_4 = "Russ Cochran was a member of University of Kentucky's golf team."
dict_4 = {(0, 12): {"Q7381115": 0.0, "Q2146908": 1.0}}
train_data.append((text_4, {"links": dict_4}))

examples = []
for text, annotation in train_data:
    doc = nlp.make_doc(text)
    example = Example.from_dict(doc, annotation)
    examples.append(example)

other_pipes = [pipe for pipe in nlp.pipe_names if pipe != "entity_linker"]
with nlp.select_pipes(disable=other_pipes):
    optimizer = nlp.initialize()
    for itn in range(n_iter):
        random.shuffle(examples)
        losses = {}
        batches = minibatch(examples, size=compounding(4.0, 32.0, 1.001))
        for batch in batches:
            nlp.update(
                batch, drop=0.2, losses=losses, sgd=optimizer,
            )

I’m getting this same issue when I try to load my own knowledge base and vocab. For this I thought maybe I needed to change the config file to point to the correct vectors location (which is “local_kb_path/vocab/vectors”), so I tried:

config = {"initialize": {"vectors": 'local_kb_path/vocab/vectors'}}
entity_linker = nlp.add_pipe("entity_linker", config=config)

but this gives ‘extra fields not permitted’.

Many thanks!

Your Environment

  • spaCy version: 3.0.0rc2
  • Platform: Darwin-18.6.0-x86_64-i386-64bit
  • Python version: 3.7.9
  • Pipelines: en_core_web_md (3.0.0a0), en_core_web_lg (3.0.0a0)

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
svlandegcommented, Jan 4, 2021

Sounds good! I’ll close this in the meantime, but feel free to reopen or open a new issue if you can’t get it to work. (for usage questions, you can also use our new discussion board btw! https://github.com/explosion/spaCy/discussions)

1reaction
lizgzilcommented, Jan 4, 2021

Thank you @svlandeg I will let you know how I get on with that advice 👍

Read more comments on GitHub >

github_iconTop Results From Across the Web

OSError: [E050] Can't find model 'en' - Stack Overflow
when using spacy we have to download the model using python -m spacy download en_core_web_sm. If you have already done that make sure...
Read more >
OSError: [E050] Can't find model 'en'. It doesn't seem to be a ...
I wanted to use the chatterbot spacy collaborate system and trained data on chatterbot and created a response chat system.
Read more >
OSError: [E050] Can't find model 'en'. It doesn't seem to be a ...
I followed the github one. so as ines said on github import en_core_web_sm nlp = en_core_web_sm.load() Those commands were running successfully ...
Read more >
Issue with exported spacy models - solved - Prodigy Support
OSError : [E050] Can't find model 'en_model.vectors'. It doesn't seem to be a shortcut link, a Python package or a valid path to...
Read more >
Models & Languages · spaCy Usage Documentation
Use spacy.load() ... python -m spacy download en_core_web_smimport spacynlp ... Initializing the language object directly yields the same result as ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found