question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

local model does not link/package properly

See original GitHub issue

How to reproduce the behaviour

I am creating a model with a pattern matcher. These are the rules;

{"label":"PROGLANG","pattern":[{"LOWER":"golang"}]}
{"label":"PROGLANG","pattern":[{"LOWER":"go", "POS": {"NOT_IN": "VERB"}}]}
{"label":"PROGLANG","pattern":[{"LOWER":"sql"}]}
{"label":"PROGLANG","pattern":[{"LOWER":"python"}]}
{"label":"PROGLANG","pattern":[{"LOWER":{"REGEX":"(python\\d+\\.?\\d*.?\\d*)"}}]}
{"label":"PROGLANG","pattern":[{"LOWER":"python"}, {"TEXT":{"REGEX":"(\\d+\\.?\\d*.?\\d*)"}}]}
{"label":"PROGLANG","pattern":[{"LOWER": {"IN": ["node", "nodejs", "js", "javascript"]}}]}
{"label":"PROGLANG","pattern":[{"LOWER": {"IN": ["node", "nodejs", "js", "javascript"]}}, {"TEXT": {"REGEX": "(\\d+\\.?\\d*.?\\d*)"}}]}

This is the script that creates the model and saves it to disk.

import pathlib

import spacy
from spacy.lang.en import English
from spacy.pipeline import EntityRuler


if __name__ == "__main__":
    path = pathlib.Path('matcher-rules/proglang.jsonl')
    # note that we could have also used `English()` as a starting point
    # if our matching rules weren't using part of speech 
    nlp = spacy.load("en_core_web_sm")

    # create a new rule based NER detector loading in settings from disk
    ruler = EntityRuler(nlp).from_disk(path)
    print(f"Will now create model for {path}.")

    # add the detector to the model
    nlp.add_pipe(ruler, name="proglang-detector")

    # save the model to disk
    nlp.meta["name"] = "custom-proglang-model"
    nlp.to_disk(nlp.meta["name"])
    print(f"spaCy model saved over at {nlp.meta['name']}.")

I now create this model and link it, because linking feels like the formal thing to do.

> python mkmodel.py
Will now create model for matcher-rules/proglang.jsonl.
spaCy model saved over at custom-proglang-model.
> python -m spacy link custom-proglang-model proglang-model --force
✔ Linking successful
You can now load the model via spacy.load('proglang-model')

However, when I now load the model, stuff starts breaking.

(venv) ➜  rasa-spacy-integration git:(master) ✗ python
Python 3.6.8 (v3.6.8:3c6b436a57, Dec 24 2018, 02:04:31) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import spacy
>>> spacy.load("proglang-model") 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/vincent/Development/rasa-spacy-integration/venv/lib/python3.6/site-packages/spacy/__init__.py", line 30, in load
    return util.load_model(name, **overrides)
  File "/Users/vincent/Development/rasa-spacy-integration/venv/lib/python3.6/site-packages/spacy/util.py", line 162, in load_model
    return load_model_from_link(name, **overrides)
  File "/Users/vincent/Development/rasa-spacy-integration/venv/lib/python3.6/site-packages/spacy/util.py", line 176, in load_model_from_link
    cls = import_file(name, path)
  File "/Users/vincent/Development/rasa-spacy-integration/venv/lib/python3.6/site-packages/spacy/compat.py", line 157, in import_file
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 674, in exec_module
  File "<frozen importlib._bootstrap_external>", line 780, in get_code
  File "<frozen importlib._bootstrap_external>", line 832, in get_data
FileNotFoundError: [Errno 2] No such file or directory: '/Users/vincent/Development/rasa-spacy-integration/venv/lib/python3.6/site-packages/spacy/data/proglang-model/__init__.py'

The irony is that if I directly point to the folder (which isn’t what you’d want to recommend to folks for production … I think) then it works just fine.

>>> spacy.load("custom-proglang-model")
<spacy.lang.en.English object at 0x1221314e0>

This is independant of the nlp.meta["name"] = "custom-proglang-model" line in mkmodel.py.

Info about spaCy

  • spaCy version: 2.2.4
  • Platform: Darwin-19.3.0-x86_64-i386-64bit
  • Python version: 3.6.8

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:11 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
adrianeboydcommented, Apr 22, 2020

I think the installed package will be called en_proglang. After running pip install dist/model.tar.gz you should be able to find the directory for the installed package in venv/lib/python3.x/site-packages to double-check the name, too.

spacy.load("en_proglang")

As an alternative, you can also try importing the package directly:

import en_proglang
nlp = en_proglang.load()

You have to run python setup.py sdist to regenerate the dist folder.

0reactions
github-actions[bot]commented, Nov 6, 2021

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Read more comments on GitHub >

github_iconTop Results From Across the Web

SpaCy OSError: Can't find model 'en' - Stack Overflow
i tried to create a macro with python for topic detection but got the Error that there is not a module named "en_core_web_sm"....
Read more >
Xcode 13 "Missing package product"… - Apple Developer
In Xcode 12, the local Swift packages resolve correctly and compiles, ... derived data, or restarting the application does not resolve the issue....
Read more >
2022 SANTA FE Limited - Hyundai
2022 SANTA FE Limited ... Standard on most new Hyundai vehicles, this comprehensive safety and car care package offers peace of mind. In...
Read more >
yarn link
This command is run in the package folder you'd like to consume. For example if you are working on react and would like...
Read more >
Testing npm packages before publishing | by Carl Vitullo
~/workspace/some-application $ npm link package-name ... Even if you've gotten your symlinked package to run correctly, it doesn't tell you if it will...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found