Deserialization fails based on whether "nlp" object was used yet
See original GitHub issueWhen I try to deserialize using a fresh nlp
object, spaCy crashes, though it manages to deserialize fine if it’s the same nlp
object that was used to originally parse the text. I can’t tell if I’m using the library in a way it’s not intended to be used, or if this is a bug?
(This is spaCy version 1.5.0 on Linux with Python 2.7)
from spacy.tokens.doc import Doc
import spacy
nlp = spacy.load("en")
text = u"Hello world."
# Parse it. Works.
doc = nlp(text)
print len(set([o for o in doc]))
# Save a serialized copy
with open("out",'w') as f:
f.write(str(doc.to_bytes()))
# Deserialize: works
doc2 = Doc(nlp.vocab)
doc2.from_bytes(open("out").read())
print len(set([o for o in doc2]))
# Deserialize with a fresh nlp object: crashes
nlp = spacy.load("en")
doc3 = Doc(nlp.vocab)
doc3.from_bytes(open("out").read())
print len(set([o for o in doc3]))
Issue Analytics
- State:
- Created 7 years ago
- Comments:7 (3 by maintainers)
Top Results From Across the Web
Deserialization fails with a new model instance. #927 - GitHub
The problem is that, if documents contain unicode characters, it seems that retrieving them later and then deserializing them with 'from_bytes' ...
Read more >java - Jackson Deserialization Fails because of non-default ...
A delegate creator allows Jackson to deserialize json for one type of object into another type of Java object. In this case, because...
Read more >Pyspark error "Could not serialize object" - Clare S. Y. Huang
The issue is that, as self._mapping appears in the function addition , when applying addition_udf to the pyspark dataframe, the object self ( ......
Read more >Saving and Loading · spaCy Usage Documentation
When an nlp object with the component in its pipeline is saved or loaded, the component will then be able to serialize and...
Read more >Introducing spaCy v2.2 - Explosion AI
Version 2.2 of the spaCy Natural Language Processing library is leaner, cleaner and even more user-friendly. In addition to new model ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Oh now it works! I reinstalled the model and now it has both these two directories in spacy/data:
Before, it only had
en-1.1.0
after I installed the model.This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.