question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Spacy does not return vector if GPU is enabled

See original GitHub issue

I’m using a GPU-enabled Google Colab notebook.

After installing the requisite libraries and models

!pip install spacy[cuda100]~=2.2 scispacy~=0.2.4

!pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.2.4/en_core_sci_lg-0.2.4.tar.gz
import spacy
spacy.prefer_gpu()
nlp = spacy.load("en_core_sci_lg")

text = """
Myeloid derived suppressor cells (MDSC) are immature 
myeloid cells with immunosuppressive activity. 
They accumulate in tumor-bearing mice and humans 
with different types of cancer, including hepatocellular 
carcinoma (HCC).
"""

doc = nlp(text)

Running

doc.ents

produces

(Myeloid,
 suppressor cells,
 MDSC,
 immature,
 myeloid cells,
 immunosuppressive activity,
 accumulate,
 tumor-bearing mice,
 humans,
 cancer,
 hepatocellular 
 carcinoma,
 HCC)

as expected, but

doc.vector

produces this error:

TypeError                                 Traceback (most recent call last)
<ipython-input-19-40a48203c66b> in <module>()
----> 1 doc.vector

doc.pyx in __iter__()

cupy/core/core.pyx in cupy.core.core.ndarray.__array_ufunc__()

cupy/core/_kernel.pyx in cupy.core._kernel.ufunc.__call__()

cupy/core/_kernel.pyx in cupy.core._kernel._preprocess_args()

TypeError: Unsupported type <class 'numpy.ndarray'>

Commenting out

spacy.prefer_gpu()

solves the issue. This means that getting token vectors will not be able when the GPU is enabled?

I’m not sure if this is related to #81 or #3431 in spaCy.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6

github_iconTop GitHub Comments

1reaction
joseshocommented, Mar 2, 2020

Hi @danielkingai2,

I also see your problem.

For full documentation’s sake:

import spacy
spacy.prefer_gpu()
nlp_core = spacy.load("en_core_web_lg")

text1 = "The effect of anxiogenic treatments on three rodent models of anxiety: \
         the open field test, the elevated plus-maze, and the light-dark box."

doc1 = nlp_core(text1)

doc1.vector

produces

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-531ef58ab65a> in <module>()
      1 doc1 = nlp_core(text1)
      2 
----> 3 doc1.vector

doc.pyx in __iter__()

cupy/core/core.pyx in cupy.core.core.ndarray.__add__()

cupy/core/_kernel.pyx in cupy.core._kernel.ufunc.__call__()

cupy/core/_kernel.pyx in cupy.core._kernel._preprocess_args()

TypeError: Unsupported type <class 'numpy.ndarray'>

Relevant package versions:

print(spacy.__version__)
print(cupy.__version__)
2.2.3
7.2.0

Closing this for now as it seems it’s a spacy issue and not scispacy’s.

0reactions
dakingggcommented, Feb 28, 2020

Any update on this?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Vectors · spaCy API Documentation
Vectors data is kept in the Vectors.data attribute, which should be an instance of numpy.ndarray (for CPU vectors) or cupy.ndarray (for GPU vectors)....
Read more >
SpaCy's most_similar() function returns error on GPU
I'm trying to evaluate performance of most_similar ...
Read more >
spaczz - PyPI
Spaczz expects token matches returned in order of ascending match start, then descending match length. However, spaCy's Matcher does not return matches in ......
Read more >
Turbo-charge your spaCy NLP pipeline | Inverse Entropy
Since we will not be doing any specialized tasks such as dependency parsing ... these components are disabled when loading the spaCy model....
Read more >
SPACY v3: Custom trainable relation extraction component
spaCy v3.0 features new transformer-based pipelines tha... ... try restarting your device. Your browser can't play this video.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found