Phrasematcher; OverflowError: Python int too large to convert to C long
See original GitHub issueHow to reproduce the behaviour
Greetings;
I am trying to use Spacy PhraseMatcher with token attr 'IS_SENT_START'
. I guess like other attributes like ‘IS_PUNCT
’ it should be possible to use the is_sent_start attr yet I got the following error.
`matcher = PhraseMatcher(nlp.vocab, attr = "IS_SENT_START")
Traceback (most recent call last):
File "<ipython-input-74-c8e7d50d6aec>", line 1, in <module>
matcher = PhraseMatcher(nlp.vocab, attr = "IS_SENT_START")
File "phrasematcher.pyx", line 63, in spacy.matcher.phrasematcher.PhraseMatcher.__init__
OverflowError: Python int too large to convert to C long`
Here is the sample code;
import spacy
nlp = spacy.load('en_core_web_sm')
from spacy.matcher import PhraseMatcher
sub_list = ['however', 'although', 'on the other hand', 'nonetheless']
patterns = [nlp.make_doc(text) for text in sub_list]
matcher = PhraseMatcher(nlp.vocab, attr = 'IS_SENT_START')
Why so? I am trying to see if phrases in list are sent_start = True or False
. No problem with Matcher yet it is not working with Phrasematcher. Can you help me with the solution or a work around?
Your Environment
- Operating System: macOS Catalina
- Python Version Used: 3.7.9
- spaCy Version Used: 2.0
- Environment Information: Anaconda-Spyder
Issue Analytics
- State:
- Created 3 years ago
- Comments:10 (6 by maintainers)
Top Results From Across the Web
Python int too large to convert to C long" on windows but not ...
OverflowError : Python int too large to convert to C long. Now try with float conversion: df['temp'] = df['temp'].astype(float).
Read more >OverflowError: Python int too large to convert to C long
Hello, As the label names it, I am having issues w/ running some i2c-2 source on an am335x based, SiP board. I deal...
Read more >OverflowError: Python int too large to convert to C long
Hi, I was able to setup my raspberry pi using Method 1 mentioned here. Below is my simple code. Just trying to create...
Read more >Python int too large to convert to C long in python 3.4
I am getting this error when I am trying to run word2vec from gensim library of python. I am using python 3.4 and...
Read more >[Python] OverflowError: Python int too large to convert to C long
[Python] OverflowError: Python int too large to convert to C long. Hello guys! I'm trying to run a program I built but being...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi!
The error the matcher throws should definitely be more user-friendly, and the documentation should be updated too.
But you can make this work with
attr='SENT_START'
. Note that this will make thePhraseMatcher
look at sentence segmentation, and so you’ll have to compile your patterns withnlp()
instead ofnlp.make_doc(text)
to ensure that the parser runs (which does the sentence segmentation inen_core_web_sm
)This should work:
The pattern will now be
[True, None, None, None, True, None]
This prints
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.