Process exits with status 139 when trying to instantiate a SentenceTransformer
See original GitHub issueDescription
Hi all, I’m running into a strange error while trying to run the following:
import sentence_transformers as st
encoder = st.SentenceTransformer("all-mpnet-base-v2") # also tried sentence-t5-base
I get an error: Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)
, and if I run it as a script, I get zsh: segmentation fault python
. I also get a warning: multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appears to be 1 leaked semaphore objects to clean up at shutdown
.
Reproducibility
I’m afraid I don’t know how to reproduce this problem, it was working until it stopped working. Here are my specs:
- MacBook Pro 2019, Intel Core i9, 32GB RAM
- macOS Monterey 12.0.1
- python 3.10.2
- sentence_transformers==2.2.0
- torch==1.11.0
What I’ve tried so far
- Clear the cache from
~/.cache/torch/sentence_transformers
- reinstall sentence_transformers (also tried downgrading to 2.1.0)
- reinstalling torch
- recreating my virtual environment
- Rebooting
I’ve also managed to trace this problem to the following point in the torch codebase.
Any help would be appreciated, thanks in advance!
Update: I managed to fix the problem by downgrading to Python 3.9.11. I’m not actually sure whether the Python version is to be blamed here, or perhaps just having a new Python environment did the trick (perhaps the virtual environment wasn’t enough). Anyway, this isn’t really a solution but a workaround, so I think we should keep this ticket open.
Issue Analytics
- State:
- Created a year ago
- Comments:7 (3 by maintainers)
These build on top of sentence Transformers. Python 3.10 is sadly still buggy with Pytorch, see https://github.com/pytorch/pytorch/issues/66424
So we have to wait until Python 3.10 is fully supported by Pytorch.
@nreimers I just updated the issue with a workaround, but just in case this issue returns, can you recommend an alternative library that I could try? It seems to me that txtai and Top2Vec could be used to obtain sentence embeddings – would you say that these are reasonable alternatives?