question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Multiprocessing pipe hangs when using en_core_web_trf

See original GitHub issue

How to reproduce the behaviour

When running nlp.pipe with n_process > 1 and using the en_core_web_trf model, multiprocessing seem to be stuck. Here is a simple PoC:

import spacy

nlp = spacy.load("en_core_web_trf")

texts = ["Hello world" for _ in range(20)]
for doc in nlp.pipe(texts=texts, n_process=2):
    pass

Looks like spacy gets stuck here: https://github.com/explosion/spaCy/blob/ace6ae435b0f1dc95f489099252b097929c9b78f/spacy/language.py#L1466

Using en_core_web_sm works fine.

Info about spaCy

  • spaCy version: 3.0.0rc2
  • Platform: Linux-5.10.3-arch1-1-x86_64-with-glibc2.2.5
  • Python version: 3.8.7
  • Pipelines: en_core_web_sm (3.0.0a0), en_core_web_trf (3.0.0a0)

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:9 (7 by maintainers)

github_iconTop GitHub Comments

4reactions
martincjespersencommented, Mar 8, 2021

Hi, I am having the same issue with spacy==3.0.3 regarding pickling the spans when running inference on a finetuned RoBERTA model with nlp.pipe(n_process > 1). Are there any news regarding this issue?

0reactions
github-actions[bot]commented, Oct 21, 2021

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Python Multiprocessing Pipe hang - Stack Overflow
When the migration is complete, you will access your Teams at stackoverflowteams.com, and they will no longer appear in the left sidebar on ......
Read more >
Why your multiprocessing Pool is stuck (it's full of sharks!)
You're using multiprocessing to run some code across multiple processes, and it just—sits there. It's stuck.
Read more >
Handling Hang in Python Multiprocessing - Sefik Ilkin Serengil
In this post, I'm going to share you how to handle this hang and deadlock problem. 6-armed Spider-Man. Approach causing deadlock. I was...
Read more >
using multiprocessing in flask app in windows, process just ...
The process that starts selenium hangs after around 20 minutes. (Always after 20 minutes). I don't know if it's an issue with deadlock....
Read more >
Make YOLO do object detection faster with Multiprocessing
This tutorial is a brief introduction to multiprocessing in Python. ... Same as before, we'll modify the Queue's code to use Pipe.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found