Setting progress_bar=True freezes execution for parallel_apply before reaching 1% completion on all CPU's
See original GitHub issueWhen progress_bar=True
, I noticed that the execution of my parallel_apply
task stopped right before all parallel processes reached 1% progress mark.
Here are some further details of what I was encountering -
- I turned on
logging
withDEBUG
messages, but no messages were displayed when the execution stopped. There were no error messages either. The dataframe rows simply stopped processing further and the process seemed to be frozen. - I have two CPU’s. It seems that the progress bar only updates in 1% increments. One of the progress bars reaches 1% mark, but when the number of processed rows reaches the 2% mark (which I assume is associated with the second progress bar updating to 1% as well), that’s when the process froze.
- The process runs fine with
progress_bar=False
.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:12
- Comments:22 (5 by maintainers)
Top Results From Across the Web
Pandarallel — A simple and efficient tool to parallelize your ...
The idea of Pandaral·lel is to distribute your pandas calculation over all available CPUs on your computer to get a significant speed ...
Read more >Make Pandas DataFrame apply() use all cores? - Stack Overflow
Our of pure curiosity, is there a way to limit number of cores it uses when doing parallel apply? I have a shared...
Read more >dotnet process freezes and goes upto 99 % of cpu utilisation
Visual Studio is unusable after macOS upgrade to 11.6.1. Within 5 seconds the dotnet process reaches 99 % CPU utilisation and after that...
Read more >Tips for first-time users — Ray 2.2.0
Since each task requests by default one CPU, this setting allows us to execute up to four tasks in parallel. As a result,...
Read more >Embarrassingly parallel for loops - Joblib - Read the Docs
Parallel uses the 'loky' backend module to start separate Python worker processes to execute tasks concurrently on separate CPUs. This is a reasonable ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Similar issue and i’m only working on about 12k rows. It seems to get to about 300 completed items on each core then all of the forked processes just seem to die - almost like it’s trying to create new threads but then it just sits there, all cores basically unused.
Python 3.6.9 on Ubuntu-18.04 WSL2
** Edit** I removed the enable for progress_bar in my little console application, and it seems that whatever deadlock is occurring has disappeared, it seems to be progressing pretty well
I’m assuming this has been fixed.