Python Interactive Window doesn't parallelize with dask properly
See original GitHub issueEnvironment data
- VS Code version: Version: 1.62.0-insider
- Extension version (available under the Extensions sidebar): v2021.11.1313923388-dev
- OS and version: macOS 11.6 (Apple Silicon M1)
- Python version (& distribution if applicable, e.g. Anaconda): 3.9.7 (conda-forge)
- Type of virtual environment used (N/A | venv | virtualenv | conda | …): conda
- Relevant/affected Python packages and their versions: dask 2021.9.1
- Relevant/affected Python-related VS Code extensions and their versions: Jupyter v2021.10.1001362801
- Value of the
python.languageServer
setting: Default
Actual behaviour
I have a .py file and am using # %%
to run cells in the Python Interactive Window (which I love!).
I have a function (fiber_analysis
) that takes 1 min to run on an image—essentially all the last step—using 1 python process on my computer and 100% CPU. When I spin up the Interactive Window kernel I see 4 python3.9 processes in my Activity Monitor.
I have 4 performance cores 4 efficiency cores and this computation should be able to run in parallel since the computations are independent per image.
So I thought dask would help.
So i made a dask array of my images and chunked by image.
(Note: I pared down my data set to just 6 images so the tests would be quicker)
I’ve set up to iterate over the slices and run the function as delayed:
output_s1 = []
for sample in all_samples_g:
dvals = delayed(fiber_analysis)(sample)
output_s1.append(dvals)
I can then compute this:
import dask
out_dvals = dask.compute(*output_s1)
And this uses one of the existing python3.9 processes and 115-120% CPU—essentially same as just running on one image. This takes 7.5 min. So that’s not really progress, but I understand that this should be using the dask threaded model by default, so only non-python code will be parallelized. So next I try:
import dask
out_dvals = dask.compute(*output_s1, scheduler='processes')
This launches 1 additional python3.9 process and an additional processes python and now that python process uses ~125% CPU. Oddly, this computation takes >10 min, so much longer than just running the function on the images one by one. I killed it. Maybe I need to add workers? So lets try that. 4 workers, for my 4 performance cores:
import dask
out_dvals = dask.compute(*output_s1, num_workers=4)
Same as without the argument. 1 python3.9 process (out of 4) doing 115% CPU.
Run time: 7 min. So no great shakes.
What about with processes
and 4 workers?
Same as before: 1 extra python3.9 and 1 python process running at 125% CPU.
And again >10 min so I kill it.
It appears that no speed benefit can be gained by using the regular dask scheduler and Interactive Python Windows… Maybe related to: https://github.com/microsoft/vscode-jupyter/issues/2962
Anyhow, by accident I reran:
out_dvals = dask.compute(*output_s1, scheduler='processes', num_workers=4)
with a stack of 15 images instead of 6.
Now 3 python
processes spawn, each doing 115% CPU, & 1 essentially idle extra python 3.9 process.
So promising!
…but after 20 min I killed it, because that’s longer than just running each image one by one.
Expected behaviour
When rerunning the same code as a Jupyter notebook .ipynb, with the 15 image stack, processed using
out_dvals = dask.compute(*output_s1, scheduler='processes', num_workers=4)
5 python3.9 processes were spawned total by the command. 4 python3.9 processes go to 100% CPU, with all 4 performance cores are totally slammed, 100%. 5 remaining python3.9 processes essentially idle. 4 mins in, only 2 processes at 100%, rest idle. 5 min in, 1 process at 100%, rest idle. And finished in 5 min 15 s! 🎉
Steps to reproduce:
Not sure how I can make something that is easily shared, any advice?
Logs
N/A
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (5 by maintainers)
Top GitHub Comments
This might help: https://code.visualstudio.com/docs/python/debugging
Essentially you have to attach a debugger to the python kernel and then debug the execution on your dask code.
At some point the execution will go through this function: https://github.com/ipython/ipykernel/blob/fdda069bba36cafcc25df4d2353b26fbdb9e4d15/ipykernel/ipkernel.py#L294
A breakpoint is something that tells the debugger (after you attach) to stop execution at a line on a piece of code.
Ah sorry I misunderstood. It works fine in a notebook but not in the IW (but both running in VS code). Then the
disableZMQSupport
flag will have no effect.The difference between running an IW kernel and jupyter kernel is just the things we set in it. Like
__file__
is set in the IW but not the notebook. Not sure how that would affect dask though.This likely requires debugging the kernel itself to see why it isn’t parallelizing.