question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Python Interactive Window doesn't parallelize with dask properly

See original GitHub issue

Environment data

  • VS Code version: Version: 1.62.0-insider
  • Extension version (available under the Extensions sidebar): v2021.11.1313923388-dev
  • OS and version: macOS 11.6 (Apple Silicon M1)
  • Python version (& distribution if applicable, e.g. Anaconda): 3.9.7 (conda-forge)
  • Type of virtual environment used (N/A | venv | virtualenv | conda | …): conda
  • Relevant/affected Python packages and their versions: dask 2021.9.1
  • Relevant/affected Python-related VS Code extensions and their versions: Jupyter v2021.10.1001362801
  • Value of the python.languageServer setting: Default

Actual behaviour

I have a .py file and am using # %% to run cells in the Python Interactive Window (which I love!). I have a function (fiber_analysis) that takes 1 min to run on an image—essentially all the last step—using 1 python process on my computer and 100% CPU. When I spin up the Interactive Window kernel I see 4 python3.9 processes in my Activity Monitor. I have 4 performance cores 4 efficiency cores and this computation should be able to run in parallel since the computations are independent per image. So I thought dask would help. So i made a dask array of my images and chunked by image. (Note: I pared down my data set to just 6 images so the tests would be quicker) I’ve set up to iterate over the slices and run the function as delayed:

output_s1 = []

for sample in all_samples_g:
    dvals = delayed(fiber_analysis)(sample)
    output_s1.append(dvals)

I can then compute this:

import dask
out_dvals = dask.compute(*output_s1)

And this uses one of the existing python3.9 processes and 115-120% CPU—essentially same as just running on one image. This takes 7.5 min. So that’s not really progress, but I understand that this should be using the dask threaded model by default, so only non-python code will be parallelized. So next I try:

import dask
out_dvals = dask.compute(*output_s1, scheduler='processes')

This launches 1 additional python3.9 process and an additional processes python and now that python process uses ~125% CPU. Oddly, this computation takes >10 min, so much longer than just running the function on the images one by one. I killed it. Maybe I need to add workers? So lets try that. 4 workers, for my 4 performance cores:

import dask
out_dvals = dask.compute(*output_s1, num_workers=4)

Same as without the argument. 1 python3.9 process (out of 4) doing 115% CPU. Run time: 7 min. So no great shakes. What about with processes and 4 workers? Same as before: 1 extra python3.9 and 1 python process running at 125% CPU. And again >10 min so I kill it.

It appears that no speed benefit can be gained by using the regular dask scheduler and Interactive Python Windows… Maybe related to: https://github.com/microsoft/vscode-jupyter/issues/2962

Anyhow, by accident I reran:

out_dvals = dask.compute(*output_s1, scheduler='processes', num_workers=4)

with a stack of 15 images instead of 6. Now 3 python processes spawn, each doing 115% CPU, & 1 essentially idle extra python 3.9 process. So promising! …but after 20 min I killed it, because that’s longer than just running each image one by one.

Expected behaviour

When rerunning the same code as a Jupyter notebook .ipynb, with the 15 image stack, processed using

out_dvals = dask.compute(*output_s1, scheduler='processes', num_workers=4)

5 python3.9 processes were spawned total by the command. 4 python3.9 processes go to 100% CPU, with all 4 performance cores are totally slammed, 100%. 5 remaining python3.9 processes essentially idle. 4 mins in, only 2 processes at 100%, rest idle. 5 min in, 1 process at 100%, rest idle. And finished in 5 min 15 s! 🎉

Steps to reproduce:

Not sure how I can make something that is easily shared, any advice?

Logs

N/A

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
rchiodocommented, Oct 21, 2021

This might help: https://code.visualstudio.com/docs/python/debugging

Essentially you have to attach a debugger to the python kernel and then debug the execution on your dask code.

At some point the execution will go through this function: https://github.com/ipython/ipykernel/blob/fdda069bba36cafcc25df4d2353b26fbdb9e4d15/ipykernel/ipkernel.py#L294

A breakpoint is something that tells the debugger (after you attach) to stop execution at a line on a piece of code.

1reaction
rchiodocommented, Oct 20, 2021

Ah sorry I misunderstood. It works fine in a notebook but not in the IW (but both running in VS code). Then the disableZMQSupport flag will have no effect.

The difference between running an IW kernel and jupyter kernel is just the things we set in it. Like __file__ is set in the IW but not the notebook. Not sure how that would affect dask though.

This likely requires debugging the kernel itself to see why it isn’t parallelizing.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to parallelize Python code with Dask Delayed - Coiled.io
Dask Delayed allows us to parallelize custom Python code and algorithms. It can be very powerful in accelerating existing workflows with ...
Read more >
How do I parallelize a simple Python loop? - Stack Overflow
I'm using currently Linux but the code should run on Windows and Mac as-well. What's the easiest way to parallelize this code? python...
Read more >
dask.distributed - Parallel Processing in Python - CoderzColumn
The dask.distributed module is wrapper around python concurrent.futures module ... Start Scheduler by executing below command in the shell.
Read more >
dask.distributed client hangs in VSCode · Issue #5574 - GitHub
Client in vscode python interactive window I have to restart the kernel a few times and copy+paste into the input terminal until the...
Read more >
Embarrassingly parallel Workloads - Dask Examples
There are many ways to parallelize this function in Python with libraries like multiprocessing , concurrent.futures , joblib or others. These are good...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found