question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Dask client not running on Colab

See original GitHub issue

Dask client is not running on Colab. My code is:

from dask.distributed import Client, progress
client = Client()
client

and the error is:

/usr/local/lib/python3.6/dist-packages/distributed/bokeh/core.py:57: UserWarning: 
Port 8787 is already in use. 
Perhaps you already have a cluster running?
Hosting the diagnostics dashboard on a random port instead.
  warnings.warn('\n' + msg)
tornado.application - ERROR - Multiple exceptions in yield list
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tornado/gen.py", line 828, in callback
    result_list.append(f.result())
  File "/usr/local/lib/python3.6/dist-packages/tornado/concurrent.py", line 238, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
  File "/usr/local/lib/python3.6/dist-packages/tornado/gen.py", line 1069, in run
    yielded = self.gen.send(value)
  File "/usr/local/lib/python3.6/dist-packages/distributed/deploy/local.py", line 229, in _start_worker
    raise gen.TimeoutError("Worker failed to start")
tornado.gen.TimeoutError: Worker failed to start
---------------------------------------------------------------------------
TimeoutError                              Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/tornado/gen.py in callback(f)
    827                 try:
--> 828                     result_list.append(f.result())
    829                 except Exception as e:

33 frames
TimeoutError: Worker failed to start

During handling of the above exception, another exception occurred:

TimeoutError                              Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/distributed/deploy/local.py in _start_worker(self, death_timeout, **kwargs)
    227         if w.status == 'closed' and self.scheduler.status == 'running':
    228             self.workers.remove(w)
--> 229             raise gen.TimeoutError("Worker failed to start")
    230 
    231         raise gen.Return(w)

TimeoutError: Worker failed to start

To reproduce the error please see the following Colab gist:

https://colab.research.google.com/gist/cornhundred/70524e9425db14f3adec9e86660aa313/trying_dask_on_colab.ipynb

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:26 (13 by maintainers)

github_iconTop GitHub Comments

3reactions
randerzandercommented, May 30, 2019

@cornhundred I suspect the dashboard being inaccessible is because printing the client info in Colab gives you a localhost address:

Dashboard: http://localhost:8787/status |  

I’ve dug around for ways to expose it, and the simplest I’ve seen is using ngrok like so:

# expose Dask's status dashboard to a public URL
!wget https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
!unzip ngrok-stable-linux-amd64.zip

get_ipython().system_raw('./ngrok http 8787 &')

!curl -s http://localhost:4040/api/tunnels | python3 -c \
    "import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"

That should print a link which you can click and get the Dask Status Dashboard for your Colab instance.

The above with a LocalCluster(processes=False) worked for me just a few moments ago.

1reaction
riversdarkcommented, Jul 20, 2022

Any update on this? I’m getting the same error messages on Colab.

Okay it seems !pip install -U dask[complete] can fix the problem.

Read more comments on GitHub >

github_iconTop Results From Across the Web

not able to install dask on google colab - Stack Overflow
Try this. !python -m pip install "dask[complete]" import dask import dask.dataframe as dd.
Read more >
Connect to Dask from Google Colab - Saturn Cloud
First, ensure that the client connecting to the Dask cluster has the appropriate libraries, in particular the version of dask-saturn shown by the...
Read more >
adv_viz.ipynb - Colaboratory - Google Colab
The Dask scheduler runs on a single thread, so assigning it its own node is a waste. There is no hard limit on...
Read more >
Dask Installation - Dask documentation
This will install a minimal set of dependencies required to run Dask similar to (but not exactly the same as) python -m pip...
Read more >
Release 0+untagged.50.ge6068ba.dirty Modin contributors
Modin can be used with Google Colab via the pip command, by running the ... To avoid the problem the Dask Client creation...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found