question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Installing dask and distributed packages can be confusing

See original GitHub issue

Our installation docs recommend that people do the following

conda install dask distributed -c conda-forge

or

pip install dask[complete] distributed --upgrade

However we shouldn’t expect most people to do the proper diligence of reading installation docs. We all tend to just guess that conda install name-of-project works pretty well most of the time. Unfortunately, if you’ve heard that Dask does distributed computing, you conda install dask, and then try out any distributed example then you’re likely to receive an import error, which makes for a bad first impression.

There are a few ways that we could resolve this problem:

  1. We could provide informative errors whenever someone tries to do a dask.distributed thing. These would point them to installation docs. This wouldn’t help if the just did import distributed though I think that most of the public materials we produce at this point always import from dask.distributed.
  2. We could switch out the conda package dask with a metapackage that included both dask and distributed. This would be foolproof in the conda case but would be a bit of an organizational hassle from a packaging perspective. We would rename the existing package dask-core (or something similar) and then switch in the dask metapackage. We would have to do this on conda-forge at the same time.
  3. We could find some way within conda to have a cycle (dask includes distributed, distributed includes dask)
  4. Other suggestions?

I’m in favor of starting with option 1, though would love to find a more thorough alternative.

cc @pzwang @ilanschnell

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:2
  • Comments:27 (14 by maintainers)

github_iconTop GitHub Comments

2reactions
TomAugspurgercommented, Apr 29, 2018

We’ve all made that mistake at least once 😃

On Sun, Apr 29, 2018 at 6:09 AM, YorT notifications@github.com wrote:

@TomAugspurger https://github.com/TomAugspurger sorry, ignore me, total noob mistake on my part. I had a testing file called dask.py being used which was being imported instead of actual dask! Lesson learned!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/distributed/issues/962#issuecomment-385243388, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQHIiW8iFeliNnwupPCJflBJDV-ox7hks5ttZ9rgaJpZM4Ml1Rb .

2reactions
jzwinckcommented, Feb 22, 2018

Why is the Conda package for dask.distributed not called dask.distributed or dask-distributed (what I would expect to install, but does not exist)? It’s called distributed which is super-confusing and there is nothing helpful in the output of conda info distributed -c conda-forge. Further, distributed does not express a dependency on dask. Am I even looking at the right package?

distributed 1.18.0 py35_0
-------------------------
file name   : distributed-1.18.0-py35_0.tar.bz2
name        : distributed
version     : 1.18.0
build string: py35_0
build number: 0
channel     : defaults
size        : 632 KB
arch        : x86_64
date        : 2017-08-16
license     : BSD 3-Clause
license_family: BSD
md5         : d0c3b75432a2037425478d76ba0870bc
noarch      : None
platform    : linux
url         : https://repo.continuum.io/pkgs/free/linux-64/distributed-1.18.0-py35_0.tar.bz2
dependencies:
    bokeh >=0.12.3
    click >=6.6
    cloudpickle >=0.2.2
    msgpack-python
    psutil
    python 3.5*
    six
    sortedcontainers
    tblib
    toolz >=0.7.4
    tornado >=4.4
    zict >=0.1.2
Read more comments on GitHub >

github_iconTop Results From Across the Web

Installing dask and distributed packages can be confusing
Installing dask and distributed packages can be confusing. ... Our installation docs recommend that people do the following conda install dask distributed ...
Read more >
Install Dask.Distributed
To install distributed from source, clone the repository from github: git clone https://github.com/dask/distributed.git cd distributed python -m pip install ...
Read more >
Default pip installation of Dask gives "ImportError: No module ...
pip install dask : Install only dask, which depends only on the standard library. This is appropriate if you only want the task...
Read more >
Dask - To Distribute or Not To Distribute..Ahh..This Thing Sucks.
Ok, so we are going to install Dask on my 4 node (Linode) cluster. ... environment setup and all your needed packages installed...
Read more >
Set up a Dask Cluster for Distributed Machine Learning
Setting up a Dask cluster using SSH connections might seem like the easiest of the bunch, but it is also the most unstable,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found