gaps between distributed and dask releases to anaconda main channels results in incompatible environments
See original GitHub issueWhat happened:
Twice in the last two months, LightGBM’s continuous integration has been broken by the following situation:
distributed
changes in a way that makes it incompatible with older versions ofdask
- the newest release of
distributed
is published to anaconda’s main channels several days before the correspondingdask
version - something like
conda install -y dask distributed
results in an environment with incompatible versions ofdask
anddistributed
- any tests involving Dask fail
I’ve documented the most recent instance of this problem in https://github.com/microsoft/LightGBM/issues/4285.
We ended up with an environment like this:
dask-2021.4.0 | pyhd3eb1b0_0 5 KB
dask-core-2021.4.0 | pyhd3eb1b0_0 670 KB
distributed-2021.4.1 | py37h06a4308_0 1.0 MB
And saw all Dask tests in that project fail with this error:
> from distributed.protocol.core import dumps_msgpack
E ImportError: cannot import name 'dumps_msgpack' from 'distributed.protocol.core' (/root/miniconda/envs/test-env/lib/python3.7/site-packages/distributed/protocol/core.py)
Caused by the fact that distributed.protocol.core.dumps_msgpack()
was removed in 2021.4.1 (#4677), but dask
2021.4.0 still relies on it.
What you expected to happen:
I expected that since dask
and distributed
are so tightly connected to each other, new versions of these libraries would be published to the main anaconda channels at the same time.
Minimal Complete Verifiable Example:
It’s hard to create an MCVE for this since it relies on external state in a package manager, but as of 12 hours ago the steps at https://github.com/microsoft/LightGBM/issues/4285#issuecomment-841000102 could reproduce this issue.
If you need more details than that please let me know and I can try to produce a tighter reproducible example.
Anything else we need to know?:
Environment:
- Dask version: 2021.4.0
- Python version: 3.7
- Operating System: Ubuntu 20.04
- Install method (conda, pip, source): conda
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (6 by maintainers)
Top GitHub Comments
Thanks for reporting @jameslamb! FWIW some folks also ran into this with the
2021.04.1
release onconda-forge
(see the discussion starting here https://github.com/dask/community/issues/150#issuecomment-826844711). I think the core issue here is that we don’t specify maximum allowed versions for ourdask
anddistributed
dependencies.Over in https://github.com/dask/community/issues/155#issuecomment-841278326 I’m proposing we start pinning
dask
anddistributed
more tightly to avoid these types version inconsistency issues. If you have any thoughts on the topic, please feel free to engage over in that issueClosing as discussion moved over to the
dask/community
issue tracker and the relevant folks have been pinged here for visibility