dask-scheduler: Failed to deserialize, KeyError: 'lz4'
See original GitHub issueWe are using dask distributed for some custom calculations. Dask is deployed in docker, via docker-compose. dask-scheduler is latest, but worker is custom in terms that it is run from within some other code. Looking at the lz4 error I think it is relevant that we do install lz4 on worker’s interpreter.
It seems that since worker has lz4 package it uses it for sending packets while scheduler doesn’t have it.
Also, there has been an update on dockerhub for daskdev/dask image about a day ago so it may be version mismatch.
CPython on worker: 3.6.7 distributed on worker: 1.25.3
docker-compose part for dask-scheduler:
dask_scheduler:
image: daskdev/dask
ports:
- 8787:8787
command: ['dask-scheduler']
container_name: dask_scheduler
Worker code:
from threading import Thread
from time import sleep
from distributed import Worker
from redis import StrictRedis
from sqlalchemy import create_engine
from tornado.ioloop import IOLoop
import json_logging
# some more imports
if __name__ == '__main__':
if config.JSON_LOGGING:
json_logging.COMPONENT_NAME = 'controller'
json_logging.ENABLE_JSON_LOGGING = True
json_logging.init()
loop = IOLoop.current()
t = Thread(target=loop.start, daemon=True)
t.start()
w = Worker(f'tcp://{config.DASK_SCHEDULER_HOST}:{config.DASK_SCHEDULER_PORT}', loop=loop, ncores=4)
...
# here I init some redis, kafka, etc, later passed into Runner
...
w.runner = Runner(...) # custom stuff used for stateful calculations
w.start()
while True:
sleep(1)
Error from dask-scheduler
April 10th 2019, 17:06:22.477distributed.protocol.core - CRITICAL - Failed to deserialize
April 10th 2019, 17:06:22.478 decompress = compressions[header['compression']]['decompress']
April 10th 2019, 17:06:22.478 KeyError: 'lz4'
April 10th 2019, 17:06:22.478
April 10th 2019, 17:06:22.478 File "/opt/conda/lib/python3.7/site-packages/distributed/protocol/core.py", line 192, in loads_msgpack
April 10th 2019, 17:06:22.478 Traceback (most recent call last):
April 10th 2019, 17:06:22.479 File "/opt/conda/lib/python3.7/site-packages/distributed/protocol/core.py", line 196, in loads_msgpack
April 10th 2019, 17:06:22.479
April 10th 2019, 17:06:22.479 File "/opt/conda/lib/python3.7/site-packages/distributed/protocol/core.py", line 109, in loads
April 10th 2019, 17:06:22.479 msg = loads_msgpack(small_header, small_payload)
April 10th 2019, 17:06:22.479 Traceback (most recent call last):
April 10th 2019, 17:06:22.479 During handling of the above exception, another exception occurred:
April 10th 2019, 17:06:22.480 File "/opt/conda/lib/python3.7/site-packages/distributed/protocol/core.py", line 192, in loads_msgpack
April 10th 2019, 17:06:22.480 Traceback (most recent call last):
April 10th 2019, 17:06:22.480 " installed" % str(header['compression']))
April 10th 2019, 17:06:22.480 distributed.core - ERROR - Data is compressed as lz4 but we don't have this installed
April 10th 2019, 17:06:22.480 ValueError: Data is compressed as lz4 but we don't have this installed
April 10th 2019, 17:06:22.481 Traceback (most recent call last):
April 10th 2019, 17:06:22.481 During handling of the above exception, another exception occurred:
April 10th 2019, 17:06:22.481 KeyError: 'lz4'
April 10th 2019, 17:06:22.481
April 10th 2019, 17:06:22.481
April 10th 2019, 17:06:22.481 decompress = compressions[header['compression']]['decompress']
April 10th 2019, 17:06:22.482 File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
April 10th 2019, 17:06:22.482 yielded = self.gen.throw(*exc_info)
April 10th 2019, 17:06:22.482 msgs = yield comm.read()
April 10th 2019, 17:06:22.482 File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
April 10th 2019, 17:06:22.482 value = future.result()
April 10th 2019, 17:06:22.482 File "/opt/conda/lib/python3.7/site-packages/distributed/core.py", line 386, in handle_stream
April 10th 2019, 17:06:22.483 File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 326, in wrapper
April 10th 2019, 17:06:22.483 yielded = next(result)
April 10th 2019, 17:06:22.483 File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
April 10th 2019, 17:06:22.483 deserializers=deserializers)
April 10th 2019, 17:06:22.483 value = future.result()
April 10th 2019, 17:06:22.483 File "/opt/conda/lib/python3.7/site-packages/distributed/comm/tcp.py", line 207, in read
April 10th 2019, 17:06:22.484 File "/opt/conda/lib/python3.7/site-packages/distributed/protocol/core.py", line 196, in loads_msgpack
April 10th 2019, 17:06:22.484 " installed" % str(header['compression']))
April 10th 2019, 17:06:22.484 deserializers=deserializers)
April 10th 2019, 17:06:22.484 File "/opt/conda/lib/python3.7/site-packages/distributed/comm/utils.py", line 68, in _from_frames
April 10th 2019, 17:06:22.484 File "/opt/conda/lib/python3.7/site-packages/distributed/comm/utils.py", line 82, in from_frames
April 10th 2019, 17:06:22.484 File "/opt/conda/lib/python3.7/site-packages/distributed/protocol/core.py", line 109, in loads
April 10th 2019, 17:06:22.484 res = _from_frames()
April 10th 2019, 17:06:22.484 msg = loads_msgpack(small_header, small_payload)
April 10th 2019, 17:06:22.485 File "/opt/conda/lib/python3.7/site-packages/distributed/protocol/core.py", line 192, in loads_msgpack
April 10th 2019, 17:06:22.485 Traceback (most recent call last):
April 10th 2019, 17:06:22.485 decompress = compressions[header['compression']]['decompress']
April 10th 2019, 17:06:22.485 distributed.core - ERROR - Data is compressed as lz4 but we don't have this installed
April 10th 2019, 17:06:22.485 distributed.scheduler - INFO - Close client connection: Client-b99af26c-5acc-11e9-8006-0242c0a81004
April 10th 2019, 17:06:22.485 distributed.scheduler - INFO - Remove client Client-b99af26c-5acc-11e9-8006-0242c0a81004
April 10th 2019, 17:06:22.485ValueError: Data is compressed as lz4 but we don't have this installed
April 10th 2019, 17:06:22.486
April 10th 2019, 17:06:22.486 KeyError: 'lz4'
April 10th 2019, 17:06:22.486
April 10th 2019, 17:06:22.486 Traceback (most recent call last):
April 10th 2019, 17:06:22.486 During handling of the above exception, another exception occurred:
April 10th 2019, 17:06:22.486 File "/opt/conda/lib/python3.7/site-packages/distributed/core.py", line 346, in handle_comm
April 10th 2019, 17:06:22.487 File "/opt/conda/lib/python3.7/site-packages/distributed/scheduler.py", line 2036, in add_client
April 10th 2019, 17:06:22.487 value = future.result()
April 10th 2019, 17:06:22.487 File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
April 10th 2019, 17:06:22.487 yielded = self.gen.send(value)
April 10th 2019, 17:06:22.487 result = yield result
April 10th 2019, 17:06:22.487 File_id.diz "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 1147, in run
April 10th 2019, 17:06:22.488 msgs = yield comm.read()
April 10th 2019, 17:06:22.488 File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
April 10th 2019, 17:06:22.488 File "/opt/conda/lib/python3.7/site-packages/distributed/core.py", line 386, in handle_stream
April 10th 2019, 17:06:22.488 yielded = self.gen.throw(*exc_info)
April 10th 2019, 17:06:22.488 File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
April 10th 2019, 17:06:22.488 value = future.result()
April 10th 2019, 17:06:22.488 yield self.handle_stream(comm=comm, extra={'client': client})
April 10th 2019, 17:06:22.489 File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
April 10th 2019, 17:06:22.489 value = future.result()
April 10th 2019, 17:06:22.489 yielded = self.gen.throw(*exc_info)
April 10th 2019, 17:06:22.489 File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
April 10th 2019, 17:06:22.489 File "/opt/conda/lib/python3.7/site-packages/distributed/comm/tcp.py", line 207, in read
April 10th 2019, 17:06:22.489 deserializers=deserializers)
April 10th 2019, 17:06:22.489 File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
April 10th 2019, 17:06:22.490 deserializers=deserializers)
April 10th 2019, 17:06:22.490 res = _from_frames()
April 10th 2019, 17:06:22.490 File "/opt/conda/lib/python3.7/site-packages/distributed/comm/utils.py", line 82, in from_frames
April 10th 2019, 17:06:22.490 File "/opt/conda/lib/python3.7/site-packages/tornado/gen.py", line 326, in wrapper
April 10th 2019, 17:06:22.490 value = future.result()
April 10th 2019, 17:06:22.490 yielded = next(result)
April 10th 2019, 17:06:22.490 File_id.diz "/opt/conda/lib/python3.7/site-packages/distributed/comm/utils.py", line 68, in _from_frames
April 10th 2019, 17:06:22.491 File "/opt/conda/lib/python3.7/site-packages/distributed/protocol/core.py", line 196, in loads_msgpack
April 10th 2019, 17:06:22.491 msg = loads_msgpack(small_header, small_payload)
April 10th 2019, 17:06:22.491ValueError: Data is compressed as lz4 but we don't have this installed
April 10th 2019, 17:06:22.491 " installed" % str(header['compression']))
April 10th 2019, 17:06:22.491 File "/opt/conda/lib/python3.7/site-packages/distributed/protocol/core.py", line 109, in loads```
Issue Analytics
- State:
- Created 4 years ago
- Comments:8 (6 by maintainers)
Top Results From Across the Web
Dask Distributed Computing Deserialize Error - Stack Overflow
I have a dask-scheduler and worker running on a linux machine and I trying to send a grid search task from a windows...
Read more >Changelog — Dask.distributed 2022.12.1 documentation
This release deprecates dask-scheduler , dask-worker , and dask-ssh CLIs in ... Error hard when Dask has mismatched versions or lz4 installed (GH#3936) ......
Read more >Changelog — Dask.distributed 2.11.0 documentation
Error hard when Dask has mismatched versions or lz4 installed (GH#3936) ... Add logging message when closing idle dask scheduler (GH#3632) ...
Read more >Troubleshooting — Dask-CHTC 0.1.0 documentation
If you get an error with the reason unsupported pickle protocol: 5 , like. distributed.protocol.core - CRITICAL - Failed to deserialize Traceback (most ......
Read more >dask-docker from dask - Coder Social
CMD includes prepare and dask-scheduler ... distributed.protocol.pickle - INFO - Failed to deserialize ... scheduler_1 | KeyError: 'op'.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I think so – thanks for following up here @GenevieveBuckley
Yes, you’ll need to ensure that your client, scheduler, and workers have the same software versions. You might consider pinning the same docker image everywhere, or installing software in some other consistent manner. If you’d like to get a printout of all version mismatches you can try the following:
This should help point you in the right direction.