TypeError: can not serialize 'Delayed' object for sklearn models on dask-ml==1.8.0
See original GitHub issueOriginally reported in https://stackoverflow.com/questions/67313754/dask-with-tensor-flow-is-failing-with-critical-failed-to-serialize-error
I was able to reproduce the problem without SciKeras/Keras/TensorFlow using just sklearn.linear_model.LogisticRegression
, hence I’m posting here to get some input.
What happened:
Dask crashes giving an error about serializing Delayed
objects.
Minimal Complete Verifiable Example:
Here is a notebook reproducing the issue: link.
For posteriority, here is the code:
# data
import numpy as np
X, y = np.random.uniform(size=(100,)), np.random.randint(0, 2, size=(100,))
# model
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
# dask
from dask.distributed import Client
from dask_ml.model_selection import HyperbandSearchCV
client = Client()
search = HyperbandSearchCV(model, {}, 1)
search.fit(X, y)
And the installed versions:
pip freeze
absl-py==0.12.0
alabaster==0.7.12
albumentations==0.1.12
altair==4.1.0
appdirs==1.4.4
argon2-cffi==20.1.0
astor==0.8.1
astropy==4.2.1
astunparse==1.6.3
async-generator==1.10
atari-py==0.2.6
atomicwrites==1.4.0
attrs==20.3.0
audioread==2.1.9
autograd==1.3
Babel==2.9.0
backcall==0.2.0
beautifulsoup4==4.6.3
bleach==3.3.0
blis==0.4.1
bokeh==2.3.1
Bottleneck==1.3.2
branca==0.4.2
bs4==0.0.1
CacheControl==0.12.6
cachetools==4.2.1
catalogue==1.0.0
certifi==2020.12.5
cffi==1.14.5
chainer==7.4.0
chardet==3.0.4
click==7.1.2
cloudpickle==1.6.0
cmake==3.12.0
cmdstanpy==0.9.5
colorcet==2.0.6
colorlover==0.3.0
community==1.0.0b1
contextlib2==0.5.5
convertdate==2.3.2
coverage==3.7.1
coveralls==0.5
crcmod==1.7
cufflinks==0.17.3
cvxopt==1.2.6
cvxpy==1.0.31
cycler==0.10.0
cymem==2.0.5
Cython==0.29.22
daft==0.0.4
dask==2021.4.1
dask-glm==0.2.0
dask-ml==1.8.0
datascience==0.10.6
debugpy==1.0.0
decorator==4.4.2
defusedxml==0.7.1
descartes==1.1.0
dill==0.3.3
distributed==2021.4.1
dlib==19.18.0
dm-tree==0.1.6
docopt==0.6.2
docutils==0.17
dopamine-rl==1.0.5
earthengine-api==0.1.260
easydict==1.9
ecos==2.0.7.post1
editdistance==0.5.3
en-core-web-sm==2.2.5
entrypoints==0.3
ephem==3.7.7.1
et-xmlfile==1.0.1
fa2==0.3.5
fancyimpute==0.4.3
fastai==1.0.61
fastdtw==0.3.4
fastprogress==1.0.0
fastrlock==0.6
fbprophet==0.7.1
feather-format==0.4.1
filelock==3.0.12
firebase-admin==4.4.0
fix-yahoo-finance==0.0.22
Flask==1.1.2
flatbuffers==1.12
folium==0.8.3
fsspec==2021.4.0
future==0.16.0
gast==0.3.3
GDAL==2.2.2
gdown==3.6.4
gensim==3.6.0
geographiclib==1.50
geopy==1.17.0
gin-config==0.4.0
glob2==0.7
google==2.0.3
google-api-core==1.26.3
google-api-python-client==1.12.8
google-auth==1.28.1
google-auth-httplib2==0.0.4
google-auth-oauthlib==0.4.4
google-cloud-bigquery==1.21.0
google-cloud-bigquery-storage==1.1.0
google-cloud-core==1.0.3
google-cloud-datastore==1.8.0
google-cloud-firestore==1.7.0
google-cloud-language==1.2.0
google-cloud-storage==1.18.1
google-cloud-translate==1.5.0
google-colab==1.0.0
google-pasta==0.2.0
google-resumable-media==0.4.1
googleapis-common-protos==1.53.0
googledrivedownloader==0.4
graphviz==0.10.1
greenlet==1.0.0
grpcio==1.32.0
gspread==3.0.1
gspread-dataframe==3.0.8
gym==0.17.3
h5py==2.10.0
HeapDict==1.0.1
hijri-converter==2.1.1
holidays==0.10.5.2
holoviews==1.14.3
html5lib==1.0.1
httpimport==0.5.18
httplib2==0.17.4
httplib2shim==0.0.3
humanize==0.5.1
hyperopt==0.1.2
ideep4py==2.0.0.post3
idna==2.10
imageio==2.4.1
imagesize==1.2.0
imbalanced-learn==0.4.3
imblearn==0.0
imgaug==0.2.9
importlib-metadata==3.10.1
importlib-resources==5.1.2
imutils==0.5.4
inflect==2.1.0
iniconfig==1.1.1
intel-openmp==2021.2.0
intervaltree==2.1.0
ipykernel==4.10.1
ipython==5.5.0
ipython-genutils==0.2.0
ipython-sql==0.3.9
ipywidgets==7.6.3
itsdangerous==1.1.0
jax==0.2.12
jaxlib==0.1.65+cuda110
jdcal==1.4.1
jedi==0.18.0
jieba==0.42.1
Jinja2==2.11.3
joblib==1.0.1
jpeg4py==0.1.4
jsonschema==2.6.0
jupyter==1.0.0
jupyter-client==5.3.5
jupyter-console==5.2.0
jupyter-core==4.7.1
jupyterlab-pygments==0.1.2
jupyterlab-widgets==1.0.0
kaggle==1.5.12
kapre==0.1.3.1
Keras==2.4.3
Keras-Preprocessing==1.1.2
keras-vis==0.4.1
kiwisolver==1.3.1
knnimpute==0.1.0
korean-lunar-calendar==0.2.1
librosa==0.8.0
lightgbm==2.2.3
llvmlite==0.34.0
lmdb==0.99
locket==0.2.1
LunarCalendar==0.0.9
lxml==4.2.6
Markdown==3.3.4
MarkupSafe==1.1.1
matplotlib==3.2.2
matplotlib-venn==0.11.6
missingno==0.4.2
mistune==0.8.4
mizani==0.6.0
mkl==2019.0
mlxtend==0.14.0
more-itertools==8.7.0
moviepy==0.2.3.5
mpmath==1.2.1
msgpack==1.0.2
multipledispatch==0.6.0
multiprocess==0.70.11.1
multitasking==0.0.9
murmurhash==1.0.5
music21==5.5.0
natsort==5.5.0
nbclient==0.5.3
nbconvert==5.6.1
nbformat==5.1.3
nest-asyncio==1.5.1
networkx==2.5.1
nibabel==3.0.2
nltk==3.2.5
notebook==5.3.1
np-utils==0.5.12.1
numba==0.51.2
numexpr==2.7.3
numpy==1.19.5
nvidia-ml-py3==7.352.0
oauth2client==4.1.3
oauthlib==3.1.0
okgrade==0.4.3
opencv-contrib-python==4.1.2.30
opencv-python==4.1.2.30
openpyxl==2.5.9
opt-einsum==3.3.0
osqp==0.6.2.post0
packaging==20.9
palettable==3.3.0
pandas==1.1.5
pandas-datareader==0.9.0
pandas-gbq==0.13.3
pandas-profiling==1.4.1
pandocfilters==1.4.3
panel==0.11.2
param==1.10.1
parso==0.8.2
partd==1.2.0
pathlib==1.0.1
patsy==0.5.1
pexpect==4.8.0
pickleshare==0.7.5
Pillow==7.1.2
pip-tools==4.5.1
plac==1.1.3
plotly==4.4.1
plotnine==0.6.0
pluggy==0.7.1
pooch==1.3.0
portpicker==1.3.1
prefetch-generator==1.0.1
preshed==3.0.5
prettytable==2.1.0
progressbar2==3.38.0
prometheus-client==0.10.1
promise==2.3
prompt-toolkit==1.0.18
protobuf==3.12.4
psutil==5.4.8
psycopg2==2.7.6.1
ptyprocess==0.7.0
py==1.10.0
pyarrow==3.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycocotools==2.0.2
pycparser==2.20
pyct==0.4.8
pydata-google-auth==1.1.0
pydot==1.3.0
pydot-ng==2.0.0
pydotplus==2.0.2
PyDrive==1.3.1
pyemd==0.5.1
pyerfa==1.7.2
pyglet==1.5.0
Pygments==2.6.1
pygobject==3.26.1
pymc3==3.7
PyMeeus==0.5.11
pymongo==3.11.3
pymystem3==0.2.0
PyOpenGL==3.1.5
pyparsing==2.4.7
pyrsistent==0.17.3
pysndfile==1.3.8
PySocks==1.7.1
pystan==2.19.1.1
pytest==3.6.4
python-apt==0.0.0
python-chess==0.23.11
python-dateutil==2.8.1
python-louvain==0.15
python-slugify==4.0.1
python-utils==2.5.6
pytz==2018.9
pyviz-comms==2.0.1
PyWavelets==1.1.1
PyYAML==3.13
pyzmq==22.0.3
qdldl==0.1.5.post0
qtconsole==5.0.3
QtPy==1.9.0
regex==2019.12.20
requests==2.23.0
requests-oauthlib==1.3.0
resampy==0.2.2
retrying==1.3.3
rpy2==3.4.3
rsa==4.7.2
scikit-image==0.16.2
scikit-learn==0.24.2
scipy==1.4.1
screen-resolution-extra==0.0.0
scs==2.1.3
seaborn==0.11.1
Send2Trash==1.5.0
setuptools-git==1.2
Shapely==1.7.1
simplegeneric==0.8.1
six==1.15.0
sklearn==0.0
sklearn-pandas==1.8.0
smart-open==5.0.0
snowballstemmer==2.1.0
sortedcontainers==2.3.0
SoundFile==0.10.3.post1
spacy==2.2.4
Sphinx==1.8.5
sphinxcontrib-serializinghtml==1.1.4
sphinxcontrib-websupport==1.2.4
SQLAlchemy==1.4.7
sqlparse==0.4.1
srsly==1.0.5
statsmodels==0.10.2
sympy==1.7.1
tables==3.4.4
tabulate==0.8.9
tblib==1.7.0
tensorboard==2.4.1
tensorboard-plugin-wit==1.8.0
tensorflow==2.4.1
tensorflow-datasets==4.0.1
tensorflow-estimator==2.4.0
tensorflow-gcs-config==2.4.0
tensorflow-hub==0.12.0
tensorflow-metadata==0.29.0
tensorflow-probability==0.12.1
termcolor==1.1.0
terminado==0.9.4
testpath==0.4.4
text-unidecode==1.3
textblob==0.15.3
textgenrnn==1.4.1
Theano==1.0.5
thinc==7.4.0
threadpoolctl==2.1.0
tifffile==2021.4.8
toml==0.10.2
toolz==0.11.1
torch==1.8.1+cu101
torchsummary==1.5.1
torchtext==0.9.1
torchvision==0.9.1+cu101
tornado==5.1.1
tqdm==4.41.1
traitlets==5.0.5
tweepy==3.10.0
typeguard==2.7.1
typing-extensions==3.7.4.3
tzlocal==1.5.1
uritemplate==3.0.1
urllib3==1.24.3
vega-datasets==0.9.0
wasabi==0.8.2
wcwidth==0.2.5
webencodings==0.5.1
Werkzeug==1.0.1
widgetsnbextension==3.5.1
wordcloud==1.5.0
wrapt==1.12.1
xarray==0.15.1
xgboost==0.90
xkit==0.0.0
xlrd==1.1.0
xlwt==1.3.0
yellowbrick==0.9.1
zict==2.0.0
zipp==3.4.1
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (7 by maintainers)
Top Results From Across the Web
Dask distributed library giving serialization error - Stack Overflow
The error says that there is some object that you're trying to send to a worker that is not serializable. The type is...
Read more >Scikit-learn Pipeline Persistence and JSON Serialization
These special Python objects cannot be serialized in JSON, as it is limited to at most bool , int , float , and...
Read more >how to save your sklearn models | Analytics Vidhya - Medium
The saving of data is called Serialization, where we store an object as a stream of bytes to save on a disk. Loading...
Read more >Save and Load Machine Learning Models in Python with scikit ...
Finding an accurate machine learning model is not the end of the project. ... Pickle is the standard way of serializing objects in...
Read more >Saving and sharing data — BIOS-823-2020 1.0 documentation
Unfortunately, the get_params method returns values that are Python objects suhc as StandardScaler() that cannot be directly serialized to JSON. [27]:.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Probably! Am I the only one who can do that? I’ll try to make some time this week.
I’ve been meaning to expand the pool of maintainers too. I’ll get to that too.
Should a new (minor?) version of Dask-ML be released if the
main
branch has the necessary fix? This issue has popped up twice in the Dask issue trackers, which sounds like a lot to me.cc @TomAugspurger