Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

save/load models in py2 and py3

See original GitHub issue

I need to work on different python version of the same system and package versions

Description

I noticed that there is no guarantee of compatibility across sklearn version nor architecture, but it seems that there is neither compatibility across python versions I have simple kmeans that I train and export in py2 and want to use also in py3 but it says that the model is not trained and even more for retraining there are missing some internal representations

Steps/Code to Reproduce

training code

import numpy as np
from sklearn import cluster

locations = np.random.random((250, 2)) * 5

kmean = cluster.KMeans(n_clusters=60, verbose=True)
kmean.fit(locations)
np.savez(open('filename.npz', 'wb'), dict(kmeans=kmean, data=locations))

dump = np.load(open('filename.npz', 'rb'))
dump = dict(dump[dump.files[0]].tolist())
dump['kmeans'].predict(locations)

from sklearn.externals import joblib
joblib.dump(kmean, 'filename.pkl')
kmean2 = joblib.load('filename.pkl')
kmean2.predict(locations)

prediction code

import numpy as np

try:  # for py3
    dump = np.load('filename.npz', encoding='bytes')
    dump = dict(dump[dump.files[0]].tolist())
    dump = {str(k.decode('utf-8')): dump[k] for k in dump}
except:  # for py2
    dump = np.load(open('filename.npz', 'rb'))
    dump = dict(dump[dump.files[0]].tolist())
    # dump = {str(k): dump[k] for k in dump}

dump['kmeans'].predict(dump['data'])
dump['kmeans'].fit(dump['data'])
dump['kmeans'].predict(dump['data'])

from sklearn.externals import joblib
kmean = joblib.load('filename.pkl')

kmean.predict(dump['data'])
kmean.fit(dump['data'])
kmean.predict(dump['data'])

Expected Results

predicted values or at least option to retrain model

Actual Results

not trained model:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/IPython/core/interactiveshell.py", line 2885, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-d6f7fad3d86c>", line 1, in <module>
    dump['kmeans'].predict(dump['data'])
  File "/usr/local/lib/python3.5/dist-packages/sklearn/cluster/k_means_.py", line 955, in predict
    check_is_fitted(self, 'cluster_centers_')
  File "/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py", line 690, in check_is_fitted
    raise _NotFittedError(msg % {'name': type(estimator).__name__})
sklearn.exceptions.NotFittedError: This KMeans instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

not compatible:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/IPython/core/interactiveshell.py", line 2885, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-2cb7b6e4f231>", line 1, in <module>
    dump['kmeans'].fit(dump['data'])
  File "/usr/local/lib/python3.5/dist-packages/sklearn/cluster/k_means_.py", line 879, in fit
    random_state = check_random_state(self.random_state)
AttributeError: 'KMeans' object has no attribute 'random_state'

Versions

python: 2.7.12 and 3.5.2 nympy: 1.13.1 scikit-learn: 0.18.1

Issue Analytics

State:
Created 6 years ago
Comments:5 (5 by maintainers)

Top GitHub Comments

1reaction

Bordacommented, Nov 29, 2017

oki, the reconstruction seems to work but it not nice nor clean solution…

kmean3 = cluster.KMeans()
kmean3.set_params(**kmean.get_params())
kmean3.cluster_centers_, kmean3.labels_, kmean3.inertia_, kmean3.n_iter_ = \
    kmean.cluster_centers_, kmean.labels_, kmean.inertia_, kmean.n_iter_

kmean3.predict(locations)

0reactions

amuellercommented, Nov 29, 2017

yeah, but that’s really a limitation of Python/pickle

Top Results From Across the Web

How to Save and Load Your Keras Deep Learning Model

You can save your model by calling the save() function on the model and specifying the filename. The example below demonstrates this by...

Save and load models | TensorFlow Core

Model progress can be saved during and after training. This means a model can resume where it left off and avoid long training...

Save and load models in Tensorflow

Now you can simply save the weights of all the layers using the save_weights() method. It saves the weights of the layers contained...

Saving and Loading Models

This save/load process uses the most intuitive syntax and involves the least amount of code. Saving a model in this way will save...

Python - How to Save and Load ML Models

WHY We need to save and restore/reload later our ML Model , so as to - ... /opt/conda/lib/python3.6/site-packages/sklearn/linear_model/logistic.py:469: ...