question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

save/load models in py2 and py3

See original GitHub issue

I need to work on different python version of the same system and package versions

Description

I noticed that there is no guarantee of compatibility across sklearn version nor architecture, but it seems that there is neither compatibility across python versions I have simple kmeans that I train and export in py2 and want to use also in py3 but it says that the model is not trained and even more for retraining there are missing some internal representations

Steps/Code to Reproduce

training code

import numpy as np
from sklearn import cluster

locations = np.random.random((250, 2)) * 5

kmean = cluster.KMeans(n_clusters=60, verbose=True)
kmean.fit(locations)
np.savez(open('filename.npz', 'wb'), dict(kmeans=kmean, data=locations))

dump = np.load(open('filename.npz', 'rb'))
dump = dict(dump[dump.files[0]].tolist())
dump['kmeans'].predict(locations)

from sklearn.externals import joblib
joblib.dump(kmean, 'filename.pkl')
kmean2 = joblib.load('filename.pkl')
kmean2.predict(locations)

prediction code

import numpy as np

try:  # for py3
    dump = np.load('filename.npz', encoding='bytes')
    dump = dict(dump[dump.files[0]].tolist())
    dump = {str(k.decode('utf-8')): dump[k] for k in dump}
except:  # for py2
    dump = np.load(open('filename.npz', 'rb'))
    dump = dict(dump[dump.files[0]].tolist())
    # dump = {str(k): dump[k] for k in dump}

dump['kmeans'].predict(dump['data'])
dump['kmeans'].fit(dump['data'])
dump['kmeans'].predict(dump['data'])

from sklearn.externals import joblib
kmean = joblib.load('filename.pkl')

kmean.predict(dump['data'])
kmean.fit(dump['data'])
kmean.predict(dump['data'])

Expected Results

predicted values or at least option to retrain model

Actual Results

not trained model:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/IPython/core/interactiveshell.py", line 2885, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-d6f7fad3d86c>", line 1, in <module>
    dump['kmeans'].predict(dump['data'])
  File "/usr/local/lib/python3.5/dist-packages/sklearn/cluster/k_means_.py", line 955, in predict
    check_is_fitted(self, 'cluster_centers_')
  File "/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py", line 690, in check_is_fitted
    raise _NotFittedError(msg % {'name': type(estimator).__name__})
sklearn.exceptions.NotFittedError: This KMeans instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

not compatible:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/IPython/core/interactiveshell.py", line 2885, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-2cb7b6e4f231>", line 1, in <module>
    dump['kmeans'].fit(dump['data'])
  File "/usr/local/lib/python3.5/dist-packages/sklearn/cluster/k_means_.py", line 879, in fit
    random_state = check_random_state(self.random_state)
AttributeError: 'KMeans' object has no attribute 'random_state'

Versions

python: 2.7.12 and 3.5.2 nympy: 1.13.1 scikit-learn: 0.18.1

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
Bordacommented, Nov 29, 2017

oki, the reconstruction seems to work but it not nice nor clean solution…

kmean3 = cluster.KMeans()
kmean3.set_params(**kmean.get_params())
kmean3.cluster_centers_, kmean3.labels_, kmean3.inertia_, kmean3.n_iter_ = \
    kmean.cluster_centers_, kmean.labels_, kmean.inertia_, kmean.n_iter_

kmean3.predict(locations)
0reactions
amuellercommented, Nov 29, 2017

yeah, but that’s really a limitation of Python/pickle

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Save and Load Your Keras Deep Learning Model
You can save your model by calling the save() function on the model and specifying the filename. The example below demonstrates this by...
Read more >
Save and load models | TensorFlow Core
Model progress can be saved during and after training. This means a model can resume where it left off and avoid long training...
Read more >
Save and load models in Tensorflow
Now you can simply save the weights of all the layers using the save_weights() method. It saves the weights of the layers contained...
Read more >
Saving and Loading Models
This save/load process uses the most intuitive syntax and involves the least amount of code. Saving a model in this way will save...
Read more >
Python - How to Save and Load ML Models
WHY We need to save and restore/reload later our ML Model , so as to - ... /opt/conda/lib/python3.6/site-packages/sklearn/linear_model/logistic.py:469: ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found