question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot unpickle data frame made with 0.19.2 after upgrade to 0.20.1

See original GitHub issue

Hello,

Problem description

When we create a data frame with pandas ≤ 0.19.2 and pickle it (using pickle.dump), it is not possible to unpickle it using pandas 0.20.1.

# Using pandas 0.19.2
import pandas as pd
import pickle as pkl
data = pd.DataFrame({'x': [1, 2]})
pkl.dump(data, open("data_pd_0.19.2.pkl", "wb"))
# After upgrade to pandas 0.20.1
import pandas as pd
import pickle as pkl
pkl.load(open("data_pd_0.19.2.pkl", "rb"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'pandas.indexes'

First analysis

  • It seems that pandas.indexes has been refactored to pandas.core.indexes.
  • I don’t know if there are other such incompatible changes

Proposal

It would be great to have:

  • A deprecation warning when unpicking old data frame
  • Load old data frame supported but automatically converted to the new format, so that we can upgrade by pickling the unpickled data frames

Thanks a lot for your help, Best regards.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:14 (7 by maintainers)

github_iconTop GitHub Comments

6reactions
matjazkcommented, Jun 1, 2017

Going from panda 0.18.1 to 0.20.1 I encountered the same problem when loading with joblib. joblib.load fails with exactly the same error: ImportError: No module named 'pandas.indexes'

When you fix this (see the first workaround), there is an error AttributeError: module 'pandas.core.base' has no attribute 'FrozenNDArray'

After workaround 2, files load. It seems that in my case this is more of a question for joblib devs.

Two (ugly) workarounds:

import sys
# 1
import pandas.core.indexes 
sys.modules['pandas.indexes'] = pandas.core.indexes
# 2
import pandas.core.base, pandas.core.indexes.frozen
setattr(sys.modules['pandas.core.base'],'FrozenNDArray', pandas.core.indexes.frozen.FrozenNDArray)
4reactions
jrebackcommented, May 24, 2017

Big red box, is clear that pd.read_pickle is the pickle reader and makes things backward compatible. Further whatsnew notes have a quite large section on what changed here

sure a direct call will work to pickle.loads, but this is not guaranteeed across versions.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Trying to unpickle estimator SVC from version 0.19.1 when ...
The error is clear it is telling you that when the pickle file was created, it was created using the version 0.19.1 but...
Read more >
Release Notes — pandas 0.20.1 documentation - PyData |
This is a major release from 0.19.2 and includes a number of API changes, ... DataFrame.update() no longer raises a DataConflictError , it...
Read more >
Version 0.20.4 — scikit-learn 1.2.0 documentation
This is a bug-fix release with some minor documentation improvements and enhancements to features released in 0.20.0. Changed models¶. The following estimators ...
Read more >
Do Not Use Python Pickle Unless You Know All These Points
Python Pickle compared with JSON can serialise objects like Pandas DataFrame and custom class objects. For Data Science and Machine ...
Read more >
Pyodide - Read the Docs
runPython( import sys sys.version. ); After importing Pyodide, only packages from the standard library are available. See Loading packages for ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found