question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Hi, Tianhe There are some problems with the dataset from http://rail.eecs.berkeley.edu/datasets/mil_data.zip I use the following code to check the dataset of ‘./data/sim_push’

import glob
import pickle
file_dir = './data/sim_push'
file_list = glob.glob(file_dir + "/*.pkl")
bad_file = []
for i in range(len(file_list)):
    try:
        with open(file_list[i], 'rb') as f:
        data = pickle.load(f)
    except:
        bad_file.append(file_list[i])
print("open(file_list): bad_file: ", len(bad_file))
print(bad_file)

There are 78 files which can’t be loaded using pickle. If I change the open(file_list[i], ‘rb’) as open(file_list[i]), there would be 753 files which can’t be loaded (all the files can’t be loaded normally).

I use python 2.7.6 and python 3.6.3 to load your pickle file, but get the same problem For python 2.7.6: I get the print(pickle.format_version) --> 2.0 For python 3.6.3: I get the print(pickle.format_version) --> 4.0

I guess that the problem may be the version of the pickle package. Could you tell me which version of pickle you used when you dump the pickle file?

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
igiorgicommented, Mar 22, 2021

try this

results = pickle.load(f, encoding=“latin1”)

0reactions
raozhongyucommented, Oct 30, 2020

Do you solve this problem?

Read more comments on GitHub >

github_iconTop Results From Across the Web

3 big problems with datasets in AI and machine learning
Datasets in AI and machine learning contain many flaws. Some might be fixable, according to experts -- given enough time and resources.
Read more >
Preparing Your Dataset for Machine Learning: 10 Steps
Problems with machine learning datasets can stem from the way an organization is built, workflows that are established, and whether instructions ...
Read more >
Your Dataset Is Imbalanced? Do Nothing!
Class imbalance is not a problem. ... “We have a problem: this dataset is imbalanced.” ... So, where is the problem with class...
Read more >
What are common dataset challenges at scale? - Medium
Common dataset challenges · Accessibility · Lack of standards · Security and Audit · Data access coupling · Dataset analytics · Storage specific ......
Read more >
Meta-Research: Dataset decay and the problem of sequential ...
Before proceeding with technical details of the problem, we outline an intuitive problem regarding sequential statistical testing and open data.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found