Dataset problem
See original GitHub issueHi, Tianhe There are some problems with the dataset from http://rail.eecs.berkeley.edu/datasets/mil_data.zip I use the following code to check the dataset of ‘./data/sim_push’
import glob
import pickle
file_dir = './data/sim_push'
file_list = glob.glob(file_dir + "/*.pkl")
bad_file = []
for i in range(len(file_list)):
try:
with open(file_list[i], 'rb') as f:
data = pickle.load(f)
except:
bad_file.append(file_list[i])
print("open(file_list): bad_file: ", len(bad_file))
print(bad_file)
There are 78 files which can’t be loaded using pickle. If I change the open(file_list[i], ‘rb’) as open(file_list[i]), there would be 753 files which can’t be loaded (all the files can’t be loaded normally).
I use python 2.7.6 and python 3.6.3 to load your pickle file, but get the same problem For python 2.7.6: I get the print(pickle.format_version) --> 2.0 For python 3.6.3: I get the print(pickle.format_version) --> 4.0
I guess that the problem may be the version of the pickle package. Could you tell me which version of pickle you used when you dump the pickle file?
Issue Analytics
- State:
- Created 5 years ago
- Comments:5 (1 by maintainers)
Top Results From Across the Web
3 big problems with datasets in AI and machine learning
Datasets in AI and machine learning contain many flaws. Some might be fixable, according to experts -- given enough time and resources.
Read more >Preparing Your Dataset for Machine Learning: 10 Steps
Problems with machine learning datasets can stem from the way an organization is built, workflows that are established, and whether instructions ...
Read more >Your Dataset Is Imbalanced? Do Nothing!
Class imbalance is not a problem. ... “We have a problem: this dataset is imbalanced.” ... So, where is the problem with class...
Read more >What are common dataset challenges at scale? - Medium
Common dataset challenges · Accessibility · Lack of standards · Security and Audit · Data access coupling · Dataset analytics · Storage specific ......
Read more >Meta-Research: Dataset decay and the problem of sequential ...
Before proceeding with technical details of the problem, we outline an intuitive problem regarding sequential statistical testing and open data.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

try this
results = pickle.load(f, encoding=“latin1”)
Do you solve this problem?