AttributeError: 'DatasetDict' object has no attribute 'train_test_split'
See original GitHub issueThe following code fails with “‘DatasetDict’ object has no attribute ‘train_test_split’” - am I doing something wrong?
from datasets import load_dataset
dataset = load_dataset('csv', data_files='data.txt')
dataset = dataset.train_test_split(test_size=0.1)
AttributeError: ‘DatasetDict’ object has no attribute ‘train_test_split’
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
'DatasetDict' object has no attribute 'train_test_split' - Datasets
Hi @thecity2, as far as I know train_test_split operates on Dataset objects, not DatasetDict objects. For example, this works squad = ( ......
Read more >AttributeError: 'DatasetDict' object has no attribute 'load_metric'
I can't load metrics using DatasetDict.
Read more >'DataFrame' object has no attribute 'to_dataframe'
Here is my code up until the error I'm getting. # Load libraries import pandas as pd import numpy as np from pandas.tools.plotting...
Read more >torchtext.data - Read the Docs
Two fields with the same Field object will have a shared vocabulary. ... If the relative size for valid is missing, only the...
Read more >Train and Test Set in Python Machine Learning - How to Split
Can you please tell me how i can use this sklearn for training python with another language i have the dataset need i...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
dataset = load_dataset(‘csv’, data_files=[‘files/datasets/dataset.csv’]) dataset = dataset[‘train’] dataset = dataset.train_test_split(test_size=0.1)
Hi @david-waterworth!
As indicated in the error message,
load_dataset("csv")
returns aDatasetDict
object, which is mapping ofstr
toDataset
objects. I believe in this case the behavior is to return atrain
split with all the data.train_test_split
is a method of theDataset
object, so you will need to do something like this:Please let me know if this helps. 🙂