Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

The dataset should at least contain 2 batches to be split

See original GitHub issue

import pandas as pd
import numpy as np
import autokeras as ak
from tensorflow.keras.datasets import cifar10
from tensorflow.python.keras.utils.data_utils import Sequence
from tensorflow.keras.models import model_from_json
import os
def build_model():
    input_layer =ak.Input()
    cnn_layer = ak.ConvBlock()(input_layer)
    cnn_layer2 =ak.ConvBlock()(cnn_layer)
    dense_layer =ak.DenseBlock()(cnn_layer2)
    dense_layer2 =ak.DenseBlock()(dense_layer)
    output_layer =ak.ClassificationHead(num_classes=10)(dense_layer2)
    automodel =ak.auto_model.AutoModel(input_layer,output_layer,max_trials=20,seed=123,project_name="automl")
    return automodel

def build():
    ((trainX,trainY),(testX,testY))=cifar10.load_data()
    automodel = build_model()
    automodel.fit(trainX,trainY,validation_split=0.2,epochs=40,batch_size=64)#error here

if __name__ == '__main__':
    build()

i got this error even trying the example in the docs


    automodel.fit(trainX,trainY,validation_split=0.2,epochs=40,batch_size=64)
  File "S:\Anaconda\envs\tensor37\lib\site-packages\autokeras\auto_model.py", line 276, in fit
    validation_split=validation_split,
  File "S:\Anaconda\envs\tensor37\lib\site-packages\autokeras\auto_model.py", line 409, in _prepare_data
    dataset, validation_split
  File "S:\Anaconda\envs\tensor37\lib\site-packages\autokeras\utils\data_utils.py", line 47, in split_dataset
    "The dataset should at least contain 2 batches to be split."
ValueError: The dataset should at least contain 2 batches to be split.

autokeras 1.0.8 keras 2.3.1 tensorflow 2.1.0 numpy 1.19.1 pandas 1.1.1 keras-tuner 1.0.2rc1 python 3.7.7

Issue Analytics

State:
Created 3 years ago
Reactions:4
Comments:19 (5 by maintainers)

Top GitHub Comments

1reaction

haifeng-jincommented, Oct 12, 2020

@ciessielski @jisho-iemoto @Cariaga Would you confirm that the training data you are using has number of samples that at least 2 times the batch_size. For example, if your batch_size is 32 (default), then your data should at least contain 33 samples.

0reactions

neel04commented, Dec 3, 2021

Converting to numpy arrays actually seems to help///

Top Results From Across the Web

Training on batch: how do you split the data? - Zero with Dot

This approach involves splitting a dataset into a series of smaller data chunks that are handed to the model one at a time....

tensorflow:Your input ran out of data - python - Stack Overflow

Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches (in this case, 2 batches). You may need...

Structured Data Classification - AutoKeras

The data should be two-dimensional with numerical or categorical values. For the classification labels, AutoKeras accepts both plain labels, i.e. strings or ...

Load text | TensorFlow Core

Split the dataset into training and test sets. The Keras TextVectorization layer also batches and pads the vectorized data. Padding is required because...

Prepare training data | Vertex AI | Google Cloud

The dataset must have at least 2 and no more than 1,000 columns. For datasets that train AutoML models, one column must be...