question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to get classes from generator in case of Shuffle=True

See original GitHub issue

I’ve 10100 images split into 101 folders(classes) that I want to store as single dataset using keras datagen.flow_from_directory.

Here is the code

datagen = ImageDataGenerator(rescale=1./255)

generator = datagen.flow_from_directory(
        train_data_dir,
        target_size=(150, 150),
        batch_size=12,
        class_mode='sparse',
        shuffle=True)

However I want to shuffle the data while doing so,hence ideally generator.classes output should have been something like this [0, 4, 5, 3, 2 . . …<some random number between 0-100>] …and 101 such column each with 100 elements vertically one under other.However the actual class out put isn’t showing any random fashion and rather showing the value like this 0, 0, . 100 such 0’s followed by 100 such 1’s …upto 100 such 100.hence no random fashion is seen. Any idea whether this is a bug or there is a work around for this.

Please make sure that the boxes below are checked before you submit your issue. Thank you!

  • Check that you are up-to-date with the master branch of Keras. You can update with: pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
  • If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with: pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
  • Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:10

github_iconTop GitHub Comments

18reactions
jwc-radcommented, Jul 30, 2017

This works for both shuffle=False and shuffle=True:

for batch_x, batch_y in generator:
        # batch_x contains a batch of images
        # batch_y contains a batch of classes in form of one-hots

Hope this helps.

4reactions
varun-bankiticommented, Oct 31, 2016

generator.classes gives the class assigned to each sample based on the sorted order of folder names, you can check it here, It is just a list of length nb_samples (in your case 10100) with each field having sample’s class index, they are not shuffled at this point.

The samples are shuffled with in the batch generator(here) so that when a batch is requested by the fit_generator or evaluate_generator random samples are given.

Hope this helps.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Accuracy reduced when shuffle set to True in Keras fit_generator
In your case, the problem with setting the shuffle=True is that if you shuffle on your validation set, the results will be chaotic....
Read more >
A detailed example of how to use data generators with Keras
Now, let's go through the details of how to set the Python class DataGenerator , which will be used for real-time data feeding...
Read more >
Tutorial on using Keras flow_from_directory and generators
class_mode: Set “binary” if you have only two classes to predict, if not set to“categorical”, in case if you're developing an Autoencoder system ......
Read more >
keras.fit() and keras.fit_generator() - GeeksforGeeks
It specifies the total number of steps taken from the generator before it is stopped at every epoch and its value is calculated...
Read more >
Image Data Generators in Keras
Hopefully after reading this article you will learn how to construct and use a data pipeline in Keras. Keras has DataGenerator classes available ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found