Got different accuracy between history and evaluate
See original GitHub issueI fit a model as follow: history = model.fit(x_train, y_train, epochs=50, verbose=1, validation_data=(x_val,y_val))
Got the answer : Epoch 48/50 49/49 [==============================] - 0s 3ms/step - loss: 0.0228 - acc: 0.9796 - val_loss: 3.3064 - val_acc: 0.6923 Epoch 49/50 49/49 [==============================] - 0s 3ms/step - loss: 0.0186 - acc: 1.0000 - val_loss: 3.3164 - val_acc: 0.6923 Epoch 50/50 49/49 [==============================] - 0s 2ms/step - loss: 0.0150 - acc: 1.0000 - val_loss: 3.3186 - val_acc: 0.6923
While, when I try to evaluate my model in train set with model.evaluate(x_train,y_train)
I got this [4.552013397216797, 0.44897958636283875]
I have no idea how this happen? Thank you.
Issue Analytics
- State:
- Created 5 years ago
- Reactions:13
- Comments:27 (3 by maintainers)

Top Related StackOverflow Question
Same issue here. When training using fit_generator, I get a training and validation accuracy that are both much higher than the ones I get when I evaluate the model manually, on training and testing data.
I have found a fix for this issue. I encountered this issue while using ImageDataGenerators and obviously my model accuracies on both training and validation set were far more lower when using mode.evaluate() as compared to values returned from model history. So the fix is to set
shuffle=Falsewhile creating your validation generator and then your accuracy will match with the validation set. For training set it may not match as we generally keepshuffle=Truefor training. Below is an example of how to create validation DataGenerator for reproducible results (Setshuffle=False) :