Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Different results between model.evaluate() and model.predict()

See original GitHub issue

I got different results between model.evaluate() and model.predict(). Could someone point out what is wrong in my calculation as follows? Note that the model, X_test_features, y_regression_test are identical in two approaches.

Thank you very much!

directly use model evaluate() to get loss and metrics:

model = define_top_model()
model.compile(loss='mse', optimizer='rmsprop', metrics=['mae', 'mape'])
model.load_weights(model_weights_file)
scores = model.evaluate(X_test_features, y_regression_test, batch_size=batch_size)
logger.info('mse=%f, mae=%f, mape=%f' % (scores[0],scores[1],scores[2]))

The output is: mse=0.551147, mae=0.589529, mape=10.979756

get the preds numpy array using model.predict(), and use keras metrics to calculate metrics:

model = define_top_model()
model.compile(loss='mse', optimizer='rmsprop', metrics=['mae', 'mape'])
model.load_weights(model_weights_file)
preds = model.predict(X_test_features, batch_size=batch_size)
tf_session = K.get_session()
mse = metrics.mean_squared_error(y_regression_test, preds)
mae = metrics.mean_absolute_error(y_regression_test, preds)
mape = metrics.mean_absolute_percentage_error(y_regression_test, preds)
logger.info('mse=%f, mae=%f, mape=%f' % (mse.eval(session=tf_session),
                                                             mae.eval(session=tf_session),
                                                             mape.eval(session=tf_session)))

The output is: mse=0.678286, mae=0.654362, mape=12.249291

Issue Analytics

State:
Created 7 years ago
Reactions:17
Comments:13 (3 by maintainers)

Top GitHub Comments

29reactions

bstrinercommented, Jan 24, 2017

@parkerzf The issue is that your y is the wrong shape and dtype, but keras automatically fixes the shape for you, giving different results if you use the model or don’t.

This line fixes the issue:

y_regression = y_regression.astype(np.float32).reshape((-1,1))

This script should have the same values through numpy, evaluate, or using the backend.

import numpy as np
from keras.models import Model
from keras.layers import Input
from keras.metrics import mse, mae, mape
from keras import backend as K

def test(preds, y_regression):
	print("preds: {}, {}".format(preds.dtype, preds.shape))
	print("y_regression: {}, {}".format(y_regression.dtype, y_regression.shape))

	print('manual result: mse=%f' % np.mean(np.square(y_regression - preds)))

	a = mse(y_regression, preds)
	b = mae(y_regression, preds)
	c = mape(y_regression, preds)
	f = K.function([], [a,b,c])
	print 'backend result: mse={}, mae={}, mape={}'.format(*f([]))

	x = Input(preds.shape[1:])
	m = Model(x, x)
	m.compile(loss='mse', optimizer='rmsprop', metrics=['mae', 'mape'])
	scores = m.evaluate(preds, y_regression, batch_size=32, verbose=0)

	print '\nevaluate result: mse={}, mae={}, mape={}'.format(*scores)

preds = np.load(open('preds.npy','rb'))
y_regression = np.load(open('y_regression.npy','rb'))
print("Before fix")
test(preds, y_regression)
#this line fixes the issue
print("Fixing dtype and shape")
y_regression = y_regression.astype(np.float32).reshape((-1,1))
print("After fix")
test(preds, y_regression)

Before fix preds: float32, (7666L, 1L) y_regression: float64, (7666L,) manual result: mse=0.581871 backend result: mse=0.581870887686, mae=0.602397143873, mape=11.7284955271 evaluate result: mse=0.511590141619, mae=0.565474303248, mape=11.0147933211 Fixing dtype and shape

After fix preds: float32, (7666L, 1L) y_regression: float32, (7666L, 1L) manual result: mse=0.511590 backend result: mse=0.511590123177, mae=0.565474271774, mape=11.014793396 evaluate result: mse=0.511590141619, mae=0.565474303248, mape=11.0147933211

Cheers

12reactions

kev5commented, Apr 30, 2018

How do I plot the confusion matrix in my code?

from getEmbeddings import getEmbeddings
import matplotlib.pyplot as plt
import numpy as np
import keras
from keras import backend as K
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Dense, Dropout, LSTM, Embedding, Input, RepeatVector
from keras.optimizers import SGD
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
import scikitplot.plotters as skplt


def plot_cmat(yte, ypred):
    '''Plotting confusion matrix'''
    skplt.plot_confusion_matrix(yte,ypred)
    plt.show()

xtr,xte,ytr,yte = getEmbeddings("datasets/train.csv")
np.save('./xtr',xtr)
np.save('./xte',xte)
np.save('./ytr',ytr)
np.save('./yte',yte)

xtr = np.load('./xtr.npy')
xte = np.load('./xte.npy')
ytr = np.load('./ytr.npy')
yte = np.load('./yte.npy')


def baseline_model():
    '''Neural network with 3 hidden layers'''
    model = Sequential()
    model.add(Dense(256, input_dim=300, activation='relu', kernel_initializer='normal'))
    model.add(Dropout(0.3))
    model.add(Dense(256, activation='relu', kernel_initializer='normal'))
    model.add(Dropout(0.5))
    model.add(Dense(80, activation='relu', kernel_initializer='normal'))
    model.add(Dense(2, activation="softmax", kernel_initializer='normal'))

    # gradient descent
    sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
    
    # not 100% sure what "compile" does, maybe just runs the gradient descent algorithm
    model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
    return model


model = baseline_model()
model.summary()
x_train, x_test, y_train, y_test = train_test_split(xtr, ytr, test_size=0.2, random_state=42)
label_encoder = LabelEncoder()
label_encoder.fit(y_train)
encoded_y = np_utils.to_categorical((label_encoder.transform(y_train)))
label_encoder.fit(y_test)
encoded_y_test = np_utils.to_categorical((label_encoder.transform(y_test)))
estimator = model.fit(x_train, encoded_y, epochs=20, batch_size=64)
print("Model Trained!")
score = model.evaluate(x_test, encoded_y_test)
print("")
print("Accuracy = " + format(score[1]*100, '.2f') + "%")   # 92.62%