Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

AutoModel.predict crashes with custom metric

See original GitHub issue

Bug Description

Bug Reproduction

Code for reproducing the bug:

from sklearn.datasets import load_breast_cancer

from tensorflow.keras.callbacks import Callback, EarlyStopping, ModelCheckpoint
from tensorflow.python.platform import tf_logging as logging
from sklearn import datasets, svm, metrics
from sklearn.model_selection import train_test_split, KFold, StratifiedKFold
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.metrics import confusion_matrix, precision_score, recall_score, f1_score, cohen_kappa_score, matthews_corrcoef, log_loss
import matplotlib.pyplot as plt, pandas as pd, numpy as np, matplotlib as mpl
import requests, time, re, os, subprocess, json, sys, datetime

import autokeras as ak
import keras.backend as K
if 0:
    import keras
    #from keras.callbacks import ModelCheckpoint, History, EarlyStopping
    from keras.datasets import fashion_mnist
    from keras.layers import Dense, Dropout, Flatten
    from keras.layers import Conv1D, MaxPooling1D
    from keras.layers.normalization import BatchNormalization
    from keras.layers.advanced_activations import LeakyReLU
    from keras.models import Sequential,Input,Model
    from keras.optimizers import SGD,Adam
    from keras.utils import to_categorical
from IPython.display import clear_output
%matplotlib inline

def matthewsCorrelation(yTrue, yPred):
    yPredPos = K.round(K.clip(yPred, 0, 1))
    yPredNeg = 1 - yPredPos

    yPos = K.round(K.clip(yTrue, 0, 1))
    yNeg = 1 - yPos

    tp = K.sum(yPos * yPredPos) #int(sum(yPos * yPredPos))
    tn = K.sum(yNeg * yPredNeg)

    fp = K.sum(yNeg * yPredPos)
    fn = K.sum(yPos * yPredNeg)

    numerator = (tp * tn - fp * fn)
    denominator = K.sqrt((tp + fp) * (tp + fn) * (tn + fp) * (tn + fn))

    return numerator / (denominator + K.epsilon())

x, y = load_breast_cancer(return_X_y=True)
xTrain, xTest, yTrain, yTest = train_test_split(x, y, random_state=0)

MAXTRIALS = 20
inputNode = ak.StructuredDataInput()
outputNode = ak.StructuredDataBlock(categorical_encoding=True)(inputNode)
outputNode = ak.ClassificationHead(num_classes=2)(outputNode)

clf = ak.AutoModel(
    inputNode, 
    outputNode, 
    overwrite=True,
    objective="val_loss",
    metrics=[matthewsCorrelation],
    max_trials=MAXTRIALS)

clf.fit(xTrain, yTrain) #metrics=[matthewsCorrelation],
    
predictedClasses = clf.predict(xTest)

cm = confusion_matrix(yTest, predictedClasses)

print(cm)

matthew = matthews_corrcoef(yTest, predictedClasses)

print('MCC:', matthew)

Data used by the code: Breast Cancer Dataset

Expected Behavior

Setup Details

Include the details about the versions of:

OS type and version: OSX 10.13.6
Python: 3.8.5
autokeras: 1.0.10
keras-tuner: 1.0.2rc3
scikit-learn: 0.23.2
numpy: 1.18.5
pandas: 1.1.3
tensorflow: 2.3.0

Additional context

I have come up with a solution to this problem by editing a couple of the autokeras files. In autokeras/auto_model.py, I changed the predict() function to be

def predict(self, x, batch_size=32, custom_objects={}, **kwargs):
        """Predict the output for a given testing data.

        # Arguments
            x: Any allowed types according to the input node. Testing data.
            **kwargs: Any arguments supported by keras.Model.predict.

        # Returns
            A list of numpy.ndarray objects or a single numpy.ndarray.
            The predicted results.
        """
        if isinstance(x, tf.data.Dataset):
            if self._has_y(x):
                x = x.map(lambda x, y: x)
        self._check_data_format((x, None), predict=True)
        dataset = self._adapt(x, self.inputs, batch_size)
        pipeline = self.tuner.get_best_pipeline()
        if custom_objects:
            model = self.tuner.get_best_model(custom_objects=custom_objects)
        else:
            model = self.tuner.get_best_model()
        dataset = pipeline.transform_x(dataset)
        y = model.predict(dataset, **kwargs)
        y = utils.predict_with_adaptive_batch_size(
            model=model, batch_size=batch_size, x=dataset, **kwargs
        )
        return pipeline.postprocess(y)

Then in autokeras/engine/tuner.py, I changed the get_best_model() function to be

def get_best_model(self, custom_objects={}):
        with hm_module.maybe_distribute(self.distribution_strategy):
            if custom_objects:
                model = tf.keras.models.load_model(self.best_model_path, custom_objects=custom_objects)
            else:
                model = tf.keras.models.load_model(self.best_model_path)
        return model

Lastly, in the above code that I used, I replace predictedClasses = clf.predict(xTest) with predictedClasses = clf.predict(xTest, custom_objects={'matthewsCorrelation': matthewsCorrelation}). With these changes made, everything runs as I would expect.

Issue Analytics

State:
Created 3 years ago
Reactions:4
Comments:9 (2 by maintainers)

Top GitHub Comments

2reactions

garyeecommented, Feb 8, 2021

Hi,

did get the same error apparently. I got an error message: ValueError: Unable to restore custom object of type _tf_keras_metric currently. Please make sure that the layer implements 'get_config'and 'from_config' when saving. In addition, please use the 'custom_objects' arg when calling 'load_model()'. I used the StructuredDataClassifier. Apparently the custom metric is not saved with the model.

0reactions

JuliaWasalacommented, Jan 28, 2022

I can confirm that the combination of the other edits suggested in this issue, with the addition of compiling the model with the custom metric in evaluate solves the issue. For me it works with the following version of autoModel.evaluate:

    def evaluate(self, x, y=None, batch_size=32, verbose=1, custom_objects={},**kwargs):
        """Evaluate the best model for the given data.

        # Arguments
            x: Any allowed types according to the input node. Testing data.
            y: Any allowed types according to the head. Testing targets.
                Defaults to None.
            batch_size: Number of samples per batch.
                If unspecified, batch_size will default to 32.
            verbose: Verbosity mode. 0 = silent, 1 = progress bar.
                Controls the verbosity of
                [keras.Model.evaluate](http://tensorflow.org/api_docs/python/tf/keras/Model#evaluate)
            **kwargs: Any arguments supported by keras.Model.evaluate.

        # Returns
            Scalar test loss (if the model has a single output and no metrics) or
            list of scalars (if the model has multiple outputs and/or metrics).
            The attribute model.metrics_names will give you the display labels for
            the scalar outputs.
        """
        self._check_data_format((x, y))
        if isinstance(x, tf.data.Dataset):
            dataset = x
            x = dataset.map(lambda x, y: x)
            y = dataset.map(lambda x, y: y)
        x = self._adapt(x, self.inputs, batch_size)
        y = self._adapt(y, self._heads, batch_size)
        dataset = tf.data.Dataset.zip((x, y))
        pipeline = self.tuner.get_best_pipeline()
        dataset = pipeline.transform(dataset)
        if custom_objects:
            model = self.tuner.get_best_model(custom_objects=custom_objects)
            # only gets metrics from custom_objects for now
            model.compile(metrics=[val for key,val in custom_objects.items()])
        else:
            model = self.tuner.get_best_model()
        return utils.evaluate_with_adaptive_batch_size(
            model=model, batch_size=batch_size, x=dataset, verbose=verbose, **kwargs

I compile the model with only the metrics provided in custom objects.

Top Results From Across the Web

Developers - AutoModel.predict crashes with custom metric -

Bug Description. Bug Reproduction. Code for reproducing the bug: from sklearn.datasets import load_breast_cancer from tensorflow.keras.callbacks import ...

Adding a custom metric to AutoGluon

If best_quality preset was used, it would crash. Custom Accuracy Metric¶. We will start with calculating accuracy. A prediction is correct if the...

Tutorial — AutoTS 0.5.0 documentation - GitHub Pages

The 'Score' that compares models can easily be adjusted by passing through custom metric weights dictionary. Higher weighting increases the importance of ...

Auto Classes - Hugging Face

Instantiating one of AutoConfig, AutoModel, and AutoTokenizer will ... Each of the auto classes has a method to be extended with your custom...

Managing and Tracking ML Experiments with W&B - WandB

You can log custom metrics, matplotlib plots, datasets, embeddings from your models, prediction distribution, etc. Recently, Weights and Biases ...