Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Can't run a classifier

See original GitHub issue

Hello

My name is Iván, I’m stuck from several days ago with the problem I’m going to describe. I’m following the Daniel Nouri’s tutorial about deep learning: http://danielnouri.org/notes/category/deep-learning/ and I tried to adapt his example to a classification dataset. My problem here is that if I treat the dataset as a regression problem, it works properly, but if I try to perform a classification, it fails. I tried to write 2 reproducible examples.

Regression (it works well)

import lasagne
from sklearn import datasets
import numpy as np
from lasagne import layers
from lasagne.updates import nesterov_momentum
from nolearn.lasagne import NeuralNet
from sklearn.preprocessing import StandardScaler

iris = datasets.load_iris()
X = iris.data[iris.target<2]  # we only take the first two features.
Y = iris.target[iris.target<2]
stdscaler = StandardScaler(copy=True, with_mean=True, with_std=True)
X = stdscaler.fit_transform(X).astype(np.float32)
y = np.asmatrix((Y-0.5)*2).T.astype(np.float32)

print X.shape, type(X)
print y.shape, type(y)

net1 = NeuralNet(
    layers=[  # three layers: one hidden layer
        ('input', layers.InputLayer),
        ('hidden', layers.DenseLayer),
        ('output', layers.DenseLayer),
        ],
    # layer parameters:
    input_shape=(None, 4),  # 96x96 input pixels per batch
    hidden_num_units=10,  # number of units in hidden layer
    output_nonlinearity=None,  # output layer uses identity function
    output_num_units=1,  # 1 target value

    # optimization method:
    update=nesterov_momentum,
    update_learning_rate=0.01,
    update_momentum=0.9,

    regression=True,  # flag to indicate we're dealing with regression problem
    max_epochs=400,  # we want to train this many epochs
    verbose=1,
    )

net1.fit(X, y)

Classification (it raises an error of matrix dimensionalities; I paste it below)

import lasagne
from sklearn import datasets
import numpy as np
from lasagne import layers
from lasagne.nonlinearities import softmax
from lasagne.updates import nesterov_momentum
from nolearn.lasagne import NeuralNet
from sklearn.preprocessing import StandardScaler

iris = datasets.load_iris()
X = iris.data[iris.target<2]  # we only take the first two features.
Y = iris.target[iris.target<2]
stdscaler = StandardScaler(copy=True, with_mean=True, with_std=True)
X = stdscaler.fit_transform(X).astype(np.float32)
y = np.asmatrix((Y-0.5)*2).T.astype(np.int32)

print X.shape, type(X)
print y.shape, type(y)

net1 = NeuralNet(
    layers=[  # three layers: one hidden layer
        ('input', layers.InputLayer),
        ('hidden', layers.DenseLayer),
        ('output', layers.DenseLayer),
        ],
    # layer parameters:
    input_shape=(None, 4),  # 96x96 input pixels per batch
    hidden_num_units=10,  # number of units in hidden layer
    output_nonlinearity=softmax,  # output layer uses identity function
    output_num_units=1,  # 1 target value

    # optimization method:
    update=nesterov_momentum,
    update_learning_rate=0.01,
    update_momentum=0.9,

    regression=False,  # flag to indicate we're dealing with classification problem
    max_epochs=400,  # we want to train this many epochs
    verbose=1,
    )

net1.fit(X, y)

The failed output I get with the code 2.


(100, 4) <type 'numpy.ndarray'>
(100, 1) <type 'numpy.ndarray'>
  input                 (None, 4)               produces       4 outputs
  hidden                (None, 10)              produces      10 outputs
  output                (None, 1)               produces       1 outputs
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-13-184a45e5abaa> in <module>()
     40     )
     41 
---> 42 net1.fit(X, y)

/Users/ivanvallesperez/anaconda/lib/python2.7/site-packages/nolearn/lasagne/base.pyc in fit(self, X, y)
    291 
    292         try:
--> 293             self.train_loop(X, y)
    294         except KeyboardInterrupt:
    295             pass

/Users/ivanvallesperez/anaconda/lib/python2.7/site-packages/nolearn/lasagne/base.pyc in train_loop(self, X, y)
    298     def train_loop(self, X, y):
    299         X_train, X_valid, y_train, y_valid = self.train_test_split(
--> 300             X, y, self.eval_size)
    301 
    302         on_epoch_finished = self.on_epoch_finished

/Users/ivanvallesperez/anaconda/lib/python2.7/site-packages/nolearn/lasagne/base.pyc in train_test_split(self, X, y, eval_size)
    399                 kf = KFold(y.shape[0], round(1. / eval_size))
    400             else:
--> 401                 kf = StratifiedKFold(y, round(1. / eval_size))
    402 
    403             train_indices, valid_indices = next(iter(kf))

/Users/ivanvallesperez/anaconda/lib/python2.7/site-packages/sklearn/cross_validation.pyc in __init__(self, y, n_folds, shuffle, random_state)
    531         for test_fold_idx, per_label_splits in enumerate(zip(*per_label_cvs)):
    532             for label, (_, test_split) in zip(unique_labels, per_label_splits):
--> 533                 label_test_folds = test_folds[y == label]
    534                 # the test split can be too big because we used
    535                 # KFold(max(c, self.n_folds), self.n_folds) instead of

IndexError: too many indices for array

What is going on here? Am I doing something bad? I thing I tried everything but I am not able to figure out what is happening.

Note that I just updated today my dependencies using the command: pip install -r https://raw.githubusercontent.com/dnouri/kfkd-tutorial/master/requirements.txt

Thanks in advance

Edit

I achieved to make it work by performing the subsequent changes but I still have some doubts:

I defined Y as a one-dimensional vector with 0/1 values as: y = Y.astype(np.int32) but I still have some doubts
I had to change the parameter output_num_units=1 to output_num_units=2 and I’m not really sure of understanding that because I’m working with a binary classification problem and I think that this multilayer perceptron should have only 1 output neuron, not 2 of them… Am I wrong?

I also tried to change the cost function to a ROC-AUC. I know there’s a parameter called objective_loss_function which is defined as objective_loss_function=lasagne.objectives.categorical_crossentropy by default but… how can I use the ROC AUC as the cost function instead of the categorical crossentropy?

Thanks

Issue Analytics

State:
Created 8 years ago
Comments:7

Top GitHub Comments

1reaction

BenjaminBossancommented, Nov 21, 2015

Hi Iván,

regarding your first question, since you perform a softmax function on your output layer, the values have to sum up to 1 for each row. That is why you need 2 outputs. It is a little unintuitive but you figured out what to do.

Regarding roc auc, there is no straightforward implementation of that cost function, since you cannot really differentiate it, which is necessary to perform backprop. There are proxy metrics for roc auc, but from my experience, they are not worth the trouble. The main reason is because they are much more unstable than cross-entropy and did not lead to other results at the end.

Hope that helps

0reactions

jmwolosocommented, Jan 2, 2016

Nevermind, I believe I have figured it out. The first prediction in the output is p(x=0) with the second one being p(x=1).