Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Which loss function works in multi-label classification task?

See original GitHub issue

I need to train a multi-label classifier for text topic classification task. Having searched around the internet, I follow the suggestion to use sigmoid + binary_crossentropy. But I can’t get good results (i.e. subset accuracy) on the validation set although the loss is very small. After reading the source codes in Keras, I find out that the binary_crossentropy loss is implemented like this,

def binary_crossentropy(y_true, y_pred): 
    return K.mean(K.binary_crossentropy(y_true, y_pred), axis=-1)

My doubt is whether it makes sense to use the average in the case of multi-label classification task. Suppose that the dimension of label set is 30, and each training sample has only two or three of the labels. Since most of the labels are zeros in the most of the samples, I guess this loss will encourage the classifier to predict a tiny probability in each output dimension.

Following the idea here, https://github.com/keras-team/keras/issues/2826, I also give a try to categorial_crossentropy but still have no such luck.

Any tips on choosing the loss function for multi-label classification task is beyond welcome. Thanks in advance.

Issue Analytics

State:
Created 5 years ago
Comments:15

Top GitHub Comments

19reactions

daniel410commented, Jun 12, 2018

For the multi-label classification, you can try tanh+hinge with {-1, 1} values in labels like (1, -1, -1, 1). Or sigmoid + hamming loss with {0, 1} values in labels like (1, 0, 0, 1). In my case, sigmoid + focal loss with {0, 1} values in labels like (1, 0, 0, 1) worked well. You can check this paper https://arxiv.org/abs/1708.02002.

10reactions

BovineEnthusiastcommented, Dec 7, 2018

I found an implementation of multi-label focal loss here:

https://github.com/Umi-you/FocalLoss

EDIT: Seems like his implementation doesn’t work.