question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Which loss function works in multi-label classification task?

See original GitHub issue

I need to train a multi-label classifier for text topic classification task. Having searched around the internet, I follow the suggestion to use sigmoid + binary_crossentropy. But I can’t get good results (i.e. subset accuracy) on the validation set although the loss is very small. After reading the source codes in Keras, I find out that the binary_crossentropy loss is implemented like this,

def binary_crossentropy(y_true, y_pred): 
    return K.mean(K.binary_crossentropy(y_true, y_pred), axis=-1)

My doubt is whether it makes sense to use the average in the case of multi-label classification task. Suppose that the dimension of label set is 30, and each training sample has only two or three of the labels. Since most of the labels are zeros in the most of the samples, I guess this loss will encourage the classifier to predict a tiny probability in each output dimension.

Following the idea here, https://github.com/keras-team/keras/issues/2826, I also give a try to categorial_crossentropy but still have no such luck.

Any tips on choosing the loss function for multi-label classification task is beyond welcome. Thanks in advance.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:15

github_iconTop GitHub Comments

19reactions
daniel410commented, Jun 12, 2018

For the multi-label classification, you can try tanh+hinge with {-1, 1} values in labels like (1, -1, -1, 1). Or sigmoid + hamming loss with {0, 1} values in labels like (1, 0, 0, 1). In my case, sigmoid + focal loss with {0, 1} values in labels like (1, 0, 0, 1) worked well. You can check this paper https://arxiv.org/abs/1708.02002.

10reactions
BovineEnthusiastcommented, Dec 7, 2018

I found an implementation of multi-label focal loss here:

https://github.com/Umi-you/FocalLoss

EDIT: Seems like his implementation doesn’t work.

Read more comments on GitHub >

github_iconTop Results From Across the Web

What loss function for multi-class, multi-label classification ...
1. I believe softmax is "sigmoid units that squash their inputs into a probability range 0..1 for every class". – Hong Ooi ·...
Read more >
Which loss function and metrics to use for multi-label ...
Multi-label and single-Label determines which choice of activation function for the final layer and loss function you should use.
Read more >
Modified Cross-Entropy loss for multi-label classification and ...
We discussed the convenient way to apply cross entropy loss for multi-label classifications and offset it with appropriate class weights to ...
Read more >
Cross-entropy for classification - Towards Data Science
Multi-class classification · The loss is 0 when the prediction is 1 (the same as the target). · The loss is infinity if...
Read more >
Multi-Label Classification with Deep Learning
Deep learning neural networks are an example of an algorithm that natively supports multi-label classification problems. Neural network models ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found