Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Accuracy computation with variable classes

See original GitHub issue

❓ Questions/Help/Support

I’ve been using the Accuracy class (in ignite.metrics) successfully so far in a multi-class setting to compute test set accuracy.

Recently, I’ve hit a new scenario/dataset where the test set has different number of classes for each example. You can think of this as a multiple-choice selection task, where each example has a different number of candidates.

Now in this scenario, when I used Accuracy, I got the below error: ERROR:ignite.engine.engine.Engine:Current run is terminating due to exception: Input data number of classes has changed from 13 to 81.

I dived into the code and I see that the base class for Accuracy, i.e., _BaseClassification assumes that the number of classes/candidates is fixed across test set examples. And this is why I’m getting the above error at run-time.

However, for multi-class accuracy, as you can see in the update() method here, we are simply computing the number of correct predictions via argmax along dimension-1 and measuring matches against y.

This computation of correct should be accurate even if the number of candidates changes across two examples from the test set, right?

So in effect, the computed accuracy would be correct even in such a case because the underlying variables are computed correctly?

What do you think, @vfdev-5? Is it possible to have a version of Accuracy for multi-class where there is no expectation that the number of classes is the same across all examples in the test set?

Issue Analytics

State:
Created 4 years ago
Comments:6

Top GitHub Comments

1reaction

g-karthikcommented, Feb 23, 2020

@vfdev-5 Thanks a lot for the quick response! Here’s an example model, GPT2DoubleHeadsModel from the transformers library by Hugging Face.

For the multiple-choice classification task, in principle one could have a varying number of candidates/choices across examples in the test set.

So x1 might have C1 candidates, x2 might have C2 candidates, x3 might have C3 candidates and so on, like you showed above. The model would output logits over the candidate set for x1, another logits over the candidate set for x2, etc.

And we would know the multiple-choice label for x1, the multiple-choice label for x2, etc. So we should be able to compute the accuracy by computing match b/w logits and labels for x1, then for x2, and so on. And then finally we would have the total test set accuracy.

I think the padding on maximum number of classes should work, but that can cause memory issues when the maximum number of classes is very high compared to most of the examples.

Fixing the number of classes might make sense for training time to make life easier, but that might not be the case during inference time.

0reactions

vfdev-5commented, Mar 6, 2020

@g-karthik maybe TopKCategoricalAccuracy can help ?

Top Results From Across the Web

Confusion Matrix for Your Multi-Class Machine Learning Model

Calculate precision, recall, f1-score from confusion matrix for your multi-class machine learning classification problem.

Calculating Precision & Recall for Multi-Class Classification

Micro averaging follows the one-vs-rest approach. It calculates Precision & Recall separately for each class with True(Class predicted as Actual) ...

Balanced Accuracy: When Should You Use It? - neptune.ai

Balanced Accuracy is used in both binary and multi-class classification. It's the arithmetic mean of sensitivity and specificity, its use case is when...

Does the number of classes in the target variable affect the ...

I said that number of classes provides a base to understand how "good" specific accuracy is. And the more the classes, the harder...

How to compute accuracy for multi class classification problem ...

Precision for each class (assuming the predictions are on the rows and the true outcomes are on the columns) can be computed with:...