question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

User guide formulas for macro average recall and precision seem wrong

See original GitHub issue

To clarify, the calculation provided by recall_score(y_true, t_pred, average='macro') is correct, but the formula in the User Guide seems to be wrong. Here’s an example that shows that recall_score and precision_score calculate the correct numbers:

In [2]: y_true = [0, 1, 2, 0, 1, 2]

In [3]: y_pred = [0, 2, 1, 0, 0, 1]

In [4]: confusion_matrix(y_true, y_pred)
Out[4]: 
array([[2, 0, 0],
       [1, 0, 1],
       [0, 2, 0]])

In [5]: recall_score(y_true, y_pred, average='macro')
Out[5]: 0.3333333333333333

In [6]: precision_score(y_true, y_pred, average='macro')
Out[6]: 0.2222222222222222

In [14]: np.mean(np.diagonal(C) / np.sum(C, axis=1))
Out[14]: 0.3333333333333333

In [15]: np.mean(np.diagonal(C) / np.sum(C, axis=0))
Out[15]: 0.2222222222222222

However, this is what the User Guide gives for the precision and recall formulas:

Screen Shot 2020-11-18 at 11 06 27 AM Screen Shot 2020-11-18 at 11 06 48 AM

So, the user guide is saying that recall is calculated for each class, l, by dividing by the number of samples with predicted label l, but it should be dividing by the number of samples with true label l, and vice versa for the precision definition.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:12 (8 by maintainers)

github_iconTop GitHub Comments

2reactions
jnothmancommented, Dec 19, 2020

I think the confusing notation should be fixed, and where possible we should make the functions take arguments true then predicted.

I believe strongly that precision and recall should be defined in terms of set notation. It makes it much easier to reason about variants of the metric (e.g. micro averages in multiclass classification where some class is considered the majority class to be ignored; or the usages of precision and recall in information retrieval and information extraction where you can’t presume to be able to count the total number of items being considered for prediction). It also makes the equivalence between F1 and dice coefficient more apparent

1reaction
Rubix982commented, Nov 21, 2020

Hi, guys! I have a paper starting from next Tuesday and kind of busy this weekend preparation. Will be back next week after my papers end. Hope you don’t mind, @CameronBieganek 😄

Read more comments on GitHub >

github_iconTop Results From Across the Web

Choosing Performance Metrics. Accuracy, recall, precision, F1 ...
Accuracy, recall, precision, F1 score––how do you choose a metric for judging model performance? And once you choose, do you want the macro...
Read more >
How to Calculate Precision, Recall, and F-Measure for ...
In this tutorial, you will discover how to calculate and develop an intuition for precision and recall for imbalanced classification. After ...
Read more >
Micro-average & Macro-average Scoring Metrics - Python
In this post, you will learn about how to use micro-averaging and macro-averaging methods for evaluating scoring metrics (precision, recall, ...
Read more >
Why are precision, recall and F1 score equal when using ...
In a recent project I was wondering why I get the exact same value for precision, recall and the F1 score when using...
Read more >
Confusion matrix- Machine learning | Clairvoyant Blog
Accuracy is calculated for the model as a whole but recall and precision are calculated for individual classes. We use macro or micro...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found