Potential bug/confusion when is_multilabel=True for some metrics
See original GitHub issueHello,
in the documentation for Precision
, Recall
and ClassificationReport
, if the task is multilabel, following examples are given:
Multilabel case, the shapes must be (batch_size, num_categories, ...)
.. testcode:: 2
metric = ClassificationReport(output_dict=True, is_multilabel=True)
metric.attach(default_evaluator, "cr")
y_true = torch.Tensor([
[0, 0, 1],
[0, 0, 0],
[0, 0, 0],
[1, 0, 0],
[0, 1, 1],
]).unsqueeze(0)
y_pred = torch.Tensor([
[1, 1, 0],
[1, 0, 1],
[1, 0, 0],
[1, 0, 1],
[1, 1, 0],
]).unsqueeze(0)
In all metric docs, a first-dim is added by unsqueeze(0)
to both y_true
and y_pred
, but it is also told that the shapes must be (batch_size, num_categories, ...)
. From my understanding of the latter, both y_true
and y_pred
should have n_samples X num_categories
shape. In this example, this amounts to 5 x 3
e.g. 5 examples/samples and three output labels, each binary.
- Why do we add the dummy singleton dim in the first place? If we don’t add it, per-label computation is not correct as it takes the sample dimension as the label dimension i.e. if 256 examples are given, 256 metrics are computed for each of them.
Thanks!
Issue Analytics
- State:
- Created 2 years ago
- Comments:12 (4 by maintainers)
Top Results From Across the Web
No results found
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@vfdev-5 I’ll do it, if you will
Sorry i’m a little busy these days, i have not started this one yet. @nishantb06 you can take this one. I will focus on #2423