sklearn.multiclass.OneVsRestClassifier documentation
See original GitHub issueIs this documentation for sklearn.multiclass.OneVsRestClassifier
correct?
Also known as one-vs-all, this strategy consists in fitting one classifier per class. For each classifier, the class is fitted against all the other classes. … In the multilabel learning literature, OvR is also known as the binary relevance method.
There is similar documentation in the user guide:
This strategy, also known as one-vs-all, is implemented in OneVsRestClassifier. The strategy consists in fitting one classifier per class. For each classifier, the class is fitted against all the other classes.
The prior section in the user guide talks about an example problem of multilabel classification as follows.
An array such as
np.array([[1, 0, 0], [0, 1, 1], [0, 0, 0]])
represents label 0 in the first sample, labels 1 and 2 in the second sample, and no labels in the third sample.
All of these together do not make sense to me. The wording “For each classifier, the class is fitted against all the other classes” seems to imply a multi class problem, not a multi label problem. It seems like in the implementation it is actually fitting each label with its binary indicator independent of the other labels. I am further confused because the documentation says one vs. rest is the same as binary relevance, but the wikipedia article on multi-label classification seems to say otherwise:
This method of dividing the task into multiple binary tasks has something in common with the one-vs.-all (OvA, or one-vs.-rest, OvR) method for multiclass classification. Note though that it is not the same method: in binary relevance we train one classifier for each label, not one classifier for each possible value for the label.
For what it’s worth, it seems like sklearn.multiclass.OneVsRestClassifier
implements what that wikipedia article would call binary relevance, not one vs. rest.
Issue Analytics
- State:
- Created 6 years ago
- Comments:15 (11 by maintainers)
Top GitHub Comments
It did indeed reach my inbox. I don’t think there was anything wrong with posting your comment; it displayed how difficult it is to deal with the mix of terminology. But hopefully there aren’t terribly many problems to solve here.
@jnothman Sorry about that! I agree that there isn’t as much of an issue as I had initially thought, and wish I hadn’t made the comment in the first place. I deleted my comment on GitHub soon after I had made it. But, I think because of email replying, it still reached your inbox.