Confusion Matrix Representation / Return Value
See original GitHub issueDescribe the workflow you want to enable
An enhancement to the output of confusion matrix function, better representing the true and predicted values for multilevel classes.
- i.e. Current Representation with code:
from sklearn.metrics import confusion_matrix
y_true = ["cat", "ant", "cat", "cat", "ant", "bird"]
y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"]
confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"])
Output: array([[2, 0, 0], [0, 0, 1], [1, 0, 2]])
Describe your proposed solution
When you have multiple levels you can have difficulty reading the ndarray, associating the levels with the True and Predicted values.
Proposed solution should look similar to the table below, providing better readability of the confusion matrix.
Predicted | Value | |||
---|---|---|---|---|
Levels | ant | bird | cat | |
True | ant | 2 | 0 | 0 |
Value | bird | 0 | 0 | 1 |
cat | 1 | 0 | 2 |
Possible Solutions:
- Provide a parameter to prettyprint the matrix. printMatrix [type:bool]
- Include another parameter to return ndarray, index as true_values, columns as predicted_values
For example:
cm = confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"])
index=["true:ant", "true:bird", "true:cat"]
columns=["pred:ant", "pred:bird", "pred:cat"]
return cm, index, columns
Which can be easily converted into a dataframe for further use
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:18 (10 by maintainers)
Top Results From Across the Web
Confusion Matrix for Machine Learning - Analytics Vidhya
Sklearn confusion_matrix() returns the values of the Confusion matrix. The output is, however, slightly different from what we have studied so ...
Read more >Understanding Confusion Matrix | by Sarang Narkhede
It is a table with 4 different combinations of predicted and actual values. It is extremely useful for measuring Recall, Precision, Specificity, Accuracy,...
Read more >Confusion Matrix: How To Use It & Interpret Results [Examples]
A confusion matrix is used for evaluating the performance of a machine learning model. Learn how to interpret it to assess your model's ......
Read more >What is a Confusion Matrix in Machine Learning
Given a list of expected values and a list of predictions from your machine learning model, the confusionMatrix() function will calculate a ...
Read more >Confusion Matrix - an overview | ScienceDirect Topics
Confusion matrices represent counts from predicted and actual values. The output “TN” stands for True Negative which shows the number of negative examples ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Returning tuples as keys within the nested dict work well with casting to DataFrame, please refer below dict structure representing the same.
Screenshot:
Another Option: Flat Dict Here, I think the problem with flat dict option is the casting the dataframe will need extra steps. If we need one step conversion then the nested dict with tuples as keys is good as seen above.
Flat Dict:
Screenshot:
I think nested dict with tuples as key represent the data very well preserving class names and easy conversion to dataframe. Please let me know your thoughts, I’ll make the changes accordingly. Thanks!
@jnothman @glemaitre taking your comments in due consideration, I’ve added another change so that it returns an output in a default data type i.e. dict() Eliminating need for hard dependencies, increasing usability with any other 3rd party libs too.
Please refer to the example code and screenshots. Thanks!
Example: from sklearn.metrics._classification import confusion_matrix y_true = [“cat”, “ant”, “cat”, “cat”, “ant”, “bird”] y_pred = [“ant”, “ant”, “cat”, “cat”, “ant”, “cat”] cm = confusion_matrix(y_true, y_pred, labels=[“ant”, “bird”, “cat”], pprint=True)
Code: Output as a dict()
Output w/o pprint(False):
Output w/ pprint(True):
For this solution changes required:
Option 2: For better understanding the true and pred values.
Output: