question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

What-If Tool: thresholded inference problem (confusion matrix/ROC)

See original GitHub issue

Version info:

  • TensorBoard 1.12.0a0
  • TensorFlow 1.8.0
  • MacOS 10.13.6
  • Python 2.7

Description: Running 2-class classification with a custom estimator results in incorrect confusion matrix/ROC curve values. When dragging the threshold slider, the “actual yes/no” percentages change (see screenshots). Other than that, when using a vocab file to specify the labels (“False”, “True”), the legend shows “False” and “undefined”. The inference scores seem to return correctly.

screen shot 2018-09-26 at 3 52 06 pm screen shot 2018-09-26 at 3 51 56 pm screen shot 2018-09-26 at 3 56 49 pm

I would assume, even if my model would be incorrect, that the “actual” samples are unrelated to the threshold set.

Context: The classification API is used as

signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: tf.estimator.export.ClassificationOutput(scores=softmax, classes=None)

with softmax a (?,2)-shaped Tensor. This leads to the following signature:

The given SavedModel SignatureDef contains the following input(s):
  inputs['inputs'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: Placeholder:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['scores'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 2)
      name: softmax/Reshape_1:0
Method name is: tensorflow/serving/classify

The ground truth is specified via a numeric integer value in [0,1] (about 97% 0 and 3% 1).

screen shot 2018-09-26 at 3 51 47 pm

Inference result as shown in the datapoint editor: screen shot 2018-09-26 at 3 57 40 pm

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:1
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
jameswexcommented, Sep 28, 2018

Thanks for the doc reference. I created #1471 to have the what-if tool handle empty labels by using indices for the class labels, which seems to do the right thing for your case.

1reaction
jameswexcommented, Sep 27, 2018

@reinhouthooft Thanks for the bug report. Would you be willing to provide the saved model and a tf record file of examples for me to reproduce the problem with? Or is the model and/or data not for sharing?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Demystifying the Confusion Matrix Using a Business Example
A deep dive in confusion matrix, understanding the threshold,Area Under Curve(AUC) of ROC and their major impact on model evaluation.
Read more >
Confusion Matrix: How To Use It & Interpret Results [Examples]
A confusion matrix is used for evaluating the performance of a machine learning model. Learn how to interpret it to assess your model's ......
Read more >
ROC Curves and Precision-Recall Curves for Imbalanced ...
The curve provides a convenient diagnostic tool to investigate one classifier with different threshold values and the effect on the ...
Read more >
Model Assessment with ROC Curves
Here we take a look at model performance metrics derived from the confusion matrix. We highlight their shortcomings and illustrate how ROC curves...
Read more >
The What-If Tool: Interactive Probing of Machine Learning ...
To address this issue, we replaced the data points visualization (in this tab) with the confusion matrices and ROC curves. This simpler and...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found