question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Misleading plot example for Isolation Forest method

See original GitHub issue

Describe the issue linked to the documentation

I have observed that the plotted feature receive the color of the expected label, i.e., train, test, and outlier. https://github.com/scikit-learn/scikit-learn/blob/778b11904e8ec0286e977582d37e7ca495947ee5/examples/ensemble/plot_isolation_forest.py#L57-L62

Instead, I would expect to plot the predicted labels, i.e., the color of the plotted features according to the prediction of the method. In the published example, the variables defined below are not used for the plot. https://github.com/scikit-learn/scikit-learn/blob/778b11904e8ec0286e977582d37e7ca495947ee5/examples/ensemble/plot_isolation_forest.py#L45-L47

Suggest a potential alternative/fix

The expected plot would be something like this:

b1 = plt.scatter(X_train[:, 0], X_train[:, 1], c=['white' if x > 0 else 'red' for x in y_pred_train],
                 s=20, edgecolor='k')
b2 = plt.scatter(X_test[:, 0], X_test[:, 1], c=['green' if x > 0 else 'red' for x in y_pred_test],
                 s=20, edgecolor='k')
c = plt.scatter(X_outliers[:, 0], X_outliers[:, 1], c=['yellow' if x > 0 else 'red' for x in y_pred_outliers],
                s=20, edgecolor='k')

And the result in this case would vary: isolation_forest_doc_sklearn

Link to the online example: https://scikit-learn.org/stable/auto_examples/ensemble/plot_isolation_forest.html

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
pedrormjuniorcommented, Jan 14, 2020

I agree with you guys, to plot a discrete decision boundary would be nice!

0reactions
cmarmocommented, Sep 21, 2022

Related to #22406.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Anomaly Detection with Isolation Forest & Visualization
Here we are identifying anomalies using isolation forest. ... This function creates actuals plot on a time series with anomaly points highlighted on...
Read more >
Anomaly Detection Using Isolation Forest Algorithm - Medium
Let's start with a simple game — “Find Odd man out”. I will show you an Image and you have to identify the...
Read more >
Isolation Forest | Anomaly Detection with ... - Analytics Vidhya
Let us look at the complete algorithm step by step: ... After an ensemble of iTrees(Isolation Forest) is created, model training is complete....
Read more >
An Introduction to Isolation Forests
Isolation Forest is an unsupervised decision-tree-based algorithm originally developed for outlier detection in tabular data, which consists in splitting sub- ...
Read more >
Machine Learning Interpretability for Isolation forest using SHAP
This plot gives us the impact of a particular variable on anomaly detection. Let's take Co2 as an example. The summary plot says...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found