Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Feature Suggestion] Add histogram plot of residual errors to ResidualsPlot

See original GitHub issue

The current ResidualsPlot shows training and testing residuals as a scatter plot, by eye we can get an idea of whether more errors are above or below the 0 line. By adding a histogram of testing errors we might more clearly be able to tell if errors have a Normal distribution.

In the following examples I have some large positive and negative errors, from the histogram it looks as though I have a negatively skewed distribution which might tell me something about my training examples:

from yellowbrick.regressor import ResidualsPlot
fig, ax = plt.subplots(figsize=(8,6)); 
model = ResidualsPlot(clone_estimator(clf), ax=ax)
model.fit(X_train, y_train)
model.score(X_test, y_test)

# add histogram of residual errors
left, bottom, width, height = [0.65, 0.17, 0.2, 0.2]
ax2 = fig.add_axes([left, bottom, width, height])

testing_residuals = pd.Series(model.predict(X_test) - y_test)
testing_residuals.plot(kind="hist", bins=50, title="Residuals on Predicted", ax=ax2);
ax2.vlines(0, ymin=0, ymax=ax2.get_ylim()[1] ) # add x==0 line

model.poof()

It isn’t obvious where the best location would be for the histogram. Annoyingly I cannot get an alpha value for ax2 either (I’d hoped to make this semi-transparent so location was less of an issue).

Issue Analytics

State:
Created 6 years ago
Reactions:1
Comments:12 (12 by maintainers)

Top GitHub Comments

2reactions

ianozsvaldcommented, Jun 21, 2017

Well, thank you all too for putting this library together, I’ve got a bunch of my own hacky viz tools but you’ve built something far more useful here. @rebeccabilbro’s talk for us at the conference (and the book signing she joined me for) was ace 😃

0reactions

ianozsvaldcommented, Jun 18, 2018

Overlaying test and train probably looks fine (given the PDF), that should be more comparable than having one stacked on the other? The PDF should look lovely regardless. I look forward to seeing it 😃

Top Results From Across the Web

Interpreting Residual Plots to Improve Your Regression

If the points skew drastically from the line, you could consider adjusting your model by adding or removing other variables in the regression...

Origin Help - Residual Plot Analysis

A residuals plot (see the picture below) which has an increasing trend suggests that the error variance increases with the independent variable; ...

Goodness of the fit; linear regression, residual histogram

5.1: Residual Histogram; 5.2: R-square and Goodness of the Fit ... You can try to create a model with one feature (e.g predicting...

Residuals Plot — Yellowbrick v1.5 documentation

This seems to indicate that our linear model is performing well. We can also see from the histogram that our error is normally...

4.8 - Further Examples | STAT 501

This violates the assumption of constant error variance. ... Hsitogram of the Residuals plot. qq plot. Interpretation: The histogram is roughly bell-shaped ...