question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`QuantileDifferenceReason` and `StandardDeviationReason`

See original GitHub issue

Hey! I was thinking if it would make sense to add two more reasons for regressions tasks, namely something like HighLeveragePointReason and HighStudentizedResidualReason.

Citing Wikipedia:

  • Leverage is a measure of how far away the independent variable values of an observation are from those of the other observations. High-leverage points, if any, are outliers with respect to the independent variables (link)
  • A studentized residual is the quotient resulting from the division of a residual by an estimate of its standard deviation. […] This is an important technique in the detection of outliers. (link)

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:31 (31 by maintainers)

github_iconTop GitHub Comments

5reactions
FBruzzesicommented, Dec 19, 2021

The following results are mean scores across 500 different random states per reason-%shuffled pairs

reason recall precision fpr %shuffled
QuantileDifferenceReason 0.31 0.05 0.051 1%
QuantileDifferenceReason 0.40 0.24 0.042 5%
QuantileDifferenceReason 0.37 0.42 0.033 10%
QuantileDifferenceReason 0.29 0.62 0.023 20%
BoxplotReason 0.10 0.07 0.004 1%
BoxplotReason 0.13 0.34 0.003 5%
BoxplotReason 0.11 0.47 0.002 10%
BoxplotReason 0.06 0.52 0.0007 20%
StandardDeviationReason 0.29 0.072 0.037 1%
StandardDeviationReason 0.34 0.296 0.029 5%
StandardDeviationReason 0.30 0.461 0.023 10%
StandardDeviationReason 0.26 0.703 0.014 20%
2reactions
FBruzzesicommented, Dec 27, 2021

@koaning finally found the time to write a notebook, you can find it here.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Mean and standard deviation or median and quartiles?
Mean and standard deviation are frequently used measures of central tendency and variability in data from scale variables.
Read more >
Interquartile Range vs. Standard Deviation - Statology
The interquartile range and the standard deviation are two ways to measure the spread of values in a dataset.
Read more >
Quantiles as summary statistics- Principles
As a result, a quantile is often defined as the value which has the pth relative rank within a ''population'' - where the...
Read more >
Z-Score and Quantiles in Statistics - AI ML Analytics
If we try to understand it in a more technical way, then it states how many standard deviations above or below the mean...
Read more >
3.5 - Measures of Spread or Variation | STAT 100
The IQR is a type of resistant measure. The second measure of spread or variation is called the standard deviation (SD).
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found