Percentile Linear returns incorrect value.
See original GitHub issueThe percentile function does not return the expected values. It seems to be getting the linear distance between two points the wrong way around and returns a value closer to the lower number.
For example:
array([ 110531353, 167471747, 167471747, 183000406,
200000000, 759174457, 921094606, 931142911,
1300000000, 1341797102, 1380317195, 1380317195,
1500000000, 1500000000, 1500000000, 1830004057,
1932444073, 2000000000, 2000000000, 2345976525, 2500000000,
2745006085, 2847019692, 3000000000, 3000000000, 3000000000,
3000000000, 3312761268, 3500000000, 3588824707, 4000000000,
4140951585, 5000000000, 6600000000, 7100000000, 7717299940,
8445515490, 8972061767, 9662220364, 11000000000, 11042537559,
12422854754, 13000000000, 13000000000, 13803171949, 15000000000,
17000000000, 20000000000, 22085075118, 24845709508, 32025070994,
34000000000, 35000000000, 36000000000, 36000000000, 39453076027,
40000000000, 46930784627, 82819031694, 110425375592], dtype=int64)
> > > np.percentile(a,75)
> > > 14102378961.75
> > > np.percentile(a,50)
> > > 3794412353.5
> > > np.percentile(a,25)
> > > 1747503042.75
expected values:
75 = 14700792987.25
50 = 3794412353.5
25 = 1582501014.25
When reading the documentation the linear methodology seems correct but there possibly could be an issue with the fraction it is using??
Issue Analytics
- State:
- Created 7 years ago
- Comments:21 (15 by maintainers)
Top Results From Across the Web
numpy percentile function outputs wrong values with nearest ...
"40th percentile" means that 40% of the values are below the value. With 5 values, that means there are 2 values below, which...
Read more >Percentiles: Interpretations and Calculations - Statistics By Jim
Percentiles indicate the percentage of scores that fall below a particular value. They tell you where a score stands relative to other scores....
Read more >How to Calculate Percentiles in NumPy with np ... - Datagy
A percentile is a measure that indicates the value below which a percentage of observations in a group fall. For example, the 50th...
Read more >Excel PERCENTILE Functions - Calculate kth percentile
Learn how to use Excel's PERCENTILE function for both Mac and PC. ... Lookup - Return Cell Address (Not Value) ... FALSE, The...
Read more >Percentile and Percentile Rank Calculation in Excel - Xelplus
Without getting into the mathematical weeds of Linear Interpolation, the generic answer to this difference is: Inclusive evaluate all values in the data...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
There is nothing wrong with it, but there are different ways to calculate percentile, see gh-10736. In particular, like many many default values, NumPy’s is a sample percentile and not a population estimtae (there are many population estimate choices!). Comments on that PR are very welcome.
@apbard I completely agree with you that the Method1 should be available, and that the names of the other interpolation methods are confusing. I posted this issue separately in here before finding your message in this thread.