question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] Numerical inaccuracy in summation based routines

See original GitHub issue

Describe the bug

Bottleneck’s implementation of algorithms containing a summation yields different results than numpy for floats. This seems to stem from the fact that numpy uses some sort of compensated summation algorithm to increase accuracy, while bottleneck uses a straight sum, e.g.: bottleneck/src/reduce_template.c

FOR {
    const npy_DTYPE0 ai = AI(DTYPE0);
    if (!bn_isnan(ai)) {
        asum += ai;
    }

To Reproduce

import numpy as np
import bottleneck

# adding float32.eps to 2.f gives 2.f so e.g. Kahan-summation is needed to get result != 2.f
arr = np.hstack(([np.float32(2.)], np.repeat(np.finfo(np.float32).eps, 100000).astype(np.float32)))
print('numpy: ', np.nansum(arr))
print('bottleneck: ', bottleneck.nansum(arr))
numpy:  2.011919
bottleneck:  2.0

System: Linux-5.11.11-arch1-1-x86_64-with-glibc2.33 Python 3.9.2 (default, Feb 20 2021, 18:40:11) [GCC 10.2.0] bottleneck 1.3.2

Expected behavior As implementations can be switched due to non-obvious reasons (like a fallback to numpy routines in the case of non-native byteorder), results between bottleneck-routines and numpy should match. If a complete match of results is not attainable, the documentation should state clearly that bottleneck does not always reproduce numpy results.

Additional context https://github.com/astropy/astropy/issues/11492

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:1
  • Comments:5

github_iconTop GitHub Comments

1reaction
qwhelancommented, Aug 4, 2021

@sebasv Sorry for the lack of response here, I’ve had significantly less bandwidth this year.

I believe it’s possible to match numpy’s output while also being faster. I have some local commits that are unfinished that accomplish part of this - biggest issue is that I would want to fix this for every function in one release.

0reactions
sebasvcommented, Aug 7, 2021

I believe now that Numpy uses pairwise summation (see numpy/numpy#3685). Naive summation has a O(n) error, pairwise has an O(log(n)) error and Kahan has an O(1) error. With a large base case, Naive and pairwise have equivalent speed (just minimal recursion overhead). Kahan requires about 4 times the number of additions. So perhaps pairwise is the best fit for Bottleneck?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Numerical Validation of Compensated Summation Algorithms ...
Abstract. Compensated summation algorithms are designed to improve the accuracy of ill-conditioned sums. They are based on algorithms, such as ...
Read more >
A Class of Fast and Accurate Summation Algorithms
Summation is a key computational task at the heart of many numerical algorithms, most notably numerical linear algebra kernels involving inner.
Read more >
A Comprehensive Study of Real-World Numerical Bug ...
Based on our observations, we propose four categories for numerical bugs: accuracy, special value, convergence, and correctness. A. Accuracy Bugs. We classified ...
Read more >
What Every Computer Scientist Should Know About Floating ...
Another way to measure the difference between a floating-point number and the real number it is approximating is relative error, which is simply...
Read more >
SUMMATION How should we compute a sum S = a + a2+
based on their original given order? ... Recall the relationship between a number x and its ... We use these results as tools...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found