Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unexpected results for the mean of a DataFrame of ufloat from the uncertainties package.

See original GitHub issue

Related to #6898.

I find it very convenient to use a DataFrame of ufloat from the uncertainties package. Each entry consists of (value, error) and could represent the result of Monte Carlo simulations or an experiment.

At present taking sums along both axes gives the expected result, but taking the mean does not.

import pandas as pd
import numpy as np
from uncertainties import unumpy

value = np.arange(12).reshape(3,4)
err = 0.01 * np.arange(12).reshape(3,4) + 0.005

data = unumpy.uarray(value, err)

df = pd.DataFrame(data, index=['r1', 'r2', 'r3'], columns=['c1', 'c2', 'c3', 'c4'])

Examples:

print (df)
               c1             c2             c3             c4
r1  0.000+/-0.005  1.000+/-0.015  2.000+/-0.025  3.000+/-0.035
r2    4.00+/-0.04    5.00+/-0.06    6.00+/-0.07    7.00+/-0.08
r3    8.00+/-0.09    9.00+/-0.10   10.00+/-0.11   11.00+/-0.12

df.sum(axis=0) # This works

c1    12.00+/-0.10
c2    15.00+/-0.11
c3    18.00+/-0.13
c4    21.00+/-0.14
dtype: object

df.sum(axis=1) # This works

r1     6.00+/-0.05
r2    22.00+/-0.12
r3    38.00+/-0.20
dtype: object

df.mean(axis=0) # This does not work

Series([], dtype: float64)

Expected (`df.apply(lambda x: x.sum() / x.size)`)

c1    4.000+/-0.032
c2      5.00+/-0.04
c3      6.00+/-0.04
c4      7.00+/-0.05
dtype: object

df.mean(axis=1) # This does not work

r1   NaN
r2   NaN
r3   NaN
dtype: float64

Expected (`df.T.apply(lambda x: x.sum() / x.size)`)

r1    1.500+/-0.011
r2    5.500+/-0.031
r3      9.50+/-0.05
dtype: object

Issue Analytics

State:
Created 7 years ago
Reactions:3
Comments:15 (9 by maintainers)

Top GitHub Comments

7reactions

lebigotcommented, Sep 7, 2016

Seen from the outside, it looks like in both cases Pandas decrees that the result of mean() should be of type float64: in @rth’s example above the NumPy array actually contains integers, that are converted to float64 (which is doable); in the case of uncertainties.UFloat numbers with uncertainty, forcing the result to float64 is mostly meaningless (as this would get rid of the uncertainty) and mean() does not produce the expected result.

In contrast, as the original post shows, Pandas is more open on the data type of sum(), which is, correctly, object, for uncertainties.UFloat objects.

I think that it is desirable that since Pandas is able to sum(), it be able to get the mean() too (since the mean is not much more than a sum).

2reactions

shoyercommented, Sep 6, 2016

I just wanted to be sure that you’re not using subclassing or something else like that.

In any case, I think this is probably a pandas bug (but would need someone to work through/figure out). We should have a fallback implementation of mean (like NumPy’s mean) that works on object arrays.

Top Results From Across the Web

User Guide — uncertainties Python package 3.0.1 ...

The ufloat() function creates numbers with uncertainties. ... run with no or little modification and automatically produce results with uncertainties.

Unexpectedly long computation time with uncertainties package

If I try to run it on my computer it takes up to 10 minutes to produce a result. I'm not really sure...

Source code documentation - OMFIT

pythonFile – is meant to be an OMFITpythonGUI object in the OMFIT tree ... class returns unumpy.uarrays of Variable objects using the uncertainties...

Usage · Measurements - JuliaPhysics

measurement(value) creates a Measurement object with zero uncertainty, ... to define quantities with uncertainty, but can lead to unexpected results if used ...

Uncertainty propagation - Risk Engineering

The uncertainties package is able to do various types of arithmetic and other ... on these uncertain numbers, and propagates the uncertainty to...