question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unexpected results for the mean of a DataFrame of ufloat from the uncertainties package.

See original GitHub issue

Related to #6898.

I find it very convenient to use a DataFrame of ufloat from the uncertainties package. Each entry consists of (value, error) and could represent the result of Monte Carlo simulations or an experiment.

At present taking sums along both axes gives the expected result, but taking the mean does not.

import pandas as pd
import numpy as np
from uncertainties import unumpy

value = np.arange(12).reshape(3,4)
err = 0.01 * np.arange(12).reshape(3,4) + 0.005

data = unumpy.uarray(value, err)

df = pd.DataFrame(data, index=['r1', 'r2', 'r3'], columns=['c1', 'c2', 'c3', 'c4'])

Examples:

print (df)
               c1             c2             c3             c4
r1  0.000+/-0.005  1.000+/-0.015  2.000+/-0.025  3.000+/-0.035
r2    4.00+/-0.04    5.00+/-0.06    6.00+/-0.07    7.00+/-0.08
r3    8.00+/-0.09    9.00+/-0.10   10.00+/-0.11   11.00+/-0.12

df.sum(axis=0) # This works

c1    12.00+/-0.10
c2    15.00+/-0.11
c3    18.00+/-0.13
c4    21.00+/-0.14
dtype: object

df.sum(axis=1) # This works

r1     6.00+/-0.05
r2    22.00+/-0.12
r3    38.00+/-0.20
dtype: object

df.mean(axis=0) # This does not work

Series([], dtype: float64)

Expected (`df.apply(lambda x: x.sum() / x.size)`)

c1    4.000+/-0.032
c2      5.00+/-0.04
c3      6.00+/-0.04
c4      7.00+/-0.05
dtype: object

df.mean(axis=1) # This does not work

r1   NaN
r2   NaN
r3   NaN
dtype: float64

Expected (`df.T.apply(lambda x: x.sum() / x.size)`)

r1    1.500+/-0.011
r2    5.500+/-0.031
r3      9.50+/-0.05
dtype: object

Issue Analytics

  • State:open
  • Created 7 years ago
  • Reactions:3
  • Comments:15 (9 by maintainers)

github_iconTop GitHub Comments

7reactions
lebigotcommented, Sep 7, 2016

Seen from the outside, it looks like in both cases Pandas decrees that the result of mean() should be of type float64: in @rth’s example above the NumPy array actually contains integers, that are converted to float64 (which is doable); in the case of uncertainties.UFloat numbers with uncertainty, forcing the result to float64 is mostly meaningless (as this would get rid of the uncertainty) and mean() does not produce the expected result.

In contrast, as the original post shows, Pandas is more open on the data type of sum(), which is, correctly, object, for uncertainties.UFloat objects.

I think that it is desirable that since Pandas is able to sum(), it be able to get the mean() too (since the mean is not much more than a sum).

2reactions
shoyercommented, Sep 6, 2016

I just wanted to be sure that you’re not using subclassing or something else like that.

In any case, I think this is probably a pandas bug (but would need someone to work through/figure out). We should have a fallback implementation of mean (like NumPy’s mean) that works on object arrays.

Read more comments on GitHub >

github_iconTop Results From Across the Web

User Guide — uncertainties Python package 3.0.1 ...
The ufloat() function creates numbers with uncertainties. ... run with no or little modification and automatically produce results with uncertainties.
Read more >
Unexpectedly long computation time with uncertainties package
If I try to run it on my computer it takes up to 10 minutes to produce a result. I'm not really sure...
Read more >
Source code documentation - OMFIT
pythonFile – is meant to be an OMFITpythonGUI object in the OMFIT tree ... class returns unumpy.uarrays of Variable objects using the uncertainties...
Read more >
Usage · Measurements - JuliaPhysics
measurement(value) creates a Measurement object with zero uncertainty, ... to define quantities with uncertainty, but can lead to unexpected results if used ...
Read more >
Uncertainty propagation - Risk Engineering
The uncertainties package is able to do various types of arithmetic and other ... on these uncertain numbers, and propagates the uncertainty to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found