question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: Inconsistent behaviour when averaging Decimals, floats and ints

See original GitHub issue
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.


from decimal import Decimal

df = pd.DataFrame({'col_1': [Decimal(1.5), Decimal(4.0)], 'col_2': [5.0, 10.0]})
df.mean(axis=1) # returns 5, 10 -- ignoring the decimals types in the averaging

df2 = pd.DataFrame({'col_1': [Decimal(1.5), Decimal(4.0)], 'col_2': [5, 10]})
df2.mean(axis=1) # returns 3.25, 7 -- includes the decimals types in the averaging

Problem description

There is inconsistent behaviour on how Decimal is being averaging depending if it is averaged to an int vs a float. Is it expected that the two dataframes above return different results?

Expected Output

I would expect in both cases to see 3.25 and 7 as the mean of the rows.

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None python : 3.6.6.final.0 python-bits : 64 OS : Darwin OS-release : 18.7.0 machine : x86_64 processor : i386 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.0.3 numpy : 1.18.3 pytz : 2019.3 dateutil : 2.8.1 pip : 20.0.2 setuptools : 46.1.3 Cython : None pytest : 5.4.1 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : 1.2.8 lxml.etree : None html5lib : None pymysql : None psycopg2 : 2.8.5 (dt dec pq3 ext lo64) jinja2 : 2.11.2 IPython : 7.13.0 pandas_datareader: None bs4 : None bottleneck : None fastparquet : None gcsfs : None lxml.etree : None matplotlib : 3.2.1 numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pytables : None pytest : 5.4.1 pyxlsb : None s3fs : None scipy : None sqlalchemy : 1.3.16 tables : None tabulate : None xarray : None xlrd : 1.2.0 xlwt : None xlsxwriter : 1.2.8 numba : None

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
dsaxtoncommented, May 19, 2020

@cuchoi I can close. Thanks for the report nonetheless, it’s an interesting edge case

1reaction
dsaxtoncommented, May 16, 2020

Now the second row doesn’t include in the average the Decimal in the operation. Is this expected behaviour as well?

Yes, that’s expected / consistent with the above. It’s trying to do the averaging across all rows and if that fails falls back on only the “numeric” columns.

Read more comments on GitHub >

github_iconTop Results From Across the Web

strange behavior when casting the result float to int
You then truncate that float to an integer, and the result is 62. ... (int) casting always truncates the decimal value so that...
Read more >
Inconsistent float to string vs. string to float casting - externals.io
Hi! Regarding the decimal separator (aka. decimal point), the behavior of casting float to string is inconsistent with casting string to float.
Read more >
What Every Computer Scientist Should Know About ...
This rounding error is the characteristic feature of floating-point ... First read in the 9 decimal digits as an integer N, ignoring the...
Read more >
Inconsistent rounding behaviour for sprintf and IEEE doubles
The current behaviour is correct only in how the FPU deals with rounding the error at the 15'th decimal of precision.
Read more >
Float Precision–From Zero to 100+ Digits | Random ASCII
Representing floats: 8-9 digits. The flip side of this question is figuring out how many decimal digits it takes to uniquely identify a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found