question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pd.testing.assert_frame_equal doesn't do precision according to the doc

See original GitHub issue

Code Sample, a copy-pastable example if possible

import pandas as pd
import pandas.testing
df1 = pd.DataFrame([0.00016,                -0.154526,            -0.20580199999999998])
df2 = pd.DataFrame([0.00015981824253685772, -0.15452557802200317, -0.20580188930034637])
pd.testing.assert_frame_equal(df1, df2, check_exact=False, check_less_precise=3)

Problem description

This asserts, despite all columns being identical in the first 3 digits after the decimal point.

AssertionError: DataFrame.iloc[:, 0] are different

DataFrame.iloc[:, 0] values are different (33.33333 %)
[left]:  [0.00016, -0.154526, -0.20580199999999998]
[right]: [0.00015981824253685772, -0.15452557802200317, -0.20580188930034637]

It doesn’t assert if check_less_precise=2 is used instead. So something is not right here. Is there some kind of a rounding issue here?

Doc:

check_less_precise : bool or int, default False

Specify comparison precision. Only used when check_exact is False. 5 digits (False) or 3 digits (True) after decimal points are compared. If int, then specify the digits to compare

I understand the doc says check_less_precise defines how many digits after the decimal point are compared.

Unrelated: The doc should probably say “decimal point” (singular) as there is only one, no? and “specify the digits to compare” is vague, perhaps “In int, then specify how many digits after decimal point to compare”?

Here is a proposed updated doc entry:

Specify comparison precision. Only used when check_exact is False. int: How many digits after the decimal point to compare, False: 5 digits, True: 3 digits.

Expected Output

no assert for up to check_less_precise=4 in this example, the numbers start to diverge at digit 5.

and it’s still unclear whether rounding is performed or not.

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line] INSTALLED VERSIONS

commit: None python: 3.7.1.final.0 python-bits: 64 OS: Linux OS-release: 4.15.0-43-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_CA.UTF-8 LANG: en_CA.UTF-8 LOCALE: en_CA.UTF-8

pandas: 0.24.0 pytest: 4.0.2 pip: 19.0.1 setuptools: 40.6.3 Cython: 0.29.2 numpy: 1.15.4 scipy: 1.2.0 pyarrow: None xarray: None IPython: 7.2.0 sphinx: None patsy: None dateutil: 2.7.5 pytz: 2018.7 blosc: None bottleneck: 1.2.1 tables: None numexpr: 2.6.9 feather: None matplotlib: 3.0.2 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml.etree: 4.2.5 bs4: 4.7.1 html5lib: None sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None gcsfs: None

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:4
  • Comments:13 (5 by maintainers)

github_iconTop GitHub Comments

4reactions
wudstrandcommented, Mar 12, 2020

Any updates?

4reactions
loikeincommented, Mar 7, 2020

Any updates? It’s been several updates, but the problem seems to persist.

Read more comments on GitHub >

github_iconTop Results From Across the Web

pandas.testing.assert_frame_equal - DataFrame
Check that left and right DataFrame are equal. This function is intended to compare two DataFrames and output any differences. It is mostly...
Read more >
pandas assert_frame_equal behavior - Stack Overflow
These frames contain floats that I want to compare to some user defined precision. The check_less_precise argument from assert_frame_equal seems ...
Read more >
apache_beam.dataframe.frames module - Apache Beam
Raises an AssertionError if left and right are not equal. Provides an easy interface to ignore inequality in dtypes, indexes and precision among...
Read more >
Testing Pandas Code - MungingData
This post explains how to unit test Pandas DataFrames with built-in assert_series_equal and assert_frame_equal methods and beavis methods ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found