question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Int64 numbers from Pandas DataFrame.to_markdown() incorrectly displayed

See original GitHub issue

Summary

When a Pandas DataFrame contains a 64 bit integer and the .to_markdown() method is called on the DataFrame, the printed integer is incorrect due to overflow.

This behavior is being passed along by the tabulate package but is really a fundamental Python issue. I bring this up here because the Pandas .head() method does print the correct number. Should Pandas be handling this case to present a consistent view of DataFrame data to users regardless of method?

If this fix is outside the scope of Pandas, perhaps the Pandas documentation should be updated as a warning.

Reproduction

Test 64bit int with Pandas head()

import pandas as pd
df = pd.DataFrame({'colA': [503498111827123021]})
df.head()
                 colA
0  503498111827123021

Test 64bit int with Pandas to_markdown()

import pandas as pd
df = pd.DataFrame({'colA': [503498111827123021]})
print(df.to_markdown(floatfmt='.0f'))
|    |               colA |
|---:|-------------------:|
|  0 | 503498111827123008 |

Test with Python format()

>>> format(503498111827123021, '.0f')
'503498111827123008'

Pandas Version

Python 3.9.6 (default, Aug  5 2022, 15:21:02)
[Clang 14.0.0 (clang-1400.0.29.102)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> pd.show_versions()

INSTALLED VERSIONS
------------------
commit           : 91111fd99898d9dcaa6bf6bedb662db4108da6e6
python           : 3.9.6.final.0
python-bits      : 64
OS               : Darwin
OS-release       : 21.6.0
Version          : Darwin Kernel Version 21.6.0: Thu Sep 29 20:12:57 PDT 2022; root:xnu-8020.240.7~1/RELEASE_X86_64
machine          : x86_64
processor        : i386
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 1.5.1
numpy            : 1.23.4
pytz             : 2022.6
dateutil         : 2.8.2
setuptools       : 58.0.4
pip              : 21.2.4
Cython           : None
pytest           : None
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : None
html5lib         : None
pymysql          : None
psycopg2         : None
jinja2           : None
IPython          : None
pandas_datareader: None
bs4              : None
bottleneck       : None
brotli           : None
fastparquet      : None
fsspec           : None
gcsfs            : None
matplotlib       : None
numba            : None
numexpr          : None
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : None
pyreadstat       : None
pyxlsb           : None
s3fs             : None
scipy            : None
snappy           : None
sqlalchemy       : None
tables           : None
tabulate         : 0.9.0
xarray           : None
xlrd             : None
xlwt             : None
zstandard        : None
tzdata           : None

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
jbencinacommented, Nov 7, 2022

Good point. I’ll see if there’s an idea when the next version will be out and circle back here with a PR when available

1reaction
jbencinacommented, Nov 5, 2022

Confirmed this is fixed in the upcoming release of tabulate

Read more comments on GitHub >

github_iconTop Results From Across the Web

pandas.DataFrame.to_markdown transform large int to float
When there are more than one column, and that one of them contains float numbers. Since tabulate uses df.values to extract the data,...
Read more >
pandas.DataFrame.to_markdown
Print DataFrame in Markdown-friendly format. New in version 1.0.0. Parameters. bufstr, Path or StringIO-like, optional, default None. Buffer to write to.
Read more >
Pandas - Format DataFrame numbers with commas and ...
In this post we'll learn how to format numbers in Pandas DataFrames.
Read more >
Practical Python Pandas Tricks - Part 2: Data Preview and ...
describe: This function outputs a descriptive statistical summary that includes number of observation, mean, standard deviation, min, max and ...
Read more >
pandas/frame.py at main - GitHub
MultiIndex, the number of keys in the other DataFrame (either the index ... whose merge key only appears in the right DataFrame, and...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found