question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: Integer column index breaks json roundtrip with orient=table

See original GitHub issue

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

col1 = [1.0, 2.0, 3.5, 6.75]
col2 = [2.1, 3.1, 4.1, 5.1]
df = pd.DataFrame({1: col1, 2:col2}, index=[110, 112, 113, 121])
df.index.name = 'ID'
s = df.to_json(orient='table')
new = pd.read_json(s, orient='table')

Issue Description

The new dataframe will become

      1   2
ID         
110 NaN NaN
112 NaN NaN
113 NaN NaN
121 NaN NaN

Expected Behavior

The expected dataframe would look like this:

        1    2
ID            
110  1.00  2.1
112  2.00  3.1
113  3.50  4.1
121  6.75  5.1

Changing to strings instead of integers in the column index will give the expected result:

col1 = [1.0, 2.0, 3.5, 6.75]
col2 = [2.1, 3.1, 4.1, 5.1]
df = pd.DataFrame({'1': col1, '2':col2}, index=[110, 112, 113, 121])
df.index.name = 'ID'
s = df.to_json(orient='table')
new = pd.read_json(s, orient='table')

Installed Versions

This crashed in my environment with the error assert '_distutils' in core.__file__, core.__file__ raised from lib/python3.9/site-packages/_distutils_hack/__init__.py", line 59, in ensure_local_distutils

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
jmg-duartecommented, Dec 9, 2022

@coatless I discussed a potential fix in https://github.com/pandas-dev/pandas/issues/46392#issuecomment-1242696492 but got no response as you can see 😕

0reactions
coatlesscommented, Dec 5, 2022

@jmg-duarte did you end up solving the issue?

If not, @mroeschke could you suggest a way for @jmg-duarte to diff between summary runs? It’s not ideal that the JSON being generated is invalid.

Read more comments on GitHub >

github_iconTop Results From Across the Web

BUG: Index type casting in read_json with orient='table' and ...
Problem description. Round trip should recover the original DataFrame. But the result index has been cast from float to integer.
Read more >
Pandas read_json(orient="table") returns NaN if the column is ...
To work around the issue we can loop over the fields dataframe_table_schema.schema.fields and check if the field name is an integer if it...
Read more >
IO tools (text, CSV, HDF5, …) — pandas 1.5.2 documentation
Any orient option that encodes to a JSON object will not preserve the ordering of index and column labels during round-trip serialization. If...
Read more >
apache_beam.dataframe.io module - Apache Beam
If list-like, all elements must either be positional (i.e. integer indices into the document columns) or strings that correspond to column names provided ......
Read more >
IO tools (text, CSV, HDF5, …) - Pandas 中文
Indicate number of NA values placed in non-numeric columns. ... In [290]: df.index.name = 'index' In [291]: df.to_json('test.json', orient='table') In ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found