to_json should make separators configurable (similar to json.dump)
See original GitHub issueCode Sample, a copy-pastable example if possible
import numpy as np
import pandas as pd
print(pd.DataFrame([1]).to_json(indent=4))
outputs
{
"0":{
"0":1
}
}
Problem description
Any JSON prettyfier (including bare VS Code) wants to replace all :
by :
. So output of to_json
should behave as json.dump
does, at least when indent is not None
(in which case the user clearly does not care about file size):
If specified, separators should be an (item_separator, key_separator) tuple. The default is
(', ', ': ')
if indent is None and(',', ': ')
otherwise. To get the most compact JSON representation, you should specify (‘,’, '😂 to eliminate whitespace.
https://docs.python.org/3/library/json.html#:~:text=To get the most compact JSON representation
Expected Output
{
"0": {
"0": 1
}
}
Output of pd.show_versions()
INSTALLED VERSIONS
commit : None python : 3.8.2.final.0 python-bits : 64 OS : Linux OS-release : 4.12.14-lp151.28.32-default machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8
pandas : 1.0.3 numpy : 1.18.2 pytz : 2019.3 dateutil : 2.8.1 pip : 20.0.2 setuptools : 46.1.1 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : 1.2.8 lxml.etree : 4.5.0 html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : 4.8.0 bottleneck : None fastparquet : None gcsfs : None lxml.etree : 4.5.0 matplotlib : 3.2.1 numexpr : 2.7.1 odfpy : None openpyxl : 3.0.3 pandas_gbq : None pyarrow : None pytables : None pytest : None pyxlsb : None s3fs : None scipy : 1.4.1 sqlalchemy : None tables : None tabulate : None xarray : None xlrd : 1.2.0 xlwt : None xlsxwriter : 1.2.8 numba : None
Issue Analytics
- State:
- Created 3 years ago
- Comments:8 (4 by maintainers)
Top GitHub Comments
@ss-is-master-chief if you or whomever wants to work on this, I would suggest looking at the ujson source for the
JSON_NO_EXTRA_WHITESPACE
checkshttps://github.com/pandas-dev/pandas/blob/25d893bb9a5d79afc6a88e84de89fa32b385ae09/pandas/_libs/src/ujson/lib/ultrajsonenc.c#L962
Right now this is a compile-time constant defined in
ultrajson.h
, but I think we can make a runtime check and just replace withif (enc->indent) > 0
checks.This would just add spaces when indent is provided, which is a little different than the request but I still think solves the same problem
I’d be happy to write a patch that passes through separators to json.dumps, the same as is done for indent.
Personally, it feels simpler to just pass any extra keyword arguments to
json.dumps
, but no problem with the explicit approach.Edit: Oh, dang, ujson doesn’t support
separators
. I guess it’d have to be done like in https://github.com/pandas-dev/pandas/issues/33014#issuecomment-607494037