No way with to_json to write only date out of datetime
See original GitHub issueCode Sample, a copy-pastable example if possible
In [2]: pd.Series(pd.to_datetime(['2017-03-15'])).to_csv()
Out[2]: '0,2017-03-15\n'
In [3]: pd.Series(pd.to_datetime(['2017-03-15'])).to_json(date_format='iso')
Out[3]: '{"0":"2017-03-15T00:00:00.000Z"}'
Problem description
From this SO comment
By default, to_csv()
drops times of day if they are all midnight. By default, to_json
instead writes dates in epoch, but if passed date_format='iso'
will always write times.
Not sure whether it would be optimal to have date_format='iso'
behave as to_csv()
, or rather add a new (for instance) date_format='iso_date'
, or maybe even free-form like accepted by pd.Series.df.strftime()
, e.g. date_format="%Y-%m-%d"
.
Expected Output
Out[3]: '{"0":"2017-03-15"}'
Output of pd.show_versions()
pandas: 0.20.1 pytest: 3.0.6 pip: 9.0.1 setuptools: None Cython: 0.25.2 numpy: 1.12.1 scipy: 0.19.0 xarray: 0.9.2 IPython: 5.1.0.dev sphinx: 1.5.6 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: 1.2.1 tables: 3.3.0 numexpr: 2.6.1 feather: 0.3.1 matplotlib: 2.0.2 openpyxl: 2.3.0 xlrd: 1.0.0 xlwt: 1.1.2 xlsxwriter: 0.9.6 lxml: 3.7.1 bs4: 4.5.3 html5lib: 0.999999999 sqlalchemy: 1.0.15 pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: 0.2.1 w_versions() here
Issue Analytics
- State:
- Created 6 years ago
- Reactions:6
- Comments:10 (7 by maintainers)
Top GitHub Comments
-1 on this. This does not conform to JSON and this creates non-standard formatting for dates.
#12997 is completely orthogonal to this and is an actual bug.
Any movement on this? It’s a real problem when generating JSONL files for Redshift. Redshift rejects the file because it has a time portion for a field that’s a
DATE
. (I don’t own the table so switching toTIMESTAMP
isn’t a viable solution.)