Inconsistency between read_csv(... parse_dates) and read_csv followed by astype('datetime')
See original GitHub issueWrite time-zones out in csv file
Datetimes are written to csv in readable UTC. Using pd.read_csv('file.csv', parse_dates=['date'])
expects UTC. astype('datetime64')
expects readable dates (with no timezone specified) to be local (which also makes sense). But it leads to the following:
import pandas as pd
from StringIO import StringIO
df = pd.DataFrame({'date' : [0]})
df['date'] = df.date.astype('datetime64[s]')
print df.date[0]
s = StringIO()
df.to_csv(s)
s = s.getvalue()
print 'csv:'
print s
with_parse_dates = pd.read_csv(StringIO(s), parse_dates=['date'])
print 'parse_date:', with_parse_dates.date[0]
with_astype = pd.read_csv(StringIO(s))
with_astype.date = with_astype.date.astype('datetime64')
print 'astype :', with_astype.date[0]
with_astypeZ = pd.read_csv(StringIO(s))
with_astypeZ.date[0] += 'Z'
with_astypeZ.date = with_astypeZ.date.astype('datetime64')
print 'astype w/Z:', with_astypeZ.date[0]
output (for me):
1970-01-01 00:00:00
csv:
,date
0,1970-01-01 00:00:00
parse_date: 1970-01-01 00:00:00
astype : 1970-01-01 08:00:00
astype w/Z: 1970-01-01 00:00:00
Would be easily fixed by indicating UTC in the csv output, possibly with just the 'Z'
suffix.
Issue Analytics
- State:
- Created 10 years ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
datetime dtypes in pandas read_csv - python - Stack Overflow
to_datetime() isn't an option I can't know which columns will be datetime objects. That information can change and comes from whatever informs my...
Read more >4 tricks you should know to parse date columns with Pandas ...
4 tricks you should know to parse date columns with Pandas read_csv() · 1. Reading date columns from a CSV file · 2....
Read more >pandas.read_csv — pandas 1.5.2 documentation
If a column or index cannot be represented as an array of datetimes, say because of an unparsable value or a mixture of...
Read more >DateTime in pandas read_csv(): Everything you have to know
Parsing the dates as datetime at the time of reading the data. Set parse_date parameter of read_csv() to label/index of the column you...
Read more >Complete Guide To Datetime Parsing With Pandas
Pandas is famous for its datetime parsing, processing, analysis & plotting functions ... forestfire = pd.read_csv(url) forestfire.sample(10).
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Isn’t that why you add
+00:00
to the end of a datestamp? Pandas doesn’t recognize this, though.Putting these formats in CSV file:
are all parsed as
dtype='datetime64[ns]'
, even though these represent UTC and so I would expectdtype='datetime64[ns, UTC]'
.And it also gives you UTC:
but only if you specify to return timezone-aware datetimes.