Timestamp.strftime(): missing support for nanoseconds
See original GitHub issueWhen parsing text into a Timestamp
object we can specify a format string. Currently %f
is documented with
note that “%f” will parse all the way up to nanoseconds
See https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html, in particular the description of the format
parameter. The note about %f
was added in this patch: https://github.com/pandas-dev/pandas/pull/8904
The fact that we can parse text using nanosecond precision is great, and here I make use of that behavior (showing two methods yielding the same result):
# Implicit format:
>>> t = pd.to_datetime('2019-10-03T09:30:12.133333337')
>>> t
Timestamp('2019-10-03 09:30:12.133333337')
# Explicit format using %f:
>>> t = pd.to_datetime('2019-10-03T09:30:12.133333337', format='%Y-%m-%dT%H:%M:%S.%f')
>>> t
Timestamp('2019-10-03 09:30:12.133333337')
But when I now want to invert that process using strftime()
then the fractional part is truncated to microsecond precision:
>>> t.strftime('%Y-%m-%dT%H:%M:%S.%f')
'2019-10-03T09:30:12.133333'
On the one hand this is inconsistent with the meaning of %f
while parsing. On the other hand it corresponds to what’s documented in Python’s stdlib documentation (which says that %f
means “Microsecond as a decimal number, zero-padded on the left.”).
In any case, I think it would make sense to have a format string specifier that allows us to turn the timestamp into a string with nanosecond precision.
If I am not mistaken, we otherwise have to work around the absence of that format specifier by using the nanosecond
property:
>>> t.nanosecond
337
>>> t.strftime('%Y-%m-%dT%H:%M:%S.%f') + str(t.nanosecond)
'2019-10-03T09:30:12.133333337'
Do you agree that we should have a format specifier for that? Or do we have one, and it’s just not documented?
INSTALLED VERSIONS
commit : None python : 3.7.4.final.0 python-bits : 64 OS : Linux OS-release : 5.3.7-200.fc30.x86_64 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8
pandas : 0.25.2 numpy : 1.17.3 pytz : 2019.3 dateutil : 2.8.0 pip : 19.0.3 setuptools : 40.8.0 Cython : 0.29.13 pytest : 5.2.1 hypothesis : 4.41.3 sphinx : 2.2.0 blosc : 1.8.1 feather : None xlsxwriter : 1.2.2 lxml.etree : 4.4.1 html5lib : 1.0.1 pymysql : None psycopg2 : None jinja2 : 2.10.3 IPython : 7.8.0 pandas_datareader: None bs4 : 4.8.1 bottleneck : 1.2.1 fastparquet : 0.3.2 gcsfs : None lxml.etree : 4.4.1 matplotlib : 3.1.1 numexpr : 2.7.0 odfpy : None openpyxl : 3.0.0 pandas_gbq : None pyarrow : None pytables : None s3fs : 0.3.5 scipy : 1.3.1 sqlalchemy : 1.3.10 tables : 3.6.0 xarray : 0.14.0 xlrd : 1.2.0 xlwt : 1.3.0 xlsxwriter : 1.2.2
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (3 by maintainers)
@jgehrcke
Note that for
Period
s there is a different convention in pandas:%u
means microseconds and%n
means nanoseconds. This is mostly because for such objects,%f
means “fiscal year”.