AmbigousTimeError on Timestamp.floor during dst change
See original GitHub issueCode Sample, a copy-pastable example if possible
import pandas as pd
pd.Timestamp('2018-11-04 01:55:17.869342-0500', tz='America/New_York').floor('T')
Traceback (most recent call last):
File "/venv/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 3265, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-2-42bf9ce66d1d>", line 1, in <module>
pd.Timestamp('2018-11-04 01:55:17.869342-0500', tz='America/New_York').floor('T')
File "pandas/_libs/tslibs/timestamps.pyx", line 696, in pandas._libs.tslibs.timestamps.Timestamp.floor
File "pandas/_libs/tslibs/timestamps.pyx", line 667, in pandas._libs.tslibs.timestamps.Timestamp._round
File "pandas/_libs/tslibs/timestamps.pyx", line 903, in pandas._libs.tslibs.timestamps.Timestamp.tz_localize
File "pandas/_libs/tslibs/conversion.pyx", line 963, in pandas._libs.tslibs.conversion.tz_localize_to_utc
pytz.exceptions.AmbiguousTimeError: Cannot infer dst time from '2018-11-04 01:55:00', try using the 'ambiguous' argument
Problem description
AmbiguousTimeError
thrown on floor of timestamp that leads to time within dst change.
Similar issues happen on ceil
and round
:
pd.Timestamp('2018-11-04 01:55:17.869342-0500', tz='America/New_York').ceil('T')
pd.Timestamp('2018-11-04 01:55:17.869342-0500', tz='America/New_York').round('T')
Expected Output
Timestamp('2018-11-04 01:55:00-0500', tz='America/New_York')
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None python: 3.5.6.final.0 python-bits: 64 OS: Linux OS-release: 4.15.0-38-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 pandas: 0.23.4 pytest: None pip: 10.0.1 setuptools: 39.1.0 Cython: None numpy: 1.15.2 scipy: None pyarrow: None xarray: None IPython: 7.0.1 sphinx: None patsy: None dateutil: 2.7.3 pytz: 2018.5 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: None openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None sqlalchemy: None pymysql: None psycopg2: None jinja2: None s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None
Issue Analytics
- State:
- Created 5 years ago
- Comments:8 (2 by maintainers)
Top GitHub Comments
I think that the problems here are deeper, and cannot be solved by tacking the
ambiguous
keyword argument ontofloor
and others.The thing is:
2018-11-04 01:31:33-0700
and2018-11-04 01:31:33-0600
are not ambiguous times, so it makes no sense to require the programmer to use theambiguous
keyword to specify whether they are in standard or daylight savings time. If given a UTC offset and a location likeAmerica/Edmonton
orAmerica/New_York
it is possible to infer whether that time is in standard or daylight savings time. If you look at the traceback from above:it seems that pandas strips the time offset while processing the time, which is very strange since the offset is crucial information.
Can someone please explain why this has been closed ? It still seems to be an issue (at least, on Windows). In my view, adamkpickering’s https://github.com/pandas-dev/pandas/issues/23521#issuecomment-441087553 perfectly describes the issue - and is still valid.