Allow choosing the utc timezone class in pd.to_datetime
See original GitHub issueCode Sample, a copy-pastable example if possible
from datetime import datetime, timezone
import pandas as pd
dt1 = pd.to_datetime(datetime(2020, 3, 11), utc=True)
print(repr(dt1))
print(type(dt1.tz))
dt2 = pd.to_datetime(datetime(2020, 3, 11, tzinfo=timezone.utc))
print(repr(dt2))
print(type(dt2.tz))
# dragons here
print(dt1 - dt2)
outputs
Timestamp('2020-03-11 00:00:00+0000', tz='UTC')
<class 'pytz.UTC'>
Timestamp('2020-03-11 00:00:00+0000', tz='UTC')
<class 'datetime.timezone'>
TypeError: Timestamp subtraction must have the same timezones or no timezones
Problem description
There is no ability to specify which “UTC” the Timestamp should be. I suggest extending the interface of pd.to_datetime()
to specify utc_cls=pytz.UTC
.
Expected Output
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit : None
python : 3.7.5.final.0
python-bits : 64
OS : Linux
OS-release : 5.3.0-40-generic
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.0.1
numpy : 1.17.4
pytz : 2019.2
dateutil : 2.7.3
pip : 19.3.1
setuptools : 42.0.1
Cython : 0.29.14
pytest : 5.3.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.5.0
html5lib : None
pymysql : None
psycopg2 : 2.8.4 (dt dec pq3 ext lo64)
jinja2 : 2.10.3
IPython : 7.10.0
pandas_datareader: None
bs4 : 4.8.1
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.5.0
matplotlib : 3.1.2
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.16.0
pytables : None
pytest : 5.3.1
pyxlsb : None
s3fs : None
scipy : 1.2.1
sqlalchemy : 1.3.12
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None
numba : None
Issue Analytics
- State:
- Created 4 years ago
- Reactions:4
- Comments:8 (6 by maintainers)
Top Results From Across the Web
pandas.to_datetime() how to convert from local timezone to ...
These values are loaded as a Series in Pandas dataframe and I want to convert them into UTC timestamp. If I use very...
Read more >pandas.to_datetime — pandas 1.5.2 documentation
Timezone -aware inputs are converted to UTC (the output represents the exact same datetime, but viewed from the UTC time offset +00:00 )....
Read more >How to remove timezone from a Timestamp column in a ...
The first part of the output tells us a timestamp column is a DateTime object. The UTC in squared brackets denotes that the...
Read more >Pandas To Datetime – String to Date – pd.to_datetime()
Try the format code options first. utc (Default=None): If you want to convert your DateTime objects to timezone-aware (meaning each datetime object also...
Read more >Dealing with dates and time in Pandas - Bartosz Mikulski
The Timestamp object returned by the to_datetime() function has no time zone. Because of the way the Unix Epoch is defined the output...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The bug here is in tz_compare not considering the two different UTCs as equivalent. I agree with @mroeschke that we shouldnt change the to_datetime API.
I would be +1 on changing our defaults from pytz to the stdlib tzinfo (and zoneinfo going forward)
I’d be -0.5 to allow UTC to accept non booleans as IMO it’s idiomatic to use
tz_convert
afterto_datetime