question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Allow choosing the utc timezone class in pd.to_datetime

See original GitHub issue

Code Sample, a copy-pastable example if possible

from datetime import datetime, timezone
import pandas as pd

dt1 = pd.to_datetime(datetime(2020, 3, 11), utc=True)
print(repr(dt1))
print(type(dt1.tz))

dt2 = pd.to_datetime(datetime(2020, 3, 11, tzinfo=timezone.utc))
print(repr(dt2))
print(type(dt2.tz))

# dragons here
print(dt1 - dt2)

outputs

Timestamp('2020-03-11 00:00:00+0000', tz='UTC')
<class 'pytz.UTC'>
Timestamp('2020-03-11 00:00:00+0000', tz='UTC')
<class 'datetime.timezone'>

TypeError: Timestamp subtraction must have the same timezones or no timezones

Problem description

There is no ability to specify which “UTC” the Timestamp should be. I suggest extending the interface of pd.to_datetime() to specify utc_cls=pytz.UTC.

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit           : None
python           : 3.7.5.final.0
python-bits      : 64
OS               : Linux
OS-release       : 5.3.0-40-generic
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 1.0.1
numpy            : 1.17.4
pytz             : 2019.2
dateutil         : 2.7.3
pip              : 19.3.1
setuptools       : 42.0.1
Cython           : 0.29.14
pytest           : 5.3.1
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : 4.5.0
html5lib         : None
pymysql          : None
psycopg2         : 2.8.4 (dt dec pq3 ext lo64)
jinja2           : 2.10.3
IPython          : 7.10.0
pandas_datareader: None
bs4              : 4.8.1
bottleneck       : None
fastparquet      : None
gcsfs            : None
lxml.etree       : 4.5.0
matplotlib       : 3.1.2
numexpr          : None
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : 0.16.0
pytables         : None
pytest           : 5.3.1
pyxlsb           : None
s3fs             : None
scipy            : 1.2.1
sqlalchemy       : 1.3.12
tables           : None
tabulate         : None
xarray           : None
xlrd             : None
xlwt             : None
xlsxwriter       : None
numba            : None

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:4
  • Comments:8 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
jbrockmendelcommented, Jun 8, 2020

The bug here is in tz_compare not considering the two different UTCs as equivalent. I agree with @mroeschke that we shouldnt change the to_datetime API.

I would be +1 on changing our defaults from pytz to the stdlib tzinfo (and zoneinfo going forward)

1reaction
mroeschkecommented, Mar 11, 2020

I’d be -0.5 to allow UTC to accept non booleans as IMO it’s idiomatic to use tz_convert after to_datetime

In [3]: pd.to_datetime(datetime(2020, 3, 11), utc=True).tz
Out[3]: <UTC>

In [5]: pd.to_datetime(datetime(2020, 3, 11), utc=True).tz_convert(timezone.utc).tz
Out[5]: datetime.timezone.utc
Read more comments on GitHub >

github_iconTop Results From Across the Web

pandas.to_datetime() how to convert from local timezone to ...
These values are loaded as a Series in Pandas dataframe and I want to convert them into UTC timestamp. If I use very...
Read more >
pandas.to_datetime — pandas 1.5.2 documentation
Timezone -aware inputs are converted to UTC (the output represents the exact same datetime, but viewed from the UTC time offset +00:00 )....
Read more >
How to remove timezone from a Timestamp column in a ...
The first part of the output tells us a timestamp column is a DateTime object. The UTC in squared brackets denotes that the...
Read more >
Pandas To Datetime – String to Date – pd.to_datetime()
Try the format code options first. utc (Default=None): If you want to convert your DateTime objects to timezone-aware (meaning each datetime object also...
Read more >
Dealing with dates and time in Pandas - Bartosz Mikulski
The Timestamp object returned by the to_datetime() function has no time zone. Because of the way the Unix Epoch is defined the output...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found