question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DatetimeIndex lookups tz inconsistency

See original GitHub issue

TLDR: enforcing tzaware vs tznaive compat in DatetimeIndex comparisons (#18162) appears to be inconsistent with current slicing behavior.

The following example is based off of tests.series.test_indexing.test_getitem_setitem_datetimeindex:

dti = pd.date_range('1/1/1990', periods=50, freq='H', tz='US/Eastern')
ts = pd.Series(np.random.randn(50), index=dti)

lb = '1990-01-01 04:00:00'
rb = '1990-01-01 07:00:00'

The behavior that we want to enforce is #18162 requires that dti < pd.Timestamp(lb) should raise, as should dti < lb. At the moment they effectively get treated as UTC. But if we change this so that it does raise, the following from test_getitem_setitem_datetimeindex breaks pretty irrevocably:

ts[(ts.index >= lb) & (ts.index <= rb)]

There is also ts[lb:rb] which if we’re being strict should raise, but at least we could make that still work. (BTW this implicitly casts lb and rb to US/Eastern, albeit in different places. So far that appears to be a related but distinct can of worms.)

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:25 (21 by maintainers)

github_iconTop GitHub Comments

1reaction
1kastnercommented, Nov 24, 2017

@jbrockmendel

If a DatetimeIndex is tznaive and an indexer/slicer is tzaware, then #17920 assumes the DatetimeIndex is UTC. I would much rather raise and require the user explicitly make the index tzaware. Is there is a reason why this wouldn’t work in your use case?

It would work in my use case, this assumption was done not to break other people’s code unnecessarily. I agree that your solution is cleaner but for me pandas means having a lot of convenience and the convenience of today creates lots of edge cases. When you look at the documentation, you can pass a lot of string labels which do not look like a datetime-like object but instead are just the year, the year and the month or something similar. We should not assume the user to provide a ISO8601 conform label only. And how do you suggest to make a timezone aware label look like that way? A small sample:

t = pd.date_range(start=pd.datetime(2000, 1, 1), periods=400, freq='d')
df = pd.DataFrame(index=t, data={"val": range(len(t))})
df.loc["2000-01":"2000-03"]
df.loc["2000-01"]

I see no way in how we can add timezones to these kinds of labels which can be intuitively read. And only if all kinds of possible labels allow adding timezones, the strictness you suggest is suitable for this library. But that would be a major update for me, rather something which can be discussed for a similar but completely new library.

0reactions
jbrockmendelcommented, Dec 6, 2018

I what way has it been resolved?

I thought that #17920 had been merged, apparently was wrong about that.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Search - pandas 1.5.2 documentation
Search Results. Search finished, found 73 page(s) matching the search query. pandas.to_datetime (Python function, in pandas.to_datetime); pandas.to_datetime.
Read more >
Inconsistent behavior of pandas DatetimeIndex.round?
Entries 1 and 3 got rounded up while entries 2 and 4 got rounded down. Is this supposed behavior? I guess there is...
Read more >
Extend datetimeindex to previous times in pandas-Pandas ...
Just change the time from 08:00:00 to 05:00:00 in your code and create 3 more rows and update this dataframe to the existing...
Read more >
v0.25.0 版本特性(2019年7月18日)
Name Lookup.TextField Lookup. ... Bug in DatetimeIndex.union() when combining a timezone aware and timezone unaware DatetimeIndex (opens new ...
Read more >
Pandas DatetimeIndex Usage Explained
1. DatetimeIndex() Syntax and Usage · data — Datetime-like data. · freq — Stands for frequency. · tz — This keyword stands for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found