DISC: Period.__le__
See original GitHub issueThere are a bunch of issues outstanding that relate to Period
ops and comparison. AFAICT making a decision about the comparison issue will snowball into resolving (some of) the ops issues.
#5202 ENH: Period ops NaT & timedelta ops #10798 Date / Datetime in Period Index #6779 Adding Period and Offset not implemented #13077 ENH/API: Decide what to return Period - Period subtraction #17112 MultiIndex - Comparison with Mixed Frequencies (and other FUBAR)
Right now two Period objects are comparable iff they equal freq
attributes. AFAICT this is to avoid guessing in cases where the “correct” answer is ambiguous. But there are some other cases with an obviously correct answer. Namely, if the per1.end_time < per2.start_time
, then it should be the case that per1 < per2
unambiguously. This intuition also extends to datetime
and Timestamp
objects that do not lie between per.start_time
and per.end_time
.
For cases with overlap there are a couple of reasonable approaches. My preferred approach is lexicographic: first compare per1.start_time
with per2.start_time
. If they are equal, then compare the end_time
s. Then we treat datetime
and Timestamp
objects as analogous to zero-duration periods.
Thoughts?
Issue Analytics
- State:
- Created 6 years ago
- Comments:5 (5 by maintainers)
Top GitHub Comments
yeah in an ideal work we would actually use an Interval to back a Period. Yes we should clarify the semantics here.
Another point of reference is
Interval
, which uses lexicographic ordering on(start, stop, closed)
: https://github.com/pandas-dev/pandas/blob/d3be81ad595c5338781bed9963c729a9702e6611/pandas/_libs/interval.pyx#L91-L94So if we care about ordering periods with different frequencies, this would be a reasonable choice. On the other hand, if we don’t have use cases for this, from a usability perspective it is probably better to stick with refusing to order potentially overlapping periods.
So
period < timestamp
would effectively compareperiod.start_time < timestamp
? I suppose we could argue that this is consistent, but here I really don’t think it’s a good idea. It’s better to force users to be explicit about what they want rather than allow them to make mistaken assumptions (period.end_time < timestamp
is at least as plausible).