Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] Absolute to relative ForecastingHorizon

See original GitHub issue

Describe the bug

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-5-40dcb829593a> in <module>
      4 y_train, y_test = temporal_train_test_split(y, test_size=120)
      5 fh = ForecastingHorizon(y_test.index[0:5], is_relative=False)
----> 6 fh.to_relative(cutoff=y_train.index[-1])

c:\Users\Martin\Desktop\sktime\sktime\sktime\forecasting\base\_fh.py in to_relative(self, cutoff)
    223                 values = _coerce_duration_to_int(values, unit=_get_unit(cutoff))
    224 
--> 225             return self._new(values, is_relative=True)
    226 
    227     @lru_cache(typed=True)

c:\Users\Martin\Desktop\sktime\sktime\sktime\forecasting\base\_fh.py in _new(self, values, is_relative)
    160         if is_relative is None:
    161             is_relative = self.is_relative
--> 162         return type(self)(values, is_relative)
    163 
    164     @property

c:\Users\Martin\Desktop\sktime\sktime\sktime\forecasting\base\_fh.py in __init__(self, values, is_relative)
    137         if not isinstance(is_relative, bool):
    138             raise TypeError("`is_relative` must be a boolean")
--> 139         values = _check_values(values)
    140 
    141         # check types, note that isinstance() does not work here because index

c:\Users\Martin\Desktop\sktime\sktime\sktime\forecasting\base\_fh.py in _check_values(values)
    105     # check values does not contain duplicates
    106     if len(values) != values.nunique():
--> 107         raise ValueError("`values` must not contain duplicates.")
    108 
    109     # return sorted values

ValueError: `values` must not contain duplicates.

To Reproduce

from sktime.forecasting.all import *
y = load_airline()
y.index = y.index.to_timestamp()
y_train, y_test = temporal_train_test_split(y, test_size=120)
fh = ForecastingHorizon(y_test.index[0:5], is_relative=False)
fh.to_relative(cutoff=y_train.index[-1])

Expected behavior No exception

Additional context

Versions Current GitHub master

Issue Analytics

State:
Created 3 years ago
Comments:13

Top GitHub Comments

1reaction

mloningcommented, Dec 30, 2020

Yes I asked about it on their Gitter: https://gitter.im/pydata/pandas

The issue is we often work with incomplete indices (e.g. forecasting horizon or windows in temporal CV), not sure how to best handle that yet.

1reaction

aiwaltercommented, Dec 28, 2020

@mloning I had a look into it and I think the best way is what you wrote here:

fh.to_pandas().to_period(“M”) - cutoff.to_period(“M”)

So basically what you already said, when doing fh.to_relative() then just converting it to a PerdiodIndex first in case it is a DatetimeIndex. The good thing is that pandas can convert the DatetimIndex also just like this:

from sktime.forecasting.all import *
y = load_airline()
y = y.to_timestamp()
cutoff = y.index[-1]
y.index.to_period()

and it works without giving the freq="M" argument:

PeriodIndex(['1949-01', '1949-02', '1949-03', '1949-04', '1949-05', '1949-06',
             '1949-07', '1949-08', '1949-09', '1949-10',
             ...
             '1960-03', '1960-04', '1960-05', '1960-06', '1960-07', '1960-08',
             '1960-09', '1960-10', '1960-11', '1960-12'],
            dtype='period[M]', name='Period', length=144, freq='M')

The bad thing is that to_period() does not work for a single Timestamp like below. I think I will raise this as an issue to pandas, as this is should then also work imho.

cutoff.to_period()

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
pandas\_libs\tslibs\period.pyx in pandas._libs.tslibs.period.freq_to_dtype_code()

AttributeError: 'pandas._libs.tslibs.offsets.MonthBegin' object has no attribute '_period_dtype_code'

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
<ipython-input-122-1571c54e24e5> in <module>
      1 cutoff = y.index[-1]
----> 2 cutoff.to_period()

pandas\_libs\tslibs\timestamps.pyx in pandas._libs.tslibs.timestamps._Timestamp.to_period()

pandas\_libs\tslibs\period.pyx in pandas._libs.tslibs.period.Period.__new__()

pandas\_libs\tslibs\period.pyx in pandas._libs.tslibs.period.freq_to_dtype_code()

ValueError: Invalid frequency: {0}

So I have now a good workaroud to convert the Timestamp to a Period without including a hardcoded freq value or other if statements:

date = pd.DatetimeIndex([cutoff], freq=cutoff.freq)
cutoff = date.to_period()[0]
cutoff

This results in Period('1949-01', 'M'). 🎉 So now we ca do the delta calculation as usual with the PeriodIndex in order to get the relative fh. What do you think? I can try to implement it.

Top Results From Across the Web

[BUG] Absolute to relative ForecastingHorizon #534 - GitHub

The issue is we often work with incomplete indices (e.g. forecasting horizon or windows in temporal CV), not sure how to best handle...

Forecasting with sktime

ForecastingHorizon -s can be converted from relative to absolute and back via ... Such forecasters will produce informative error messages when it is...

Forecast Error Measures: Scaled, Relative, and other Errors

But Scaled Error is different here because it is not directly dependent on the Reference Forecast, but rather on the mean absolute error...

Why start using sktime for forecasting? - Towards Data Science

The forecasting horizon can be an array of relative or absolute values. Absolute values are specific data points for which we want to ......

Error Measures for Generalizing About Forecasting Methods

forecast accuracy, M-Competition, relative absolute error, Theil's U ... horizon, and the presence of extreme forecast errors (outliers).