question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] Absolute to relative ForecastingHorizon

See original GitHub issue

Describe the bug

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-5-40dcb829593a> in <module>
      4 y_train, y_test = temporal_train_test_split(y, test_size=120)
      5 fh = ForecastingHorizon(y_test.index[0:5], is_relative=False)
----> 6 fh.to_relative(cutoff=y_train.index[-1])

c:\Users\Martin\Desktop\sktime\sktime\sktime\forecasting\base\_fh.py in to_relative(self, cutoff)
    223                 values = _coerce_duration_to_int(values, unit=_get_unit(cutoff))
    224 
--> 225             return self._new(values, is_relative=True)
    226 
    227     @lru_cache(typed=True)

c:\Users\Martin\Desktop\sktime\sktime\sktime\forecasting\base\_fh.py in _new(self, values, is_relative)
    160         if is_relative is None:
    161             is_relative = self.is_relative
--> 162         return type(self)(values, is_relative)
    163 
    164     @property

c:\Users\Martin\Desktop\sktime\sktime\sktime\forecasting\base\_fh.py in __init__(self, values, is_relative)
    137         if not isinstance(is_relative, bool):
    138             raise TypeError("`is_relative` must be a boolean")
--> 139         values = _check_values(values)
    140 
    141         # check types, note that isinstance() does not work here because index

c:\Users\Martin\Desktop\sktime\sktime\sktime\forecasting\base\_fh.py in _check_values(values)
    105     # check values does not contain duplicates
    106     if len(values) != values.nunique():
--> 107         raise ValueError("`values` must not contain duplicates.")
    108 
    109     # return sorted values

ValueError: `values` must not contain duplicates.

To Reproduce

from sktime.forecasting.all import *
y = load_airline()
y.index = y.index.to_timestamp()
y_train, y_test = temporal_train_test_split(y, test_size=120)
fh = ForecastingHorizon(y_test.index[0:5], is_relative=False)
fh.to_relative(cutoff=y_train.index[-1])

Expected behavior No exception

Additional context

Versions Current GitHub master

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:13

github_iconTop GitHub Comments

1reaction
mloningcommented, Dec 30, 2020

Yes I asked about it on their Gitter: https://gitter.im/pydata/pandas

The issue is we often work with incomplete indices (e.g. forecasting horizon or windows in temporal CV), not sure how to best handle that yet.

1reaction
aiwaltercommented, Dec 28, 2020

@mloning I had a look into it and I think the best way is what you wrote here:

fh.to_pandas().to_period(“M”) - cutoff.to_period(“M”)

So basically what you already said, when doing fh.to_relative() then just converting it to a PerdiodIndex first in case it is a DatetimeIndex. The good thing is that pandas can convert the DatetimIndex also just like this:

from sktime.forecasting.all import *
y = load_airline()
y = y.to_timestamp()
cutoff = y.index[-1]
y.index.to_period()

and it works without giving the freq="M" argument:

PeriodIndex(['1949-01', '1949-02', '1949-03', '1949-04', '1949-05', '1949-06',
             '1949-07', '1949-08', '1949-09', '1949-10',
             ...
             '1960-03', '1960-04', '1960-05', '1960-06', '1960-07', '1960-08',
             '1960-09', '1960-10', '1960-11', '1960-12'],
            dtype='period[M]', name='Period', length=144, freq='M')

The bad thing is that to_period() does not work for a single Timestamp like below. I think I will raise this as an issue to pandas, as this is should then also work imho.

cutoff.to_period()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
pandas\_libs\tslibs\period.pyx in pandas._libs.tslibs.period.freq_to_dtype_code()

AttributeError: 'pandas._libs.tslibs.offsets.MonthBegin' object has no attribute '_period_dtype_code'

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
<ipython-input-122-1571c54e24e5> in <module>
      1 cutoff = y.index[-1]
----> 2 cutoff.to_period()

pandas\_libs\tslibs\timestamps.pyx in pandas._libs.tslibs.timestamps._Timestamp.to_period()

pandas\_libs\tslibs\period.pyx in pandas._libs.tslibs.period.Period.__new__()

pandas\_libs\tslibs\period.pyx in pandas._libs.tslibs.period.freq_to_dtype_code()

ValueError: Invalid frequency: {0}

So I have now a good workaroud to convert the Timestamp to a Period without including a hardcoded freq value or other if statements:

date = pd.DatetimeIndex([cutoff], freq=cutoff.freq)
cutoff = date.to_period()[0]
cutoff

This results in Period('1949-01', 'M'). 🎉 So now we ca do the delta calculation as usual with the PeriodIndex in order to get the relative fh. What do you think? I can try to implement it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

[BUG] Absolute to relative ForecastingHorizon #534 - GitHub
The issue is we often work with incomplete indices (e.g. forecasting horizon or windows in temporal CV), not sure how to best handle...
Read more >
Forecasting with sktime
ForecastingHorizon -s can be converted from relative to absolute and back via ... Such forecasters will produce informative error messages when it is...
Read more >
Forecast Error Measures: Scaled, Relative, and other Errors
But Scaled Error is different here because it is not directly dependent on the Reference Forecast, but rather on the mean absolute error...
Read more >
Why start using sktime for forecasting? - Towards Data Science
The forecasting horizon can be an array of relative or absolute values. Absolute values are specific data points for which we want to ......
Read more >
Error Measures for Generalizing About Forecasting Methods
forecast accuracy, M-Competition, relative absolute error, Theil's U ... horizon, and the presence of extreme forecast errors (outliers).
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found