Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] AutoARIMA not working with Temporal Cross Validation

See original GitHub issue

Describe the bug Would like the statistical models like ARIMA, ETS, etc to work with the Temporal Cross Validation Flow. I am not able to reproduce the same results as the standalone ARIMA when using the Temporal Cross Validation Flow.

To Reproduce

Setup

y = load_airline()
y_train, y_test = temporal_train_test_split(y, test_size=36)

fh = ForecastingHorizon(np.arange(len(y_test)) + 1, is_relative=True)

Regular AutoARIMA (Baseline for test)

forecaster = AutoARIMA(sp=12, suppress_warnings=True)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"]);
smape_loss(y_test, y_pred)

0.04117062370076287

Using Temporal Cross Validation

Version 1: ‘sp’ in `forecaster_param_grid` (does not work)

forecaster_param_grid = {'sp': [12]}
forecaster = AutoARIMA(suppress_warnings=True)

cv = SlidingWindowSplitter(initial_window=int(len(y_train) * 0.90), start_with_window=True)
gscv = ForecastingGridSearchCV(forecaster, cv=cv, param_grid=forecaster_param_grid, verbose=True)

gscv.fit(y_train)
y_pred = gscv.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"]);
smape_loss(y_test, y_pred)

0.11346208431398466

Version 2: ‘sp’ in `forecaster` (works but defeats the purpose of Grid Search)

forecaster_param_grid = {}
forecaster = AutoARIMA(sp=12, suppress_warnings=True)

cv = SlidingWindowSplitter(initial_window=int(len(y_train) * 0.90), start_with_window=True)
gscv = ForecastingGridSearchCV(forecaster, cv=cv, param_grid=forecaster_param_grid, verbose=True)

gscv.fit(y_train)
y_pred = gscv.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"]);
smape_loss(y_test, y_pred)

0.04117062370076287 (matches standalone AutoARIMA without GridSearch above)

Expected behavior It does not look like the best_estimator is taking the seasonality value of 12 when ‘sp’ is just passed through the forecaster_param_grid . It only works if it is set natively in the forecaster initialization.

Additional context Basically, I would like to create a unified flow around sktime to build and compare multiple models (ARIMA, ETS, Random Forest, SVM, etc), including hyper parameter parameter for the statistical models. I see from the examples folder how this can be done for native scikit models but wanted to recreate the same for the statistical models

Versions

System: python: 3.6.12 |Anaconda, Inc.| (default, Sep 9 2020, 00:29:25) [MSC v.1916 64 bit (AMD64)] executable: C:\Users\xxxx\AppData\Local\Continuum\anaconda3\envs\sktime\python.exe machine: Windows-10-10.0.18362-SP0

Python dependencies: pip: 20.3 setuptools: 49.6.0 sklearn: 0.23.2 numpy: 1.19.2 scipy: 1.5.2 Cython: 0.29.17 pandas: 1.1.3 matplotlib: 3.3.2 joblib: 0.17.0 numba: None pmdarima: 1.7.1 tsfresh: None

Issue Analytics

State:
Created 3 years ago
Comments:7 (3 by maintainers)

Top GitHub Comments

1reaction

ngupta23commented, Dec 12, 2020

Hi @mloning, Thanks you for suggesting the alternative. You will need to change the argument assignments in _AutoARIMA from just <param_name> to self.<param_name> and then it will pick the updated param value after assignment is done in GC. It is definitely cleaner since we don’t have to rename the parameters such as sp.

Would you like me to submit a PR for this?

1reaction

ngupta23commented, Dec 7, 2020

Hi @mloning, Thanks for the reference to the repo! I will go through it and let you know if I have any further questions.

Top Results From Across the Web

[BUG] AutoARIMA not working with Temporal Cross Validation

I think I figured out the issue. The issue is that the AutoARIMA module is not honoring the set_params internally. It is getting...

Is this a bug in auto arima or am I doing something wrong?

arima expects a ts object, not a zoo object. However, the error message is very unfriendly. I'll catch it more cleanly in the...

forecasting with tscv auto.arima predicted values in R

So tsCV() returns errors in a matrix where the (i,j)th entry contains the error for forecast origin i and forecast horizon h.

Model selection with cross-validation — pmdarima 2.0.2 ...

For this problem, we'll use the pmdarima.datasets.load_sunspots() method, which loads a seasonal time series of monthly mean relative sunspot numbers from ...

ARIMA Model - Complete Guide to Time Series Forecasting in ...

Linear regression models, as you know, work best when the predictors are not correlated and are independent of each other. So how to...