question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] AutoARIMA not working with Temporal Cross Validation

See original GitHub issue

Describe the bug Would like the statistical models like ARIMA, ETS, etc to work with the Temporal Cross Validation Flow. I am not able to reproduce the same results as the standalone ARIMA when using the Temporal Cross Validation Flow.

To Reproduce

Setup

y = load_airline()
y_train, y_test = temporal_train_test_split(y, test_size=36)

fh = ForecastingHorizon(np.arange(len(y_test)) + 1, is_relative=True)

Regular AutoARIMA (Baseline for test)

forecaster = AutoARIMA(sp=12, suppress_warnings=True)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"]);
smape_loss(y_test, y_pred)

0.04117062370076287

image

Using Temporal Cross Validation

Version 1: ‘sp’ in forecaster_param_grid (does not work)

forecaster_param_grid = {'sp': [12]}
forecaster = AutoARIMA(suppress_warnings=True)

cv = SlidingWindowSplitter(initial_window=int(len(y_train) * 0.90), start_with_window=True)
gscv = ForecastingGridSearchCV(forecaster, cv=cv, param_grid=forecaster_param_grid, verbose=True)

gscv.fit(y_train)
y_pred = gscv.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"]);
smape_loss(y_test, y_pred)

0.11346208431398466

image

Version 2: ‘sp’ in forecaster (works but defeats the purpose of Grid Search)

forecaster_param_grid = {}
forecaster = AutoARIMA(sp=12, suppress_warnings=True)

cv = SlidingWindowSplitter(initial_window=int(len(y_train) * 0.90), start_with_window=True)
gscv = ForecastingGridSearchCV(forecaster, cv=cv, param_grid=forecaster_param_grid, verbose=True)

gscv.fit(y_train)
y_pred = gscv.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"]);
smape_loss(y_test, y_pred)

0.04117062370076287 (matches standalone AutoARIMA without GridSearch above)

image

Expected behavior It does not look like the best_estimator is taking the seasonality value of 12 when ‘sp’ is just passed through the forecaster_param_grid . It only works if it is set natively in the forecaster initialization.

Additional context Basically, I would like to create a unified flow around sktime to build and compare multiple models (ARIMA, ETS, Random Forest, SVM, etc), including hyper parameter parameter for the statistical models. I see from the examples folder how this can be done for native scikit models but wanted to recreate the same for the statistical models

Versions

System: python: 3.6.12 |Anaconda, Inc.| (default, Sep 9 2020, 00:29:25) [MSC v.1916 64 bit (AMD64)] executable: C:\Users\xxxx\AppData\Local\Continuum\anaconda3\envs\sktime\python.exe machine: Windows-10-10.0.18362-SP0

Python dependencies: pip: 20.3 setuptools: 49.6.0 sklearn: 0.23.2 numpy: 1.19.2 scipy: 1.5.2 Cython: 0.29.17 pandas: 1.1.3 matplotlib: 3.3.2 joblib: 0.17.0 numba: None pmdarima: 1.7.1 tsfresh: None

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
ngupta23commented, Dec 12, 2020

Hi @mloning, Thanks you for suggesting the alternative. You will need to change the argument assignments in _AutoARIMA from just <param_name> to self.<param_name> and then it will pick the updated param value after assignment is done in GC. It is definitely cleaner since we don’t have to rename the parameters such as sp.

Would you like me to submit a PR for this?

1reaction
ngupta23commented, Dec 7, 2020

Hi @mloning, Thanks for the reference to the repo! I will go through it and let you know if I have any further questions.

Read more comments on GitHub >

github_iconTop Results From Across the Web

[BUG] AutoARIMA not working with Temporal Cross Validation
I think I figured out the issue. The issue is that the AutoARIMA module is not honoring the set_params internally. It is getting...
Read more >
Is this a bug in auto arima or am I doing something wrong?
arima expects a ts object, not a zoo object. However, the error message is very unfriendly. I'll catch it more cleanly in the...
Read more >
forecasting with tscv auto.arima predicted values in R
So tsCV() returns errors in a matrix where the (i,j)th entry contains the error for forecast origin i and forecast horizon h.
Read more >
Model selection with cross-validation — pmdarima 2.0.2 ...
For this problem, we'll use the pmdarima.datasets.load_sunspots() method, which loads a seasonal time series of monthly mean relative sunspot numbers from ...
Read more >
ARIMA Model - Complete Guide to Time Series Forecasting in ...
Linear regression models, as you know, work best when the predictors are not correlated and are independent of each other. So how to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found