question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Refit with ForecastingGridSearchCV when using a Grid Search based forecasters may not use "best hyperparameters"

See original GitHub issue

Describe the bug When using a Grid Search based forecaster (such as AutoARIMA, or ReducedRegressionForecaster with GridSearchCV from scikit-learn) in ForecastingGridSearchCV, the best hyperparameters that are found in the initial_window may not be used when the final refit is done.

The outer Grid Search (that in ForecastingGridSearchCV) works fine and picks the right hyperparameters for the final fit, but the inner Grid Search is a problem. This is due to the fact that the final refit is simply calling the fit function on the full data which leads to the entire “inner” grid search to be rerun. This may lead to a different set of best hyperparameters being selected than those that were selected during the fit in the initial_window.

Since the final hyperparameters can be different from the ones used in the sliding cross validation, this has the potential to make the cross validation results incorrect (i.e. cross validation results will not correspond to the final model hyperparameters that are refit).

To Reproduce

y = load_airline()
fh = np.arange(1, 13)  

y_train, y_test = temporal_train_test_split(y, test_size=len(fh))

cv = SlidingWindowSplitter(
    initial_window=int(len(y_train) * 0.5),
    start_with_window=True,
)

# ARIMA
forecaster_param_grid = {"sp": [12, 24]}
forecaster = AutoARIMA(suppress_warnings=True)

gscv_arima = ForecastingGridSearchCV(
    forecaster, cv=cv, param_grid=forecaster_param_grid, verbose=True
)
gscv_arima.fit(y_train)

During the Grid Search, the hyperparameter values with sp=12 indicate that only 2 lags are selected along with a seasonal lag. Note, I have left out the parameters for sp=24 since that is not selected as the final best_forecaster.

# Initial Window Fit
forecaster._get_fitted_param_names()
['intercept', 'ar.L1', 'ar.L2', 'ar.S.L12', 'sigma2']
forecaster._get_fitted_params()
array([ 7.99081197,  0.56606485,  0.1956277 , -0.47478481, 68.31476303])

Best Estimator

self.best_forecaster_
AutoARIMA(sp=12, suppress_warnings=True)

Final Refit: This seems to indicate that there 3 lags should be used. This means that the hyperparameters selected by the final fit are not the same as those selected by the fit in the “initial_window”. Note, that the actual parameter values will be different and that is OK, but the hyperparameters (param_names below) should be the same for the initial window and final refit.

self.best_forecaster_._get_fitted_param_names()
['intercept', 'ar.L1', 'ar.L2', 'ar.L3', 'sigma2']
self.best_forecaster_._get_fitted_params()
array([  5.53410033,   0.70489183,   0.25742152,  -0.14344774, 101.09688613])

Expected behavior The param names should be the same as those selected in the initial window.

Additional context The same issue can potentially be observed when ReducedRegressionForecaster with GridSearchCV from scikit-learn is used in ForecastingGridSearchCV.

An example is shown below (note that in this case, it ends up picking the same hyperparameter in the final refit as in the initial_window, but since the GridSearchCV object is fit again in the end without any regard to the hyperparameters selected in the initial_window, it could have picked different hyperparameters.

regressor_param_grid = {"n_estimators": [200, 300]}
forecaster_param_grid = {"window_length": [12, 15]}
tscv = TimeSeriesSplit()

# create a tunnable regressor with GridSearchCV
regressor = GridSearchCV(
    RandomForestRegressor(),
    param_grid=regressor_param_grid,
    cv = tscv,
    scoring='neg_mean_squared_error',  
    verbose=1
)
forecaster = ReducedRegressionForecaster(
    regressor, strategy="recursive"
)

gscv = ForecastingGridSearchCV(
    forecaster, cv=cv, param_grid=forecaster_param_grid, verbose=1
)
gscv.fit(y_train)

Possible Solution

Implement a refit method in the forecasters.

  • For the Grid Search Based forecasters, this will pluck the “best hyperparameters” based on the “inner” grid search and use those for the final fit by setting those explicitly in the underlying forecaster (ARIMA for AutoARIMA, RandomForestRegressor for ReducedRegressionForecaster, etc.) instead of performing the “inner” grid search again.
  • For non Grid Search based forecasters, this can simply alias the fit method.

Versions

System:
    python: 3.6.12 |Anaconda, Inc.| (default, Sep  9 2020, 00:29:25) [MSC v.1916 64 bit (AMD64)]
executable: C:\Users\Nikhil\.conda\envs\sktime_dev\python.exe
   machine: Windows-10-10.0.18362-SP0

Python dependencies:
          pip: 20.3.3
   setuptools: 51.0.0.post20201207
      sklearn: 0.24.0
       sktime: 0.5.1
  statsmodels: 0.12.1
        numpy: 1.19.4
        scipy: 1.5.4
       Cython: 0.29.17
       pandas: 1.1.5
   matplotlib: 3.3.3
       joblib: 1.0.0
        numba: 0.52.0
     pmdarima: 1.8.0
      tsfresh: 0.17.0

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
ngupta23commented, Jan 26, 2021

Hi @mloning , Thank you for the alternate code for ReducedRegressionForecaster. I verified that this alternate (without using GridSearchCV in the regressor) works great when using scikit-learn based regressors.

However the issue still exists with AutoARIMA since it is essentially an intelligent Grid Search. Do you have an alternate that works for AutoARIMA as well?

1reaction
ngupta23commented, Jan 15, 2021

@fkiraly Thanks for the feedback on this topic. @mloning , Thanks for the alternate code. Let me try this out and get back in case there are any further questions. If this works as expected, it might be good to document this in the forecasting example notebook.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Refit with ForecastingGridSearchCV when using a Grid ...
Refit with ForecastingGridSearchCV when using a Grid Search based forecasters may not use "best hyperparameters" #620.
Read more >
ForecastingGridSearchCV — sktime documentation
Grid -search cross-validation is performed based on a cross-validation ... True = refit the forecaster with the best parameters on the entire data...
Read more >
How to Grid Search Deep Learning Models for Time Series ...
In this tutorial, you will discover how to develop a framework to grid search hyperparameters for deep learning models.
Read more >
Custom refit strategy of a grid search with cross-validation
We will select a classifier by searching the best hyper-parameters on folds of the training set. To do this, we need to define...
Read more >
Grid search forecaster - Skforecast Docs
If return_best = True , the original forecaster is trained with the best lags and hyperparameters configuration found. Libraries ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found