question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

One-step prediction CV using all previous data

See original GitHub issue

I have daily data from 2017,2018 (X_train) and 2019 (X_test).

I want to do a one-step ahead CV prediction for each day in 2019 trained on all the data op to the day before i.e first use all traning data (2017,2018) to predict 01-01/2019, then use all training-data + 01-01/2019 to predict 02-01/2019 etc. etc.

As far as I understand, the parameter initial that is the traning-window i.e setting initial="365 days" and horizon="1 day" it would use the last 365 days to predict the next day i.e a rolling window of 365 days.

The problem is, I want to use all the data in the past, and not just the last 365 days thus initial would first be 730, then 731, then 732 etc. According to the doc-string for initial; “The first training period will include at least this much data. If not provided, 3 * horizon is used.” which seems like it would then use 3 days for forecasting if initial is not set?

The question is; would the following code


model = Prophet()
#X_train= data from 2017 and 2018
#X_test = data from 2019
model.fit(pd.concat((X_train,X_test)))

cutoffs = [p for p in pd.to_datetime(X_test["ds"].values)]
df_cv = cross_validation(model,cutoffs=cutoffs[:-1],horizon = '1 day')

provide df_cv with:

  1. One-step prediction for 2019 using all data in the past?
  2. One-step prediction for 2019 using just the last 3 days?
  3. Something else

Thanks a bunch for an awesome package!

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
Jakobhenningjensencommented, Sep 18, 2020

Fantastic! It all makes sense now.

Thanks a bunch!

0reactions
blethamcommented, Sep 16, 2020

Correct on both points! Essentially cutoffs = initial + i * period where i increments from 0 to however large it can get before we have less than horizon data left.

(For the sake of exactness: cutoffs are actually computed backwards in time, where the first cutoff is placed at end - horizon, and then subsequent cutoffs are placed at end - horizon - i * period where i increments until we reach initial. But the principle is the same: cutoffs are placed every period time between initial and end - horizon).

Read more comments on GitHub >

github_iconTop Results From Across the Web

Chapter 8: Winningest Methods in Time Series Forecasting
One-Step Prediction ¶. Let's build a model for making one-step forecasts. To do this, we first need to transform the time series data...
Read more >
How To Backtest Machine Learning Models for Time Series ...
We can do this by splitting up the data that we do have available. We use some to prepare the model and we...
Read more >
time series - Forecasting several periods with machine learning
I lately recapped my Time Series knowledge and realised that machine learning mostly gives only one step ahead forecasts. With one-step-ahead ...
Read more >
Backtesting Time Series models — Weekend of a Data Scientist
Evaluating Time Series models​​ Because we want to predict future based on the past and with k-fold cv we may train on data...
Read more >
one step forecast using 'glmnet' package - cv.glmnet
When you select 1 row of a matrix/dataframe it is transformed into a vector, which is not an option as input to predict....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found