One-step prediction CV using all previous data
See original GitHub issueI have daily data from 2017,2018 (X_train
) and 2019 (X_test
).
I want to do a one-step ahead CV prediction for each day in 2019 trained on all the data op to the day before i.e first use all traning data (2017,2018) to predict 01-01/2019, then use all training-data + 01-01/2019 to predict 02-01/2019 etc. etc.
As far as I understand, the parameter initial
that is the traning-window i.e setting initial="365 days"
and horizon="1 day"
it would use the last 365 days to predict the next day i.e a rolling window of 365 days.
The problem is, I want to use all the data in the past, and not just the last 365 days thus initial
would first be 730, then 731, then 732 etc.
According to the doc-string for initial
; “The first training period will include at least this much data. If not provided, 3 * horizon is used.” which seems like it would then use 3 days for forecasting if initial
is not set?
The question is; would the following code
model = Prophet()
#X_train= data from 2017 and 2018
#X_test = data from 2019
model.fit(pd.concat((X_train,X_test)))
cutoffs = [p for p in pd.to_datetime(X_test["ds"].values)]
df_cv = cross_validation(model,cutoffs=cutoffs[:-1],horizon = '1 day')
provide df_cv
with:
- One-step prediction for 2019 using all data in the past?
- One-step prediction for 2019 using just the last 3 days?
- Something else
Thanks a bunch for an awesome package!
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
Fantastic! It all makes sense now.
Thanks a bunch!
Correct on both points! Essentially
cutoffs = initial + i * period
where i increments from 0 to however large it can get before we have less thanhorizon
data left.(For the sake of exactness: cutoffs are actually computed backwards in time, where the first cutoff is placed at
end - horizon
, and then subsequent cutoffs are placed atend - horizon - i * period
where i increments until we reachinitial
. But the principle is the same: cutoffs are placed everyperiod
time betweeninitial
andend - horizon
).