Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Question/Error for add_new_observation for ARIMA one-step ahead prediction

See original GitHub issue

Description

I want to make one-step-ahead prediction for the test samples (my application is weather forecast where I only make a prediction one day ahead and after that day using the temperature I get to predict the next day. However, after using the code following, the prediction is always the same and seems not related to the new information I feed into the system at all.

Steps/Code to Reproduce

import numpy as np
import matplotlib.pyplot as plt
from pmdarima.arima import auto_arima


x=np.array([-1.02676385, -0.46451953, -1.01906986, -0.31565596, -0.20805649,
       -0.12734656, -0.05142027,  0.27602999, -0.10876694, -0.08089352,
       -0.30637015, -0.6179563 , -0.93839935, -0.60254471, -0.28821944,
       -0.23155955,  0.61213451,  0.66176357,  0.43377408,  1.14219008,
        1.14310312,  0.76193474, -0.19938661, -0.6360834 , -0.69639544,
       -0.48678292, -0.00564601, -0.52564729, -0.68984075, -0.79059337,
       -0.24680773, -0.18874286, -0.08292224, -0.27194204, -0.87429535,
       -0.7811314 , -0.48947481, -0.84639831, -0.55538127, -0.31155128,
       -0.60231845, -0.69728486, -0.66874828, -0.13013291, -0.02630958,
       -0.09312909, -0.15161505, -0.57901022, -0.21259026,  0.14004154,
       -0.09863094, -0.12033935,  0.12694016, -0.58078907, -0.52504222,
       -0.30477033, -0.41261187, -0.19480561,  0.11248883, -0.0658716 ,
       -0.208641  , -0.50981766, -0.50981766, -0.50981766, -0.14157162,
       -0.17896511, -0.06271138,  0.16935137, -0.13398295, -0.14928922,
       -0.32424737, -0.38455942, -0.2444819 , -0.43076258, -0.38022048,
       -0.29660021, -0.77816602, -0.86361237, -0.57311869,  0.18337378,
       -0.27191842, -0.32583919, -0.65511553, -0.42710242, -0.78900441,
       -0.50006333,  0.4843478 , -0.38859325, -0.4729003 , -0.47381334,
       -0.24651062, -0.63039451, -0.61135438,  0.28877331,  0.46982563,
        1.26704303,  0.38615813,  0.19287024, -0.48269386,  0.08909415,
        0.03608642, -0.03349582,  0.10221914,  0.13347123,  0.01359677,
        0.38707116,  0.6125478 ,  0.63884502,  0.32992714,  0.19692768,
        0.43804217,  0.45276698,  0.76255067,  0.58421385,  0.47545928,
        0.44484674, -0.04443668,  0.4088268 ,  0.34178104,  0.44106278,
        0.12153277, -0.33464886,  0.1108734 ,  0.12344055,  0.3370132 ,
        1.25188932,  0.20399012,  0.49133139,  1.00821474,  1.07622077,
        0.76552405,  0.67327313,  0.92915967,  0.28080583,  0.26697044,
        0.00498974, -0.02468615,  1.15194441,  0.46459727, -0.21888708,
       -0.28902716, -0.79519305, -0.04915729,  0.22846127,  0.13714701,
       -0.12198926, -0.12439679, -0.59961856, -0.52391853, -0.60662577,
       -0.84436159, -0.42034509, -0.33332253, -0.44054814, -0.21842656,
       -0.27326036, -0.21568744, -0.44456635, -0.34497665, -0.43869844,
       -0.55575208, -0.322     , -0.30387291,  0.00253211, -0.17671774,
        0.31752054,  0.19005741,  2.02583011, -0.6233793 , -0.52897073,
       -0.70268425, -0.06631611,  0.22104877,  0.74429978,  1.0742629 ,
        0.91245336,  0.18637078, -0.29702111,  0.48209282,  0.16708077,
        0.27766142, -0.1549621 ,  0.24547268,  0.38032183,  0.73815838,
        0.55743764, -0.0565934 , -0.06916056,  0.22104877,  0.79804152,
        0.0232193 ,  0.72800676,  0.03614165,  0.18028461,  0.41959663,
        1.01616375, -0.56666418, -0.13008053, -0.47748397,  0.00964179,
        0.31853605,  0.70401976,  1.55858383,  0.42388834,  0.78512717,
        1.38930656,  0.67533061,  0.18932054, -0.658131  , -0.9702986 ,
       -0.1191132 , -0.31331413, -0.36906097, -0.70275796, -1.03259214,
       -0.68156016,  0.63078536, -0.59204836, -0.80869171,  0.35479024,
        0.49958836,  1.20211284,  0.66682891,  0.34367035,  0.30807932,
       -0.28635374, -0.06427939, -0.46288809, -1.13480003, -0.92676371,
       -0.98522606, -0.56732735, -0.46466694,  0.21064728,  0.21064728,
       -0.17212093, -0.21471917, -0.35748857,  0.22288285, -0.34276375,
       -0.77787652, -1.21933334, -1.04188595, -0.06107975, -0.08920305,
        0.63418764,  0.725065  ,  0.63103524,  0.62829612,  0.2605026 ,
        0.29949591, -0.52800246, -0.29696588, -0.33798791,  0.66820247,
        1.2270445 ,  0.53426635,  0.2824925 ,  0.82785434,  1.13819587,
        0.24598843,  0.75974299,  0.45176177,  0.18194252,  0.45151189,
        0.23827083,  0.4630843 ,  0.45176177,  0.24441223,  0.41505505,
        0.6992283 , -0.03706113, -0.43115187, -0.39896312,  0.48905794,
       -0.36926362,  0.33143478, -0.2360379 , -0.21857397,  0.21222348,
        0.09981675,  0.40439569,  0.90769352,  0.31330769, -0.01305051,
        0.17323017,  0.57208875,  0.22917966,  0.10204811, -0.14429474,
        0.83607266, -0.38515935, -0.41485885, -0.359112  ,  0.29898816,
       -0.27549173, -0.07788852,  0.15621876,  0.21196561,  0.38352147])

x_test = np.array([ 1.00059960e+00,  5.33298082e-01,  1.50990396e-01, -7.59344140e-01,
       -2.53178251e-01, -2.45589584e-01,  2.83295821e-02, -5.80070671e-01,
       -7.68177429e-01, -9.19116954e-01, -3.69568725e-01, -3.90434940e-01,
        9.27070739e-02, -1.62219188e-01, -1.28204363e-01, -4.57917608e-01,
       -6.87959440e-01, -4.11004048e-01, -6.39219801e-01, -6.57346897e-01,
       -5.16403574e-01, -3.57083280e-01, -2.28125652e-01, -1.82125132e-01,
       -4.80609902e-01, -3.93337472e-01, -6.10230694e-01, -2.95905426e-01,
       -1.26833086e+00, -1.26127642e+00, -1.25604805e+00, -1.54001065e+00,
       -1.24766728e+00, -1.18669208e+00, -1.13526055e+00, -1.21187361e+00,
       -1.01086812e+00, -6.58923101e-01, -5.83958656e-02,  2.97423720e-02,
        2.80603187e-01, -6.07296960e-02, -1.59074781e-01, -4.75810859e-02,
       -1.12824386e-01, -1.09172228e-01, -8.40483170e-01, -5.82817790e-01,
       -4.57077758e-02,  2.36428746e-01, -1.24810081e-01, -1.11661471e-01,
       -2.40619099e-01,  2.58795692e-02,  2.79229627e-01, -2.15007070e-04,
       -9.07349905e-01, -1.28783151e+00, -9.80763329e-01, -8.09207464e-01,
       -2.80728093e-01,  2.58795692e-02, -9.33317310e-02, -8.21240390e-01,
       -8.29823803e-01])

#####fit the model on training data x
arima = auto_arima(x,start_p=1, start_q=1, max_p=5, max_q=5, 
                  stepwise=True, suppress_warnings=True,
                  error_action='ignore',trace=True)  

#using one-step-ahead prediction for x_test
preds =np.zeros((len(x_test)))
conf_int = np.zeros((len(x_test),2))
#predict the next day T (horizon = 1), and then add the actual temperature for 
#that day and continue
for i in range(len(x_test)):
      new_preds, new_conf_int = arima.predict(n_periods=1, return_conf_int=True)
      arima.add_new_observations([x_test[i]])
      preds[i] = new_preds
      conf_int[i] = new_conf_int

Actual Results

Also when I check arima.summary() after using add_new_observation, the Sample number is not change.

Versions

Windows-10-10.0.17134-SP0 Python 3.6.7 |Anaconda custom (64-bit)| (default, Dec 10 2018, 20:35:02) [MSC v.1915 64 bit (AMD64)] NumPy 1.15.4 SciPy 1.2.1 Scikit-Learn 0.20.3 Pandas 0.24.2 Statsmodels 0.9.0 Pmdarima 1.1.0

Another way of doing this

If add_new_observation does not store the value in arima model, I also try adding the data everytime up to the day I wanna predict, but still all the prediction is the same

for i in range(len(x_test)):
    if i == 0:
       new_preds, new_conf_int = arima.predict(n_periods=1, return_conf_int=True)
       preds[i] = new_preds
       conf_int[i] = new_conf_int
    else:
        arima.add_new_observations(x_test[:i])    
        new_preds, new_conf_int = arima.predict(n_periods=1, return_conf_int=True)
        preds[i] = new_preds
        conf_int[i] = new_conf_int

Am I using it incorrectly? Or the function just didn’t provide one-step-ahead prediction with new real measurements feeding in?

Add comment:

If I set seasonal=False at

arima = auto_arima(x,start_p=1, start_q=1, max_p=5, max_q=5, 
                  stepwise=True, suppress_warnings=True,
                  error_action='ignore',seasonal=False, trace=True)

Then my one-step-ahead prediction seems more reasonable, however,

in_sample_preds = arima.predict_in_sample()

will give me only 299 training fitting results (where my training data has 300 data points). And when I plot it, the fitting training results looks far off from the training data.

I assumed seasonal=False won’t affect any results but it turns out it doe affect and cause trouble for even my fitting results

Issue Analytics

State:
Created 5 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

tgsmith61591commented, Mar 26, 2019

#109 should have fixed this. I’ll be releasing a v1.1.1 patch soon.

Using all of your setup code, and the following (slightly amended):

for i in range(len(x_test)):
      new_preds, new_conf_int = arima.predict(n_periods=1, return_conf_int=True)
      arima.update([x_test[i]])
      preds[i] = new_preds
      conf_int[i] = new_conf_int

Here’s the output:

>>> arima.summary()
<class 'statsmodels.iolib.summary.Summary'>
"""
                           Statespace Model Results
==============================================================================
Dep. Variable:                      y   No. Observations:                  301
Model:               SARIMAX(1, 1, 1)   Log Likelihood                -170.729
Date:                Tue, 26 Mar 2019   AIC                            349.457
Time:                        17:22:54   BIC                            364.272
Sample:                             0   HQIC                           355.386
                                - 301
Covariance Type:                  opg
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
intercept     -0.0001      0.000     -0.294      0.769      -0.001       0.001
ar.L1          0.6425      0.039     16.451      0.000       0.566       0.719
ma.L1         -0.9992      0.097    -10.272      0.000      -1.190      -0.809
sigma2         0.1834      0.018     10.037      0.000       0.148       0.219
===================================================================================
Ljung-Box (Q):                       48.47   Jarque-Bera (JB):                73.67
Prob(Q):                              0.17   Prob(JB):                         0.00
Heteroskedasticity (H):               1.57   Skew:                             0.17
Prob(H) (two-sided):                  0.03   Kurtosis:                         5.40
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
"""

Notice the summary includes the new number of samples. With respect to the second portion of your issue, I don’t really understand how seasonal=False is making your code not work… but predict_in_sample does appear to return 301 samples, as expected:

>>> arima.predict_in_sample().shape
(301,)

Finally, the preds are not the same any longer:

>>> preds
array([ 0.31408198,  0.667704  ,  0.40843767,  0.193331  , -0.32219308,
       -0.04226993, -0.04039324,  0.11236622, -0.23331992, -0.34539467,
       -0.44096565, -0.13397046, -0.14929239,  0.1252347 , -0.02143114,
       -0.00783168, -0.19421307, -0.33560071, -0.17650232, -0.31110942,
       -0.32605371, -0.24961845, -0.16175163, -0.08972028, -0.06505728,
       -0.23790608, -0.18939635, -0.31820696, -0.14191564, -0.70241455,
       -0.71036907, -0.77560899, -0.96036023, -0.78606551, -0.75458671,
       -0.72870007, -0.78247309, -0.66323714, -0.44838843, -0.07476588,
       -0.02056666,  0.13371806, -0.07343361, -0.13648112, -0.06474216,
       -0.10642029, -0.10411575, -0.57823669, -0.41552317, -0.06970567,
        0.11579691, -0.1189743 , -0.11050851, -0.19460383, -0.02099652,
        0.1468136 , -0.03553684, -0.63006517, -0.88352281, -0.68709393,
       -0.57913402, -0.23512272, -0.03450596, -0.11233004, -0.58979941])

0reactions

tgsmith61591commented, Jul 17, 2019

Please post a reproducible code block or there’s not much I can do to help you