Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Difference between R and Python implementation of auto.arima

See original GitHub issue

Question In the documentation, there is written:

pmdarima is designed to behave as similarly to R’s well-known auto.arima as possible

Could someone answer the question, what are the differences?

I tried to run simple ARIMA (order is not important at the moment) on my dataset (1000 data points). Something like that:

import pmdarima as pm
arima = pm.ARIMA(order=(0,1,1))
arima_fit = arima.fit(ts_data)

And it’s pretty fast.

When I use the same dataset and use auto_arima function (like pm.auto_arima(ts_data)), it’s taking a bit more time (measured with timeit):

1.07 s ± 53.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

R implementation of auto.arima is roughly 10 times faster. What’s the reason? Is there a way how to improve that?

When I take the same dataset and use R and Python implementations of auto ARIMA I get (depends on data) different results. The default parameters seem to be the same. What’s the reason for that?

Versions Windows-10-10.0.17134-SP0 Python 3.6.8 |Anaconda, Inc.| (default, Feb 21 2019, 18:30:04) [MSC v.1916 64 bit (AMD64)] pmdarima 1.2.1 NumPy 1.16.4 SciPy 1.2.1 Scikit-Learn 0.20.2 Statsmodels 0.9.0

Issue Analytics

State:
Created 4 years ago
Comments:7 (5 by maintainers)

Top GitHub Comments

3reactions

tgsmith61591commented, Jul 18, 2019

Doing a side-by-side on wineind:

Python:

As stated, this does take a while. Timings for each model are shown via trace

# For exact replicability, you can run this in a docker image:
# $ docker run --rm -it tgsmith61591/pmdarima:1.2.1
import pmdarima as pm

y = pm.datasets.load_wineind()
model = pm.auto_arima(y, seasonal=True, m=12, suppress_warnings=True, trace=True)
# Fit ARIMA: order=(2, 1, 2) seasonal_order=(1, 1, 1, 12); AIC=3069.879, BIC=3094.629, Fit time=1.109 seconds
# Fit ARIMA: order=(0, 1, 0) seasonal_order=(0, 1, 0, 12); AIC=3133.376, BIC=3139.564, Fit time=0.046 seconds
# Fit ARIMA: order=(1, 1, 0) seasonal_order=(1, 1, 0, 12); AIC=3099.734, BIC=3112.109, Fit time=0.289 seconds
# Fit ARIMA: order=(0, 1, 1) seasonal_order=(0, 1, 1, 12); AIC=3066.930, BIC=3079.305, Fit time=0.262 seconds
# Fit ARIMA: order=(0, 1, 1) seasonal_order=(1, 1, 1, 12); AIC=3067.842, BIC=3083.311, Fit time=0.931 seconds
# Fit ARIMA: order=(0, 1, 1) seasonal_order=(0, 1, 0, 12); AIC=3090.512, BIC=3099.793, Fit time=0.051 seconds
# Fit ARIMA: order=(0, 1, 1) seasonal_order=(0, 1, 2, 12); AIC=3068.080, BIC=3083.549, Fit time=0.702 seconds
# Fit ARIMA: order=(0, 1, 1) seasonal_order=(1, 1, 2, 12); AIC=3067.847, BIC=3086.410, Fit time=5.332 seconds
# Fit ARIMA: order=(1, 1, 1) seasonal_order=(0, 1, 1, 12); AIC=3066.760, BIC=3082.229, Fit time=0.851 seconds
# Fit ARIMA: order=(1, 1, 0) seasonal_order=(0, 1, 1, 12); AIC=3094.571, BIC=3106.946, Fit time=0.225 seconds
# Fit ARIMA: order=(1, 1, 2) seasonal_order=(0, 1, 1, 12); AIC=3066.742, BIC=3085.305, Fit time=0.621 seconds
# Fit ARIMA: order=(2, 1, 3) seasonal_order=(0, 1, 1, 12); AIC=3070.634, BIC=3095.384, Fit time=1.760 seconds
# Fit ARIMA: order=(1, 1, 2) seasonal_order=(1, 1, 1, 12); AIC=3067.487, BIC=3089.143, Fit time=1.790 seconds
# Fit ARIMA: order=(1, 1, 2) seasonal_order=(0, 1, 0, 12); AIC=3090.957, BIC=3106.426, Fit time=0.178 seconds
# Fit ARIMA: order=(1, 1, 2) seasonal_order=(0, 1, 2, 12); AIC=3067.731, BIC=3089.387, Fit time=1.667 seconds
# Fit ARIMA: order=(1, 1, 2) seasonal_order=(1, 1, 2, 12); AIC=3068.627, BIC=3093.377, Fit time=4.410 seconds
# Fit ARIMA: order=(0, 1, 2) seasonal_order=(0, 1, 1, 12); AIC=3066.787, BIC=3082.256, Fit time=0.334 seconds
# Fit ARIMA: order=(2, 1, 2) seasonal_order=(0, 1, 1, 12); AIC=3068.673, BIC=3090.330, Fit time=0.818 seconds
# Fit ARIMA: order=(1, 1, 3) seasonal_order=(0, 1, 1, 12); AIC=3068.810, BIC=3090.467, Fit time=0.876 seconds
# Total fit time: 22.316 seconds

Returns almost immediately:

> library(forecast)
> auto.arima(wineind, seasonal=TRUE)
Series: wineind
ARIMA(1,1,2)(0,1,1)[12]

Coefficients:
         ar1      ma1     ma2     sma1
      0.4299  -1.4673  0.5339  -0.6600
s.e.  0.2984   0.2658  0.2340   0.0799

sigma^2 estimated as 5399312:  log likelihood=-1497.05
AIC=3004.1   AICc=3004.48   BIC=3019.57

Explanation and root cause analysis

First of all, when you cross the language border, there is not always such a thing as a 1-to-1 rewrite, so that is the core challenge here. Under the hood, we are using statsmodels to fit our ARMA, ARIMA and SARIMAX models, and what we’re providing is the optimization code, tests of seasonality and stationarity, etc.

Just to make sure it isn’t a bottleneck on our side, I ran an experiment on the slowest of the models shown above:

from statsmodels import api as sm
import time


def timed(func):
    def wrapper(*args, **kwargs):
        start = time.time()
        res = func(*args, **kwargs)
        print(f"complete in {time.time() - start:.4f} seconds")
        return res
    return wrapper


@timed
def create_model():
    sm_sarimax = sm.tsa.statespace.SARIMAX(
        endog=y, exog=None, order=(0, 1, 1),
        seasonal_order=(1, 1, 2, 12), trend='c',
        enforce_stationarity=True)
    res_wrapper = sm_sarimax.fit(
        start_params=None, trend='c',
        method='lbfgs', transparams=True,
        solver='lbfgs', maxiter=50, disp=0)
    return res_wrapper

create_model()
# complete in 5.2814 seconds
# <statsmodels.tsa.statespace.sarimax.SARIMAXResultsWrapper object at 0x7f33cefd2a50>

When you dig deeper, R’s ARIMA code is almost 100% C, and statsmodels’ is almost 100% python. While that will account for a lot of it, I’m not one to be swayed by that fact alone… my opinion is there is surely something going on in the SARIMAX class that is causing this to drag, and I don’t have a perfect explanation for you right now.

RE: AIC, BIC, etc., my guess is that statsmodels computes those values in a slightly different fashion than statsmodels. I wish I had a better answer for you… Now, it could totally be that the manner in which we’re using the SARIMAX interface (or one of the default options we’re passing) is causing a slow-down, and that’s something I’m not really certain about. But I do think it warrants maybe a deep dive by someone on our side to see if it’s anything we have control over.

Thank you for bringing this up; hopefully we can get a satisfactory explanation, if not solution, to you in a reasonable time.

0reactions

tgsmith61591commented, Nov 25, 2019

It seems to me there is not problem with different AIC results, but the algorithm has to differ in something. Do you know more about that?

There was a time when R added more rules to its stepwise search algorithm, so the two diverged. The new release in 1.5.0 will reconcile these differences. However, one reason the same algorithm may not even search the same orders is due to the AIC of the results. The stepwise search refines its search based on the AIC, and since R and Python have subtly different ways of computing it, the same progression of models may not yield the same refinement, if that makes sense.

Another thing I’ve noticed recently, is R is faster because it will approximate more liberally:

> auto.arima
function (y, d = NA, D = NA, max.p = 5, max.q = 5, max.P = 2,
    max.Q = 2, max.order = 5, max.d = 2, max.D = 1, start.p = 2,
    start.q = 2, start.P = 1, start.Q = 1, stationary = FALSE,
    seasonal = TRUE, ic = c("aicc", "aic", "bic"), stepwise = TRUE,
    trace = FALSE, approximation = (length(x) > 150 | frequency(x) >
        12), truncate = NULL, xreg = NULL, test = c("kpss", "adf",
        "pp"), seasonal.test = c("seas", "ocsb", "hegy", "ch"),
    allowdrift = TRUE, allowmean = TRUE, lambda = NULL, biasadj = FALSE,
    parallel = FALSE, num.cores = 2, x = y, ...)
{
    ...
}

Notice approximation = (length(x) > 150 | frequency(x) > 12). The python version does not approximate unless you explicitly pass simple_differencing=True, and even then the results still differ slightly. That has to do with the number of iterations run in the optimizer. R’s BFGS optimization defaults to 100 iterations (run optim in R to see the function), while pmdarima’s ARIMA defaults to 50 iterations. If you set simple_differencing=True, you might also crank up the maxiter param to refine it some more.