Bug in LASSO AIC BIC formula
See original GitHub issueHi, I think that the calculation of sigma in this line is wrong:
According to Zou et al, 2007 Eq. 2.12 the value of sigma is the sigma_ols. It should be calculated as:
sigma2 = np.var(R)
I hope this is useful.
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (7 by maintainers)
Top Results From Across the Web
Is it possible to calculate AIC and BIC for lasso regression ...
I'm using R to fit lasso regression models with the glmnet() function from the glmnet package, and I'd like to know how to...
Read more >Information criteria - MATLAB aicbic - MathWorks
To assess model adequacy, aicbic computes information criteria given loglikelihood values obtained by fitting competing models to data.
Read more >AIC and BIC: Comparisons of Assumptions and Performance
Abstract. The two most commonly used penalized model selection criteria, the Bayesian information criterion (BIC) and Akaike's information criterion (AIC), are ...
Read more >Lasso model selection: AIC-BIC / cross-validation - Scikit-learn
Indeed, several strategies can be used to select the value of the regularization parameter: via cross-validation or using an information criterion, namely AIC...
Read more >regsem: Regularized Structural Equation Modeling
start = 0, alpha = 0.5, gamma = 3.7, type = "lasso", random.alpha = 0.5,. Page 3. cv_regsem. 3 fit.ret = c("rmsea", "BIC",...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
So looking a bit more at the literature I could find the following:
https://www.sciencedirect.com/science/article/abs/pii/S0893965917301623
In short, it confirms the section “Compare with least squares” from the Wikipedia page:
https://en.wikipedia.org/wiki/Akaike_information_criterion
We can use a surrogate of the AIC by discarding the constant term and thus compute:
After some more reading and math sketching, AIC and Cp are not equals. In addition, it could be more confusing because Cp can be defined differently (cf. https://en.wikipedia.org/wiki/Mallows's_Cp#cite_note-4) The relationship between the Cp and AIC as defined in ESL is shown here: https://stats.stackexchange.com/a/492385/121348
So we can just better document which definition of AIC was are using. In addition, we have a bug regarding the estimator of the noise variance of the OLS. If we want to have an unbiased estimator, one possible choice is
RSS / (n_samples - n_features)
. However, it does seem to be well defined for n_features > n_samples that is problematic. We probably need to look a bit more there.In addition, for the OLS estimator, @ogrisel was proposing to fit a ridge with a very low penalty to get a stable result even with colinearity. In this case, we control the OLS model and ensure that we always have the same penalty applied which is not the case with the last linear predictor in the lars path right now.