[Question] LOO-CV with CRPS
See original GitHub issueHi,
Is it true that each element of mu
and sigma2
in leave_one_out_pseudo_likelihood.py represents what would have been the mean and variance of the posterior distribution if the corresponding data points would have been left out?
If so, I am curious about your thoughts on modifiying this mll so that it instead of returning the mll, it will return the summed CRPS based on mu
and sigma2
. According to Gneiting & Raftery, 2007, Eq. 21, we could replace line 63 and forward by
pdf_term = (
torch.distributions.Normal(0, 1)
.log_prob((target - mu) / sigma2.sqrt())
.exp()
)
cdf_term = torch.distributions.Normal(0, 1).cdf((target - mu) / sigma2.sqrt())
crps = sigma2.sqrt() * (
1.0 / math.sqrt(math.pi)
- 2 * pdf_term
- ((target - mu) / sigma2.sqrt()) * (2 * cdf_term - 1)
)
res = crps.sum(dim=-1)
return res
Any thoughts?
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (1 by maintainers)
Top Results From Across the Web
Using cross validation to assess interpolation results
Answering this question seems to require knowing the values of locations that you haven't sampled. However, there is a common and widely used...
Read more >Proof of LOOCV formula - Cross Validated - Stack Exchange
I'll show the result for any multiple linear regression, whether the regressors are polynomials of Xt or not. In fact, it shows a...
Read more >Using Stacking to Average Bayesian ... - Project Euclid
Abstract. Bayesian model averaging is flawed in the M-open setting in which the true data-generating process is not one of the candidate models...
Read more >Approximate cross-validatory predictive checks in disease ...
This work is based on an idea by L. Held to perform predictive crossvalidation in a similar way as Marshall and Spiegelhalter (2003)...
Read more >Accurate parameter estimation for Bayesian network ...
Use Leave-one-out cross validation (LOOCV) on MSE to select both k and which attributes to use. ▷ Requires a third pass through the...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@dme65 @jacobrgardner ok so I did actual leave-one-out cross-validation by training a model 47 times on a noisy, 30-dimensional dataset with 47 data points obtained from physical experiments, each time with the exact mll, the pseudo loocv mll, and the loocv crps “mll” respectively.
There were four outputs (observables), and so I repeated the procedure for each observable. The loocv pseudo mll did perform a little better than the exact mll function, but the loocv crps approach was much better – there is a clear difference between loocv crps (solid lines) and pseudo loocv mll (dashed lines). I did not even plot results from the exact mll runs since the results cannot even be distinguished from the results from the loocv pseudo mll runs in the below figure. The blue, orange and green observables are very noisy, while the red one is expected to be less noisy.
Based on these results, I think we should add the loocv crps approach to GPyTorch – but I don’t know how. We could add a “crps” kwarg to the constructor in the loocv pseudo mll class (defaulting to False), and then if self.crps==True then the above lines are evaluated instead of the current ones in the forward call. Another option would be split the loocv pseudo mll into an abstract base class from which specific loocv pseudo mll and loocv crps “mll” classes would inherit common lines of code.
Sounds good!
Yes, there are actually two obvious outliers in the dataset: one experiment rendered values of the observables which differed substantially from the rest, and there is no surprise that the model(s) had a hard time predicting these values – this experiment corresponds to the most extreme bumps in each plot. I used a linear mean, and a small difference in the slope could have caused the discrepancy between loocv psuedo likelihood and loocv crps (I should rerun the entire thing multiple times just to be sure…) The other outlier @-10 for red is just a noisy experiment.