Add method for simulating from the posterior (or just add an example to the documentation)
See original GitHub issueEstimating the mean and confidence intervals (using prediction_intervals) is great. In some cases, it can be useful to simulate from the posterior distribution of the model’s coefficients. An example is given in pages 242–243 of [1].
I think the following code snippet does the trick for a LinearGAM:
def simulate_from_posterior(linear_gam, X, n_simulations):
"""Simulate from the posterior of a LinearGAM a certain number of times.
Inputs
------
linear_gam : pyGAM.LinearGAM
X : array of shape (n_samples, n_features)
n_simulations : int
The number of simulations from the posterior to compute
Returns
-------
simulations : array of shape (n_samples, n_simulations)
"""
beta_replicates = np.random.multivariate_normal(
linear_gam.coef_, linear_gam.statistics_['cov'], size=n_simulations)
return linear_gam._modelmat(X).dot(beta_replicates.T)
I’m not sure if this should be added as an example in the documentation or added to the code (or both).
To implement this in general, I think we’d want to add a method for each Distribution that draws a certain number of samples (called sample or random_variate?), so we’d have a NormalDist.sample, BinomialDist.sample, and so on. Then the GAM.simulate could just call self.dist.sample(self.coef_, self.statistics['cov'], size=n_simulations)? I’m not sure yet how to best handle the link functions for these simulations…
As pointed out on pages 256–257 of [1], this procedure simulates the coefficients conditioned on the smoothing parameters, lambda (lam). To actually simulate from the coefficients, one may use bootstrap samples to get simulations of the coefficients and of the smoothing parameters; an example is given on page 257 of [1].
[1] S. Wood. Generalized Additive Models: An Introduction with R (First Edition). Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis, 2006.
Issue Analytics
- State:
- Created 6 years ago
- Reactions:1
- Comments:9 (4 by maintainers)

Top Related StackOverflow Question
@cbrummitt so awesome man!!
thanks for this 😃
I forked the repo and made a new branch
simulate-from-posterior. I added a methodsimulate_from_coef_posterior_conditioned_on_smoothing_parameters(a clunky, long name that should be improved) and added asamplemethod to each distribution using numpy. I have not gotten a chance to test thesesamplemethods and to make sure that I got all the parameters right, and in particular the scale parameter.• Normal:
np.random.normal(loc=mu, scale=standard_deviation, size=None)wherestandard_deviation = self.scale**0.5 if self.scale else 1.0. • Binomial:np.random.binomial(n=number_of_trials, p=success_probability, size=None)wherenumber_of_trials = self.levelsandsuccess_probability = mu / number_of_trials. • Poisson:np.random.poisson(lam=mu, size=None). • Gamma:np.random.gamma(shape=shape, scale=scale, size=None)whereshape = 1. / self.scaleandscale = mu / shape. • InvGaussian:np.random.wald(mean=mu, scale=self.scale, size=None).The
sizeparameters are allNoneso that the result has the same shape asmu.I also made
samplean abstract method. I’m not sure whether you’d like to makeDistributionan abstract base class.I haven’t yet tried implementing bootstrap samples to get random samples of the smoothing parameters, too.
Update: Fixed the arguments to the numpy methods in the list above.