Add method for simulating from the posterior (or just add an example to the documentation)
See original GitHub issueEstimating the mean and confidence intervals (using prediction_intervals
) is great. In some cases, it can be useful to simulate from the posterior distribution of the model’s coefficients. An example is given in pages 242–243 of [1].
I think the following code snippet does the trick for a LinearGAM
:
def simulate_from_posterior(linear_gam, X, n_simulations):
"""Simulate from the posterior of a LinearGAM a certain number of times.
Inputs
------
linear_gam : pyGAM.LinearGAM
X : array of shape (n_samples, n_features)
n_simulations : int
The number of simulations from the posterior to compute
Returns
-------
simulations : array of shape (n_samples, n_simulations)
"""
beta_replicates = np.random.multivariate_normal(
linear_gam.coef_, linear_gam.statistics_['cov'], size=n_simulations)
return linear_gam._modelmat(X).dot(beta_replicates.T)
I’m not sure if this should be added as an example in the documentation or added to the code (or both).
To implement this in general, I think we’d want to add a method for each Distribution
that draws a certain number of samples (called sample
or random_variate
?), so we’d have a NormalDist.sample
, BinomialDist.sample
, and so on. Then the GAM.simulate
could just call self.dist.sample(self.coef_, self.statistics['cov'], size=n_simulations)
? I’m not sure yet how to best handle the link functions for these simulations…
As pointed out on pages 256–257 of [1], this procedure simulates the coefficients conditioned on the smoothing parameters, lambda (lam
). To actually simulate from the coefficients, one may use bootstrap samples to get simulations of the coefficients and of the smoothing parameters; an example is given on page 257 of [1].
[1] S. Wood. Generalized Additive Models: An Introduction with R (First Edition). Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis, 2006.
Issue Analytics
- State:
- Created 6 years ago
- Reactions:1
- Comments:9 (4 by maintainers)
Top GitHub Comments
@cbrummitt so awesome man!!
thanks for this 😃
I forked the repo and made a new branch
simulate-from-posterior
. I added a methodsimulate_from_coef_posterior_conditioned_on_smoothing_parameters
(a clunky, long name that should be improved) and added asample
method to each distribution using numpy. I have not gotten a chance to test thesesample
methods and to make sure that I got all the parameters right, and in particular the scale parameter.• Normal:
np.random.normal(loc=mu, scale=standard_deviation, size=None)
wherestandard_deviation = self.scale**0.5 if self.scale else 1.0
. • Binomial:np.random.binomial(n=number_of_trials, p=success_probability, size=None)
wherenumber_of_trials = self.levels
andsuccess_probability = mu / number_of_trials
. • Poisson:np.random.poisson(lam=mu, size=None)
. • Gamma:np.random.gamma(shape=shape, scale=scale, size=None)
whereshape = 1. / self.scale
andscale = mu / shape
. • InvGaussian:np.random.wald(mean=mu, scale=self.scale, size=None)
.The
size
parameters are allNone
so that the result has the same shape asmu
.I also made
sample
an abstract method. I’m not sure whether you’d like to makeDistribution
an abstract base class.I haven’t yet tried implementing bootstrap samples to get random samples of the smoothing parameters, too.
Update: Fixed the arguments to the numpy methods in the list above.