Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add method for simulating from the posterior (or just add an example to the documentation)

See original GitHub issue

Estimating the mean and confidence intervals (using prediction_intervals) is great. In some cases, it can be useful to simulate from the posterior distribution of the model’s coefficients. An example is given in pages 242–243 of [1].

I think the following code snippet does the trick for a LinearGAM:

def simulate_from_posterior(linear_gam, X, n_simulations):
    """Simulate from the posterior of a LinearGAM a certain number of times.

    Inputs
    ------
    linear_gam : pyGAM.LinearGAM

    X : array of shape (n_samples, n_features)

    n_simulations : int
        The number of simulations from the posterior to compute

    Returns
    -------
    simulations : array of shape (n_samples, n_simulations)
    """
    beta_replicates = np.random.multivariate_normal(
        linear_gam.coef_, linear_gam.statistics_['cov'], size=n_simulations)
    return linear_gam._modelmat(X).dot(beta_replicates.T)

I’m not sure if this should be added as an example in the documentation or added to the code (or both).

To implement this in general, I think we’d want to add a method for each Distribution that draws a certain number of samples (called sample or random_variate?), so we’d have a NormalDist.sample, BinomialDist.sample, and so on. Then the GAM.simulate could just call self.dist.sample(self.coef_, self.statistics['cov'], size=n_simulations)? I’m not sure yet how to best handle the link functions for these simulations…

As pointed out on pages 256–257 of [1], this procedure simulates the coefficients conditioned on the smoothing parameters, lambda (lam). To actually simulate from the coefficients, one may use bootstrap samples to get simulations of the coefficients and of the smoothing parameters; an example is given on page 257 of [1].

[1] S. Wood. Generalized Additive Models: An Introduction with R (First Edition). Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis, 2006.

Issue Analytics

State:
Created 6 years ago
Reactions:1
Comments:9 (4 by maintainers)

Top GitHub Comments

2reactions

dswahcommented, Dec 10, 2017

@cbrummitt so awesome man!!

thanks for this 😃

1reaction

cbrummittcommented, Jul 31, 2017

I forked the repo and made a new branch simulate-from-posterior. I added a method simulate_from_coef_posterior_conditioned_on_smoothing_parameters (a clunky, long name that should be improved) and added a sample method to each distribution using numpy. I have not gotten a chance to test these sample methods and to make sure that I got all the parameters right, and in particular the scale parameter.

• Normal: np.random.normal(loc=mu, scale=standard_deviation, size=None) where standard_deviation = self.scale**0.5 if self.scale else 1.0. • Binomial: np.random.binomial(n=number_of_trials, p=success_probability, size=None) where number_of_trials = self.levels and success_probability = mu / number_of_trials. • Poisson: np.random.poisson(lam=mu, size=None). • Gamma: np.random.gamma(shape=shape, scale=scale, size=None) where shape = 1. / self.scale and scale = mu / shape. • InvGaussian: np.random.wald(mean=mu, scale=self.scale, size=None).

The size parameters are all None so that the result has the same shape as mu.

I also made sample an abstract method. I’m not sure whether you’d like to make Distribution an abstract base class.

I haven’t yet tried implementing bootstrap samples to get random samples of the smoothing parameters, too.

Update: Fixed the arguments to the numpy methods in the list above.

Top Results From Across the Web

Simulating from a constructed posterior distribution

The VoseDiscrete function defines a discrete distribution where the allowed values are given by the {x} array and the relative likelihood of each...

26.1 Simulating from the posterior predictive distribution - Stan

The posterior predictive distribution is the distribution over new observations given previous observations. It's predictive in the sense that it's predicting ...

10.1 Introduction to JAGS - Bookdown

Compile the model in JAGS; Simulate values from the posterior distribution; Summarize simulated values and check diagnostics. This section introduces a brief ...

Chapter 6 Approximating the Posterior

Implement and examine the limitations of using grid approximation to simulate a posterior model. Explore the fundamental properties of MCMC posterior simulation ......

Carlo Simulation Method - an overview | ScienceDirect Topics

Monte Carlo simulation methods can be used to numerically evaluate expectations of functions of random variables (e.g., posterior moments of parameters of ...