WAIC/LOO for models with multiple observed variables
See original GitHub issue(Related issue: #794 - cc @rpgoldman)
First, thank you for this great library!
I was told by @ahartikainen here that multiple observations were not supported in arviz (at least for numpyro and pyro, but I guess for other libraries like PyMC3 as well).
My question is: to go from 1 observation to multiple observations , isn’t it enough to sum all the log_likelihoods of observed variables to compute the waic and loo?
Indeed, the WAIC and LOO are computed based on where
are all the examples (observations) and
are all the samples from the posterior of all parameters and hidden variables.
- In the case where
are conditionally independent given all the hidden variables and parameters, then we can use
and use this quantity as the likelihood of our observation. (Correct me if I’m mistaken)
- When there are some dependencies between the observed variables, then it seems to me that it still holds - maybe except in the case where there are circular dependencies between variables.
Below I write examples of numpyro models:
for one observation:
def eight_schools(J, sigma, y=None):
mu = numpyro.sample('mu', dist.Normal(0, 5))
tau = numpyro.sample('tau', dist.HalfCauchy(5))
with numpyro.plate('J', J):
theta = numpyro.sample('theta', dist.Normal(mu, tau))
numpyro.sample('obs', dist.Normal(theta, sigma), obs=y)
for several observations which are conditionally independent:
def eight_schools(J, sigma, y1=None, y2=None):
mu = numpyro.sample('mu', dist.Normal(0, 5))
tau = numpyro.sample('tau', dist.HalfCauchy(5))
with numpyro.plate('J', J):
theta = numpyro.sample('theta', dist.Normal(mu, tau))
numpyro.sample('obs1', dist.Normal(theta, sigma), obs=y1)
numpyro.sample('obs2', dist.Normal(theta, sigma), obs=y2)
for several observations which are not conditionally independent:
def eight_schools(J, sigma, y1=None, y2=None):
mu = numpyro.sample('mu', dist.Normal(0, 5))
tau = numpyro.sample('tau', dist.HalfCauchy(5))
with numpyro.plate('J', J):
theta = numpyro.sample('theta', dist.Normal(mu, tau))
obs1 = numpyro.sample('obs1', dist.Normal(theta, sigma), obs=y1)
numpyro.sample('obs2', dist.Normal(obs1, sigma), obs=y2)
Issue Analytics
- State:
- Created 4 years ago
- Comments:19 (10 by maintainers)
I think we should have multiple options for different methods. LFO needs also possibility to resample (there are code to do this in pystan, but we really need to refine InferenceData class to enable this).
We should have same default as in loo2.
Closing this as the original issue has been fixed. For usage and interpretation questions on loo/waic, please ask on stan or pymc discourse forums.