sample_posterior_predictive interferes into the summary of the previous sample
See original GitHub issueFor example, if I use the following code:
with pm.Model() as model:
mu = pm.Normal('mu')
sigma = pm.HalfNormal('sigma')
pm.Normal('like',mu,sigma,observed=X)
trace_X = pm.sample()
Y_ = pm.Normal('Y_', mu, sigma)
Y = pm.Deterministic('Y',Y_ + 2)
ppc = pm.sample_posterior_predictive(trace_X, vars=[Y])
and require
az.summary(trace_X)
I would get an error:
TypeError Traceback (most recent call last)
<ipython-input-14-475fd4d1b7ee> in <module>
----> 1 az.summary(trace_X)
~/anaconda3/lib/python3.6/site-packages/arviz/stats/stats.py in summary(data, var_names, fmt, round_to, include_circ, stat_funcs, extend, credible_interval, order, index_origin)
843
844 """
--> 845 posterior = convert_to_dataset(data, group="posterior")
846 var_names = _var_names(var_names, posterior)
847 posterior = posterior if var_names is None else posterior[var_names]
~/anaconda3/lib/python3.6/site-packages/arviz/data/converters.py in convert_to_dataset(obj, group, coords, dims)
160 xarray.Dataset
161 """
--> 162 inference_data = convert_to_inference_data(obj, group=group, coords=coords, dims=dims)
163 dataset = getattr(inference_data, group, None)
164 if dataset is None:
~/anaconda3/lib/python3.6/site-packages/arviz/data/converters.py in convert_to_inference_data(obj, group, coords, dims, **kwargs)
81 return from_pystan(**kwargs)
82 elif obj.__class__.__name__ == "MultiTrace": # ugly, but doesn't make PyMC3 a requirement
---> 83 return from_pymc3(trace=kwargs.pop(group), **kwargs)
84 elif obj.__class__.__name__ == "EnsembleSampler": # ugly, but doesn't make emcee a requirement
85 return from_emcee(sampler=kwargs.pop(group), **kwargs)
~/anaconda3/lib/python3.6/site-packages/arviz/data/io_pymc3.py in from_pymc3(trace, prior, posterior_predictive, coords, dims)
224 posterior_predictive=posterior_predictive,
225 coords=coords,
--> 226 dims=dims,
227 ).to_inference_data()
~/anaconda3/lib/python3.6/site-packages/arviz/data/io_pymc3.py in to_inference_data(self)
208 **{
209 "posterior": self.posterior_to_xarray(),
--> 210 "sample_stats": self.sample_stats_to_xarray(),
211 "posterior_predictive": self.posterior_predictive_to_xarray(),
212 "prior": self.prior_to_xarray(),
~/anaconda3/lib/python3.6/site-packages/arviz/data/base.py in wrapped(cls, *args, **kwargs)
30 if all([getattr(cls, prop_i) is None for prop_i in prop]):
31 return None
---> 32 return func(cls, *args, **kwargs)
33
34 return wrapped
~/anaconda3/lib/python3.6/site-packages/arviz/data/io_pymc3.py in sample_stats_to_xarray(self)
104 name = rename_key.get(stat, stat)
105 data[name] = np.array(self.trace.get_sampler_stats(stat, combine=False))
--> 106 log_likelihood, dims = self._extract_log_likelihood()
107 if log_likelihood is not None:
108 data["log_likelihood"] = log_likelihood
~/anaconda3/lib/python3.6/site-packages/arviz/data/base.py in wrapped(cls, *args, **kwargs)
30 if all([getattr(cls, prop_i) is None for prop_i in prop]):
31 return None
---> 32 return func(cls, *args, **kwargs)
33
34 return wrapped
~/anaconda3/lib/python3.6/site-packages/arviz/data/base.py in wrapped(cls, *args, **kwargs)
30 if all([getattr(cls, prop_i) is None for prop_i in prop]):
31 return None
---> 32 return func(cls, *args, **kwargs)
33
34 return wrapped
~/anaconda3/lib/python3.6/site-packages/arviz/data/io_pymc3.py in _extract_log_likelihood(self)
81 chain_likelihoods = []
82 for chain in self.trace.chains:
---> 83 log_like = [log_likelihood_vals_point(point) for point in self.trace.points([chain])]
84 chain_likelihoods.append(np.stack(log_like))
85 return np.stack(chain_likelihoods), coord_name
~/anaconda3/lib/python3.6/site-packages/arviz/data/io_pymc3.py in <listcomp>(.0)
81 chain_likelihoods = []
82 for chain in self.trace.chains:
---> 83 log_like = [log_likelihood_vals_point(point) for point in self.trace.points([chain])]
84 chain_likelihoods.append(np.stack(log_like))
85 return np.stack(chain_likelihoods), coord_name
~/anaconda3/lib/python3.6/site-packages/arviz/data/io_pymc3.py in log_likelihood_vals_point(point)
73 log_like_vals = []
74 for var, log_like in cached:
---> 75 log_like_val = utils.one_de(log_like(point))
76 if var.missing_values:
77 log_like_val = log_like_val[~var.observations.mask]
~/anaconda3/lib/python3.6/site-packages/pymc3/model.py in __call__(self, *args, **kwargs)
1280 def __call__(self, *args, **kwargs):
1281 point = Point(model=self.model, *args, **kwargs)
-> 1282 return self.f(**point)
1283
1284 compilef = fastfn
~/anaconda3/lib/python3.6/site-packages/theano/compile/function_module.py in __call__(self, *args, **kwargs)
884 raise TypeError("Missing required input: %s" %
885 getattr(self.inv_finder[c], 'variable',
--> 886 self.inv_finder[c]))
887 if c.provided > 1:
888 restore_defaults()
TypeError: Missing required input: Y_
(see the gist here)
Expected behavior It worked as expected with previous pymc version 3.7
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
7.1 Posterior predictive checking | An Introduction to Bayesian ...
Posterior predictive checking involves comparing the observed data to simulated samples (or some summary statistics) generated from the posterior predictive ...
Read more >Statistical Rethinking: Chapter 3 Practice - AWS
These samples can be used to produce intervals, point estimates, posterior predictive checks, as well as other kinds of simulations. Posterior predictive ......
Read more >Posterior predictive checks
Posterior predictive checks. 2022-03-23. This tutorial explains how to check that the fitted model is compatible with the observed data.
Read more >Posterior Predictive Check - an overview | ScienceDirect Topics
Another approach to a posterior predictive check is to create a posterior predictive sampling distribution of a measure of discrepancy between the ...
Read more >Analysing Posterior Predictive Distributions with PyMC3
Posterior distributions allow for updating of prior beliefs through taking new evidence (or data) into account when generating such ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@aakhmetz I had some time to look into this. And the issue is actually not related to the one I thought, it is basically that this usecase is not supported by ArviZ.
In
from_pymc3
, ArviZ tries to extract all the information from the trace object and store it a multidimensional labeled arrays. One of the information it tries to extract is the log likelihood, which in pymc3 it can be computed by callingvar.logp_elemwise
on observed variables. In your caselike
is an observed variable. The issue is with the call signature of this function.logp_elemwise
expects as input a dictionary with keys as variable names and with values the value of each variable at some given draw. ArviZ tries to call it with the samples stored in the trace, however, as the trace does not containY_
norY
(which are now expected after including them in the model) it raises an error.I don’t know how to solve this though, I can only provide two workarounds.
The first is to convert the trace to inference data before modifying the model:
This has the advantage of including the proper log likelihood in the inference data object, I don’t know if you need this info.
The second option is to tweak with the model to make ArviZ believe there are no observed variables so it does not try to calculate likelihoods.
This will not make the log likelihood data available, but it would allow to include the posterior predictive samples in the inference data object:
Converting “manually” to inference data has the advantage of avoiding the conversion every single time you call an ArviZ function and it also allows to use custom named coordinates and dimensions and take advantage of all the groups. However, I think that none of the options above handles properly the group
constant_data
, I don’t think it is relevant though.@OriolAbril Thank you very much for your time! Yes, it looks reasonable. The command
az.from_pymc3
is something new for me, so I will try to read more.After your reply I realized that I also used separated blocks before:
which, I believe, should also work. But some time ago, I adopted my previous shortcut.
Thank you again for checking