Question: Massage InferenceData for multi-variable model comparison?
See original GitHub issueShort Description
I have a PyMC3 model that is partitioned into 6 sub-models. The observations are also partitioned into 6 subsets. This lets me apply different parameters based on the values of independent variables without complex indexing. I can’t use mixing because there is no distribution over the independent variables.
I know model comparison only works for models with a single observed RV. But I was wondering: is there some way to “unpartition” the InferenceData
so that Arviz can treat the six vectors of observations as one big vector, and compute model comparison metrics?
I can compare the submodels individually, but this does not properly take into account the hyperparameters that link them.
Issue Analytics
- State:
- Created 4 years ago
- Comments:8 (8 by maintainers)
Top Results From Across the Web
A question-entailment approach to question answering
First, we compare logistic regression and deep learning methods for RQE using different kinds of datasets including textual inference, question ...
Read more >A comparison of multivariable regression models to analyse ...
Method: This paper compares the appropriateness of various multivariable models of cost data by examining regression diagnostics, using as an example data ...
Read more >Inference for multivariate normal hierarchical models
This paper provides a new method and algorithm for making inferences about the parameters of a two-level multivariate normal hierarchical ...
Read more >A Review on Deep Learning Techniques Applied to Answer ...
The representations of the input sentences are then compared at several granularities using multiple similarity metrics. Finally, the comparison ...
Read more >arXiv:1810.09774v3 [cs.CL] 31 May 2019
NLI models using datasets where label preserving swapping operations have been applied, reporting significant performance drops compared to ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@ahartikainen I would like to compare the models as a whole (rather than comparing the submodels), so I think I want to do what @OriolAbril suggests: effectively concatenate all the observed random variables into one big random variable. This is reasonable, because the observations are interchangeable by the model.
After combining all the log_likelihood data into a single array (stored in
sample_stats.log_likelihood
), waic and loo will calculate the IC assuming all observations are conditionally independent (or independent, I am not completely sure, I have not found time to do the math), waic and loo already worked with n-dimensional arrays. If this is your case, then the results will be correct, otherwise you would have to implement the correct version of the algorithm or wait a little still.Note: ArviZ currently looks first at sample_stats to raise a warning if log likelihood is still there, therefore, doing this will only get a deprecation warning and not the annoying
"Found several log likelihood arrays {}, var_name cannot be None"
error.