question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

InferenceData extensions [Discussion]

See original GitHub issue

InferenceData objects are central to ArviZ, and even though a common subset of tasks using InferenceData can be done directly with ArviZ plotting and stats functions, any task that deviates from this becomes more and more convoluted and long.

The aim of this issue is to start a discussion about new capabilities to add to InferenceData and generate a proposal (which will be added to xarray_examples for discussion with xarray team).

I also think there are several groups of functions, if it may help start brainstroming or generating different proposals per group. Ideas on all levels are welcome!

Straightforward extensions to xr.Dataset methods

.sel is a good example of this. I think several methods could fit in this category and very roughly follow a similar pattern:

def idata_extension(self, groups, ... , **kwargs):
    for group in groups:
        if group not in self._groups:
            raise Error
        # some kind of check to make method as convenient as possible
        # an example is sel using only the dimensions present in current group to index
        dataset = getattr(self, group)
        setattr(self, group, datasel.method(**kwargs)

In addition to groups we should think about other ArviZ specific args, common in most functions and not passed to xarray. Maybe inplace and/or copy?

Also, groups could accept groups and some metagroups so that one keyword represents several proper groups. We could go as far as adding the metagroups dict in rcParams. One metagroup example could be "posteriors" -> ("posterior", "sample_stats", "log_likelihood", "posterior_predictive")

Some ideas of functions that could fit in this category are:

  • .isel
  • stack and unstack
  • rename, rename_dims and rename_vars
  • .load and other dask related methods like chunk would be interesting after ArviZ starts becoming Dask friendly.

Many dataset methods make sense to extend, so I think we should focus on the ones that solve more issues on our side. For example, if we make an extension to apply_ufunc compatible with inference data or extend the map method, the mean, median, max… are not really necessary, only convenient, whereas other methods may have no alternative.

Commenting the ones you expect to use the most seems like a good start to choose where to begin with.

Specific inference data methods

This category requires a much more detailed and custom implementation. Some examples that would fall here are:

  • InferenceData html repr
  • InferenceData method to print all dims, coords and variables in all groups at once. Maybe add option to show values? In general no values to make it readable. In jupyter may not have much sense because html could already cover this, but for terminal-like environments it would.
  • Extension to xr.where to select from one group with a condition on another, ideally similar to pandas query function
  • InferenceData compatible apply_ufunc to apply the same transform to several groups, e.g. shift and rescale all values in prior_predictive, posterior_predictive, predictions and observed_data (could also be an extension to map but apply_ufunc should be more versatile)

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

4reactions
percygautamcommented, Mar 1, 2020

Regarding specific inference data method : InferenceData html repr

I have been working on this for some time. The possible implementation is https://dfm.io/nbview/?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpercygautam%2Farviz-examples%2Fmaster%2FHTML%2520repr.ipynb . I used xarray’s implementation as reference.

1reaction
OriolAbrilcommented, Mar 1, 2020

This got me thinking, should inference data objects have a name attribute?

Read more comments on GitHub >

github_iconTop Results From Across the Web

InferenceData schema specification
The InferenceData schema approach defines a data structure compatible with NetCDF. ... There are also some extensions particular to the InferenceData case.
Read more >
Perceived Variability and Inferences About Brand Extensions
ABSTRACT - Recent research on consumer reactions to brand extensions has focused on the judgmental effects of the match between the attributes, benefits, ......
Read more >
A Complete Guide to Causal Inference - Towards Data Science
The Doubly Robust model is a slight extension to our discussion of using Propensity scores alongside our model. The Doubly Robust model is...
Read more >
Contrastive Mixture of Posteriors for Counterfactual Inference ...
rigorous foundation to discuss both fairness (Chiappa, 2019;. Kusner et al., 2017; ... Our approach to counterfactual inference, data integration.
Read more >
Bayesian Inference—Data Evaluation and Decisions (2nd ed.)
Bayesian Inference—Data Evaluation and Decisions. (2nd ed.) ... The following two chapters discuss the basics of Bayes's the- ... As an extension.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found