question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unexpected behavior when using `pm.hpd` with multidimensional (ndim>2)posterior arrays

See original GitHub issue

Describe the bug I am using hpd on multi-dimensional posteriors with larger than 2 dimensions. One example of where this can come up is if I use a 2D array to represent the coefficients for the interaction between levels of 2 different predictors. The output trace of this 2D array of coefficients from PyMC3 will now be 3D, with the 0th dimension representing the MCMC samples.

I find that with a 3D trace/array, hpd treats dimension 1 as the MCMC dimension (instead of dimension 0, the actual MCMC dimension)

To Reproduce

#12000 MCMC samples from a 10x3 dimensional distribution
a = np.random.normal(size=(12000, 10, 3))
pm.hpd(a).shape

Output: (3, 12000, 2) Should instead be: (10, 3, 2)

This happens because the assumption in hpd is that ndim==2 if ndim>1: https://github.com/arviz-devs/arviz/blob/master/arviz/stats/stats.py#L346

for row in ary.T works properly for arrays that have 2 dimensions, but for anything larger than that, it infers the wrong MCMC dimension.

Suggestion: Will it make sense to have an axis parameter to specify the axis of the MCMC samples, much like say np.mean etc?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
percygautamcommented, Feb 17, 2020

I’ll like to work on this.

1reaction
narendramukherjeecommented, Oct 31, 2019

@AlexAndorra Yeah, I used the concatenation and transposition idea that you outlined for my use-case, but wondering if its good to have some sort of general structure in arviz that clears up the confusion that might be caused by multiple array dimensions.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Unexpected behavior of boolean mask applied to 2d numpy ...
I checked that the number of data values in the boolean and the masked two dimensional arrays are equivalent using np.count_nonzero and checking ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found