Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

az.from_cmdstanpy fails from latest version of cmdstanpy (v0.9.68) for 2d params

See original GitHub issue

Describe the bug

After updating to the latest version of cmdstanpy, the az.from_cmdstanpy fails when dims provided had length > 1.

The error message is :

ValueError: different number of dimensions on data and dims: 3 vs 4

The stacktrace is provided below.

The problem appears to be that az.data.io_cmdstanpy._unpack_fit returns a flattened result for each parameter, rather than mirroring the shape of the parameter along the lines of what cmdstanpy.CmdStanMCMC.stan_variable does. This does not cause an error for scalar or 1d parameters, but does cause an error for 2+d parameters where the dims are provided.

To Reproduce

See this gist

Expected behavior The InferenceData object should be created from a CmdStanMCMC object.

Additional context

Relevant parts of the stacktrace:

~/projects/workflow2/workflow/models/stanmodel.py in prepare_inference_data(cls, fit, coords, prior_fit, stan_data, **kwargs)
    244         if prior_fit is not None:
    245             input_args = dict(prior=prior_fit, **input_args)
--> 246         idata = az.from_cmdstanpy(**input_args)
    247         # add information about the stan model class, etc.
    248         run_id = cls._get_run_id(idata)

~/.local/share/virtualenvs/workflow2-PgZLfFHB/src/arviz/arviz/data/io_cmdstanpy.py in from_cmdstanpy(posterior, posterior_predictive, predictions, prior, prior_predictive, observed_data, constant_data, predictions_constant_data, log_likelihood, coords, dims, save_warmup)
    657     InferenceData object
    658     """
--> 659     return CmdStanPyConverter(
    660         posterior=posterior,
    661         posterior_predictive=posterior_predictive,

~/.local/share/virtualenvs/workflow2-PgZLfFHB/src/arviz/arviz/data/io_cmdstanpy.py in to_inference_data(self)
    333             save_warmup=self.save_warmup,
    334             **{
--> 335                 "posterior": self.posterior_to_xarray(),
    336                 "sample_stats": self.sample_stats_to_xarray(),
    337                 "posterior_predictive": self.posterior_predictive_to_xarray(),

~/.local/share/virtualenvs/workflow2-PgZLfFHB/src/arviz/arviz/data/base.py in wrapped(cls, *args, **kwargs)
     44                 if all([getattr(cls, prop_i) is None for prop_i in prop]):
     45                     return None
---> 46             return func(cls, *args, **kwargs)
     47 
     48         return wrapped

~/.local/share/virtualenvs/workflow2-PgZLfFHB/src/arviz/arviz/data/io_cmdstanpy.py in posterior_to_xarray(self)
     93 
     94         return (
---> 95             dict_to_dataset(data, library=self.cmdstanpy, coords=coords, dims=dims),
     96             dict_to_dataset(data_warmup, library=self.cmdstanpy, coords=coords, dims=dims),
     97         )

~/.local/share/virtualenvs/workflow2-PgZLfFHB/src/arviz/arviz/data/base.py in dict_to_dataset(data, attrs, library, coords, dims, skip_event_dims)
    238     data_vars = {}
    239     for key, values in data.items():
--> 240         data_vars[key] = numpy_to_data_array(
    241             values, var_name=key, coords=coords, dims=dims.get(key), skip_event_dims=skip_event_dims
    242         )

~/.local/share/virtualenvs/workflow2-PgZLfFHB/src/arviz/arviz/data/base.py in numpy_to_data_array(ary, var_name, coords, dims, skip_event_dims)
    197     # filter coords based on the dims
    198     coords = {key: xr.IndexVariable((key,), data=coords[key]) for key in dims}
--> 199     return xr.DataArray(ary, coords=coords, dims=dims)
    200 
    201 

~/.local/share/virtualenvs/workflow2-PgZLfFHB/lib/python3.8/site-packages/xarray/core/dataarray.py in __init__(self, data, coords, dims, name, attrs, indexes, fastpath)
    401             data = _check_data_shape(data, coords, dims)
    402             data = as_compatible_data(data)
--> 403             coords, dims = _infer_coords_and_dims(data.shape, coords, dims)
    404             variable = Variable(dims, data, attrs, fastpath=True)
    405             indexes = dict(

~/.local/share/virtualenvs/workflow2-PgZLfFHB/lib/python3.8/site-packages/xarray/core/dataarray.py in _infer_coords_and_dims(shape, coords, dims)
    119         dims = tuple(dims)
    120     elif len(dims) != len(shape):
--> 121         raise ValueError(
    122             "different number of dimensions on data "
    123             "and dims: %s vs %s" % (len(shape), len(dims))

ValueError: different number of dimensions on data and dims: 3 vs 4