question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Shape broadcast error (in PyMC) for model with

See original GitHub issue

Hello all,

Thanks again for this awesome package! I am having some trouble getting Bambi to build a model with a fairly complex set of terms (a spline term interaction with a categorical predictor, plus a three-way interaction between the same categorical predictor and two continuous, plus a group-specific effect). However the problem does not seem caused by Bambi but in PyMC - formulae will build the design matrices (but not convert to a pandas DataFrame) but PyMC throws a ValueError when trying to broadcast operands together. I will head to the PyMC repo if needed, but thought I would ask here first!

Here is a minrepex that mimics the data structure and the exact specification I am using, giving the same error, though the actual dataset is much bigger:

test_data = {'session': np.repeat([f'pid_{x}' for x in np.arange(0, 10)], 6),  
'state': np.tile(['lonely', 'depressed', 'hopeful', 'stressed', 'positive', 'isolated'], 10),
'level': np.repeat(np.random.normal(0, 1, size=5), 12),
'trait': np.repeat(np.random.normal(0, 1, size=10), 6),
'time': np.repeat(np.arange(0, 5), 12),
'rating': np.random.randint(1, 6, size=60)}

test_data = (pd.DataFrame(test_data)
             .assign(state = lambda x: x['state'].astype('category'),
                     session = lambda x: x['session'].astype('category'))
            )

tester_mod = bmb.Model('rating ~ bs(time, degree=2, df=3) * state + state * level * trait + (1|session)', data=test_data)
tester_mod.build()

I get a ValueError from PyMC - operands could not be broadcast together with shapes (5,) (15,). I can’t think why this is happening, but the offending term seems to be the bs(time, degree=2, df=3) * state term - dropping the interaction leads to the model being built.

Is this related to the coords Bambi/formulae give categorical predictors? I coded my own model in PyMC to work around this using patsy, and I noted that the intercept term (and thus one of the state level predictions) has basically no variability after estimation, and I am guessing Bambi does something clever behind the scenes to negate this! Any help hugely appreciated!

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
tomicaprettocommented, Apr 27, 2022

@alexjonesphd I’m sorry about that issue. And yes, please open an issue in formulae to see if we can solve it.

1reaction
tomicaprettocommented, Apr 26, 2022

@alexjonesphd this was a problem with how formulae was handling levels of interactions involving numerical terms with more than one column (as is in the case of a bs() call). If you install formulae from the development version, it should work now.

Do

pip uninstall formulae
pip install git+https://github.com/bambinos/formulae.git

It’s working on my side now

image

Read more comments on GitHub >

github_iconTop Results From Across the Web

Shape broadcast error in sample_prior_predictive · Issue #3481
When trying to sample the prior predictive -- for a model which can be successfully sampled using NUTS, and whose resulting trace can...
Read more >
Hierarchical Model broadcast error - PyMC Discourse
I am trying to build a hierarchical model, where records are divided into two categories: score classes and topics.
Read more >
PyMC3 failing to broadcast correct dimensions for inference
This code produces the following error: ValueError: Input dimension mis-match. (input[0].shape[0] = 5, input[1].shape[0] = 3) .
Read more >
PyMC3 shape handling - Luciano Paz
If the distribution cannot broadcast with the supplied size , then an error is raised (it's actually more pedantic than simple broadcasting). # ......
Read more >
pymc-devs/pymc - Gitter
For a model I would typically take 50-100K samples in Metripolis, ... I think currently you need to supply shape only if you...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found