question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Design how we're going to extend Bambi

See original GitHub issue

The following is a list of features we’re missing (or covering only partially) in Bambi

  • Distributional models (we model more than the mean parameter of the response)
  • Multivariate models (ie the response is a multivariate distribution)
  • Non-linear models.
  • Survival models/Models with censored data.
  • Ordinal models.
  • Zero and Zero-One inflated models.

The last three points (survival/censored, ordinal, and zero/zero-one inflated) are covered by the first points (distributional and multivariate) if we implement them appropriately. The third point, non-linear models, is a separate problem. I’ll try to add a couple of things I’ve been thinking about lately.


Distributional models

Some API proposals

formula = bmb.formula(
    "y ~ a + b",
    "sigma ~ a",
)
priors = {
    "a": bmb.Prior("Normal", mu=0, sigma=1),
    "b": bmb.Prior("Normal", mu=0, sigma=1),
    "sigma_Intercept": bmb.Prior("Normal", mu=0, sigma=1),
    "sigma_x": bmb.Prior("Normal", mu=0, sigma=1)
}
link = {"mu": "identity", "sigma": "log"}
model = bmb.Model(formula, data, priors, link)
  • We need a formula object where we can have multiple formula parts. I propose to call it bmb.formula(). There’s an open discussion in #423.
  • We need a name for the terms associated with the auxiliary parameters. I propose to use {param}_{term} such as sigma_x.
  • We need a transformation of the linear predictor of the auxiliary parameters into something that makes sense. I propose we have defaults for the built-in families that can be overridden with a dictionary. Note a dictionary is not supported by the link argument in Model now.

I haven’t thought much more about the implementation details, where other concerns may appear. For the moment, I think it’s good to discuss about the API we want. Any objections, any suggestions, any drawbacks I’m not seeing?

Multivariate models

We currently support some multivariate families, such as "categorical" and "multinomial". I feel we should think more about the implementation. I think we could make it more general so we don’t need to handle all cases as special cases. With that said, I think there are other things to discuss.

  • What do we use to indicate a multivariate response?
"c(y1, y2, ..., yn) ~ ..."
"mvbind(y1, y2, ..., yn) ~ ..."
bmb.formula("y1 ~ ...", "y2 ~ ...", "y3 ~ ...")

note the last alternative allows for different predictors to be included in each case.

  • How much do we want to support multivariate families?

I’m not an expert in this area but I have the feeling that things can get very complex very quickly. And I’m not sure if this is a highly required feature.

For now, I tend to think we should have minimum support that allows people and us to explore the possibilities available as well as refine the API.

Non-linear models

This has been discussed a little here #448. I think it’s a very nice to have feature but I don’t have it solved in my mind yet. The only thing I have are some API proposals, but I don’t see how to implement them without a huge effort.

First:

formula = bmb.formula(
    "y ~ b1 * np.exp(b2 * x)",
    nlpars=("b1", "b2")    
)

But this comes with a major problem, how do we override the meaning of the * operator in the formula syntax? If we pass something like that to formulae, it won’t multiply things by b1 or b2, it will try to construct full interaction terms between the operands. I like how this approach looks but it would require a huge amount of effort to parse terms and parameters.

Another alternative would be to use a function.

def f(x, b1, b2):
    return b1 * np.exp(b2 * x)

formula = bmb.formula(
    "y ~ f(x, b1, b2)",
    nlpars=("b1", "b2")    
)

This would work on the formulae side, but again we would need to do parsing stuff to grab the non-linear relationship between the parameters (b1 and b2) and the predictor x. How do we handle arbitrarily complex functions? I’m not sure.

Survival models/Models with censored data.

#543 adds support for survival analysis with right-censored data. One drawback of the proposal is that family="exponential" always implies right-censored data. I think we should have something more general.

I imagine all the following cases working

bmb.Model("y ~ ...", data, family="exponential")
bmb.Model("censored(y, status) ~ ...", data, family="exponential")
bmb.Model("censored(y, status, 'left') ~ ...", data, family="exponential")

The challenge is that censored() should be a function that returns an array-like structure (so formulae knows how to handle it) with some attribute that enables Bambi to figure out the characteristics of the censoring. I’m not sure how to implement this but I know it’s feasible.

Ordinal models and Zero and Zero-One inflated models.

I think these ones come almost for free if we do a good job with the tasks above.

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:9

github_iconTop GitHub Comments

2reactions
canyon289commented, Jul 15, 2022

Other than the technical implementation frankly I dont think itll be all that usefull and theres not a huge userbase for it. If people want non linear models they can just use PyMC to code those up.

The other use cases imo are much easier to implement in Bambi and will have a wider userbase.

1reaction
zwelitunyiswacommented, Oct 22, 2022

Nice. I like that structure - very clear. Distributional models are a very cool addition!

On Sat, Oct 22, 2022 at 15:08 Tomás Capretto @.***> wrote:

I have new ideas for distributional models

Instead of this

formula = bmb.formula( “y ~ a + b”, “sigma ~ a”, )priors = { “a”: bmb.Prior(“Normal”, mu=0, sigma=1), “b”: bmb.Prior(“Normal”, mu=0, sigma=1), “sigma_Intercept”: bmb.Prior(“Normal”, mu=0, sigma=1), “sigma_x”: bmb.Prior(“Normal”, mu=0, sigma=1) }link = {“mu”: “identity”, “sigma”: “log”}model = bmb.Model(formula, data, priors, link)

have this (notice the priors)

formula = bmb.formula( “y ~ a + b”, “sigma ~ a”, )priors = { “y”: { “a”: bmb.Prior(“Normal”, mu=0, sigma=1), “b”: bmb.Prior(“Normal”, mu=0, sigma=1), }, “sigma”: { “Intercept”: bmb.Prior(“Normal”, mu=0, sigma=1), “a”: bmb.Prior(“Normal”, mu=0, sigma=1) } }link = {“mu”: “identity”, “sigma”: “log”}model = bmb.Model(formula, data, priors, link)

It adds more structure and prevents us from having to parse strings to decide to which response the prior corresponds to. Also, “_” is very common in variable names, so it’s highly likely we get it wrong.

— Reply to this email directly, view it on GitHub https://github.com/bambinos/bambi/issues/544#issuecomment-1287886498, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH3QQVY2LAPXLAPSQZY5YBLWEQ3SZANCNFSM53HNB7OQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Read more comments on GitHub >

github_iconTop Results From Across the Web

Add priors to custom functions parameters · Issue #448 - GitHub
Hi bambinos, First thanks for this great package, really useful and the backend part is awesome. I would love to add priors to...
Read more >
The Chinese 'Paper Son' Who Inspired The Look Of Disney's ...
Tyrus Wong's expressive paintings caught Walt Disney's eye and became the visual guide for Bambi. Born in China, Wong — now 104 —...
Read more >
How Bambi changed Disney's animation | ACMI
From its vivid animation to the coming-of-age story at its centre, Bambi changed the way Disney approached animation and established one of ...
Read more >
HOW I RENOVATED A VINTAGE AIRSTREAM BAMBI CAMPER
In todays video I show the remaining clips form when I put together the 1961 Airstream Bambi 16' travel trailer. The next video...
Read more >
“Make history”: Wally Byam, design pioneer - Airstream.com
Wally Byam wrote "Make history" in 1916 before going on to design the ... The Airstream Torpedo Car Cruiser was a teardrop design...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found