Generic ParametricUnivariate mixture cure modelSee original GitHub issue
Would it be of interest to have a ParametricUnivariateFitter that adds a ‘cure’ component to a specified non-cure model? This could operate in a manner similar to the flexsurvreg R package - wrapping the functions of some baseline fitter. I suggest it because I’ve found such a thing useful in my own work, when selecting which underlying model to use.
Here is a quickly put together example of what I mean. If there is interest, I’d be willing to get into a better shape, with guidance.
class MixtureCureModel(ParametricUnivariateFitter): def __init__(self, base_fitter, *args, **kwargs): self._base_fitter = base_fitter self._fitted_parameter_names = ["theta__"] + base_fitter._fitted_parameter_names self._bounds = [(0.0, 1.0)] + base_fitter._bounds super().__init__(*args, **kwargs) def _cumulative_hazard(self, params, times): theta__, lambda_, k_ = params sf = self._base_fitter._survival_function(params[1:], times) return -np.log(theta__ + (1 - theta__)*sf) def _create_initial_point(self, Ts, E, *args): self._base_fitter._censoring_type = self._censoring_type base_point = self._base_fitter._create_initial_point(Ts, E, *args) return np.array([.5] + list(base_point))
- Created 4 years ago
- Comments:7 (7 by maintainers)
Top GitHub Comments
Here are all the versions I’ve considered, with the assumption that long term you’ll want to parallel the capabilities of flexsurvcure, where they make sense. I think you’ve been trying to avoid having any mapping between string parameters and fitters, so there won’t be anything like dist=‘weibull’.
1. Class builder function: This would parallel the way that the sklearn_adapter function works. It takes a non-cure class and a few parameters, and returns a new class that the user then can instantiate.
def cure_model(base_fitter_klass: ParametricUnivariateFitter, link=expit, mixture=True) -> Type['MixtureCureFitter']: ... return new_klass MixtureCureLogLogistic = cure_model(LogLogisticFitter, link=lambda x: exp(-exp(x)), mixture=True) fitter = CureLogLogistic()
2. A Cure Model class with passed instance: A pair of MixtureCureModelFitter and NonMixtureCureModelFitter classes that take instances of a ParametricUnivariateFitter and are implemented in terms of that fitter. Changing things like the cure parameter link function can just be done via subclassing. This has the advantage of totally decoupling how the cure model portion and the base fitter are instantiated. I think it’s probably the easiest to implement approach, but that’s just a guess. It’s what I’ve actually used for my own work.
class MixtureCureModelFitter(ParametricUnivariateFitter): def __init__(self, base_fitter, *args, **kwargs): self._base_fitter = base_fitter ... super().__init__(*args, **kwargs) fitter = MixtureCureModelFitter(LogLogisticFitter())
3. A Cure Model Fitter class with passed class and arguments: A pair of MixtureCureModelFitter and NonMixtureCureModelFitter classes that have a parameter which takes a type that is a subclass of ParametricUnivariateFitter and any needed arguments. Then it instantiates the class itself as needed. I think this is probably a more awkward interface than the previous approach, but there is a lot of state stored in the fitters, and it would allow the Fitter to ‘reset’ the base_fitter if that becomes necessary.
class MixtureCureModelFitter(ParametricUnivariateFitter): def __init__(self, base_fitter, *args, **kwargs): self._base_fitter = base_fitter ... super().__init__(*args, **kwargs) fitter = MixtureCureModel(LogLogisticFitter())
4. Split apart fitters and equations: This is a more aggressive solution that involves a big refactoring of…everything. The portion of a class which represents the survival model equations could be separated from the actual mechanics of fitting the data. Then the equation for the non-cure portion could be wrapped in what I would expect is a straightforward way. I think this is closer to how the R packages do it, but I’m not sure that makes it correct.
Actually, that by itself doesn’t work well. The allowed parameter bounds and default value should probably depend on the specific link function to use (or some kind of future smart default would need to have an inverse function). Maybe the way sklearn scorer specification works would be a reasonable model: allow either a string to use something predefined, or allow an object with a specific interface. Maybe something as simple as a namedtuple that gives the desired function, its inverse, and default bounds.
For reference, here’s the code for R/flexsurvcure (link functions are dealt with about halfway down): https://rdrr.io/cran/flexsurvcure/src/R/flexsurvcure.R