CoxPH throws 'delta contains nan value(s). Convergence halted'
See original GitHub issueThere was a closed issue for this at https://github.com/CamDavidsonPilon/lifelines/issues/242 but it was closed based on the direction to ensure no columns had constant values. None of my columns have constant values (my data is attached compressed, github wouldn’t let me attach as a 2,000 row csv).
I’m able to fit an AalenAdditiveFilter to this data using the code below:
model = AalenAdditiveFitter()
model.fit(lifelines_df, duration_col='duration', event_col='event_observed')
but I get “ValueError: delta contains nan value(s). Convergence halted.” when fitting to a CoxPH model with similar code for CoxPH:
model = CoxPHFitter()
model.fit(lifelines_df, duration_col='duration', event_col='event_observed')
Thanks for this library!
Issue Analytics
- State:
- Created 6 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
Convergence halted due to matrix inversion problems - Stack ...
This is the error message I get ConvergenceError: Convergence halted due to matrix inversion problems. Suspicion is high collinearity. Please ...
Read more >Convergence Error with Lifelines CoxPHFitter when using ...
I want to evaluate my Cox model using cross validation for which lifelines package does not support. So I must use the sklearn...
Read more >lifelines Documentation - Read the Docs
Traditionally, survival analysis was developed to measure lifespans of individuals. An actuary or health professional.
Read more >How to use the lifelines.utils.ConvergenceError function ... - Snyk
LinAlgError): raise ConvergenceError( """Convergence halted due to matrix ... T, E, weights) raise ConvergenceError( """delta contains nan value(s).
Read more >survival.pdf
A coxph model that has a numeric failure may have undefined predicted values, in which case the concordance will be NULL.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Here was my work (this is using the latest lifelines, 0.12, released recently):
I got a lifelines warning about two columns, and my fitting failed with a NaN calc. However the warning tells us what the problem is:
So I dropped those:
Again, the fitting failed with a singular-matrix error. This implies some linear dependance.
Thanks Cam,
Yeah the 100% correlation between some columns (‘num7’, ‘num6’) is caused by my doing development on a very small amount of data. In my last comment, I had dropped the same columns you do above, but got stuck at
After upgrading to lifelines 0.12 (had latest conda-forge version, 0.11.2) and dropping the low-variance variables you dropped (‘dummy1’, ‘dummy7’), I now can reproduce your results. Those “low-variance” dummies were dummies that were almost always 0 in this split of the data, but not across all of my data.
With cross-validation I don’t always know in advance which dummy vars might be ‘low variance’ for a given data-split. I use a cross-validated pipeline and wanted to use a thinly-wrapped
lifelines
model as my estimator at the end of the pipeline. But I can’t dynamically drop whichever dummy-columns are low-variance at the end of my pipeline because the final-step (the model) needs to see the same columns infit
andpredict
. I’ll give this some more thought and maybe post a SO question if I can’t figure it out. Thanks again for the help and the library.Feel free to close this issue