CoxTimeVaryingFitter is actually faster than CoxPHFitter...
See original GitHub issueCoxPHFitter
test
import pandas as pd
import time
from lifelines import CoxPHFitter
from lifelines.datasets import load_rossi
df = load_rossi()
df = pd.concat([df] * 20)
cp = CoxPHFitter()
start_time = time.time()
cp.fit(df, duration_col="week", event_col="arrest")
print("--- %s seconds ---" % (time.time() - start_time))
cp.print_summary()
takes about 2.3 seconds.
CoxTimeVaryingFitter
test
import time
import pandas as pd
from lifelines import CoxTimeVaryingFitter
from lifelines.datasets import load_rossi
from lifelines.utils import to_long_format
df = load_rossi()
df = pd.concat([df] * 20)
df = df.reset_index()
df = to_long_format(df, duration_col='week')
ctv = CoxTimeVaryingFitter()
start_time = time.time()
ctv.fit(df, id_col="index", event_col="arrest", start_col="start", stop_col="stop")
time_took = time.time() - start_time
print("--- %s seconds ---" % time_took)
ctv.print_summary()
takes about 1.65 seconds.
Note that the datasets between the two are identical. Even the results are identical (as expected). The internal differences are that CoxPHFitter
looks at each row individually, while the CoxTimeVaryingFitter
looks at all rows grouped by duration. The latter is much more efficient when there are lots of ties (i.e. when cardinality / row count
is low).
This is kinda shocking to me. It means I can improve CoxPHFitter
performance by like 30%.
Issue Analytics
- State:
- Created 5 years ago
- Reactions:1
- Comments:7 (7 by maintainers)
Top Results From Across the Web
Using Statistics to Make Statistics Faster - Data Origami
The first class, CoxTimeVaryingFitter , is used for time-varying datasets. ... was actually faster than my simpler CoxPHFitter model.
Read more >CoxPHFitter — lifelines 0.27.4 documentation - Read the Docs
This class implements fitting Cox's proportional hazard model. The baseline hazard, h0(t) can be modeled in two ways: 1.
Read more >lifelines Documentation - Read the Docs
The dataset for regression models is different than the datasets above. ... Instead of CoxPHFitter, we must use CoxTimeVaryingFitter instead ...
Read more >Predictions using CoxTimeVaryingFitter for survival analysis in ...
Simply put: you can't predict for epistemological reasons. Why is that? To predict, you must have time-varying X, that is, X(t).
Read more >https://raw.githubusercontent.com/CamDavidsonPilon...
New features - `CoxPHFitter` and `CoxTimeVaryingFitter` has support for an ... but we are now about the same or slighty faster than the...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Down to ~0.17 with #609
Down to 0.67 with #595