Using weighted data in proportional_hazard_test() for CoxPH
See original GitHub issueI’ve been comparing CoxPH results for R’s Survival and Lifelines, and I’ve noticed huge differences for the output of the test for proportionality when I use weights instead of repeated rows. The hazard ratio estimate and CI’s are very close, but the proportionality chisq is very different.
I’ve attached a csv (txt because Github) with sample data.
The R (Survival) code is:
> hr_data <- lapply(hr, as.numeric) #only need this when using weights
> res.cox <- coxph(Surv(daysToEvent, hasOutcome) ~ cohort, data = hr_data, weights = count, robust = FALSE)
> x <- cox.zph(res.cox, transform = "km")
> summary(res.cox)
> print(x)
The Lifelines code is:
cph = CoxPHFitter()
cph.fit(
df=hr_df, # hr data as a dataframe
duration_col="daysToEvent",
event_col="hasOutcome",
weights_col="count",
show_progress=False,
robust=False,
)
hazard_ratio = cph.hazard_ratios_.at["cohort"]
lower_ci = math.exp(cph.confidence_intervals_.values[0][0])
upper_ci = math.exp(cph.confidence_intervals_.values[0][1])
proportional_test = proportional_hazard_test(cph, hr_df, time_transform="km")
For the attached data, using weights, I get from Lifelines:
"hazardRatio": 0.5070670424582165,
"lower_ci": 0.2374970008403611,
"upper_ci": 1.0826115051454892
"chisq": 6.030892236208823,
"dof": 1.0,
"p": 0.014057625681369732
from R:
exp(coef) exp(-coef) lower .95 upper .95
cohort 0.5071 1.972 0.2375 1.083
chisq df p
cohort 30.3 1 3.7e-08
Whereas using a row per entry and no weights, I get Lifelines:
"hazardRatio": 0.48501838000070696,
"lower_ci": 0.22532011968933832,
"upper_ci": 1.0440382743576246
"chisq": 29.78133316357846,
"dof": 1.0,
"p": 4.836262532790787e-08
R:
exp(coef) exp(-coef) lower .95 upper .95
cohort 0.485 2.062 0.2253 1.044
chisq df p
cohort 31.7 1 1.8e-08
So the hazard ratio values and errors are in good agreement, but the chi-square for proportionality is way off when using weights in Lifelines (6 vs 30). thanks. hr.txt
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (3 by maintainers)
Top Results From Across the Web
coxphw: Weighted Estimation in Cox Regression
Cox Analysis of Survival Data with Non-Proportional Hazard Functions. J R ... regression (coxphw with template = "PH" or coxph) or weighted.
Read more >Weighted Cox Regression Using the R Package coxphw
Cox's regression model for the analysis of survival data relies on the proportional hazards assumption. However, this assumption is often violated in practice ......
Read more >Does the weight option in coxph function fit the weighted cox ...
No, using the weights gives you a weighted estimator rather than a weighted model. The model is still λ(t,z)=λ0(t)ezβ.
Read more >Testing the proportional hazard assumptions - lifelines
Testing the proportional hazard assumptions¶. This Jupyter notebook is a small tutorial on how to test and fix proportional hazard problems.
Read more >The Stratified Cox Proportional Hazards Regression Model
Once we stratify the data, we fit the Cox proportional hazards model ... The data set we'll use to illustrate the procedure of...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @CamDavidsonPilon , thanks for figuring this out. I can see how these numbers will be different from different regressors/implementations.
I guess tho from my perspective the more immediate issue was that using weighted vs unweighted data produced totally different results.
ack sorry, it’s a high priority but am stuck on it. I haven’t made much progress, unfortunately.