if self.median_ != np.inf
check
cluster_col
is CoxPHFitter: https://lifelines.readthedocs.io/en/latest/Examples.html#correlations-between-subjects-in-a-cox-model. Another solution is to strata-ify per machine in the CoxPHFitter.
Disappointingly, 0.53 is a bit on the low end. Have you tried a LogNormalAFT - it can fit some models better.
What is the reference of the range of 0.55-0.7?
I think I saw it in Frank H. work, maybe his blog?
You can't compare CoxPH and WeibullAFT log likelihood values, no. Mostly because the CoxPH is a partial likelihood.
Hello. I'm reading the docs on coxph regression (https://lifelines.readthedocs.io/en/latest/Survival%20Regression.html).
a) the badfit image appears to be broken
b) in the rossi
dataset example provided, plotting the KM curve against the baseline hazards appears not to have a good spread. is there an included dataset that could be used for this example that would show a bigger spread, like the one in the goodfit picture?
kmf = lifelines.KaplanMeierFitter()
kmf.fit(rossi_dataset['week'],rossi_dataset['arrest'])
fig, ax = plt.subplots()
ax.plot(cox_prop_hazards.baseline_survival_,color='b')
ax.plot(kmf.survival_function_,color='r')
rossi
end-to-end for the regression documentation here—unless i'm misjudging the plot and this really is a good fit
I do not know how I can modify the output image provide by lifelines since I am unfamiliar with "cph.plot_covariate_groups". Unfortunately, there seems no detailed description about it in the link here - https://lifelines.readthedocs.io/en/latest/Survival%20Regression.html .
What I am looking for is, (1) how to shorten the event days (X axis), I do not want to show such a long days for the survival curve. Ideally, 4000 is the best. (2) Also, if possible, I would like to remove the baseline survival curve from my image. (3) I am also hoping if I could change the color of the survival curves from orange/blue to others.
Can anyone give me a kind feedback please?
@daturkel_twitter the rossi fit is so-so, partly because the model is so simple (no interaction terms, no higher-order terms, and some variables fail the proportional hazards test). Generally we shouldn't expect huge separation. That visual test is just one test you can use. Predictive performance is another (looking at the c-index), and looking at the log-likelihood ratio test as well.
What I suggest is to start with a baseline model, and then ask it questions to see if it improves the fit. Ex: do I satisfy the proportional hazards assumption? Does adding a quadratic term improve fit (and make sense)?