predict_survival_functiongives me a dataframe of dimensions
57 x 59, and doing that operation gives me an array of
59 x 57. it seems in the latter operation, the rows now correspond to the individuals (unit of analysis), and in the former, the columns correspond to the individuals. when i inspect (one row) the result of the latter, i see values like
1.63576020e+01which is over 100% (e.g. 16.36 * 100%). doesn't the former and latter all give me a value [0, 1]? the former is the probability of survival pass time t, and the latter the probability of death occurring at time t (given death has not occurred?)
spline, what's the difference between
predict_hazard? i would suspect that cumulative hazard is just the cumulative sum of hazard, however, when i try to match them up
cph_spline.predict_hazard(sdf).iloc[:, 0].cumsum()the outputs do not match
rossidataset. or if using some aggregation over the whole lifecycle of the machines sensor
Hi @CamDavidsonPilon - I am looking to add time varying sensor data as covariates to my existing CoxPH survival regression model (currently with static time to event data). I can do this following the instructions in "Time Varying Survival Regression" in lifelines documentation, correct? The only change I would need to make is the conversion of dataframe to the "long" format - is that right?
I also have a related question about predictions for this problem formulation. Does using rolling/lagged features for time varying covariates help in predictions at future times? Thanks for your time!
@NewsJunkie8590_twitter that's right, the dataset will be to be "long" (read the docs carefully though, as adding time-varying covariates is tricky).
Prediction with lagged features makes sense, up to the length of the lag. The trouble comes in what does your covariate matrix look like beyond known/observed times
bs(basis splines), and the next version has that.
bswas the biggest motivation for me to include formulas in lifelines in the first place - so I don't want to regress that feature for users.
Hi @CamDavidsonPilon! First of all, thanks for this awesome package.
I am running Cox PH models trying to evaluate potential interactions between covariates and treatment. Initially I was using a likelihood ratio test (model with interaction - model without interaction) to decide its importance. But I was concern about overfitting the model (I don't have a lot of subjects) and decide to repeat the analysis doing monte carlo CV and measure the importance again. I feel that the permutation analysis should be more robust, what do you think? Thanks a lot for your time and sorry if the question is a bit general.