@CamDavidsonPilon for what it's worth, a short snippet of a slightly misleading error involving pandas.DataFrame.apply that took me a day to debug
task: use Cox to predict event probability for censored items at the time of their current duration
import lifelines as ll import numpy as np import pandas as pd df = pd.DataFrame(np.random.randint(0, 100, size=(10, 2)), columns=['regressor', 'duration']) df['event'] = np.random.choice([True, False], 10) display(df) # uncomment to lose the bool and fix the TypeError #df['event'] = df['event'].astype(int) cf = ll.CoxPHFitter() cf.fit(df, duration_col='duration', event_col='event') # select only censored items df = df[df['event'] == 0] func = lambda row: cf.predict_survival_function(row[['regressor']], times=row['duration']) df.apply(func, axis=1)
'misleading' cause it will say the regressor column is non-numerical...
from lifelines import WeibullAFTFitter df['start_time'] = df['start_time'].map(map_to_seconds) df['sin_start_time'] = np.sin(2*np.pi*df['start_time']/seconds_in_day) df['cos_start_time'] = np.cos(2*np.pi*df['start_time']/seconds_in_day) df = df.drop('start_time', axis=1) wf = WeibullAFTFitter().fit(df, "duration") wf.predict_survival_function(df) wf.predict_median(df)
conditional_afterkwarg in the
predict_*methods as well
wf = WeibullAFTFitter().fit(df, "duration")exception throw
idcol in your model
from lifelines import WeibullAFTFitter from lifelines.datasets import load_rossi rossi_dataset = load_rossi() aft = WeibullAFTFitter() aft.fit(rossi_dataset, duration_col='week', event_col='arrest') X = rossi_dataset.loc[:10] aft.predict_survival_function(X)
@julianspaeth depends on the model. Recall that the c-index only depends on ranking of values. For the Cox model, the summing the cumulative hazard won't change the ranking, so it won't matter what you use. For an AFT model, it may change the ranking.
Alternatively, you can choose a point in time, and use the CHF at that