- Join over
**1.5M+ people** - Join over
**100K+ communities** - Free
**without limits** - Create
**your own community**

:wave: minor version of lifelines released: https://github.com/CamDavidsonPilon/lifelines/releases/tag/v0.22.1

Hi all. I've somewhat new to using lifelines, and in using the CoxPHFitter, when I run `check_assumptions`

, I end up with an error that reads as follows: `/RuntimeWarning: overflow encountered in exp scores = weights * np.exp(np.dot(X, self.params_))`

Any suggestions on dealing with this issue? I'm starting down the road of normalization, but I'm not sure if that's 100% correct.

Hi! I am currently trying to create mixed cure models using the lifelines fitter. I saw that there is an example code in the GitHub under experiments. I was going to use this as a starting point and then adjust accordingly but I am getting an error when I run that code saying: "AttributeError: 'CureModel' object has no attribute '_primary_parameter_name'

I don't have a full understanding of the input arguments for _cumulative_hazard so I am not sure what is causing this error. Thank you!

I don't have a full understanding of the input arguments for _cumulative_hazard so I am not sure what is causing this error. Thank you!

If not, try upgrading. Otherwise, if you are still getting the error, can you post the entire stack trace?

in reference to the subclass my computer doesn't recognize ParametricRegressionFitter as an option but it does recognize ParametericRegressionFitter - perhaps also because of the version?

Hi folks, does anyone have experience explaining concordance index to a nontechnical audience (like execs), or even devising an alternative method of presenting model accuracy? I don't think describing the model's predictions in terms of ordered pairs is likely to be of interest - they just want to know how accurate the model is in terms of customer retention/LTV.

@blissfulchar_twitter personally I like using the survival probabilities to calculate the CLV assuming contractual settings (e.g. Berry & Linof, 2004). I don't like it for a non-technical audience. I normally I try to link survival to CLV for execs. Anyone correct me if I am wrong but concordance is "global" index for validating the predictive ability of a survival model, representing how well the variables allow to predict the survival, e.g. observations with higher survival time has the higher probability of survival predicted by your model.

@blissfulchar_twitter we used the survival probabilities under each curve (cohort) and the monthly payment to calculate CLV. We didn’t used individual customer but customers grouped in the survival curves. This option as some limitations but gives us an idea for an estimated CLV. What you say should be very interesting. I think there are other approaches to calculate the predictions of individual CLV.

:wave: minor lifelines release. Important thing is that scipy 1.3 can be used with it now: https://github.com/CamDavidsonPilon/lifelines/releases/tag/v0.22.2

Hi, what is the best way to retrieve log likelihood of a fit? it is shown via 'model.print_summary()' but not via 'model.summary', which only shows a summary of the parameters.

I managed to get it via model._log_likelihood, but had to look into the source code for that.

Thanks and kudos for the library!

I managed to get it via model._log_likelihood, but had to look into the source code for that.

Thanks and kudos for the library!

Hi @aleva85, that currently is the best way, but you bring up a good observation that it’s not easy to find. Maybe in a future release I’ll promote it and document it well

Hi, sorry I was wondering if any maintainers/users of lifelines based in Europe would like to do tutorial/workshop or talk about this package in our conference Python in Pharma (PyPharma) in Basel? Apologies for the unrequested advertisement, I will delete it if this is a problem. The conference would be free to attend (under invites) and 100% volunteer run. It will take place in November 21-22 and our target is 100-150 attendees. We would really be happy if lifelines is represented at this event.

^ no need to apologize, this message welcome here. I would love to join, hopefully someone can take my place. There were a few Euro speakers of lifelines already: Linda Uruchurtu , Lorna Brightmore and Elena Sharova have all recently (past few years) given talks on lifelines. You can search for their videoes online.

Unrelated: :wave: new minor (but important) version of lifelines released: https://github.com/CamDavidsonPilon/lifelines/releases/tag/v0.22.3

@julianspaeth the only issue with pysurvival is the support for that isnt good as compared to lifelines.

After you fit a CoxPHFitter, try .baseline_cumulative*hazard*

ack that formatted wrong sorry

i just raised your `NotImplementedError`

for `conditional_after`

for CoxPH. it felt like running into a wall after reading about the new argument in the docs. i even bit the bullet and switched from the conda to the pip package ;)

your commit message does not sound to hopeful for that one, are you still working on it?

ps: still an awesome library

Hi @CamDavidsonPilon, I am new to survival analysis and am using it for trying to predict customer churn. I created a model using CoxPHFitter and I wanted to evaluate how well the model performed by comparing the survival after 12 months (using the correct row from predict_survival_function output), to the observed churn rate (1-survival rate). I noticed that it is consistently getting a higher survival rate compared to actual (~10%).

I paired back the model so that it was only based off the baseline hazard (passed no extra variables) and I still get a difference in survival rates.

I have tried this on open data, and can reproduce the result:

```
import lifelines
import numpy as np
import pandas as pd
churn_data = pd.read_csv('https://raw.githubusercontent.com/'
'treselle-systems/customer_churn_analysis/'
'master/WA_Fn-UseC_-Telco-Customer-Churn.csv')
event_col = 'Churn'
duration_col = 'tenure'
churn_data[event_col] = churn_data[event_col].map({'No':0, 'Yes':1})
churn_data_example = churn_data[[event_col, duration_col]]
cph = lifelines.CoxPHFitter()
cph.fit(churn_data[[event_col, duration_col]], duration_col=duration_col, event_col=event_col)
# cph.print_summary()
# get predicted churn:
unconditioned_sf = cph.predict_survival_function(churn_data_example)
predicted_survival = unconditioned_sf[[0]].T[12.0][0]
predicted_churn = 1 - predicted_survival
#Create churn at tenure = 12: logic is
# if tenure > 12 then they didnt churn => churn_12 =0;
# if they have tenure < 12 and churn=1, then the churn_12 =1;
# if tenure < 12 and churn=0, dont know if they churn => churn_12 = np.nan
churn_data_example['churn_12'] = churn_data_example['Churn']
churn_data_example.loc[(churn_data_example.tenure < 12) & (churn_data_example.churn_12 == 0), 'churn_12'] = np.nan
churn_data_example.loc[(churn_data_example.tenure > 12) , 'churn_12'] = 0
actual_churn = churn_data_example['churn_12'].mean()
print(f'actual churn: {round(actual_churn,2)}')
print(f'predicted churn: {round(predicted_churn,2)}')
print(f'ratio: {round(predicted_churn/actual_churn,2)}')
```

The results are:

actual churn: 0.17

predicted churn: 0.15

ratio: 0.89

And it deviates further as tenure increases.

Have you got any idea why I am seeing the behaviour? I feel it is either to do with me not understanding what predict_survival_function returns, or I am mis calculating the ‘actual churn’?

@gabrown, I am able to replicate what you are seeing locally. If I understand correctly, your definition of churn is "fraction of uncensored users who died before 12 months". I think this is going to bias your churn rate up, as you are not taking into account censoring. In an extreme case, where all but one subject is censored, then your def of churn will give 0% or 100%. But, that feels a bit strange, no? If they died early on, and the other subjects were censored later, we should feel that the churn isn't 100%.

Please correct me if I am mistaken, or I am not making sense. Happy to discuss more!

@CamDavidsonPilon can you please help me understand why using

`value_and_grad(negative_log_likelihood)`

in the minimization function, in fitters, helps? Why not simply minimize the `negative_log_likelihood`

directly?
In the same file: it seems that

`class ParametericAFTRegressionFitter(ParametricRegressionFitter)`

contains an extra 'e' :D