Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Jacki Buros Novik
    @jburos
    sure, we could do a skype if you like. let me just confirm my skype username
    Will Adler
    @wtadler
    i'm w.t.adler
    Jacki Buros Novik
    @jburos
    Just sent you a contact request
    Jacki Buros Novik
    @jburos
    Will Adler
    @wtadler
    awesome
    very helpful
    thanks so much! hopefully i'll have some results in the next few days
    adam-haber
    @adam-haber
    Hello! I'm trying to use survivalstan and encountered some problems. Two of them are possible bugs; reported them on github.
    Another thing I encountered which I'm not sure if it's a bug or not is a "RuntimeError: Initialization failed." msg... any idea on how to approach this?
    Jacki Buros Novik
    @jburos

    Hi adam-haber. Thanks for reporting those bugs! I will take a look. For which model did the initialization fail? This tended to happen to me for the weibull model, and for this you can pass in an initialization function which will set the random initial values to more reasonable ranges.

    For other models, this can sometimes happen if your data look different from the test data. You can try giving pystan a more narrow init value range, using the parameter init_r. I would try a value of 1 or 0.5 (the default is 2).

    adam-haber
    @adam-haber
    The init failed for the PEM model; setting the init_r to 0.5 or 1 didn't work. Is there any output I can produce that may help in debugging?
    Also, what do you mean by "the data looks different than the test data"?
    Jacki Buros Novik
    @jburos
    So, there are a few things that would help me debug this.
    1. if you can share the console output (if you're running in jupyter this will be in the stdout of that process, or the jupyter log if you turned logging on). It should look something like this :
    Rejecting initial value:                                                                                                                                                                                                                                                                                                                                         [1283/1880]
      Error evaluating the log probability at the initial value.
    validate transformed params: Sigma_baseline_hazard is not positive definite.
    Rejecting initial value:
      Error evaluating the log probability at the initial value.
    validate transformed params: Sigma_baseline_hazard is not positive definite.
    Rejecting initial value:
      Error evaluating the log probability at the initial value.
    validate transformed params: Sigma_baseline_hazard is not positive definite.
    Rejecting initial value:
      Error evaluating the log probability at the initial value.
    validate transformed params: Sigma_baseline_hazard is not positive definite.
    Rejecting initial value:
      Error evaluating the log probability at the initial value.
    validate transformed params: Sigma_baseline_hazard is not positive definite.
    Rejecting initial value:
      Error evaluating the log probability at the initial value.
    validate transformed params: Sigma_baseline_hazard is not positive definite.
    Rejecting initial value:
      Error evaluating the log probability at the initial value.
    validate transformed params: Sigma_baseline_hazard is not positive definite.
    Rejecting initial value:
      Error evaluating the log probability at the initial value.
    validate transformed params: Sigma_baseline_hazard is not positive definite.
    Rejecting initial value:
      Error evaluating the log probability at the initial value.
    validate transformed params: Sigma_baseline_hazard is not positive definite.
    Rejecting initial value:
      Error evaluating the log probability at the initial value.
    validate transformed params: Sigma_baseline_hazard is not positive definite.
    Rejecting initial value:
    Rejecting initial value:
      Error evaluating the log probability at the initial value.
      Error evaluating the log probability at the initial value.
    (this is an example from a similar error seen with a different model)
    1. Some description of your data. Maybe a head(), number of samples, number of events, typical follow-up time, and covariates.
    2. If you can share a sample of the data (de-identified, etc) that would help but I recognize this isn't always possible.
    adam-haber
    @adam-haber

    Rejecting initial value:
    Log probability evaluates to log(0), i.e. negative infinity.
    Stan can't start sampling from this initial value.
    Rejecting initial value:
    Log probability evaluates to log(0), i.e. negative infinity.
    Stan can't start sampling from this initial value.

    Initialization between (-2, 2) failed after 100 attempts.
    Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.

    Oops, tried to post this as code... what's the post-code syntax in gitter? :smile:
    The first bit (Log probability ...) returned many times
    Jacki Buros Novik
    @jburos
    Great, thanks. Based on your github issue response, it might be that your data are not yet transformed. For the PEM model you need to run prep_data_long_surv first.
    dlong = survivalstan.prep_data_long_surv(df=d, event_col='event', time_col='t')
    fit = survivalstan.fit_stan_survival_model(
        model_code = survivalstan.models.pem_survival_model,
        df = dlong,
        sample_col = 'index',
        timepoint_end_col = 'end_time',
        event_col = 'end_failure',
        formula = '~ age_centered + sex'
    )
    Let's see if I can escape this correctly - you'll want to use triple-backtick to put things in blocks, like: "```"
    single-backtick to post code inline: "`"
    One shortcoming of the way the package is currently designed is that it won't do a lot of these things for you. I do have an issue open to create a higher-level interface that would wrap a lot of these steps, but it's not currently ready
    adam-haber
    @adam-haber
    I'm trying this right now, will let you how it goes
    After prep, I now get: OSError: [Errno 12] Cannot allocate memory
    Jacki Buros Novik
    @jburos
    yes, this can be a problem depending on the size of your data
    adam-haber
    @adam-haber
    The shape of the dlong object is (31223, 28)
    And running htop shows I have more than 10G to spare...
    Perhaps the code creates many copies under the hood? The dlong object isn't that big by itself...
    Jacki Buros Novik
    @jburos
    hmm. agreed, that should be doable, but will depend on your machine. There is a copy created in the format that pystan wants as input, but this shouldn't account for the issue. The error is coming when you call fit_stan_survival_model, correct?
    adam-haber
    @adam-haber
    Yes
    I also got memory errors in the prep stage, so I use a subset of the data in the prep stage and than pass the prep-ed subset to the fit function
    then
    Jacki Buros Novik
    @jburos
    It could be within pystan -- the MCMC process itself has to deal with a lot of data. but the "prep" stage also isn't currently that efficient. I have versions of these models in the works that don't require the "long" data format, which is where i'm currently spending my effort.
    adam-haber
    @adam-haber
    OK... thanks for your time! I'll wait patiently :-)
    Jacki Buros Novik
    @jburos

    No problem! The weibull & exp models don't require this currently, if you want to give them a try.

    Also, I don't know offhand how well Python cleans up after the prep-data stage. this might be an issue if you maxed out your memory earlier.

    I'm going to make a few new issues so I can prioritize effectively
    Phil
    @philarnold4242
    Hi. I just started working through the survivalstan tutorial. I'm running a model on simulated data (haven't tried real data yet) and encounter lots of divergent transitions (>5%). Increasing warmup iterations and / or adapt delta doesn't really help. How do you usally deal with that situation? Thanks!!
    Jacki Buros Novik
    @jburos
    Hi Phil - this can indicate that the model as written has a hard time exploring certain parts of your posterior distribution. sometimes it's an indication that the model is a poor fit to the simulated data, and sometimes your data needs to be rescaled or your model re-parameterized so that the sampling can be more efficient
    if you don't mind sharing the code you've used to simulate data (or the functions if you're re-using certain functions) &/or the model you are fitting, i'll take a look sometime this upcoming weekend.
    Phil
    @philarnold4242
    Hi Jacki. Thanks for getting back to me so quickly. I'm aware that Stan models sometimes need reparametrization and data sometimes need rescaling. I was just surprised to see that using Survivalstan to simulate data (simple exponential model) and then infer parameters of an exponential model results in a posterior distribution with such bad neighbourhoods. Would be great if you could have a look at the code.
    Thanks a lot!!
    Jacki Buros Novik
    @jburos
    Agreed -- thanks for taking a look & for letting me know. I'll review sometime this week/weekend and get back to you
    there are a bunch of updates i'm planning to make here, but in the meantime if you're also comfortable in R you might find the recent work on rstanarm::stan_surv helpful
    Phil
    @philarnold4242
    cool, thanks. did not know there is work planned for R as well. will definitely have a look. thanks again.
    c-farmer
    @c-farmer
    Good morning/afternoon everyone. First thank you for your amazing work on survivalstan and offering to take questions. I have what will in hindsight be a relatively rudimentary question but one that continues to confound me. Specifically, I was hoping you could point me toward the reference used to set the priors for the shape and scale parameters of the weibull_survival_model.stan. Probably related, I am also curious about the role of alpha in the exp(-lp[n]/alpha) of the scale parameter. I am sure this is all in literature, but I have been unable to find the proper resource in my reading. Thank you in advance for any assistance!
    Jacki Buros Novik
    @jburos
    These are great questions - .. first, I don't have a specific reference for the priors on the weibull model. These have worked well for my use case, but the package in general is meant for more of a "pro user" who is comfortable editing the Stan file to customize the priors to their application. The alternative would be to develop a much more complicated Stan file & python-UI by which a user could customize the priors (ala rstanarm), or a stan-code-generator ala brms .. neither of which make it easy to expand the set of models implemented. This decision has fairly obvious & particular limitations, but .. it's what we decided to do thus far. The selections here are fairly common/generic in example bugs implementations, but i think a thorough treatment would take into consideration the time-scale .. e.g. different priors would likely work better for survival measured in years vs days.
    to your second question regarding alpha, I've gotten this question a few times so I posted the answer on my (almost-nonexistent) blog. You can see a blurb about this here: https://biostats-blog.netlify.com/post/parameterizing-the-weibull-proportional-hazards/ .. HTH & please let me know if I can help to clarify further
    c-farmer
    @c-farmer
    Thank you so much! I thought I had run google ragged looking up articles but your blog post addresses a number of questions. I wouldn't dare to call myself a pro user, I just tend to only work with models I know how to build from scratch to avoid any miss-assumption on my part. I remain curious about the configuration of bg_prior_lp where a normal(0,10) is scaling up a vector of square-root inv chi-squared(1.0) elements. I'm sure there is an important nuance I am missing there as it seems like a lot of mechanics considering beta_raw is centered on normal(0,1).
    Jacki Buros Novik
    @jburos
    Yeah that's another one I should write up -- also one I borrowed from Ben Goodrich :)