These are chat archives for bayespy/bayespy

21st
Jan 2016
Jaakko Luttinen
@jluttine
Jan 21 2016 13:57
rae89: i suppose there are different local optima for the vb approximation. you can compare the lower bound values to choose the best. also, deterministic annealing might help in finding the best optimum. it might also be that there's no significant difference in the lower bound between different number of clusters, that is, they are all good interpretations of the data
mixture models are perhaps more about density estimation (flexible multimodal distributions) than about trying to find some magical number which tells the "true" number of clusters. for instance, some clusters may be represented with a mixture of more than one distributions if the shape of the cluster is complex. still, it's only one cluster if you look at it visually.
Jaakko Luttinen
@jluttine
Jan 21 2016 14:03
but anyway, the learning may sometimes work badly, and you may need to utilize tricks: better initialization, annealing, stochastic inference, collapsed inference