## Where communities thrive

• Join over 1.5M+ people
• Join over 100K+ communities
• Free without limits
##### Activity
• Jan 31 2019 19:14

avi-stripe on develop

Generated new Bazel BUILD files… (compare)

• Jan 31 2019 19:14
avi-stripe closed #322
• Jan 30 2019 23:36
areese-stripe review_requested #322
• Jan 30 2019 23:36
areese-stripe opened #322
• Jan 30 2019 23:33

areese-stripe on areese_bazel_update

Generated new Bazel BUILD files… (compare)

• Jan 30 2019 23:33
• Jan 27 2019 02:42

avibryant on distributeconfig

wip (compare)

• Jan 23 2019 19:40

avi-stripe on develop

Fix compute bug with batched ob… (compare)

• Jan 23 2019 19:40
avi-stripe closed #321
• Jan 23 2019 19:34
mio-stripe labeled #321
• Jan 22 2019 18:24
• Jan 19 2019 11:26
mg-stripe removed as member
• Jan 17 2019 20:02

avi-stripe on develop

stack safe Packer (#273) (compare)

• Jan 17 2019 20:02
avi-stripe closed #273
• Jan 17 2019 20:01
avi-stripe synchronize #321
• Jan 17 2019 20:01

avi-stripe on targettest

fix sbcbenchmark (compare)

• Jan 17 2019 19:57
avi-stripe assigned #321
• Jan 17 2019 19:57
avi-stripe opened #321
• Jan 17 2019 18:58

avi-stripe on targettest

fix bug with batchBits output o… (compare)

• Jan 14 2019 22:36

avi-stripe on targettest

failing test for fit gamma targ… (compare)

Kai(luo) Wang
@kailuowang

One thing RandomVariable provides was the ability to generalize the composition of Models and their variables, which is useful when writing generalized libraries that is somewhat model agnostic. Is there a replacement for supporting such generalized composition?
To better illustrate my thinking, It’s tempting for me, based on my own use cases, to write a replacement of RandomVariable as something like

case class RandomVariable[A](v: A, model: Model) {
def mapWith[B, T](that: RandomVariable[B])(f: (A, B) => T): RandomVariable[T] =
RandomVariable(f(v, that.v), model.merge(that.model)
}

Would such a thing make sense with the new design?

Darren Wilkinson
@darrenjw
I have a related question. The docs suggest that there is an overhead associated with merging models. Is this significant? It seems like folds over collections of models will be necessary for many hierarchical models.
Avi Bryant
@avibryant
sorry that I didn't see these messages earlier, I need to figure out how to get my gitter notifications right. (meta: is gitter the right venue? happy to move to anything)
@kailuowang I agree that higher level wrappers like you're illustrating there would be useful. Model is, deliberately, a little bit lower level. But I'm hoping that you'll find it easier and more flexible to define those now. The problem we were having before had to do with the (mandatory) stack of RandomVariable[Generator[T]]. It's quite awkward to force people into this kind of monad transformer stack. I'd rather see what kinds of abstractions (monadic or otherwise) people come up with for a bit; maybe down the line we can officially upstream one.
@darrenjw the overhead isn't new, and is effectively just the question of how vectorizable the code is.
if you have a model that does some kind of recursive fold over the data points, then necessarily each data point will be considered singularly
Avi Bryant
@avibryant
if you can express it instead as a map rather than a fold, that can be more tractable
in particular, this is to do with the emitted code size: folds get "unrolled" in the DAG, whereas maps don't have to be
and if your DAG gets too big that can cause problems all down the line (autodiff, compilation, execution)
however, it's not obvious to me that hierarchical models will necessarily require folds, so a concrete example would be helpful.
Darren Wilkinson
@darrenjw
Your tutorial example (Vectors and Variables) illustrates it quite well. I think it's most natural to fit each group on the data for that group, then merge the models. Sure, you can unroll everything and do the conditioning with one Model.observe, but that doesn't seem so compositional, and won't necessarily adapt easily to more complex scenarios/data structures. I'm also thinking about DLMs, where it would probably be most natural to condition each state on it's observation, then foldLeft over the resulting sequence of models. But again, you could build the full prior model and condition on all of the data in one go. Am I right in thinking that for good performance it is best to build the full prior model and then condition on all of the data in one go with a single Model.observe in order to avoid merging models, if possible?
Kai(luo) Wang
@kailuowang
@avibryant thanks for the reply. I feel that it’s an improvement that the current API removes the mandatory monadical composition for most users. And it seems easier to define compositional constructs than what I saw inside the old RandomVariable. I am going to play with some constructs in my projects and see if anything turns out worth upstreaming.
Avi Bryant
@avibryant
@darrenjw yes, though to be clear: merging, say, 10s of models is fine. Merging probably hundreds or certainly thousands of models is going to cause performance problems, and to the extent that you can express it in a single Model.observe, it will be better.
I guess I can imagine eventually adding something like Model.fold
but for the moment my brain hurts thinking about how to do that.
Darren Wilkinson
@darrenjw
That makes sense. Is Model.merge associative? ie. would it make sense to define a SemiGroup instance for |+| syntax?
Avi Bryant
@avibryant
yes, associative and commutative
Model is just keeping track of a set of likelihood functions
(and the associated observations)
Avi Bryant
@avibryant
so, if you think of one model as $\sum_{i=1}^n L(\theta;x_i)$
then merged models are something like
$\sum_{m=1}^k \sum_{i=1}^n L_m(\theta;x_{mi})$
Avi Bryant
@avibryant
and because each $L_m$ has to be compiled separately, it's better to have small $k$ and large $n$ than vice versa
Darren Wilkinson
@darrenjw
Yes. I have been getting some warnings about stuff being too big to JIT. I guess this is why.
Avi Bryant
@avibryant
so in general the compiler breaks things into individual methods that are under theJIT threshold,
but there is an issue right now where if n_models * n_parameters gets large enough, there's a top-level dispatch method that gets generated too large to JIT
I'll try to fix that soon
Avi Bryant
@avibryant
BTW: I have been working on multivariate normal support, as well as mass matrix adaptation. As far as I can tell all the math is right, but I'm having a lot of problems with poor mixing when I try to actually fit a covariance matrix from data.
Darren Wilkinson
@darrenjw
Have you had a look at how they do it in Stan, or other libraries? People often just fit a diagonal mass matrix. If you are going to learn a full covariance matrix, I would use a shrinkage estimator to try and keep all of the eigenvalues away from zero.
Avi Bryant
@avibryant
I modeled it after what pymc3 does. I know stan at least optionally does a full mass matrix. but I'd love to hear more about the shrinkage estimator if you have any links.
it's a good point though that I should try starting with a diagonal mass matrix adaptation
which I haven't tried
Darren Wilkinson
@darrenjw
Avi Bryant
@avibryant
thanks
I should have been more precise, though. My mixing problems are when trying to sample the covariance matrix of multivariate normal data, using an LKJ prior. I had hoped that learning the mass matrix would help but it's not obvious that it does
(one thing that's confusing to think about when debugging is that the mass matrix then ends up being the covariances of the elements in the [cholesky decomposition of the] covariance matrix of the data)
anyway it feels like some kind of numerical instability, maybe, which is always hard to track down
Kai(luo) Wang
@kailuowang
@avibryant quick question: the ability to plot in repl is removed right? The only way to plot now is in notebook right?
Avi Bryant
@avibryant
yeah, though again that could be easily reintroduced and probably should be. It was just easier for me to delete with a broad brush and reintroduce as needed. Do you actually use that?
@kailuowang ^
Darren Wilkinson
@darrenjw
I would use that, too!
ie. have "show" use EvilPlot's "displayPlot" method to pop up the plot from the Scala REPL. Does that make sense?
Avi Bryant
@avibryant
oh I see, rather than ascii plots like we had before?
yes that would be great
created stripe/rainier#488, I probably won't work on this right now but PRs very welcome
(the original motivation for the ascii plots was for tut docs, but mdoc makes it easier to support images)
Avi Bryant
@avibryant
BTW do you guys use ammonite or sbt console or what, as a REPL?
Darren Wilkinson
@darrenjw
I typically just use sbt console (old-school, I know)
Avi Bryant
@avibryant
ok I asked this to some others via DM but posting here in case maybe @darrenjw has thoughts: