by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Jan 31 2019 19:14

    avi-stripe on develop

    Generated new Bazel BUILD files… (compare)

  • Jan 31 2019 19:14
    avi-stripe closed #322
  • Jan 30 2019 23:36
    areese-stripe review_requested #322
  • Jan 30 2019 23:36
    areese-stripe opened #322
  • Jan 30 2019 23:33

    areese-stripe on areese_bazel_update

    Generated new Bazel BUILD files… (compare)

  • Jan 30 2019 23:33
    areese-stripe added as member
  • Jan 27 2019 02:42

    avibryant on distributeconfig

    wip (compare)

  • Jan 23 2019 19:40

    avi-stripe on develop

    Fix compute bug with batched ob… (compare)

  • Jan 23 2019 19:40
    avi-stripe closed #321
  • Jan 23 2019 19:34
    mio-stripe labeled #321
  • Jan 22 2019 18:24
    mg-stripe added as member
  • Jan 19 2019 11:26
    mg-stripe removed as member
  • Jan 17 2019 20:02

    avi-stripe on develop

    stack safe Packer (#273) (compare)

  • Jan 17 2019 20:02
    avi-stripe closed #273
  • Jan 17 2019 20:01
    avi-stripe synchronize #321
  • Jan 17 2019 20:01

    avi-stripe on targettest

    fix sbcbenchmark (compare)

  • Jan 17 2019 19:57
    avi-stripe assigned #321
  • Jan 17 2019 19:57
    avi-stripe opened #321
  • Jan 17 2019 18:58

    avi-stripe on targettest

    fix bug with batchBits output o… (compare)

  • Jan 14 2019 22:36

    avi-stripe on targettest

    failing test for fit gamma targ… (compare)

Avi Bryant
@avibryant
yeah, though again that could be easily reintroduced and probably should be. It was just easier for me to delete with a broad brush and reintroduce as needed. Do you actually use that?
@kailuowang ^
Darren Wilkinson
@darrenjw
I would use that, too!
ie. have "show" use EvilPlot's "displayPlot" method to pop up the plot from the Scala REPL. Does that make sense?
Avi Bryant
@avibryant
oh I see, rather than ascii plots like we had before?
yes that would be great
created stripe/rainier#488, I probably won't work on this right now but PRs very welcome
(the original motivation for the ascii plots was for tut docs, but mdoc makes it easier to support images)
Avi Bryant
@avibryant
BTW do you guys use ammonite or sbt console or what, as a REPL?
Darren Wilkinson
@darrenjw
I typically just use sbt console (old-school, I know)
Avi Bryant
@avibryant
ok I asked this to some others via DM but posting here in case maybe @darrenjw has thoughts:
Screen Shot 2020-02-25 at 11.05.33 AM.png
this is me trying to understand something basic about mass matrix adaptation and HMC
Darren Wilkinson
@darrenjw
I'm not sure I completely understand the problem. But when thinking about this stuff conceptually, I often find it helpful to think about the dynamics of the process in continuous time. In continuous time there is no leap frog, no step size, and there is no step-size adaptation, but there are still good and bad mass matrices, and still good and bad integration times, though these could be related.
Avi Bryant
@avibryant
yes, agreed. But in the discrete world, I guess one way of framing my question is: is there any reason we don't normalize mass matrices (eg, for a diagonal matrix, such that the elements are all <= 1?)
I realize this is a bit hand wavy but it seems like that's beneficial numerically to the leapfrog
but it's clearly not common practice so I fear I'm missing something.
I guess maybe, more correctly, such that the diagonal elements treated as a vector have a length of 1?
Darren Wilkinson
@darrenjw
Or trace 1, would have been my instinct (total variance).
But it feels like it shouldn't be necessary
Avi Bryant
@avibryant
so apart from empirical observations, which could be related to implementation bugs etc, the reason it seems theoretically useful to me is that
(thinking out loud a bit here)
Darren Wilkinson
@darrenjw
It feels like in the case of an MVN target, the mass matrix should reflect the covariance/precision matrix, and not some normalised version. But this isn't something I've implemented in practice, so I could easily be missing something important...
Avi Bryant
@avibryant
in the leap frog, we advance the parameters by stepSize * (momentum * variance)
and then we advance the momentum by stepSize * gradient(parameters)
where the gradient is scaled to a delta of 1 on the parameters
Darren Wilkinson
@darrenjw
Yes it's certainly true that bad choices can make the numerics stiffer, but I feel like the good choices make the numerics better.
Avi Bryant
@avibryant
and I guess the thing that's getting me right now is that if variance is very far from 1,
does that make a good choice of stepSize for the parameter full step,
a bad choice of stepSize for the momentum half step?
I guess I can't really justify why normalizing it would be any better or worse.
Darren Wilkinson
@darrenjw
But my feeling is that you would only choose a variance far from one if different step sizes for the parameters and momentum are appropriate. But I'm seriously hand-waving at this point!
Avi Bryant
@avibryant
ok, I can see the intuition there, I think: if a given parameter has low variance, then its component of the gradient will also be small
and so you want a large stepSize in both cases
Darren Wilkinson
@darrenjw
Yes, I think so.
Avi Bryant
@avibryant
empirically though I'm having a lot of trouble with it. Which could mean there's something wrong with step size adaptation I guess.
Darren Wilkinson
@darrenjw
I would definitely shrink (heavily) towards the identity, and you might additionally want some hard bounds on how big or small masses can get. In high dimensions, estimating covariance matrices stably and well is notoriously difficult, generally, but estimating them from badly tuned MCMC traces is obviously even worse.
Avi Bryant
@avibryant
thanks for the thoughts. I'll keep banging my head on it :)
Darren Wilkinson
@darrenjw
Thanks for your efforts! It's greatly appreciated.
Avi Bryant
@avibryant
for posterity, the problem was this: https://twitter.com/avibryant/status/1238321823839711234
Darren Wilkinson
@darrenjw
:thumbsup:
Zhenhao Li
@Zhen-hao
hi, I haven't tried Rainier yet. but is it possible to sample from a user-defined non-uniform discrete distribution?
for example, with numpy you can do numpy.random.choice(actions, size=len(actions), replace=False, p=my_sampling_probabilities_vector)
Zhenhao Li
@Zhen-hao
where actions are the values of the random variable whose distribution is the my_sampling_probabilities_vector
Avi Bryant
@avibryant
@Zhen-hao there's currently no support for discrete parameters. If you're just looking to generate from a non-uniform discrete distribution, we had support for that in 0.2.x that I removed in 0.3.x in anticipation of adding it back in when I do discrete parameters :/
either way, I'd love to hear what your use case is
Zhenhao Li
@Zhen-hao
@avibryant thanks for your quick response. the use case is that some businesses want specifically defined discrete distributions such as that based on inversed rank (in recommendation system)
for now, it is not hard for me to write one using a uniform distribution U(0,1) for my use case.
so I will go down that path
Avi Bryant
@avibryant
sounds good, let me know if you run into any problems