by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • May 20 05:04

    ocramz on gh-pages

    Add `sampling` (compare)

  • May 19 09:03

    ocramz on gh-pages

    Add kdt, Supervised Learning se… (compare)

  • Apr 14 01:32
    tonyday567 removed as member
  • Jan 30 07:37

    ocramz on gh-pages

    Add arrayfire (compare)

  • Jan 02 12:51

    ocramz on gh-pages

    add inliterate (compare)

  • Jan 02 12:43

    ocramz on gh-pages

    update hvega entry (compare)

  • Jul 01 2019 09:43
    dmvianna added as member
  • Jun 15 2019 04:55

    ocramz on gh-pages

    Add pcg-random (compare)

  • Jun 14 2019 16:08
    ocramz labeled #42
  • Jun 14 2019 16:08
    ocramz labeled #42
  • Jun 14 2019 16:08
    ocramz labeled #42
  • Jun 14 2019 16:08
    ocramz labeled #42
  • Jun 14 2019 16:08
    ocramz labeled #42
  • Jun 14 2019 16:08
    ocramz labeled #42
  • Jun 14 2019 16:08
    ocramz opened #42
  • Jun 14 2019 16:08
    ocramz opened #42
  • Jun 06 2019 18:21

    ocramz on gh-pages

    Fix graphite link Merge pull request #41 from alx… (compare)

  • Jun 06 2019 18:21
    ocramz closed #41
  • Jun 06 2019 18:21
    ocramz closed #41
  • Jun 06 2019 17:32
    alx741 opened #41
MMesch
@MMesch
nice, thanks for writing this!
Alexey Kuleshevich
@lehins
:+1:
Gregory Nwosu
@gregnwosu_gitlab
happy new year all!
is anyone else experiencing long compilation times with Frames?
Tim Pierson
@o1lo01ol1o
@gregnwosu_gitlab Yes, some of the higher-kinded generics can get quite long depending on what you're trying to do.
Marco Z
@ocramz
happy new year! :tada:
Aleksey Khudyakov
@Shimuuar

Hi!

I'm proposing new lens-based API for statistics: bos/statistics#162 What to you think about it?

TL/DR example of use: meanOf (each . filtered (>0) . to log) will compute mean of logarithm of every positive number.

Marco Z
@ocramz
@Shimuuar I like it a lot; by coincidence I'm also tinkering quite a bit with lenses in the past few days, in particular the microlens set of libraries
Aleksey Khudyakov
@Shimuuar
I suspect that here lens will be necessary to be useful. But I didn't check
Tim Pierson
@o1lo01ol1o
have you considered optics?
Aleksey Khudyakov
@Shimuuar
Not really. lens is sort of standard now so I just went with it.
Tim Pierson
@o1lo01ol1o
The errors and documentation on optics are quite good, though. I think you sacrifice some cases of compositionally, but, in terms of people new to lens on a project that uses it (those poor, dreaded newbs), it presents a strong case
Marco Z
@ocramz
yeah the thing is that lens and friends are perfectly integrated with base concepts : (.), traverse, etc. My personal journey with this stuff has been extremely slow and gradual; I only found a need for it while refactoring my vega bindings, which have very nested datatypes
Marco Z
@ocramz
(I'm saying this because others might be in a similar position: not needing a full-blown optics library but just some convenience functionality)
Aleksey Khudyakov
@Shimuuar
Yes error messages are better. But libraires are more likely to provide van-Laarhoven's lens and newbies will have to learn lens anyway. There's no escape from it
Tim Pierson
@o1lo01ol1o

libraires are more likely to provide van-Laarhoven's lens and newbies will have to learn lens anyway. There's no escape from it

be the change u want to see in 2020. :sunglasses:

Marco Z
@ocramz
@Shimuuar do you plan to introduce Prism and Iso in statistics as well?
Aleksey Khudyakov
@Shimuuar
I don't know yet. Currently type signatures looks like: meanOf :: Getting (Endo (Endo MeanKBN)) s Double -> s -> Double and yes one could plug prisms there
Aleksey Khudyakov
@Shimuuar
@o1lo01ol1o Having to wrangle both libraries seems much more likely outcome
Marco Z
@ocramz
@Shimuuar please also consider the downstream cost of having to compile lens and/or optics while building criterion etc.
Aleksey Khudyakov
@Shimuuar
I think that just ability to work with all sort of containers instead of just vectors is such big improvement that it justify lens dependency. However with lens it could be possible to provide API without icurring lens dependency.
Also various workarounds are possible. Splitting package into core algorithms with less nice API and nice API wrappers.
Yves Parès
@YPares
@Shimuuar that's a cool idea :) I've never strayed away from the blessed-yet-horrific lens myself (even if quick ways to convert exist, considering another lib for the sake of err messages could be a good idea), but in the meantime it's safe to stick to lens due to widespreadness anyway
You'd expose prisms if some statistics couldn't be computed computed for some input data
Aleksey Khudyakov
@Shimuuar
Beauty of thing you don't need to know prism. API is very simple: you give me Getting - you get an answer. And since prisms are valid getting everything just magically works:
> meanOf (each . _Right . each) [Left "err", Right [1,2,3], Right [10] ]
4.0
Stefan Dresselhaus
@Drezil
Yves Parès
@YPares
@Shimuuar I was wondering, could meanOf be a Lens itself? That'd require some notion of setting a mean, but that's quite possible by adding to each each sample µ'-µ
same for stddev, multiply each sample by σ'/σ
Aleksey Khudyakov
@Shimuuar
@YPares It doesn't looks like really good idea since setting mean is sounds weird and not always possible. What if we have list of Ints with noninteger mean? We certainly can't set their mean!
Adam Conner-Sax
@adamConnerSax
There are also notions of “optics" that might be a different/useful fit here. I’m thinking of another Chris Penner post: https://chrispenner.ca/posts/algebraic, about “Algebraic Lenses” which he characterizes thusly: "an Algebraic lens allows us to run some aggregation over a collection of substates of our input, then use the result of the aggregation to pick some result to return.” Maybe something to keep an eye on. I’m thinking about how those optics and “Kaleidoscopes” fit in for map/reduce type operations.
Anyway, the lens thing is cool! I wonder how easy/hard it would be to have conversions between the Control.Foldl versions of these things. The applicative instance of Folds is useful for combining one-pass operations on the same data. Is there some equally straightforward way to combine meanOf folded and sumOf folded so that the fold only happens once? That might go back to the idea of “meanOf” as the optic and then maybe there’s a way to compose the optics?
Aleksey Khudyakov
@Shimuuar
Thanks for the pointer. I haven't seen that post!

As for foldl-like functionality I'm not sure seems difficult since accumulator type leaks in type signatures. But could be possible.

Another problem is one frequently needs multiple passes over data to avoid precision loss. Numerically stable computation of variance requires computing mean first

Adam Conner-Sax
@adamConnerSax
Foldl dodges that accumulator-type-leaking problem by quantifying over it and then adding an extra bit to the data type so you can return something else, thus separating the accumulator type from the return type. Then it is an Applicative in that returned type rather than the accumulator type.
Not sure how the multi-pass part could work but it’d be very cool if two multi-pass computations could be composed such that they need only do the max of the indivudual number of passes each needs. I wonder of there is a way to stack them, as a type-level list, like a list of effect handlers?
Yves Parès
@YPares
@Shimuuar Yep, I think what @adamConnerSax is mentionning is realizable via foldOver http://hackage.haskell.org/package/foldl-1.4.5/docs/Control-Foldl.html#v:foldOver to which you give any lens/prism/traversal, and a Fold from http://hackage.haskell.org/package/foldl-statistics
(cc @MMesch didn't you use that pattern in the past?)
Aleksey Khudyakov
@Shimuuar
Another question is what to do with statistics that couldn't be expressed as folds cleanly. Median for example. I can think only about building temporary vector, work on it and then discard
Tony Day
@tonyday567
I think that’s all there is for an exact median but there are approximate methods eg https://github.com/tonyday567/online/blob/master/online/src/Online/Medians.hs#L23
Aleksey Khudyakov
@Shimuuar
Both are needed
Torsten Scholak
@tscholak
hi, hugging face has released a new Rust implementation of their byte pair tokenizers: https://github.com/huggingface/tokenizers
they are looking for language binding contributions. Haskell bindings would be more than welcome. they would fill a gap we currently have with hasktorch where we work towards pre-trained Bert and gpt-2 models people can use in their projects
Marco Z
@ocramz
@tscholak do you work primarily within NLP?
Torsten Scholak
@tscholak
yes
Stefan Dresselhaus
@Drezil
thanks for the hint @tscholak .. i use that now in my pytorch thingy .. :)
last time i looked at hasktorch it was missing things (or at least i had the impression). I would prefer to migrate to hasktorch in the next months/years .. provided my supervisor is ok with that & i can train ppl. on the job with it.. ;)
But then there will maybe be more contributions from my side ..
Torsten Scholak
@tscholak
hehe, thanks @Drezil
I'm steadily working towards what I informally call "haskformers"...
watch it happen here: hasktorch/hasktorch#269
this is just a transformer language model (gpt-2 style) implementation.
later we want to be able to load hugging face transformers models and also use their tokenizers. hence the call for help
Torsten Scholak
@tscholak
the PR got merged, https://hasktorch.github.io/hasktorch-0.2.0.0/html/Torch-Typed-NN-Transformer.html
next I'm going to set up the training