Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Feb 04 22:49
    flak153 removed as member
  • May 20 2020 05:04

    ocramz on gh-pages

    Add `sampling` (compare)

  • May 19 2020 09:03

    ocramz on gh-pages

    Add kdt, Supervised Learning se… (compare)

  • Apr 14 2020 01:32
    tonyday567 removed as member
  • Jan 30 2020 07:37

    ocramz on gh-pages

    Add arrayfire (compare)

  • Jan 02 2020 12:51

    ocramz on gh-pages

    add inliterate (compare)

  • Jan 02 2020 12:43

    ocramz on gh-pages

    update hvega entry (compare)

  • Jul 01 2019 09:43
    dmvianna added as member
  • Jun 15 2019 04:55

    ocramz on gh-pages

    Add pcg-random (compare)

  • Jun 14 2019 16:08
    ocramz labeled #42
  • Jun 14 2019 16:08
    ocramz labeled #42
  • Jun 14 2019 16:08
    ocramz labeled #42
  • Jun 14 2019 16:08
    ocramz labeled #42
  • Jun 14 2019 16:08
    ocramz labeled #42
  • Jun 14 2019 16:08
    ocramz labeled #42
  • Jun 14 2019 16:08
    ocramz opened #42
  • Jun 14 2019 16:08
    ocramz opened #42
  • Jun 06 2019 18:21

    ocramz on gh-pages

    Fix graphite link Merge pull request #41 from alx… (compare)

  • Jun 06 2019 18:21
    ocramz closed #41
  • Jun 06 2019 18:21
    ocramz closed #41
dracon98
@dracon98
Hi I am new to Haskell and data science. I wonder whats the advantage of implementing machine learning through DataHaskell compared to the standard implementation?
Austin Huang
@austinvhuang

+1 hvega from me too. vega-lite itself has been such a boon to other languages, thanks for your work on hvega @DougBurke.

@YPares did you ever find a sql solution? i'm using sqlite-simple a lot... i can live with sqlite-simple but i think i have to bite the bullet on some minimal string builder on my current project. it'd be nice to have more lightweight string-builder middle-ground options between hard-coding queries as strings and full-on orm commitment.

@frasermince can ping us in the hasktorch slack if you'd like to get a hasktorch kaggle effort going. will say kaggle is probably a space where haskell has few comparative advantages (there's no penalty to scripts being hacky/throwaway as long as they score the test data, this is not true of "real-world" ml).

Guillaume Desforges
@GuillaumeDesforges

Hi all. I'm working as a data engineer for a startup and I'm trying to push forward functional programming for our data processing. I've read an interesting discussion when looking at Scala capabilities. They say:

I think the main thing holding us back is a good data frame implementation that the community settles on.

What's the current state of that in Haskell? If I google "haskell dataframe" I stumble upon the Frames lib. Is it the "community standard" lib to do data processing?

Adam Conner-Sax
@adamConnerSax
@GuillaumeDesforges I’m not sure there is a “standard.” Frames is what I use most of the time. But sometimes Haskell’s limitations in constraint manipulation—though maybe this could be improved by a plugin or using newer type-level features in ghc—has me reaching for a representation more akin to Heidi (https://hackage.haskell.org/package/heidi), that is, something less precisely typed. Frames is great when you are writing functions to manipulate a known set of columns. Trying to write polymorphic Vinyl/Frames code can be tricky because you have to prove a lot to the compiler to convince it that the manipulation you are doing is correctly typed. In a sense, this is the point! Catch a whole bunch of errors at compile time! But it’s sometimes harder than it seems like it ought to be. Of course, YMMV.
Matthieu Coudron
@teto
My feeling as a newcomer is that few people use either because at least Frames is not trivial to use (the fact that there are only few users => lack of examples/support). As an ex-pandas user and average haskell programmer I find Frames great, well-designed etc but I miss some features etc. In my experience the compiler errors can be completely misleading. I guess a ghc plugin (like for polysemy) could make the experience a lot better but I dont know enough to be sure.
Agustin Jimenez
@agustinjimenez_gitlab

Hi! how are you? Has anyone the udemy course "Haskell: Data Analysis Made Easy"

I'm looking for this course for a while... but actually I have 0 money.

Thanks guys

Marco Z
@ocramz
@adamConnerSax glad you like the Heidi interface ^^ I call it "friendly-typed"
Marco Z
@ocramz
Speaking of which, heidi works and is "usable" (albeit a bit rough around the edges) but I got stuck on how to pretty-print dataframes (ocramz/heidi#2) and how to reconcile tries with lists as keys and relational joins (ocramz/heidi#13 ). The first one is a somewhat mind-bending puzzle involving generics, and the second one more of an UX problem. Having other eyes on this would be a great help!
Marco Z
@ocramz
my "give" to the community instead is this: https://hackage.haskell.org/package/splitmix-distributions-0.7.0.0/docs/System-Random-SplitMix-Distributions.html a new package with various (univariate) random samplers, both continuous and discrete.
Adam Conner-Sax
@adamConnerSax
@ocramz Interesting! I’ll look at the issues. I once did a bunch of stuff with generics-sop but as I recall, it takes some time to get your head around…
A question: is there a way to move directly to a useful Heidi row without declaring an instance for a record? What I really want, what is often hard in Frames, is to add columns on the fly. For example, recently I have been using Stan to model things and then produce results with confidence intervals. So I get, e.g., a Map, keyed by one kind of data, with values as lists of doubles representing 5%, 50%, 95% or whatever. I’d like to make those into Heidi rows, add some cols, merge, via stacking and/or joining, with other rows, maybe compute an average or whatever, then feed all that to hvega for charting. Nowhere in there do I really need a corresponding record, just named and typed columns which I can efficiently access, join, concatenate and fold over, then extract by name and type. Does that make sense?
Marco Z
@ocramz
@adamConnerSax it does! internally heidi rows are essentially Map s . In addition the library provides the generic encoding stuff if your data is already parsed into Haskell values
but you certainly should be able to work with the Row types directly
Adam Conner-Sax
@adamConnerSax
@ocramz I sort of figured but then I couldn’t figure out the best way in. Do you have an example of that sort of use, or a suggestion for where in the Heidi code to look? Or, more simply, what is the type of a “row” in Heidi? E.g., in a Vinyl.Record rs -> Heidi.?? function what is ???
Marco Z
@ocramz
@adamConnerSax the interface to Row is here : https://hackage.haskell.org/package/heidi-0.0.0/docs/Heidi.html#g:13
it's pretty much Map-like, and there are some lenses for extracting values at the right type : https://hackage.haskell.org/package/heidi-0.0.0/docs/Heidi.html#v:int
more lenses , for example at can be used both for lookup and insertion : https://hackage.haskell.org/package/heidi-0.0.0/docs/Heidi.html#v:at
Adam Conner-Sax
@adamConnerSax
@ocramz Cool. How do you suggest I handle the typing bit when building rows? I can see how the generics works but I want to do that bit by hand. E.g., I have a Map Text [Double] and I want Heidi.Frame (Row [TC] VP) (I think). I can sort of see how to do it with rowFromList except toVal is not exposed so I can’t make my values into VP’s. What’s the canonical way to encode directly?
Adam Conner-Sax
@adamConnerSax
Sorry, more like:
 let toHeidiRows :: Map Text [Double] -> Heidi.Frame (Heidi.Row [Heidi.TC] Heidi.VP)
      toHeidiRows m = Heidi.rowFromList $ fmap f $ M.toList m where
        f (s, [lo, mid, hi]) = [(Heidi.mkTyN "State", Heidi.toVal s)
                           , (Heidi.mkTyN "lo", Heidi.toVal lo)
                           , (Heidi.mkTyN "mid", Heidi.toVal mid)
                           , (Heidi.mkTyN "hi", Heidi.toVal hi)
                           ]
Marco Z
@ocramz
ah, good point, toVal should be exposed. Yes that looks about right!
Adam Conner-Sax
@adamConnerSax
Okay. I’m going to fork, fix that and see what I else I need to interoperate with Frames and then submit a PR...
@ocramz ^
Adam Conner-Sax
@adamConnerSax
@ocramz May I also bump the base version in cabal?
Adam Conner-Sax
@adamConnerSax
using toVal doesn’t work since I want VP not Val. To do that directly I would have to expose the VP constructors? But you don’t want to do that (according to the comments). So how do you suggest I directly encode values into the list I need for rowFromList?
Adam Conner-Sax
@adamConnerSax
@ocramz So, exposing the constructors of VP works for that one case, but I don’t think that’s a good way forward. What I want is something like Vinyl.AllConstrained Heidi.Heidi rs => Vinyl.Record rs -> Heidi.Row [Heidi.TC] Heidi.VP or something like that. But the path from toVal at the element level, to a row is not clear to me.
Adam Conner-Sax
@adamConnerSax
And now I’m sort of lost as to what Val is since there’s the version as Fixpoint and a different version...
Marco Z
@ocramz
ah, I start seeing what you need. A vinyl record is already in the same form as a generics-sop flat product type. so we just need a traversal over that
Val is just an internal type to deal with recursive values, but it all gets flattened down to a single layer
Adam Conner-Sax
@adamConnerSax
That makes sense. But I’m not sure what is best in the [TC] part? I’d like to put the column names from the Record in there but I think you want the last element of the list to represent the type of the column, right? So add a TC to that list for the column name or is there some better way to do this?
Marco Z
@ocramz
@adamConnerSax a TC carries the name of a type and that of the constructor from which the value comes from. So let's say you have a record type data P = P { p1 :: String, p2 :: Int}, the first TC would carry "P", "p1" etc.
Marco Z
@ocramz
but if starting from a vinyl Record, I think rfoldMap https://hackage.haskell.org/package/vinyl-0.13.1/docs/Data-Vinyl-Core.html#v:rfoldMap is the right combinator
where each output key would be a list with a single TC inside
Marco Z
@ocramz
👋
Kevin Brubeck Unhammer
@unhammer
Marco Z
@ocramz
@unhammer perhaps @austinvhuang or @tscholak have updates on that?
Kevin Brubeck Unhammer
@unhammer
joined their slack, seems they're slowly working on it themselves from their chat logs …
João Paulo Andrade
@JooPaul99652025_twitter
Hi, i'm starting on Haskell but i have a degree in mathematics and a master in computational modelling. I wish to know if i can contribute to the project.
Austin Huang
@austinvhuang
@unhammer which idea are you referring to? The huggingface integration?
Kevin Brubeck Unhammer
@unhammer
Yeah
Austin Huang
@austinvhuang

@unhammer fwiw transformer models are on the critical path for a project i'm working on. can't speak for torsten but i think it's pretty important for the things he's interested too. it's also a space of capabilities that i think are hard to accomplish any other way in haskell so i think it's important to the ecosystem from that perspective.

bottom line is, these factors are going to move things effort forward in the medium term and we're definitely looking for help from others that want to see this happen.

The implementation path is pretty clear, there's 3 main things that need to get done:

1) mature the haskell bindings to rust tokenizers https://github.com/hasktorch/tokenizers

2) for cases where training is desired - will want to implement the haskell architecture and a weight parser that's 1:1 to deserialize the pytorch version. For cases where you just want to do inference and use haskell to deploy a model the usual torchscript inference + tokenizers will probably be sufficient.

3) address library performance pain points along the way (for example - mutable optimizer patterns for training w/ large # of parameters).

João Paulo Andrade
@JooPaul99652025_twitter
has anyone installed haskell.do on wsl?
Justin Le
@mstksg
thinking of getting back into the game maybe :)
some time away has given me some distance. i wonder how much value there is to having backprop be "automatic" in terms of automatically inferring the graph. it's flashy and cute but do people really mind writing explicit data deps?
@ocramz 's article on a scala api seems like it's not too bad really. and it'd blend well with my mutable library too maybe :)
Marco Z
@ocramz
hi @mstksg !
Justin Le
@mstksg
hi @ocramz !
Marco Z
@ocramz
are you thinking of updating backprop? what do you mean by "explicit data deps"
Justin Le
@mstksg
building the graph explicitly instead of implicitly, so no magic unsafePerformIO, and the ability to treat the graph like a real thing you can pass around and mutate/reset in place
Marco Z
@ocramz
@mstksg I've been looking at AD from various angles in the past few months. After the ad-delcont I tried writing a Core plugin pass that differentiates given functions, which works but it becomes a mess quickly due to polymorphism
this btw is how 'zygote' works in Julia
Marco Z
@ocramz
the 'observable sharing' trick used in 'ad' and 'backprop' is still completely inscrutable to me.