Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Snektron
    @snektron:matrix.org
    [m]
    Each lexeme is specified by some regex, and to lex it, all i need to do is a map, a scan, another map and a filter
    it does require quite a large lookup table though (~2.4MB for my test lexer, which handles 12 keywords, 13 sigils, identifiers, numbers, whitespace and comments)
    Some lexers i tested earlier required up to 42 MB, when entering all of C's keywords for example
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    @athas Do you have any rough estimation of when accumulators could be merged? In a couple of months? Within the year? I just don't want to build a strong dependence on features that will be highly experimental in the long term.
    Troels Henriksen
    @athas
    @Gusten_Isfeldt_gitlab Unless we find some fundamental soundness bug in the design, likely within a month or two.
    We have finally started working on the compiler features that the whole thing was meant for (AD), which will likely exercise the backend implementation enough that I will feel comfortable making it directly available to users.
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    Good, that's about what I expected.
    It's nice to have some form of answer if my supervisors ask.
    Troels Henriksen
    @athas
    I am pretty confident that the source language feature is sound, so I don't really expect any nasty surprises.
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    It feels very intuitive to me, though it would be nice if you could improve on the behavior of uniqueness types used with higher order functions. Piping accumulators with |> and so on would be nice for code structure.
    Troels Henriksen
    @athas
    I agree. It's tricky, though...
    Snektron
    @snektron:matrix.org
    [m]
    Just tested on a machine with a 2080 TI and it lexes the input file in 360us, excluding file transfers
    thats over 5 GB/s 😮
    Troels Henriksen
    @athas
    I must admit I have little intuition for whether that is fast.
    1 reply
    Orestis
    @omhepia:matrix.org
    [m]
    what are you using your lexer for?
    Snektron
    @snektron:matrix.org
    [m]
    Its going to be used in combination with the parser i was also working on for a compiler
    munksgaard
    @munksgaard:matrix.org
    [m]
    Snektron: Is the compiler also implemented in Futhark?
    Snektron
    @snektron:matrix.org
    [m]
    Were trying to do as much as possible in futhark yes
    Orestis
    @omhepia:matrix.org
    [m]
    what kind of compiler is this?
    1 reply
    Snektron
    @snektron:matrix.org
    [m]
    Im mostly working on the front-end, thats lexing, parsing and semantic analysis, while a friend works on the backend
    munksgaard
    @munksgaard:matrix.org
    [m]
    How do you get around the lack of recursion? I'd think that was pretty important in a functional parser/compiler?
    Snektron
    @snektron:matrix.org
    [m]
    Mostly by doing bottom up computation
    Some parts can be parallelized rather surprisingly
    Troels Henriksen
    @athas
    It sounds painful, but for the brave and true, everything is possible. Are you familiar with Co-dfns? https://github.com/Co-dfns/Co-dfns
    Snektron
    @snektron:matrix.org
    [m]
    There is some related work by aaron hsu that shows this
    (Aaron hsu wrote co-dfns)
    Troels Henriksen
    @athas
    From what I remember, Aaron claims that APL's "keyhole" operator (I think that's what they call it) is a very important tool. As I recall, it's a histogram/partition-like operator.
    Snektron
    @snektron:matrix.org
    [m]
    Ive read a decent part of his thesis yea. Sadly he doesn't really describe parsing, though i think thats less work for apl in specfic, whereas our approach is more generic.
    There is also a surprising amount of related work from the 80s, which are mainly about pram algorithms for parsing
    It mostly seems to have died out around the 90s, though there was some later work as well (like the parsing algorithm i implemented, its from a paper from 2007 or so)
    Troels Henriksen
    @athas
    Does parallel parsing... work? I mean, is it worth the bother?
    1 reply
    Aren't you IO-limited in most applications where parsing is the bottleneck?
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    I will probably soon need to make a Futhark module available for a Matlab user. I plan to just give them a wrapped futhark-generated C-library so they don't need the to install the compiler, but I think they use windows, and I don't have any experience building C-code on windows. Are there any dependencies that are not installed on a default windows system?
    Snektron
    @snektron:matrix.org
    [m]
    and that gpu code also constructs a parse tree
    whereas the cpu code only verifies
    thats on larger input of course, if the input is more like a few kb like any reasonable source code is its a different story
    futhark cpu is a lot slower - mainly because the parsing algorithm i implemented does a few passes, and so its limited by memory throughput
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    What grammar are you parsing?
    Snektron
    @snektron:matrix.org
    [m]
    i only tested with a really basic test grammar, which is the standard expression grammar thats given in every text book
    so definitely not a good benchmark
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    Do you think your algorithm would scale to something more complex?
    Snektron
    @snektron:matrix.org
    [m]
    I did make a more complex grammar, something that is more representative of a real programming language
    The way it works is that for every pair of characters, there is only allowed to be one possible part of a left-most derivation
    to check the input, a set of stack changes is generated for every input pair - an array of items that are popped from a stack, and an array of items that are pushed
    to verify the input all one needs to do is check whether that is balanced
    So the complexity of a parse is related to two things mainly: the size of the input (obviously), and the average amount of stack changes per pair of characters
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    Interesting. It's such an odd thing to do on a GPU.
    1 reply
    I was wondering how one would go about flattening it but it seems reasonable.
    1 reply
    Snektron
    @snektron:matrix.org
    [m]
    Those lists seem to average around 2 for the simple grammar, and around 3 (with some up to 7) for the larger one
    so for example i haven't found a way to parse something like if <expr> <block> else <block>