Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Troels Henriksen
    @athas
    But the main performance problem is the naive inlining policy. I think significant gains could be made just by making that a bit smarter.
    It will never be a fast compiler, though.
    Gusten Isfeldt
    @gusten:matrix.org
    [m]
    I'd rather have fast code than a fast compiler :)
    Troels Henriksen
    @athas
    Well, that's the idea. Originally my success criteria was for Futhark to be regarded like Stalin Scheme: technically impressive, but not practically useful due to excessive compile times. We've grown more ambitious since then, though.
    Gusten Isfeldt
    @gusten:matrix.org
    [m]
    This is of course only until Snektron implements futhark in futhark.
    1 reply
    Troels Henriksen
    @athas
    And algorithmically, the Futhark compiler isn't doing anything ludicrous like Stalin is. The most dubious thing we have is a general policy of writing passes that generate excessive code, then afterwards apply a heavy-duty simplifier to shrink it again. That means every single pass doesn't have to care about producing minimal code, but it does slow down compile times.
    Yes!
    Gusten Isfeldt
    @gusten:matrix.org
    [m]
    I need to read more about Stalin. What a thing to say.
    Troels Henriksen
    @athas
    It's "brutally optimizing".
    Its main trick is intense (and super expensive) control flow analysis to aggressively monomorphise and unbox values.
    Gusten Isfeldt
    @gusten:matrix.org
    [m]
    "Stalin is free and open(source)"
    Troels Henriksen
    @athas
    A similar design is found in the MLton compiler, although MLton is much more practical.
    Snektron
    @snektron:matrix.org
    [m]
    I've grown to love the |> operator
    i did not realize i needed to have something like that in my life
    Gusten Isfeldt
    @gusten:matrix.org
    [m]
    Makes sense that a snake would like a pipe if you ask me
    Troels Henriksen
    @athas
    I took it from F#. Despite years of Haskell, I think I already like it better than Haskell's $ operator.
    Of course, in Futhark it should be called the thorn ( ᚦ) operator.
    2 replies
    munksgaard
    @philip:matrix.munksgaard.me
    [m]
    That's why I've started using & in Haskell! It's probably not idiomatic, but it is nicer than $ imo.
    Snektron
    @snektron:matrix.org
    [m]
    I did some preliminary testing for the lexer VS flex, flex reaches about 150-180 MB/s depending on the compiler, on my Ryzen 3700X. Running futhark on a RTX 2080 TI yielded about 5 GB/s, though i cant actually double check since my uni broke that machine. On my RX 580 it does about 650 MB/s, and futhark c does about 80 MB/s.
    The latter can probably be explained by the large lookup table i use, which isn't very cache friendly of course.
    Gusten Isfeldt
    @gusten:matrix.org
    [m]
    They broke the computer?
    Troels Henriksen
    @athas
    That sounds quite impressive (although I have no intuition for whether flex is considered fast). Too bad about your university messing with your machines.
    Snektron
    @snektron:matrix.org
    [m]
    I don't think its physically broken, but my faculty has been messing with these machines ever since the start of this academic year.
    1 reply
    I haven't tried other flex alternatives, although the name (fast lexical analyzer generator) would lead me to believe that it should be decently competitive. Its gnu software though, so its methods may be dated by now.
    Snektron
    @snektron:matrix.org
    [m]
    This isn't even a cluster system. There are just a few machines you can use as student, and another few you can use as staff. All fair-use, although that wont stop students from running R programms that run for several months and require 700GB of memory <grumble>
    Up until recently students could log into staff machines so i had just been using those.
    Every once in a while a machine breaks magically and usually its fixed within a few hours but not this machine i guess
    Snektron
    @snektron:matrix.org
    [m]
    Is it intentional that (a).b is an invalid module expression? I guess its kind of weird, but i'm trying to import a parametric module and apply it simultaneously

    so i have something like
    a.fut:

    module a (T: integer) = {
      let add (a: T.t) (b: T.t): T.t = a + b
    }

    and then in b.fut:

    module a_u8 = (import "a").a u8
    Troels Henriksen
    @athas
    Good question. No, I think that could be allowed.
    (x).y is a special expression in the term language (because x.y is handled as a single lexeme), and I guess we just never added it similarly to the module language.
    gusten
    @gusten:matrix.org
    [m]
    Just curious; what does the 'bikeshed' label mean? It's only on the AD issue as far as I can tell?
    Troels Henriksen
    @athas
    It's for issues where discussion of superficial issues is encouraged or even necessary.
    Based on this: http://bikeshed.org/
    Gusten Isfeldt
    @gusten:matrix.org
    [m]
    Interesting, haven't heard of it before.
    And regarding mailing lists, I sure wish they had that functionality sometimes
    Second link does not work for me
    Troels Henriksen
    @athas
    It's not my site. It's old hacker lore!
    Gusten Isfeldt
    @gusten:matrix.org
    [m]
    I realised that just after writing...
    I shall check the archives
    Gusten Isfeldt
    @gusten:matrix.org
    [m]
    archive.org saves the day
    Rowan Goemans
    @rowanG077
    Hello everyone :). I'm new to GPU programming and futhark but I want to try to speed up a rendering process of points. And would like some input on how it could be handled. I have a large set of 2d points where each coordinate is a float32. I want to "rasterize" these points and then count how many of them are visible. I understand I essentially have to implement a rasterization algorithm. However I'm wondering if this can be done efficiently at all on a GPU. Essentially I seem to have 2 choices. Either loop through every point and set the correspondin cell to 1. Or loop through the grid and check whether each cell contains at least a single point. But option 1 has concurrent write access to the grid which afaik is bad. And option 2 is slow because it's hard to determine whether a point falls into a cell.
    Are there any other ways to attack this?
    In fact I don't really need a final image. I just want to see have the number of cells which contain at least a single point.
    Troels Henriksen
    @athas
    @rowanG077 GPUs can indeed be used for graphics. Some of them even allow you to connect a monitor! Most modern 3D is based rasterisation, too. Futhark doesn't expose most of the graphics facilities provided by GPUs, but people have written various kinds of rasterisers. One of my colleagues wrote a paper with some techniques: https://futhark-lang.org/publications/array19.pdf
    I think your option 1 is best. Concurrent write accesses are not great, but in Futhark it would be a "generalised histogram" (reduce_by_index), for which Futhark has a high-quality implementation.
    Connor Clark
    @crclark

    I am using reduce_by_index to update cells in the neighborhoods around many cells. I am trying to generalize to allow the user to pass in the radius of the neighborhood, but I think I am stuck because I need more expressive shape types.

    Right now, the radius is hardcoded to 1, so we have 9 cells in the neighborhood, and my type is

    neigborhood (i: i64) (j: i64): [9](i64, i64)

    I would like to write

    neigborhood (i: i64) (j: i64) (radius: i64) : [(2*radius + 1)**2](i64, i64)

    which is currently impossible. Is there a workaround for this? Radius would be ultimately passed in at my entry point.

    munksgaard
    @philip:matrix.munksgaard.me
    [m]
    @rowanG077: Have you looked at scatter_2d?
    Troels Henriksen
    @athas
    The workaround is to compute (2*radius+1)**2 as a named variable, and then pass that along.
    3 replies
    Aly
    @aly:scuttlebug.space
    [m]
    Can I make it so that I can use Futhark code as a library that will first attempt to use OpenCL and then fall back to sequential C code if OpenCL cannot be used?