by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    El Kharroubi Michaël
    @michael-elkh
    But thanks for the idea
    El Kharroubi Michaël
    @michael-elkh
    image.png
    I think I found the issue, but I know if I can fix it, or if it has to be done on the futhark compiler side?
    Troels Henriksen
    @athas
    Ugh. Looks like CUDA uses thread-local storage for its context. I think you have to keep all Futhark calls in the same thread.
    Eventually maybe Futhark could do something smarter here, but I'm not sure about the fine print in this part of the CUDA API.
    El Kharroubi Michaël
    @michael-elkh
    I see. Ok, thank you for your help.
    I think something can be done with CUresult cuCtxSetCurrent ( CUcontext ctx ) or CUresult cuCtxPushCurrent ( CUcontext ctx )
    Do you want me to open an issue?
    Troels Henriksen
    @athas
    Sure.
    El Kharroubi Michaël
    @michael-elkh

    Sure.

    Done

    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    The abyssmal perfromance on my vega 8 was largely because of my mistake of not adding the tuning file to my gitignore, so it used cpu-tuning parameters. It performs alright now.
    Troels Henriksen
    @athas
    Good to hear!
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    Are right-shift operations on negative numbers well defined in Futhark? As in, will I get the same results regardless of backend/hardware?
    In the interpreter, they are arithmetic shifts (which is what I want), but I've read that shifts can be implementation dependent in some cases.
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    kernel extraction issues again
    I just concatenated all related files this time, but if it helps I can try to reduce it a bit more.
    Troels Henriksen
    @athas
    Negative shifts are backend-dependent.
    No need; I can probably shrink it.
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    Ok thanks, and good to know
    Hardly my favourite part of the language design.
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    I guess it's hard to get around when you compile to different backends
    Troels Henriksen
    @athas
    Not particularly. It wouldn't be hard to generate code with the same semantics everywhere, but it would add a significant amount of instructions to operations that tend to be used only in fairly tight and tuned code in the first place.
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    That's what I meant, if you are using bitshifts, you probably want the performance.
    Either way, for place I was looking at, it would definitely be considered premature optimisation
    It's just so tempting when you only deal with powers of two
    Troels Henriksen
    @athas
    Well, you can define your own wrapper with a branch!
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    What is the cost of branches really?
    Troels Henriksen
    @athas
    Almost free.
    At least when everything involved is a scalar.
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    Is it through a transform to a branchless definition?
    Aren't branches usually rather expensive?
    Troels Henriksen
    @athas
    Due to branch divergence? Only when the branch bodies are large.
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    I see
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    Is it fair to assume that all hardware uses two's complement? I use a little trick for power of two modulo operations that behaves exactly the way I want, but only assuming the system uses two's complement.
    Troels Henriksen
    @athas
    Two's complement is required/specified by Futhark.
    So you can rely on it.
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    Good
    Troels Henriksen
    @athas
    @Gusten_Isfeldt_gitlab what kind of program are you writing, anyway? You've been pretty good at finding exotic bugs in rarely-exercised code paths.
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    Well it's far from finished, but the end result should be a rigid body simulation software with stokes-flow and coulomb interaction (both solved using spectral methods), and neural network accelerated pair interaction (ie fancy interpolation in a really awkward coordinate space).
    It's really only supposed to be a tool for my research
    but I will publish the software too
    Troels Henriksen
    @athas
    Cool! I think that is basically in the sweet spot of what Futhark is supposed to be used for.
    Sorry the last bug took a while to get fixed, but it was surprisingly tricky, and I'm stuck in a tent with my laptop at a hacker camp, which is not ideal for productivity.
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    Yeah, when I first heard of Futhark from a friend (who I know in a completly unrelated context) I got really excited
    Still fast, and it hasn't slowed me down anyway
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    What do you think about adding an unfold function to the language? I am thinking something like unfold 'a 'b (n:i32) (f: a -> (b, a)) (seed: a) : ([]b, a) . I know that this is easy to implement as a loop, but like in the case of permute the array must be initialized with some value. In this case however, no potentially nondeterministic behaviour would be added by not doing do.
    I don't think it would be critical for performance, but I just sort of felt it was missing when I was looking at pseudorandom number generation yesterday.
    Troels Henriksen
    @athas
    It looks harmless, but I'm fairly reluctant about adding builtins unless they are likely to be very useful. The maintenance burden is high.
    Gusten Theodor Isfeldt
    @Gusten_Isfeldt_gitlab
    Fair enough. For me it's relatively uncommon to create arrays sequentially anyways, so it would not be something I use often.