Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Alexander Botev
    @botev
    and when you constructed it actually does not do any actual computation
    but rather when you run it
    Roman Pearah
    @neverfox
    does gir have any automatic optimizations at this stage?
    like if it gets x * 1 will it just drop the multiplication?
    or is it presumed that optimizations are the responsibility of something downstream?
    Alexander Botev
    @botev
    so at this stage no
    in general there should be 5 layer as in the LLVM:
    1. Interface - since its written in Rust that does not exist in rust, but you can export it to Python, etc.. where it will have an API
    2. IR - this is what currently is the gir_core
    3. Backend agnostic optimization on the IR
    4. Backend specific optimization - this will be downstream backend job
    5. Backend code generation/compilation/linking
    Roman Pearah
    @neverfox
    I didn't think it did
    what would you say the relative impact is on performance of graph optimization vs just having the ability to calculate on a fast backend?
    right, which is the MxNet model
    Alexander Botev
    @botev
    so if the backend is very fast you can potentially go away with not too much impact
    however memory optimization is not possible on the fly
    since you don't know if you are not going to use something in the future, for gradients
    when the graph is completed you can look back and say - ah this is no longer needed after step X so I can recycle memory
    this is why for isntance Theano and Tensorflow have almost 50% memory usage compared to pytorch
    Roman Pearah
    @neverfox
    gotcha
    Alexander Botev
    @botev
    MXNet have even less as they have even more aggresive memory optimization
    Roman Pearah
    @neverfox
    and is that an easy thing to do?
    Alexander Botev
    @botev
    actually that is relatively easy yes
    if you have given operator schedule
    basically you know the last operation each tensor is part of
    so you know that after that it can be dorpped, or even used for inplace
    jonysy
    @jonysy
    Parenchyma 0.0.3! 🎉✌️ https://github.com/lychee-eng/parenchyma
    Alexander Botev
    @botev
    congrats :clap:
    jonysy
    @jonysy
    Thanks!
    jonysy
    @jonysy
    Here’s another graph-based ML library https://github.com/millardjn/alumina
    Alexander Botev
    @botev
    that one seems mainly targeting rust
    jonysy
    @jonysy

    @botev I checked out your arrayfire crate for GIR.. You aren’t doing any source/kernel generation, you’re simply using arrayfire Arrays instead of compiling the source to a kernel and then loading it in arrayfire.

    Is there a reason for that?

    Alexander Botev
    @botev
    you can not the kernel generation in Arrayfire
    the reason is arrayfire is easy to get things going, as it implements this and works on anything
    I'm currently working on the opencl bit
    where kernel generation will happen
    Arrayfire is a nice abstraction to use, and to show how the graph works, without needing to do kernel generation
    jonysy
    @jonysy
    I understand. You’re basically creating a heavily optimized transpiler - which is a huge undertaking
    Alexander Botev
    @botev
    what is transpiler? it also ilustrates how you can used the autodiff of gir in to other packages which already have numerical routines
    jonysy
    @jonysy
    You aren’t essentially transpiling Rust to CL?
    Alexander Botev
    @botev
    is that like translating?
    im sorry never came across the term
    jonysy
    @jonysy
    Yes, basically. I should probably use the word translator, anyway
    Or compiler
    Alexander Botev
    @botev
    then yes, however I like to think of it as a Meta-LLVM
    llvm what it does is it takes some code in language X and translates it to binary for architecture Y
    jonysy
    @jonysy
    A transpiler is a source-to-source language translator. So, it compiles (or translates, for that matter) a source to another source
    Alexander Botev
    @botev
    here we abstract the architecture to a numerical framework (opencl, cuda, arrrayfire etc) which is a level above binary code
    jonysy
    @jonysy
    Right. So a sigmoid function 1.0 / (1.0 + exp(-x)) in native Rust could compile to CL like such:
    __kernel void sigmoid(__global float *a, __global float *b) {
        uintptr_t i = get_global_id(0);
        b[i] = 1.0 / (1.0 + exp(-a[i]));
    }
    Alexander Botev
    @botev
    yes elementwise function would all look like that
    matrixbot
    @matrixbot
    neverfox If you do the second style, why not Reshape::new("reshape")...?
    matrixbot
    @matrixbot
    cathal Looks like a much funner way to write kernels :)