Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Stanislav Böhm
    @spirali
    Version 0.1.0 released!
    Stanislav Böhm
    @spirali
    First public issue reported & fixed. Yay!
    Stanislav Böhm
    @spirali
    Version 0.2.0 released!
    Tomáš Gavenčiak
    @gavento
    Stanislav Böhm
    @spirali
    Weiyuan Wu
    @dovahcrow
    Hi, I just get to know this very interesting project through reddit. May I ask what's the motivation behind this project? Is it a better Hadoop/Spark in the engineering sense(reinvent a better written wheel) rather than academic sense(new algorithm, new type of computation support etc.)?
    Stanislav Böhm
    @spirali
    @dovahcrow Hi, Generally our work is mostly engineering. But instead of improving Spark, I would say that the base point for us is Dask/Distributed. Nevertheless, we have also some research question around scheduler. However, our current goal is to find new users and try to understand what they really need and then confront our scheduler ideas with that.
    Weiyuan Wu
    @dovahcrow
    @spirali Sounds cool. I'm a Master student in database area and just find this interesting project related to my research.
    Stanislav Böhm
    @spirali
    @dovahcrow Our problem is how to efficiently schedule tasks, when we do not know much in advance (we do not know task duration, size of results, etc.) and even an user cannot often provide a good estimate (how long it takes to train a neural network on a particular part of data set?). We have some ideas ranging from better heuristics, algorithm based on belief propagation, etc. We can provide you basic mentoring if you want to try some experiments.
    Weiyuan Wu
    @dovahcrow
    @spirali Thanks! I'm currently working on another project and will come back to you later! This sounds like an interesting problem to me.
    Tanapat Xu
    @tnpxu
    Howdy
    Anyone here ever use Rain for long running task like streaming?
    Stanislav Böhm
    @spirali
    As far as I know, no one reports us this use case, and our background is more batch oriented. However, we had some ideas for some streaming improvements so if you have a use case or a missing feature, please let us know.
    raja sekar
    @rajasekarv
    Hi
    Is it possible to use distributed file systems like glusterfs or hdfs along with rain?
    Stanislav Böhm
    @spirali
    @rajasekarv Hi, Rain does not have any explicit support for distributed systems. Of course, Rain can read files that are normally mounted to workers.
    robbertvc
    @robbertvc
    Hi all.
    Is rain development still ongoing? Github metrics show some decay in activity. I'm considering Rain for a project. Is Rain feature complete enough for production? Anyway, thanks for the project, it looks great.
    Stanislav Böhm
    @spirali

    @robbertvc The project is still alive. We consider basic infrastructure more or less finished. There are some technical debts, but nothing serious from user's perspective. So without outside impulse, we have no big plans in near future. But we will try to fix any bug that appears.

    On the other hand, we are still working on related things. We are experimenting with schedulers (https://github.com/spirali/estee) and we would like to implement a common scheduling API into several projects, including Rain.

    robbertvc
    @robbertvc
    Cool! That's nice to hear. I will play around with the project then.
    Estee looks interesting as well.
    robbertvc
    @robbertvc
    Is Estee a tool to compare frameworks like Ray and Dask (and Rain)?
    Stanislav Böhm
    @spirali
    @robbertvc No, Estee is for comparing scheduling algorithms. However, we plan to create a common API that can be integrated into Dask and Rain, so in same sense, we can compare Dask and Rain runtimes with the same scheduler.
    Tanapat Xu
    @tnpxu
    Guys any idea on dynamically adding new remote worker and pushing new executors code.
    Stanislav Böhm
    @spirali
    @tnpxu It is now possible dynamically add a new worker dynamically. What is not yet finished is that now server assumes that all workers supports all executors, but with some changes in sheduler, it can be implemented relatively easilly.
    Tanapat Xu
    @tnpxu
    What's the practical way to add new node ( let's say i starting with 3 governor nodes (defined in host file) then i want to add more node in this cluster )
    Pedro Larroy
    @larroy
    Hey folks, considering using rain for a project. I have one question, how are failures from nodes handled?
    Vojtech Cima
    @vojtechcima
    @larroy Sorry for the tardy reply. Node (worker) failures are still not handled gracefully as node failures are not very common in our environment. Nonetheless, we are aware of this issue (substantic/rain#39) and would love to hear more about your use-case.
    Vojtech Cima
    @vojtechcima
    @tnpxu Just spin up another governor process (e.g. on a new host) using "rain governor <SERVER-ADDRESS>" and point it to the server. Server then adds the new resources to the resource pool. You can find more info in the docs (https://substantic.github.io/rain/docs/install.html#starting-governors-manually).
    Vincent yu
    @yutiansut

    Hello, I’m just fond the repo ‘RAIN"https://github.com/substantic/rain/ and I found that the repo has not been updated from 2018,I’m a quant and trying to found some solution about streaming computing like

    RUST SCHEDULE — py worker / cpp worker / rust worker

    For realtime /history analysis about financial data( like agent model)

    So I’m curious about why rain not updated, I like the idea about rain
    Thanks

    Regards

    yutian

    Under is my github https://github.com/yutiansut and my repo https://github.com/QUANTAXIS/QUANTAXIS