Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Sep 23 08:05
    VMois synchronize #540
  • Sep 22 13:36
    VMois commented #482
  • Sep 22 13:36
    VMois commented #482
  • Sep 22 13:35
    VMois commented #482
  • Sep 22 12:26
    tiborsimko commented #482
  • Sep 22 12:15
    tiborsimko commented #482
  • Sep 22 09:36
    mvidalgarcia closed #530
  • Sep 21 14:42
    mvidalgarcia reopened #530
  • Sep 21 14:39
    mvidalgarcia closed #530
  • Sep 21 14:21
    audrium closed #399
  • Sep 21 14:21

    audrium on master

    workflow_run_manager: mount WOR… (compare)

  • Sep 21 14:20
    audrium closed #538
  • Sep 21 14:20
    audrium closed #539
  • Sep 21 14:20

    audrium on master

    helm: fix templates to separate… (compare)

  • Sep 21 14:18
    audrium ready_for_review #399
  • Sep 21 14:18
    audrium ready_for_review #538
  • Sep 21 13:27
    VMois commented #482
  • Sep 21 13:22
    VMois commented #482
  • Sep 21 13:16
    VMois edited #540
  • Sep 21 13:15
    VMois opened #540
Adam Morris
@abmorris
ok I will try a local deployment before moving to openstack
thanks
Adam Morris
@abmorris
Hi. So I've been added to the LHCb Analysis Preservation Openstack project and I see there's no Share/Share Storage quota. I requested some CephFS space, but I was asked by our computing team why I can't use EOS instead. I'm not sure I can answer definitively.
I'm also thinking whether it's possible to use a volume + NFS
6 replies
Adam Morris
@abmorris
Is there a helm 3.2.x installation on lxplus?
3 replies
Adam Morris
@abmorris

This is where I am with the NFS provisioner: https://codimd.web.cern.ch/s/ryXsju3gw#NFS-share

I feel like I'm missing something. When I describe pods, the one called db is in a crash loop, and both the NFS provisioner and REANA server pod have warnings about pod has unbound immediate PersistentVolumeClaims

Adam Morris
@abmorris
right, I had to create a PersistentVolume
Adam Morris
@abmorris

ok, I'm stuck. Could someone check that CodiMD link ^ and tell me where I've gone wrong?
I have an OpenStack volume connected to a PersistentVolume. The NFS provisioner is running and the PersistentVolumeClaim created by REANA is attached successfully.

The db pod is still in a crash loop, while the server and workflow-controller pods are stuck in ContainerCreating

6 replies
Tibor Šimko
@tiborsimko
@abmorris Two meta-comments while inspecting the PVC situation: (1) If you plan to test some LHCb workflows, would it be feasible to use reana.cern.ch instance? There is a possibility to access restricted resources via keytab if you wish, we have some ATLAS and CMS examples. We could perhaps talk in CERN's REANA Mattermost channel about your particular use case? (2) Concerning supporting EOS, it is currently necessary to have some Ceph/NFS shared storage for cluster nodes indeed. One can do stage-in/stage-out of data from/to EOS already, but this is not practical to do for all the workflow steps... We are planning to look into a possibility of running user workflows directly on the user EOS workspace; that might perhaps be of interest?
5 replies
Chris Hollowell
@hollowec
HI all. We've deployed a test REANA instance on a k8s cluster at BNL. Ideally, we would want to run this in our openshift cluster instead. Has anyone successfully done this without admin-level privileges in openshift? I'm getting some errors about access to clusterrolebindings when using helm to install (potentially related to https://github.com/reanahub/reana/pull/291/). We would need to use openshift routes instead of Traefik ingress also. Also does anyone have documentation/an example of configuring REANA to use K5/non-CERN IDP auth, instead of local accounts?
Stian Soiland-Reyes
@stain
Hi, is https://reana.cern.ch/ meant to be down/firewalled?
Tibor Šimko
@tiborsimko
Hi @stain, yes reana.cern.ch is still behind CERN firewall and used for physics workflows. We don't have any public "demo" site open to non-CERN users yet, but hopefully this will change in the coming weeks.
Tibor Šimko
@tiborsimko
Hi @hollowec, sorry for the late reply! It is now possible to install REANA easily into a particular namespace, so some of the things necessary for the RIVER deployment are now available out of the box. However we haven't looked systematically into the limited privileges yet... We can revive that thread though! Also, haven't tried with OpenShift yet (this could be less work) or with non-local non-CERN SSO options (this could be more work). Have tried to deploy on Google Cloud GKE, that was working well out of the box (using local accounts).
Chris Hollowell
@hollowec
OK thanks @tiborsimko. It would be nice to integrate with our IDP if possible, and that would likely be needed before this could be adopted on a larger scale at BNL. If we are using local accounts, is there a way to disable the sign-up functionality on the web interface? I would prefer to create users manually on the reana server command-line (flask user-create) as needed. Does user-create allow the assignment of a web-ui password? Also, is there a way for users to change their password via the web interface?
If you can provide an openshift-compatible helm chart, or associated helm values needed for use with openshift with the existing chart that would be great.
Tibor Šimko
@tiborsimko
We used to have it exactly as you described in 0.6 and past releases, that is there was no sign-up on the web and it was only the admin person who could create user accounts. As of 0.7 release, users can sign-up by themselves, but please note that this action is not sufficient to launch any workflows yet; the user has to request access token via a web interface button, which will send an email to the admin, and it up to the admin to grant or refuse the user requests. Only after the admin grants token can the signed-up user start using the system. Would it be sufficient for you like this? Or would you stlil like to block sign-up? If so, please open an issue on GitHub, we can make the sign-up feature optional upon deployment!
Chris Hollowell
@hollowec
Hi @tiborsimko , thanks yes I saw I needed to grant a user a token before they can use the system. The issue is if the web interface were to be opened to the world, someone could go through and create accounts for others, because there is no email verification needed to create the account. For instance, I could register an account for someone, say "someuser@bnl.gov", and then the real "someuser@bnl.gov" can't create an account unless I delete the original bogus user first. Also if it was fully open, external entities could fill up the user database with entries. Therefore, I think the signup should be an option: I will open a ticket in github. Thanks!
Mattias de Hollander
@mdehollander
Hi there. I have been keeping an eye on this project for a while and I really like it :) I also have been following the development on Renku. I like features from both. 1) Reana that you can submit cwl workflows to a cluster 2) With Renku you can launch a interactive Rstudio/Jupyter session. In the past there have been attempts (in 2018) to integrate both systems what I found is correct. There is also a integration label on github (https://github.com/reanahub/reana/issues?q=label%3Aintegration%2Frenku+). What is the current stage/plans for this integration. And now that Reana-UI has an option to launch interactive session for Jupyter (reanahub/reana-ui#77) would this integration still be needed? Thanks for the feedback, looking forward to start playing with it.
Kyle Cranmer
@cranmer
HI all... I wanted to discuss a REANA deployment at NYU using the Slurm backend. I'm aware of the efforts there, but I'm not sure what the status is. When I look at the docs I only see references to Slurm at CERN:
http://docs.reana.io/advanced-usage/compute-backends/slurm/
and I don't see anything about deploying at scale:
http://docs.reana.io/development/deploying-at-scale/
I also don't see much about deploying for HTCondor in the docs, and I think @hollowec would also be interested in that.
Tibor Šimko
@tiborsimko
@mdehollander Hi and sorry for the late reply! We've been in touch with RENKU in the past, exactly around the idea of running CWL workflows produced by RENKU on the REANA backend... but we haven't pursued the discussions lately. REANA focuses mostly on running the batch workflows, so the support for running interactive notebooks is a bit "secondary" there. The current aim is basically to allow people to open a notebook alongside running batch workflow for some quick explorations of intermediate data on the workflow's workspace. Kind of getting a remote shell on the workspace, if you will... That said, it might be interesting to bridge the notebook world and the workflow world better, e.g. to allow dispatching REANA workflows directly from the notebook, for those users who use notebooks as the main entry interface. We plan to enrich web interface with some R/W features in 2021, so we could perhaps revive this topic soon!
Tibor Šimko
@tiborsimko
@cranmer Hi, REANA contains an job controller abstraction notion that is described in this mini-paper: http://cds.cern.ch/record/2696223/files/CERN-IT-2019-004.pdf and This allows to dispatch jobs to CERN HTCondor and CERN Slurm installations (using Kerberos authentication). We publish corresponding Docker images for CERN HTC/HPC backends here: https://hub.docker.com/r/reanahub/reana-job-controller-htcondorcern-slurmcern/tags?page=1&ordering=last_updated The deployment is easy, enabling the backends in the configuration and replacing standard reanahub/reana-job-controller image (working only with Kubernetes) by the one that contains CERN HPC/HTC support. IOW, one would basically use the same Helm deployment recipe, only change the image. WRT adding support for the NYU cluster, it should be possible by sub-classing the abstract job controller class and overriding methods to use NYU specifics instead of CERN specifics, see Figure 2 in the cited PDF mini-paper. This was done successfully by the SCAILFIN@NotreDame team for the VC3, so I guess the abstraction of the design is tested to be relatively CERN-agnostic already. If you also use Kerberos kinit for NYU Slurm, it should not be that difficult an adaptation I guess... Perhaps just to replace the name of the head node etc... We could take this as an occasion to better parametrise Slurm backend
Kyle Cranmer
@cranmer
Thanks @tiborsimko
Chris Hollowell
@hollowec
Thanks @tiborsimko I will take a look at the document!
Tibor Šimko
@tiborsimko
@hollowec Please see also our latest paper from CHEP2019 that has some more details on this https://www.epj-conferences.org/articles/epjconf/abs/2020/21/epjconf_chep2020_06041/epjconf_chep2020_06041.html
Kyle Cranmer
@cranmer
Thanks again @tiborsimko that's also useful. Is there a plan to add this information on SLURM to the admin documentation ?
Tibor Šimko
@tiborsimko
Indeed, we have focused mostly on user-level documentation, the admin-level regarding HTC/HPC deployment is lacking behind...
Chris Hollowell
@hollowec
Thanks @tiborsimko I will take a look a the CHEP paper
Mattias de Hollander
@mdehollander

@mdehollander Hi and sorry for the late reply! We've been in touch with RENKU in the past, exactly around the idea of running CWL workflows produced by RENKU on the REANA backend... but we haven't pursued the discussions lately. REANA focuses mostly on running the batch workflows, so the support for running interactive notebooks is a bit "secondary" there. The current aim is basically to allow people to open a notebook alongside running batch workflow for some quick explorations of intermediate data on the workflow's workspace. Kind of getting a remote shell on the workspace, if you will... That said, it might be interesting to bridge the notebook world and the workflow world better, e.g. to allow dispatching REANA workflows directly from the notebook, for those users who use notebooks as the main entry interface. We plan to enrich web interface with some R/W features in 2021, so we could perhaps revive this topic soon!

Thanks for the update @tiborsimko

Kyle Cranmer
@cranmer
image.png
making progress with REANA @ NYU with slurm backend:
Michael R. Crusoe
@mr-c
:sparkling_heart:
Tibor Šimko
@tiborsimko
Nice and green! :four_leaf_clover:
Tibor Šimko
@tiborsimko

is there a way to disable the sign-up functionality on the web interface?

@hollowec We have just released REANA 0.7.2 which has a deployment option to disable the sign-up functionality. Please see https://blog.reana.io/posts/2021/reana-0.7.2/

Chris Hollowell
@hollowec
Thanks @tiborsimko !
jordidem
@jordidem
Dear all,
Long time since the last post (2019!!!)... We are testing REANA at PIC, here in Barcelona. We have a version working with Kubernetes and nou our idea is to integrate REANA with our HTCondor system. We checked the documentation and the code and we saw that the default configuration provided is quite CERN dependant, and we are wondering if there is any detailed procedure to change the configuration to configure other Compute Backends using other HTCondor environments. We reached the https://github.com/reanahub/reana-job-controller/tree/master/etc and we found 10_cernsubmit.config and krb5.conf files that look to me that is what we have to modify or extend with our configurations, is that correct? Do we need to install HTCondor packages and daemons to convert our reana on a submit machine?
Diego
@diegodelemos

detailed procedure to change the configuration to configure other Compute Backends using other HTCondor environments

Hello @jordidem. To have your own HTCondor support you would have to subclass the JobManager class, you can find a detailed explanation on how it works in this paper http://cds.cern.ch/record/2696223/files/CERN-IT-2019-004.pdf. Probably there will be many commonalities with the CERN implementation, in that case, we could have a look at it and perhaps create a superclass with the common code?

Do we need to install HTCondor packages and daemons to convert our reana on a submit machine?

Yes, that is the other part that needs extension. As an example, at CERN we build REANA Job Controller with the htcondorcern build alrgument and that includes the necessary packages to submit jobs to CERN HTCondor, in your case you would have to add the instructions to install the packages you need.

Chris Hollowell
@hollowec
Hi everyone. Quick question: we now have CVMFS mounted on our k8s hosts. I tried accessing a CVMFS volume in my test workflow YAML, using the instructions listed here: https://docs.reana.io/advanced-usage/code-repositories/cvmfs/, specifying the cvmfs repo under 'resources', but it does not mount in the container. Is there anything else I need to do to enable this? Are any changes to the reana-job-controller image necessary?
Tibor Šimko
@tiborsimko
Hi @hollowec, two considerations: (1) REANA cluster templates are set up so that CVMFS volumes are mounted onto job pods via CSI PVCs. So CVMFS does not actually have to be mounted on the nodes themselves; it will get mounted for the jobs "on demand". This technique may be CERN-K8s-specific, if you don't use CSI on your end. Also, the resources clause applies to these mounts. (2) If you have /cvmfs/... mounted on your nodes already, then you would basically need to just expose these host mounts to job pods without CSI. This can be done at the cluster creation time, e.g. for development purposes, if I have a node where /cvmfs is mounted, I create cluster as follows: reana-dev cluster-create -m /var/reana:/var/reana -j /cvmfs:/cvmfs. Here, the resources does not have to be used, since any host path could be mounted in a similar way, such as /mybigdata/.... So this second technique isn't CVMFS specific actually. (3) Please tell me which one you would like to try and I can point you to the appropriate code.
Chris Hollowell
@hollowec
Thanks @tiborsimko! We do not have any CSI drivers setup in our k8s cluster currently, so the solution of bindmounting /cvmfs into the containers seems to be the simplest path forward
crusoe
@mr-c:matrix.org
[m]
@tiborsimko we got a request for CWL v1.2 on Kubernetes in another chat room, so I opened reanahub/reana-workflow-engine-cwl#187
Let me know if you have difficulties with the cwltool upgrade, many changes since 2019 🤓
Tibor Šimko
@tiborsimko
@mr-c:matrix.org Thanks! Yes we are still on cwltool 1.0.2019... Do you recommend to jump to 3.0.2021... or 3.1.2021...? Any particular incompatibilities or things to look at?
2 replies
Tibor Šimko
@tiborsimko

@hollowec Sorry for the late reply! Please have a look at the documentation of components.reana_workflow_controller.environment.REANA_JOB_HOSTPATH_MOUNTS in our helm readme and the explanations in our commons code. Basically when you create cluster you should pass something like:

components:
  reana_workflow_controller:
    environment:
      REANA_JOB_HOSTPATH_MOUNTS: '[{"name": "cvmfs", "hostPath": "/cvmfs", "mountPath": "/cvmfs"}]'
      REANA_RUNTIME_KUBERNETES_KEEP_ALIVE_JOBS_WITH_STATUSES: failed
    image: reanahub/reana-workflow-controller

This will make /cvmfs path from nodes being available to job pods by host path mounting.

Tibor Šimko
@tiborsimko
REANA 0.7.4 is released! This minor release update allows users and cluster administrators to specify memory limits for workflow jobs running on the Kubernetes compute backend platform. The release also improves the REANA client functionality for workflow parameter validation and contains other minor improvements and bug fixes. Please see the detailed blog post for more information :point_right: https://blog.reana.io/posts/2021/reana-0.7.4/
Chris Hollowell
@hollowec
Thanks for the info on REANA_JOB_HOSTPATH_MOUNTS @tiborsimko !
Irina Espejo
@irinaespejo

Hello developers! I'm re-posting here a comment by @sinclert on Mattermost. We're really interested on this ongoing bug from NYU

Hi team :wave:

We have been recently trying to run a workflow on your REANA Kubernetes instance, but receiving cryptic logs without any useful debugging information (just node: <...> failed. reason: unknown). This is the reference to the workflow: https://reana.cern.ch/details/d834f61f-3da0-42a6-8606-f5117927539b

Are there any deeper logs you could access as admin, that could help us debug what is going on?

As a hint: the workflow in question was working, on the CERN Kubernetes instance, on the very early versions of REANA 0.7 (0.7.0, 0.7.1...) but eventually it stopped working. Could there be any Kubernetes-level resource policies involved?

Thanks!

Rohit Sarkar
@rsarky
Hey everyone!
I want to set up a minimal reana development environment locally.
According to the docs here: http://docs.reana.io/administration/deployment/deploying-locally/
it recommends to: Fork and clone all REANA source code repositories:
This in turn tries to fork and clone several repositories, including examples
What repositories do I need to clone to minimally setup a local dev env?