Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Martin Cech
    @martenson
    In common deployments you either use nfs or Pulsar to stage in/out. Is Galaxy k8s different?
    Nuwan Goonasekera
    @nuwang
    It’s using NFS at the moment, but the plan is to migrate fully to Pulsar as the default (once Pulsar's remote object fetching is fully tested and ready)
    Martin Cech
    @martenson
    So in that way the Galaxy through Helm mimics the common deployment (file_path on NFS)?
    Nuwan Goonasekera
    @nuwang
    yes
    Martin Cech
    @martenson
    Alright, splendid. Thank you both.
    (And thanks for joining committers Nuwan :tada: )
    Nuwan Goonasekera
    @nuwang
    Np and glad to!
    I'm trying to figure out how to set cpu and memory limits and I assume it's done there?
    This seems like it probably shouldn't return a dict if default_resource_set is unset: https://github.com/galaxyproject/galaxy-helm/blob/master/galaxy/files/rules/k8s_container_mapper.py#L68
    Nate Coraor
    @natefoo
    Nuwan Goonasekera
    @nuwang
    @natefoo Did you run into an issue?
    Nate Coraor
    @natefoo
    I don't think there's a way to set any of these (or the walltime) for Pulsar coexecution pods
    Nuwan Goonasekera
    @nuwang
    Luke and I were just talking about that the other day. A lot of these options in the k8s runner will need to be ported over
    Nate Coraor
    @natefoo
    Yep
    Can I get kubectl describe (or another command) to output a spec? I don't see an option for it
    ah get ... -o yaml
    I'll test on Test
    Marius van den Beek
    @mvdbeek
    Before replicating all this between Galaxy and pulsar in slightly different ways it might be a good idea to converge this in pulsar, and maybe make all of this templateable ?
    Longer term ofc
    Nuwan Goonasekera
    @nuwang
    Yes, that would be good. There was a preliminary attempt at this which was abandoned pending the shift to Pulsar, but perhaps the template etc. can still be used? galaxyproject/galaxy#10714
    Nate Coraor
    @natefoo
    pfft copy paste is the best method ;D
    Nate Coraor
    @natefoo
    (yes, you're right, but if this works I'm PRing it anyway as I need it to work in production and we can refactor it later)
    Marius van den Beek
    @mvdbeek
    :+1:
    Nate Coraor
    @natefoo
    requests and limits work but activeDeadlineSeconds doesn't - this works on your regular (non-pulsar) k8s jobs @nuwang ?
    Nuwan Goonasekera
    @nuwang
    Yes. The job should stop, but it won’t get cleaned up
    or may not get cleaned up
    Nate Coraor
    @natefoo
    ahh the option is at the wrong level of the dict
    Nate Coraor
    @natefoo
    Nice, fixed
    Nate Coraor
    @natefoo
    Nuwan Goonasekera
    @nuwang
    :+1:
    Pablo Moreno
    @pcm32
    Hi, question @nuwang is it me or the docs in https://github.com/galaxyproject/galaxy-docker-k8s don’t really explain how to build that docker image? The repo doesn’t have a Dockerfile, yet all explanations are with docker build. I’m guessing you need to use the ansible playbook there, but it could be great if the readme would be explicit about what are those steps. Thanks!
    1 reply
    Gulsum Gudukbay
    @gulsumgudukbay
    Hi, I am following this tutorial: https://training.galaxyproject.org/training-material/topics/admin/tutorials/k8s-deploying-galaxy/tutorial.html#deploying-a-cvmfs-enabled-configuration to deploy Galaxy on k8s. However, when I reach the end of the tutorial and run "kubectl get pods", the pod "galaxy-metrics-scraper-9844bf959-8m8wk" has status "CreateContainerConfigError" and all the other pods are "Pending". To understand why this pod has the ConfigError, I checked the logs and there seems to be a problem with influxdb secret creation (Error: couldn't find key influxdb-user in Secret default/galaxy-galaxy-secrets). How can I solve this problem?
    Alexandru Mahmoud
    @almahmoud
    @pcm32 I replied to your issue in github
    Alexandru Mahmoud
    @almahmoud
    @gulsumgudukbay Could you give some more details on the type of environment you're deploying in? Is it on a managed k8s (gke, aks, eks, etc...) or a cluster you built yourself? Are you looking to deploy only 1 Galaxy or multiple ones in the same cluster? Are you looking to run a single-node or multi-node cluster? Do you have a shared filesystem eg NFS (you will need one if you have more than one node)? And finally what version of k8s are you running in, and what values did you change, if any?
    Those tutorials are a bit out of date since the defaults have changed in the charts. This might be more useful to you for a basic installation with CVMFS, but happy to help more for potential specific things for your use-case: https://github.com/galaxyproject/galaxy-helm/tree/2009readme#example-installation-for-a-single-galaxy-instance-with-cvmfs-tools
    Gulsum Gudukbay
    @gulsumgudukbay
    Hi @almahmoud I setup a k8s cluster that has 1 kubemaster and 1 node, and I want to deploy only 1 Galaxy instance on that cluster. It is a multi-node cluster. I also tried the CVMFS section of the tutorial and got the same error. The kube version is Major:"1", Minor:"20", GitVersion:"v1.20.4", for both nodes.
    I will try the github link you mentioned and let you know if there's any problem. Thanks!
    Alexandru Mahmoud
    @almahmoud
    Do you have a ReadWriteMany storage class? also I'm not sure if we've tested the chart with 1.20, there might be some api versions out of date or such things... Let me know how it goes though, i'd be happy to help start pushing to make it compatible with 1.20 if things don't work
    Gulsum Gudukbay
    @gulsumgudukbay

    @almahmoud I checked the PVC using kubectl describe pvc and I got "FailedBinding 2m55s (x17605 over 3d1h) persistentvolume-controller no persistent volumes available for this claim and no storage class is set" (I am not sure if this is the correct way to check it).
    Also I tried the tutorial which is in the link you sent, however I am getting "Error: rendered manifests contain a resource that already exists. Unable to continue with install: StorageClass "cvmfs-gxy-data" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-name" must equal "my-galaxy-release": current value is "cvmfs"; annotation validation error: key "meta.helm.sh/release-namespace" must equal "default": current value is "cvmfs""
    Error after I try to execute "helm install my-galaxy-release cloudve/galaxy --set cvmfs.enabled=true --set cvmfs.deploy=true"

    I tried to delete every pod, every namespace and every helm deployment to do a clean start, however, I still get this error.

    Alexandru Mahmoud
    @almahmoud
    Regarding the first error, you need a storage class to dynamically provision volumes for you, or you need to provision volumes yourself for persistence (this would be for persistent volumes for data, not the node disks that hold the ephemeral storage). If you're planning on running all of the Galaxy pods on one node, any volume should be enough. If you want to run across multiple nodes, you will need some sort of shared filesystem. We have used NFS and CephFS in the past.
    Regarding the second issue, it seems you have resources from a previous helm release that were not deleted. This is likely due to an uninstall that has not properly ended in the past. From that error, it seems you still have leftover resources from a CVMFS installation under cvmfs namespace. This is only an issue because some resources, like the storage class, are not namespaced. So while the new release installed all the namespaced resources in default namespace for the new release, it conflicted with existing resources when trying to also create the cluster-wide resources such as the storage class. If it's not too hard, it might be worth just starting from a new cluster, adding the storage, and then trying again, rather than trying to clean up the existing one
    Gulsum Gudukbay
    @gulsumgudukbay
    I want to run Galaxy across multiple nodes, so I will need a shared filesystem.
    I will try to setup the k8s cluster again from the beginning. Thanks for the advice. I will let you know if it works fine or not. Thanks!
    Alexandru Mahmoud
    @almahmoud
    Hope it helps, let me know if you encounter other issues
    Po Foster
    @pofoster_gitlab
    I am running galaxy from AWS EC2 using cloudlaunch (https://launch.usegalaxy.org/appliances) and the GVL appliance. it seems that I need a .pem file to ssh into the ec2 after galaxy was setup. There is a keypair in my AWS console for cloudlaunch, but I don't have the .pem file (no way to download it) and password authentication seems to be disabled. IS THERE A WAY TO SSH INTO THE MACHINE?
    Nuwan Goonasekera
    @nuwang
    The .pem file is available for download for an hour after launch through the cloudlaunch interface. If you didn’t download it, and you need to ssh in, the easiest thing is to relaunch a fresh instance. It’s often easier to just create a keypair yourself in AWS, and select that keypair at launch time.
    If you really need access to this instance itself, then it’s a bit more of an involved process with something like: https://docs.aws.amazon.com/systems-manager/latest/userguide/automation-ec2reset.html or https://sennovate.com/how-to-reset-the-forgotten-root-password-in-aws-ec2-instance/
    Po Foster
    @pofoster_gitlab
    @nuwang thanks. that works. I created a keypair on AWS, then used the advanced deployment option on cloudlaunch to select the keypair.