Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Helena
    @hexylena:matrix.org
    [m]

    ConditionPathExists=/etc/slurm/slurm.conf was not met

    looks like your config file is missing.

    1 reply

    in the defaults for the slurm role, you will see:

    slurm_config_dir: "{{ '/etc/slurm-llnl' if __slurm_debian else '/etc/slurm' }}"

    but if that is giving you the wrong result, in your groupvariables, you can override it, and set it manually:

    slurm_config_dir: /etc/slurm-llnl
    bvalot
    @bvalot
    Ok, I try
    bvalot
    @bvalot
    same error with log
    Sep 29 14:42:32 chrono-galaxy slurmd[29495]: (null): _log_init: Unable to open logfile `/var/log/slurm-llnl/slurmd.log': No such file or dir>
    Seems __slurm_debian variable that is not correctly set
    bvalot
    @bvalot
    any idea?
    Helena
    @hexylena:matrix.org
    [m]
    this seems like a different answer, no?
    the log dir is missing?
    bvalot
    @bvalot
    As I saw, slurm in debian work differently and that why in role, there is specail case for debian system
    but, I don't now why the __slurm_debian is not set to true
    __slurm_debian: "{{ ansible_os_family == 'Debian' }}"
    hexylena
    @hexylena:matrix.org
    [m]
    you can check the results of that with something like the following (on the target system)
    ansible -m setup -i localhost, localhost -e ansible_connection=local
    is it not the text Debian?
    and see what ansible_os_family is set to
    bvalot
    @bvalot
    "ansible_os_family": "Debian"
    Helena
    @hexylena:matrix.org
    [m]
    🤔
    bvalot
    @bvalot
    I have log error and not the primary one with confi file when I overrides this:
    slurm_config_dir: /etc/slurm
    Helena
    @hexylena:matrix.org
    [m]
    one error at a time then :) so override it, and then next to the log error.
    wow it's the one thing Nate Coraor didn't make a variable
    bvalot
    @bvalot
    for me, it is link to the same thing
    Helena
    @hexylena:matrix.org
    [m]
    it's a bit complicated to override the log dir specifically

    you can try setting

    __slurm_debian: true

    in your group variables, that should take preference over roles.

    bvalot
    @bvalot
    that doesn't work
    bvalot
    @bvalot
    I try to override the __slurm_debian: true in the role directly, but I have the same problems of bad slurm configs
    Nate Coraor
    @natefoo:matrix.org
    [m]

    You can override the old llnl paths with (among other options in your slurmd_config):

    slurm_config:
      SlurmctldLogFile: /var/log/slurm/slurmctld.log
      SlurmctldPidFile: /run/slurmctld.pid
      StateSaveLocation: /var/lib/slurm/slurmctld
      SlurmdLogFile: /var/log/slurm/slurmd.log
      SlurmdPidFile: /run/slurmd.pid
      SlurmdSpoolDir: /var/lib/slurm/slurmd

    This should match the newer Debian default paths.

    I'll need to figure out what version of Debian and what version of Ubuntu these changed in and update the role accordingly.
    bvalot
    @bvalot
    That work, nice!
    Must to configured slurm properly now for my server
    Nate Coraor
    @natefoo:matrix.org
    [m]
    Here's an issue to track: galaxyproject/ansible-slurm#20
    Lucille Delisle
    @lldelisle
    Hi there,
    Are there some galaxy instances (especially the usegalaxy.*) which have preindex maf ? (i.e. an non empty indexed_maf_files.loc)
    Nate Coraor
    @natefoo:matrix.org
    [m]
    I should have from some 10 years ago.
    bvalot
    @bvalot
    Hello. Any help to configure slurm parameters for the case of a server that are dedicated entirely to galaxy?
    Nate Coraor
    @natefoo:matrix.org
    [m]
    @bvalot: Are you using Slurm just to run jobs directly on the Galaxy server or is there a cluster you are setting up?
    bvalot
    @bvalot
    just to run jobs directly on the galaxy server
    Nate Coraor
    @natefoo:matrix.org
    [m]
    Ah ok - Is Slurm up and running now (does it respond to e.g. the sinfo command)?
    bvalot
    @bvalot
    yes, that work and I can run job on galaxy with slurm
    sinfo
    PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
    debug*       up   infinite      1   idle localhost
    Nate Coraor
    @natefoo:matrix.org
    [m]
    Ah ok, perfect. So what still needs to be configured, then?
    bvalot
    @bvalot
    slurm_config in group_var?
    How to use all ressource for slurm in the server?
    Must I configure memory ?
    bvalot
    @bvalot
    I try a more consuming job with spades assembly of bacterial genome and get this error:
    { "code_desc": "Out of memory error occurred", "desc": "Out of memory error: Out of memory error occurred", "error_level": 4, "match": "Cannot allocate memory", "stream": "stdout", "type": "regex" }
    bvalot
    @bvalot
    My current configuration derivated to the training:
    slurm_config:
      SelectType: select/cons_res
      SelectTypeParameters: CR_CPU_Memory
    Nate Coraor
    @natefoo:matrix.org
    [m]
    What do your slurm_partitions and slurm_nodes vars look like?
    bvalot
    @bvalot
    slurm_nodes:
    - name: localhost 
      CPUs: 24
    And I have no slurm_partition
    Nate Coraor
    @natefoo:matrix.org
    [m]
    I am not sure if there is a default memory allocation when using the CR_CPU_MEM consumable resources plugin (check the Slurm Consumable Resources docs) but you can specify a memory allocation with --mem=N to the native specification in your Galaxy job conf, where N is an integer in megabytes.
    3 replies
    SPAdes uses a lot, so the problem (if it's not limiting memory in this configuration) could be that your server does not have enough.
    jfeketet
    @jfeketet
    Hello, previously someone asked about an issue running the tool "Filter SAM or BAM, output SAM or BAM" on my instance of Galaxy, however I could not find any solutions to it and I have the same problem. I get the following error: "samtools: error while loading shared libraries: libcrypto.so.1.0.0: cannot open shared object file: No such file or directory", which I am seeing could be a dependency problem with samtools. Any tips for fixing this? Thank you very much for your help in advance!!
    2 replies