These are chat archives for nextflow-io/nextflow

17th
May 2018
Pierre Lindenbaum
@lindenb
May 17 2018 08:11

cool , I'll try and tell you.

too many jobs failed, I think I'll re-start from scratch, but I keep in mind that it may be possible to run some specific jobs by hand.

Paolo Di Tommaso
@pditommaso
May 17 2018 08:11
:+1:
Tobias Neumann
@t-neumann
May 17 2018 08:19
@pditommaso I logged in again today, ran it and suddenly it worked. One of the many mysteries of coding... thanks for your help!
Paolo Di Tommaso
@pditommaso
May 17 2018 08:20
very good
Tobias Neumann
@t-neumann
May 17 2018 09:01
ok sorry I was too quick to cheer. the task was stuck in the slurm queue and not actually running. same error. I checked the .command.run file and there's no singularity call to be found
Paolo Di Tommaso
@pditommaso
May 17 2018 09:02
therefore there's a problem with your config
what's the output of the nextflow config command in the launch directory ?
Tobias Neumann
@t-neumann
May 17 2018 09:04
There's none, but I use a Github repo workflow
nextflow run obenauflab/snv-calling-nf
Paolo Di Tommaso
@pditommaso
May 17 2018 09:05
this is the project you are trying to run ?
Tobias Neumann
@t-neumann
May 17 2018 09:05
exactly - with a bunch of parameters appended and a slurm profile
Paolo Di Tommaso
@pditommaso
May 17 2018 09:07
let me check
Tobias Neumann
@t-neumann
May 17 2018 09:09
latest commit has the container directive also in the process because that was the only way to get it running for now - but in the version I'm trying the directive is not there
Paolo Di Tommaso
@pditommaso
May 17 2018 09:17
Ok, there's a glitch in the config
if you look at the output of nextflow config you will see
process {
   publishDir = ['path':'./results', 'mode':'copy', 'overwrite':'true']
   errorStrategy = 'retry'
   maxRetries = 3
   maxForks = 3
   cpus = 1
   time = { 1.h * task.attempt }
   memory = { 1.GB * task.attempt }
   withName:gatk {
      container = 'docker://broadinstitute/gatk:4.0.4.0'
   }
   executor = 'local'
}

timeline {
   enabled = true
}

singularity {
   enabled = false
}

docker {
   enabled = true
}
what version of NF are you using ?
Luca Cozzuto
@lucacozzuto
May 17 2018 09:50
Dear all, sometimes in the folder I found some links to input.1 input.2 etc...
do you know what they are?
I got them when using
    input: 
  file("*") from Aln_folders.collect()
Paolo Di Tommaso
@pditommaso
May 17 2018 09:51
when you have declared a input file which is not actually a file
therefore it create that file automatically for you
Luca Cozzuto
@lucacozzuto
May 17 2018 09:52
mmm I don't see where I did it..
Paolo Di Tommaso
@pditommaso
May 17 2018 09:53
it means that Aln_folders contains strings not files
maybe the file path as a string ?
Maxime Garcia
@MaxUlysse
May 17 2018 09:53
@lucacozzuto It happened to me from time to time, try to go back from channels to channels and you might find a file which is just a path and not a file
Luca Cozzuto
@lucacozzuto
May 17 2018 09:54
ok, I'll double check this! Many thanks
Luca Cozzuto
@lucacozzuto
May 17 2018 10:00
So the Aln_folders channel contains:
Got: [sim, /nfs/software/bi/biocore_tools/git/nextflow/isoExpression/work/e9/806270335fedb1f3592694924adcb1/SALMON_sim]
Got: [test, /nfs/software/bi/biocore_tools/git/nextflow/isoExpression/work/af/ac522aa9ecc2933254006a1cefa563/SALMON_test]
Paolo Di Tommaso
@pditommaso
May 17 2018 10:08
I think you need to use groupTuple instead of collect
what are sim and test ?
Luca Cozzuto
@lucacozzuto
May 17 2018 11:12
two ids
Luca Cozzuto
@lucacozzuto
May 17 2018 11:19
btw using groupTuple give me only the input.1 and input.2...
Bioninbo
@Bioninbo
May 17 2018 11:20
Hello Paolo. Thanks a lot for completing nextflow-io/nextflow#256
Tobias Neumann
@t-neumann
May 17 2018 11:35
@pditommaso nextflow version 0.28.0.4779. ok I see the same thing for the config on my side. any apparent mistakes in the way I'm setting up the config?
Luca Cozzuto
@lucacozzuto
May 17 2018 11:36
so it is the ids that is messing up... just removing them solved everything. Thanks @pditommaso !
Paolo Di Tommaso
@pditommaso
May 17 2018 11:56
@t-neumann the problem are these two lines
remove them it will work
Tobias Neumann
@t-neumann
May 17 2018 11:57
ok. I kept those lines in to say to use the docker engine instead of singularity for the standard profile. can this still be done differently?
Paolo Di Tommaso
@pditommaso
May 17 2018 11:58
yes use the curly brackets notation
and put only in the profile scope
Bioninbo
@Bioninbo
May 17 2018 11:59

I have a question related to #256: I try to set up a variable to have different values in different processes from the config file. It seems to work but I get a warning message. Can I get rid of this warning? Or is there a better way to do. Here is the code:
in the config file:

process$process1.current_path = '/res_process1'

in the process process1:

publishDir path: "${current_path}"

warning message:

WARN: Unknown directive `current_path` for process `process1`
Paolo Di Tommaso
@pditommaso
May 17 2018 11:59
(sorry I'm in a call now)
/offline
Tobias Neumann
@t-neumann
May 17 2018 11:59
@pditommaso you mean like this?
profiles {

        singularity {
            enabled = false
        }

        docker {
            enabled = true
        }

        standard {
            process.executor = 'local'
            process.maxForks = 3
        }

        sge {
            process.executor = 'sge'
            process.penv = 'smp'
            process.queue = 'public.q'
        }


         slurm {
            process.executor = 'slurm'
            process.clusterOptions = '--qos=medium'
            process.cpus = '28'
          }

}
Tobias Neumann
@t-neumann
May 17 2018 12:29
anyways even removing the lines did not help
Kevin Sayers
@KevinSayers
May 17 2018 14:00
@t-neumann it may help with portability to remove the docker:// from your container. Nextflow will automatically add it when you run with Singularity. Did you get your config working?
Edgar
@edgano
May 17 2018 14:32
Hey guys,
I'm trying to use a singularity container but NF is complaining all the time.
I tried to pull directly from shub but it doesnt work, I asume its due I dont have sudo permission on the cluster.
Then I pull the image with sudo on my local machine and transfer the container to the cluster.
On my nextflow.config I have : container = 'file:///<PATH to image>
but I get this error ERROR : Failed to mount image in (read only): Invalid argument
Any hints?
Félix C. Morency
@fmorency
May 17 2018 14:32
@edgano can you singularity shell yourimage.img in a terminal?
Paolo Di Tommaso
@pditommaso
May 17 2018 14:33
there's only one rule in this chat: format properly the code/stdout
:smile:
Edgar
@edgano
May 17 2018 14:34
@fmorency yes, it works on the terminal
Félix C. Morency
@fmorency
May 17 2018 14:34
@edgano does it also work on your cluster?
Paolo Di Tommaso
@pditommaso
May 17 2018 14:35
I guess it's some host mount path doing the mess
Edgar
@edgano
May 17 2018 14:35
where I can check it @pditommaso ? .command.env?
Paolo Di Tommaso
@pditommaso
May 17 2018 14:36
.command.run look at the sing command line
Edgar
@edgano
May 17 2018 14:36
great, thanks
Félix C. Morency
@fmorency
May 17 2018 14:36
are you using the same singularity version on both local and cluster?
Edgar
@edgano
May 17 2018 14:37
nop....
Félix C. Morency
@fmorency
May 17 2018 14:37
you might be trying to use an image built with a newer singularity on an older version
Edgar
@edgano
May 17 2018 14:38
yeah... it makes sense! thanks @fmorency
Tobias Neumann
@t-neumann
May 17 2018 15:41
@KevinSayers yes it's working now. Haven't tried the docker version for the local profile yet, though
Eric Davis
@davisem
May 17 2018 17:43
Has anyone here had success running linux loader commands using the beforeScript directives? Can't seem to get these to stick to the running env. Any reason this approach won't work?
tbugfinder
@tbugfinder
May 17 2018 18:03
My workflow has to process a few thousand input files using AWS batch. I noticed that the job submission is really slow as the API throttles those requests. Did anybody else face the same issue and has an idea how to improve that? It would be great having the ability to configure AWS batch array jobs in nextflow, though.
tbugfinder
@tbugfinder
May 17 2018 19:01
Well, does nextflow also throttle job submission or just not scale well ?
Shawn Rynearson
@srynobio
May 17 2018 19:06
Does anyone know if their is method or variable that returns the path to where processing occurred? i.e. work/64/e68cf1cb7c6dc3a4be114178801b59
Mike Smoot
@mes5k
May 17 2018 19:29
@tbugfinder if you have a bunch of small, fast jobs, then I wonder if you're running into problems related to Nextflow's internal job queue. For instance, AWS Batch could exhaust the internal queue and then based on Nextlflow's polling interval it might take a minute or more before it re-populates its internal queue. That's just a guess, though. There are lots of parameters that can be tweaked.
Or you could batch your files and process them in larger groups. Not as elegant, but sometimes necessary to make it worth spawning a new job.
tbugfinder
@tbugfinder
May 17 2018 19:59
hm, it takes about 30min to 60min to process a single file. I would like to submit/process about 10000-50000 files in parallel. Cannot yet figure out which parameter is limiting now....
Mike Smoot
@mes5k
May 17 2018 20:02
Ah, much different problem. That being said, the queueSize parameter does limit the number of nextflow processes that will run in parallel. I believe the default is 100, so if you haven't changed that then you'll only see 100 files getting processed at once.
tbugfinder
@tbugfinder
May 17 2018 20:49
It looks like max 1000 jobs are sent into running status. Any hint which parameter could change that?
my fault, aws cli limits that.
Mike Smoot
@mes5k
May 17 2018 20:52
Not sure if batch is the same, but AWS has always been happy to increase instance limits when I ask. More money for them.
tbugfinder
@tbugfinder
May 17 2018 20:55
I have to correct this again, it looks really like max 1000
Paolo Di Tommaso
@pditommaso
May 17 2018 21:53
as said queueSize controls the max number of NF parallel jobs
but you may have hit some AWS service limit
Paolo Di Tommaso
@pditommaso
May 17 2018 22:01
@srynobio in the command script you can use the $PWD bash variable (escaping the dollar)
Shawn Rynearson
@srynobio
May 17 2018 22:14
@pditommaso unfortunately that doesn't work with aws-batch. Decided to use awscli s3 instead. Thanks!