These are chat archives for nextflow-io/nextflow

1st
Feb 2018
Stephen Zhang
@zsteve
Feb 01 2018 02:41
Hi all, is there any way to produce an output channel explicitly from within a function? I.e. would I be able to use Channel.fromPath within a process exec: field and forward that to downstream processes
Paolo Di Tommaso
@pditommaso
Feb 01 2018 08:43
@srynobio No, nextflow is not required in the container nor the ami
@zsteve a process is mean to receive data from a channel, so the approach you are suggesting is not supported
Szilveszter Juhos
@szilvajuhos
Feb 01 2018 09:35
We are using singularity on a cluster and looks for one of our container we have to use the -c directive like singularity.runOptions = "--bind /scratch -c" in the nextflow config file
the issue is that it is needed only for a single process and a single container - and as a side effect it breaks other containers. Where/how should we set this for this particular process only?
Maxime Garcia
@MaxUlysse
Feb 01 2018 09:42
I'm guessing this runOptions being in the singularity scope, it'll be difficult to access it for a specific process
Or can we use a closure like singularity.runOptions = { process == specificprocess ? "--bind /scratch -c" : "--bind /scratch" }
?
Paolo Di Tommaso
@pditommaso
Feb 01 2018 09:50
no, that's not possible
unfortunately this is an open issue nextflow-io/nextflow#415
Maxime Garcia
@MaxUlysse
Feb 01 2018 09:50
Good to know
I'll follow this issue
We should have though of looking at the issues before asking here
Paolo Di Tommaso
@pditommaso
Feb 01 2018 09:51
no pb
Szilveszter Juhos
@szilvajuhos
Feb 01 2018 09:56
OK, what I can do is to dissect these parts from the main workflow - these are going to be dissected anyway on the near future, so it I will do it earlier then
Maxime Garcia
@MaxUlysse
Feb 01 2018 09:57
Regarding this dissection, I had some questions Paolo
I follow closely #238
Paolo Di Tommaso
@pditommaso
Feb 01 2018 09:58
out of curiosity why you need the -c option
Maxime Garcia
@MaxUlysse
Feb 01 2018 09:58
Problem with one of the cluster we're using
They're asking us to use that for the moment
Paolo Di Tommaso
@pditommaso
Feb 01 2018 09:58
I see, ok
Maxime Garcia
@MaxUlysse
Feb 01 2018 09:59
but we only have a problem for only one container
So we need -c for this one
Paolo Di Tommaso
@pditommaso
Feb 01 2018 09:59
could it be a problem how the image is built ?
Maxime Garcia
@MaxUlysse
Feb 01 2018 09:59
and if we use it for all, then another container breaks
Not only
The image is a complete mess
Paolo Di Tommaso
@pditommaso
Feb 01 2018 10:00
:)
Maxime Garcia
@MaxUlysse
Feb 01 2018 10:00
It's for VEP
Paolo Di Tommaso
@pditommaso
Feb 01 2018 10:00
VEP ?
Maxime Garcia
@MaxUlysse
Feb 01 2018 10:00
Vriant Effect Predictor
Paolo Di Tommaso
@pditommaso
Feb 01 2018 10:00
:+1:
Maxime Garcia
@MaxUlysse
Feb 01 2018 10:00
At first I was making an image myself
but Ensembl is not very consistent in their install methods
Paolo Di Tommaso
@pditommaso
Feb 01 2018 10:01
ah
Maxime Garcia
@MaxUlysse
Feb 01 2018 10:01
so after having to change the procedure in my container several times, I gve up, and I'm using the container they provide
But it is a mess
Paolo Di Tommaso
@pditommaso
Feb 01 2018 10:02
not a very good reproducibility user story .. :)
Maxime Garcia
@MaxUlysse
Feb 01 2018 10:02
Definitively not
But I think that ensembl is a little messy
But at some point it was working on this cluster
But they did some updates, and now it's broken...
Whatever
We have the project of dissecting our Cancer Pipeline into at least two parts to start with, and I was wondering about #238
Paolo Di Tommaso
@pditommaso
Feb 01 2018 10:05
(need to leave now)
Maxime Garcia
@MaxUlysse
Feb 01 2018 10:06
See you
Paolo Di Tommaso
@pditommaso
Feb 01 2018 10:18
I want to work on the modularisation stuff soon but hardly it will be available before the summer
Maxime Garcia
@MaxUlysse
Feb 01 2018 10:22
OK
So I'm guessing we'll start with some duplicates, and then sort this all out when the modules finally arrive
Paolo Di Tommaso
@pditommaso
Feb 01 2018 10:29
makes sense
Martin Šošić
@Martinsos
Feb 01 2018 13:04
When running nextflow cloud shutdown <clusterName> I get error ERROR ~ You are not authorized to perform this operation. (Service: AmazonEC2; Status Code: 403; Error Code: UnauthorizedOperation; Request ID: xxx). I was wondering, how does nextflow cloud shutdown obtain needed permissions? When creating a cluster, we provide configuration file with accessKey and secret and similar. However, when we wish to shut it down, there is no mention of config file. Is it relying on that same config file again and it should be present, or is it relying on .ssh key that was on machine when cluster was created, or how does that work? Thanks!
Paolo Di Tommaso
@pditommaso
Feb 01 2018 13:11
in the same manner as when it creates the cluster
maybe you have insufficient permission for that action ?
Martin Šošić
@Martinsos
Feb 01 2018 13:21
Thanks for quick answer!
If I run it directly from terminal it works hm, but when I run it from my nodejs script it does not. Actually from one it does, from another it does not. So I certainly have permissions, but something else is problem. Maybe location on disk from which I am running script?
What does it mean in the same manner? Is it relying on .ssh or on config file?
Paolo Di Tommaso
@pditommaso
Feb 01 2018 13:24
does the create works when you run it from nodejs ?
Martin Šošić
@Martinsos
Feb 01 2018 13:27
Right. I execute it from node as a child process. Then I delete config file. However when I run shutdown from the same nodejs script somewhat later, I get this permissions errors. If I run it manually from same machine,l it works. If I run it from another nodejs script (test script), it works. So it seems like config file is not important, but something is affecting when it has permissions or not. I assumed it is .ssh key, but then i don't understand why one script works and another does not hm! Maybe one of them can't find .ssh key?
Is it using .ssh key at all? That is why I am asking how does it get those permissions.
Martin Šošić
@Martinsos
Feb 01 2018 13:43
By the way, how is it that just cluster-name is enough to provide information for nextflow to shutdown cluster? Does that mean it is saving some extra data on my machine?
Ok, I see now that there is cluster-name tag on aws, cool -> so that is how it identifies it
Paolo Di Tommaso
@pditommaso
Feb 01 2018 13:48
yes
so, the aws credentials are looked in the nextflow config file, then the aws config file, then the env standard AWS env variable
if it works for the create, it should work also for the shutdown
if you set in the nextflow config file, make sure you re setting the correct work dir when you are executing the external process
Martin Šošić
@Martinsos
Feb 01 2018 13:51
Ok got it, that makes sense. Actually I was silly thinking it could shut it down using .ssh, that is only for connecting to master instance.
Hm I still have to figure out why one script works and another doesn't, but at least now I know what to look for, thanks
Ido Tamir
@idot
Feb 01 2018 15:01
Hi is it possible to specify a range of "cpus" for a cluster and then get the value that the cluster sets e.g. in our case its $NSLOTS as ${task.cpus} ?
Martin Šošić
@Martinsos
Feb 01 2018 15:28
@pditommaso following up on my previous question: Thanks a lot for help, I managed to get it working by specifying config file with '-C'!
Paolo Di Tommaso
@pditommaso
Feb 01 2018 15:34
:+1:
Hi is it possible to specify a range of "cpus"
is this a gridengine feature? how exactly do you specify the range of cpus ?
Ido Tamir
@idot
Feb 01 2018 16:07
you can specify -pe smp 4-10
then the gridengine give me an environment variable, in our case $NSLOTS back that tells me how many cores it has actually reserved for me on the host the program is now running
-pe parallel_environment n[-[m]]|[-]m,...
Available for qsub, qsh, qrsh, qlogin and qalter only.
          Parallel programming environment (PE) to instantiate. For more detail about PEs, please see the sge_types(1).

          Qalter  allows  changing  this  option even while the job executes. The modified parameter will only be in effect after a restart or migration of the
          job, however.

          If this option or a corresponding value in qmon is specified then the parameters pe_name,  pe_min  and  pe_max  will  be  passed  to  configured  JSV
          instances  where  pe_name  will be the name of the parallel environment and the values pe_min and pe_max represent the values n and m which have been
          provided with the -pe option. A missing specification of m will be expanded as value 9999999 in JSV scripts and it  represents  the  value  infinity.
          (see -jsv option above or find more information concerning JSV in jsv(1))
Paolo Di Tommaso
@pditommaso
Feb 01 2018 16:13
I see, NF currently there's no support to for this, please open a feature request on GitHub
Ido Tamir
@idot
Feb 01 2018 16:13
so this is a SUN gridengine feature, I am not sure if the environment variable is standardised or set by the administrator
Paolo Di Tommaso
@pditommaso
Feb 01 2018 16:13
however you can still handle this use case using a clusterOptions = '-pe smp 4-10'
Ido Tamir
@idot
Feb 01 2018 16:14
ok! thats good!
Paolo Di Tommaso
@pditommaso
Feb 01 2018 16:14
then using the \$NSLOTS variable in your command
Shawn Rynearson
@srynobio
Feb 01 2018 19:07

I have a strange issue with nextflow aws-batch workflow.

I'm launching a generic process with an docker images to launch test.

process echo {
    container = 'lethalfang/samtools'

    """
    touch myfile.txt
    echo "hello world" > myfile.txt
    """
}

I can get the job to do the following:

  1. Launch my ami (with aws and docker installed) and shut down on failure.
  2. Create .command.sh & .command.run files in s3.

However the following happens:

  1. The job holds on the runnable step (in aws-batch console)
  2. It fails with only the following error: Status reasonJob killed by NF

My question: Is this due to the docker images not being set up correctly as noted here and if so, does anyone have a good example Dockerfile or dockerhub to test with.

Given that the .command.sh file can write to s3 and the aws-batch queue starts, I would guess that I've set up AWS correctly (region, etc).

Paolo Di Tommaso
@pditommaso
Feb 01 2018 19:22
if the job remains in runnable state it sounds a batch configuration issue
make sure to be able to launch a basic job using the aws cli tool
(have a look also on a recent thread in the nf google group at this regard)
Shawn Rynearson
@srynobio
Feb 01 2018 20:06
when you say 'bash configuration' do you mean job queues or compute env(aws)?
Shawn Rynearson
@srynobio
Feb 01 2018 20:12
I'' dig some more. Thanks @pditommaso
Vladimir Kiselev
@wikiselev
Feb 01 2018 20:31
maybe a silly question. Is it possible to use conda environments in NF?
Paolo Di Tommaso
@pditommaso
Feb 01 2018 20:38
not (yet) built-in support nextflow-io/nextflow#493
but you can import conda envs before launching NF
Vladimir Kiselev
@wikiselev
Feb 01 2018 20:40
ok, cool, thanks!
Paolo Di Tommaso
@pditommaso
Feb 01 2018 20:43
:+1: