These are chat archives for nextflow-io/nextflow

8th
Nov 2016
amacbride
@amacbride
Nov 08 2016 00:07
@pditommaso I saw that 0.22.4 is out -- does the S3 fix mentioned in changelog.txt fix the "Relative path cannot be made absolute" problem for S3 buckets?
Paolo Di Tommaso
@pditommaso
Nov 08 2016 06:27
@amacbride yes
Johan Viklund
@viklund
Nov 08 2016 12:02
Hola, is there anyway that I can make publishDir dependable on one of the val things entering the workflow?
that is, not directly dependent on the filename
my current strategy is to incorporate the val argument in the filename and then use that to generate the correct directory/filename
but that feels very hacky
Maxime Garcia
@MaxUlysse
Nov 08 2016 12:03
have you tried something like publishDir $val ?
Johan Viklund
@viklund
Nov 08 2016 12:03
yep
both with and without the $, nextflow complains that the var does not exist
the input is almost like function arguments, but not quite :)
Maxime Garcia
@MaxUlysse
Nov 08 2016 12:08
I have somewhere something like publishDir outDir["process"] that is working
Johan Viklund
@viklund
Nov 08 2016 12:09
but then outDir is not in your input: right?
Maxime Garcia
@MaxUlysse
Nov 08 2016 12:09
yes
it's my output
That's the idea behind publishDir as far as I understand
Johan Viklund
@viklund
Nov 08 2016 12:10
My problem is that I want to be able to run on hundreds of files, each with their own output directory, specfied in a runfile
fileA outputA
fileB outputB
and so forth
Maxime Garcia
@MaxUlysse
Nov 08 2016 12:15
ok, I see
Not sure how to do it, but the answer will definitively be of interest for me too
Johan Viklund
@viklund
Nov 08 2016 12:16
currently I'm doing it with filename munging (replacing / with ....)
it will be a PITA to maintain this kind of thing
no
I will do it with a dictionary, like you did, but index it with the filename
of course
thanks @MaxUlysse for the help
Maxime Garcia
@MaxUlysse
Nov 08 2016 12:18
I'm not sure it was a good help, but at least it's a fix that works
Johan Viklund
@viklund
Nov 08 2016 12:19
publishDir saveAs { outdir_for[$it] }
approximately
Johan Viklund
@viklund
Nov 08 2016 12:36
still need to encode stuff in the filename :(
though, the hash-table makes it a bit easier to understand what's happening
Johan Viklund
@viklund
Nov 08 2016 12:50
ahh, it seems like the val is accessible from within a saveAs closure
need to check some more
yes, that's the solution
Paolo Di Tommaso
@pditommaso
Nov 08 2016 12:52
this
process foo { 
  publishDir "result_$x"

  input: 
  val x from 'a'
  output:
  file 'file.txt'

  '''
  touch file.txt
  '''
}
Johan Viklund
@viklund
Nov 08 2016 12:53
yes
yes, it works in a string I guess
publishDir x
didn't work
Paolo Di Tommaso
@pditommaso
Nov 08 2016 12:54
yes, you need to use the variable to interpolate a string or by using a closure ie. publishDir { x }
Johan Viklund
@viklund
Nov 08 2016 12:55
it's tricky since different parts of the code is evaluated and compiled at different times :)
Paolo Di Tommaso
@pditommaso
Nov 08 2016 12:56
that's true
I've tried to have a compromise between readability and expressivity of the syntax
Johan Viklund
@viklund
Nov 08 2016 12:58
yes, it's mostly good, but sometimes I get into these issues where I'm not really sure where or when a thing gets evaluated
and the inherent dynamic nature of workflows doesn't really help
Paolo Di Tommaso
@pditommaso
Nov 08 2016 13:10
True, for this reason tests are even more important
Johan Viklund
@viklund
Nov 08 2016 13:10
yes
Maxime Garcia
@MaxUlysse
Nov 08 2016 14:24
Sorry, but still with this variable as an ouput
I'm doing publishDir "VariantCalling/MuTect1/" + { idSampleTumor }
and it create the first part of the path correctly, but the last part is like _nf_script_...
Paolo Di Tommaso
@pditommaso
Nov 08 2016 14:27
this won't work
you should use either
publishDir "VariantCalling/MuTect1/$idSampleTumor"
or
publishDir { "VariantCalling/MuTect1/" + idSampleTumor }
Maxime Garcia
@MaxUlysse
Nov 08 2016 14:28
Oh I see
now I understand why it was working well in my tag as tag { idSampleTumor}
thanks
Paolo Di Tommaso
@pditommaso
Nov 08 2016 14:29
yes, the same for tag, you could do
tag "$idSampleTumor"
Maxime Garcia
@MaxUlysse
Nov 08 2016 14:32
so it works perfectly, except when I add , mode: 'copy' afterwards
Paolo Di Tommaso
@pditommaso
Nov 08 2016 14:33
publishDir "VariantCalling/MuTect1/$idSampleTumor", mode: 'copy'
should work
Maxime Garcia
@MaxUlysse
Nov 08 2016 14:40
working well
Thanks again
Paolo Di Tommaso
@pditommaso
Nov 08 2016 14:42
:+1:
Maxime Garcia
@MaxUlysse
Nov 08 2016 15:50
One last thing (hopefully...)
I want to do different output directories for different variant callers
no problem with that
But since these variant callers give me multiples vcf files, I want to concatenate them into just one using one process
no problem here I guess too
But I want the output directory to have the cariantcaller name on it
and I think I have my answer
RubberDucking for the win !!!
sorry for the inconvienience
Johan Viklund
@viklund
Nov 08 2016 15:53
:)
Paolo Di Tommaso
@pditommaso
Nov 08 2016 15:59
you can use the collectFile to merge to a single file
Félix C. Morency
@fmorency
Nov 08 2016 19:17
"Apache Ignite is packaged with Nextflow itself, so you won't need to install it separately or configure other third party software."
Oooooooooooooooh?
Paolo Di Tommaso
@pditommaso
Nov 08 2016 19:18
em
so?
:)
Félix C. Morency
@fmorency
Nov 08 2016 19:18
that. is. awesome.
Paolo Di Tommaso
@pditommaso
Nov 08 2016 19:18
thanks
it is :)
Félix C. Morency
@fmorency
Nov 08 2016 19:19
def. gonna test this asap
Paolo Di Tommaso
@pditommaso
Nov 08 2016 19:19
on aws ?
Félix C. Morency
@fmorency
Nov 08 2016 19:19
on-premise
Paolo Di Tommaso
@pditommaso
Nov 08 2016 19:19
:+1:
Félix C. Morency
@fmorency
Nov 08 2016 19:19
cloud is too expensive atm. need much cpu/memory/disk
Paolo Di Tommaso
@pditommaso
Nov 08 2016 19:20
yes, I tend to agree
Félix C. Morency
@fmorency
Nov 08 2016 20:07
is there a way to kill the deamon?
Paolo Di Tommaso
@pditommaso
Nov 08 2016 20:12
kill <pid>
Félix C. Morency
@fmorency
Nov 08 2016 20:12
yeah ok
what iperf output should I expect if things work?
Paolo Di Tommaso
@pditommaso
Nov 08 2016 20:13
or if you want to stop all daemons after a run nextflow run foo -cluster.shutdownOnComplete
I don't remember it exactly, but it should be self-explanatory
Félix C. Morency
@fmorency
Nov 08 2016 20:17
mmm still not sure. i see one is sending stuff but I don't see any output on the other side
Paolo Di Tommaso
@pditommaso
Nov 08 2016 20:17
that's not good
Félix C. Morency
@fmorency
Nov 08 2016 20:18
ok. will try using a shared path
Félix C. Morency
@fmorency
Nov 08 2016 20:21
yeah multicast is running
Paolo Di Tommaso
@pditommaso
Nov 08 2016 20:21
:+1:
Félix C. Morency
@fmorency
Nov 08 2016 20:21
on both nodes
Félix C. Morency
@fmorency
Nov 08 2016 20:44
is there a way to list all nodes of the cluster?
Paolo Di Tommaso
@pditommaso
Nov 08 2016 20:45
not at this time
however they should be listed in the log file
Félix C. Morency
@fmorency
Nov 08 2016 20:45
ok good
Mike Smoot
@mes5k
Nov 08 2016 21:30
Hi Paolo, I'm trying to debug some problems I'm having with nextflow cloud and AWS. The immediate problem is that the master and worker nodes come up but without mounting efs. I see you construct a mount command in AmazonCloudDriver.scriptMountEFS but I'm not sure where the output of that command might be captured. Is it captured anywhere?
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:32
yes, but frankly I don't remember
it's under /var/cloud-init-something ..
what AMI are u using ?
Mike Smoot
@mes5k
Nov 08 2016 21:33
One I created.
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:33
which distro ?
Mike Smoot
@mes5k
Nov 08 2016 21:33
centos
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:34
are u able to mount it manually ?
Mike Smoot
@mes5k
Nov 08 2016 21:34
Yeah, that's what's weird.
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:35
it's not mounted at all or only in the container ?
Mike Smoot
@mes5k
Nov 08 2016 21:36
Not mounted at all.
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:36
strange
Mike Smoot
@mes5k
Nov 08 2016 21:38
Well, we're doing some hacky stuff to allow dns to resolve our internal host names, so I'm not terribly surprised.
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:38
the EFS storage is mounted using the boothook
while the remaining part by using a shell script
not sure if this could be a problem with centos
I would suggest to make a preliminary test with our AMI
Mike Smoot
@mes5k
Nov 08 2016 21:40
I see the boothook and I see that the boothook had an error in /var/log/cloud-init.log, but not what the error was.
Do I have access to your AMI in my region? I've only got access to us-west-2
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:40
that would allow you to understand if it's a DNS problem or something else
you should copy it to that region
ami-43f49030
Mike Smoot
@mes5k
Nov 08 2016 21:41
Ok, I'll give that a try
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:42
:+1:
Mike Smoot
@mes5k
Nov 08 2016 21:43
Naturally, I don't have permissions to copy an AMI. Sigh.
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:44
ah
Mike Smoot
@mes5k
Nov 08 2016 21:44
Did you start with ubuntu for your image?
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:44
don't u have a full access account ?
Mike Smoot
@mes5k
Nov 08 2016 21:44
No
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:44
ah
no, it's a Amazon Linux
Mike Smoot
@mes5k
Nov 08 2016 21:44
It's a corporate account. Maybe I should just do all my work from my personal account.
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:45
let me see if I can copy to your region
Mike Smoot
@mes5k
Nov 08 2016 21:45
Isn't Amazon Linux based on centos? Or maybe Redhat?
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:46
it should be from that tree (centos/redhat)
Mike Smoot
@mes5k
Nov 08 2016 21:47
I just ran the boothook manually and it worked. Given how early the boothook runs I'd guess that we don't have DNS properly configured when it runs.
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:48
well, it's time you fix this DNS ;)
Mike Smoot
@mes5k
Nov 08 2016 21:49
It's time I fix a lot of things! :)
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:49
what region is supposed us-west-2?
Mike Smoot
@mes5k
Nov 08 2016 21:49
US Oregon
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:50
copying there
you should find ami-ede3408d in a while
Mike Smoot
@mes5k
Nov 08 2016 21:51
Great, I'll give it a try. Many thanks!
Paolo Di Tommaso
@pditommaso
Nov 08 2016 21:51
welcome
Mike Smoot
@mes5k
Nov 08 2016 23:19
Hi Paolo, quick question about the ~/.nextflow/scm file. Does that need to be built into my AMI for nextflow cloud? Or is there another way handling those credentials?
Paolo Di Tommaso
@pditommaso
Nov 08 2016 23:20
oops, I didn't take in consideration that .. :/
are u using a private repo
Mike Smoot
@mes5k
Nov 08 2016 23:21
Yes
Paolo Di Tommaso
@pditommaso
Nov 08 2016 23:22
but why u need ~/.nextflow/scm to build the ami ?
to avoid to configure it later ?
Mike Smoot
@mes5k
Nov 08 2016 23:22
Right, just to avoid configuring later.
Paolo Di Tommaso
@pditommaso
Nov 08 2016 23:23
you can include it if so
Mike Smoot
@mes5k
Nov 08 2016 23:23
Would it be possible to specify that stuff in the nextflow config file when I run nextflow cloud
Paolo Di Tommaso
@pditommaso
Nov 08 2016 23:24
not at this time, but it's a nice improvement
have u solved the mount problem ?
Mike Smoot
@mes5k
Nov 08 2016 23:25
Not yet. We've got a custom cloud config that I was experimenting with. My changes still didn't help. I think trying your AMI will be next, but if I can't access my local servers for our git and docker repos, then that'll be a no-go too.
Paolo Di Tommaso
@pditommaso
Nov 08 2016 23:26
well, you can you just copy the ~/.nextflow/scm file once you have ssh into it ..
ahhh
Mike Smoot
@mes5k
Nov 08 2016 23:27
In any case, your ami isn't there in Oregon yet, so I'll need to wait anyway.
Paolo Di Tommaso
@pditommaso
Nov 08 2016 23:27
no?
Mike Smoot
@mes5k
Nov 08 2016 23:28
I've searched but can't find it.
Paolo Di Tommaso
@pditommaso
Nov 08 2016 23:29
it turns out the copy didn't copy the permissions
now I;ve made it public, you should see it
Mike Smoot
@mes5k
Nov 08 2016 23:32
still not finding it - I'm guessing it just takes a while for their database to update
back to the scm question, will the workers need that file as well? Or is the checkout only happening on the master node in a cluster?
Paolo Di Tommaso
@pditommaso
Nov 08 2016 23:36
back to the scm question, will the workers need that file as well?
no, it's needed only in the launcher node
Mike Smoot
@mes5k
Nov 08 2016 23:37
Ok, great
Paolo Di Tommaso
@pditommaso
Nov 08 2016 23:37
:+1:
leaving (and enjoy the exit-polls.. :))
Mike Smoot
@mes5k
Nov 08 2016 23:45
Thanks - hopefully we all wake up to good news!