These are chat archives for nextflow-io/nextflow

10th
Apr 2017
Paolo Di Tommaso
@pditommaso
Apr 10 2017 07:58
@karinlag they have two comple different semantics
in the first it transform a string file path into a file object (think a file handle) eg.
my_genome_file = file('/data/hg19.fa')
in a process context it means that the input is a file that need to staged in the process directory
Paolo Di Tommaso
@pditommaso
Apr 10 2017 08:03
note that the input value needs to be a file object (otherwise a file is implicitly created with the value itself)
then you can have
process foo {
  input: file 'hg19.fa' from genome_ch
  """
  STAR --genomeFastaFiles hg19.fa
  """
}
and
process bar {
  input: file genome from genome_ch
  """
  STAR --genomeFastaFiles ${genome}
  """
}
in the process foo you are requesting is that the genome file is staged with the name hg19.fa (whatever is the original name)
Paolo Di Tommaso
@pditommaso
Apr 10 2017 08:09
in the process bar the genome file is stated using the original file name, then you use a the variable genome to reference it
final note, parenthesis are optional when a function has at at least one argument, thus
file 'x'
is the same of
file('x')
and for any other function/methods
println 'Hello world'
println('Hello world')
Karin Lagesen
@karinlag
Apr 10 2017 08:59
I think this makes things a bit clearer...
from what you say, parenthesis for for file when defining files going through channels are optional, right?
and that also goes for whether the file variable name is fixed or based on a variable, right?
Paolo Di Tommaso
@pditommaso
Apr 10 2017 09:03
generally parenthesis on function calls are optional
Karin Lagesen
@karinlag
Apr 10 2017 09:03
ok, that helps clear up some of the confusion :)
Paolo Di Tommaso
@pditommaso
Apr 10 2017 09:04
good
Karin Lagesen
@karinlag
Apr 10 2017 09:05
and also, when I send a file along a channel, I understand that a link to it is being staged in the work directory
so if I want to use the same files in several processes, I need to be doubly sure that I don't change the input files in any way, yes?
Paolo Di Tommaso
@pditommaso
Apr 10 2017 09:07
yes, that's the golden rule of NF
Karin Lagesen
@karinlag
Apr 10 2017 09:07
good, that means that things are sinking in, gradually :)
Paolo Di Tommaso
@pditommaso
Apr 10 2017 09:09
exactly, NF uses a functional execution model in which processes for the same inputs produce the same outputs
if you modify the inputs, you break this rule
marchoeppner
@marchoeppner
Apr 10 2017 11:41
Hi everyone.
Evan Floden
@evanfloden
Apr 10 2017 11:42
Hi @marchoeppner !
marchoeppner
@marchoeppner
Apr 10 2017 11:42
I have a quick groovy/nextflow question - I am building a process that uses a reference file (bloom filter); I would like to include that file with my git repo, but cannot figure out how to use groovy to determine the location of the pipeline script (so I can derive the path of the bloom filter file for my pipeline)
other languages have something like FILE for that
uhm underscoreunderscore FILE underscoreundescore ;)
Paolo Di Tommaso
@pditommaso
Apr 10 2017 11:43
um? python ?
marchoeppner
@marchoeppner
Apr 10 2017 11:43
python, ruby, php - I think
groovy doesn't seem to have that
Paolo Di Tommaso
@pditommaso
Apr 10 2017 11:45
groovy surely not, file it's jut a NF helper method that returns a Path object give a string/uri path
marchoeppner
@marchoeppner
Apr 10 2017 11:46
right, but how would you get the path of a folder in your git repo (i.e. where my nextflow file lives) using e.g. groovy? I would like to have my config file determine the location of the subfolder so I can use it in any compute environment without hardcoding anything (which wouldn't work for obvious reasons)
Evan Floden
@evanfloden
Apr 10 2017 11:47
You mean like $baseDir?
Paolo Di Tommaso
@pditommaso
Apr 10 2017 11:48
yep
marchoeppner
@marchoeppner
Apr 10 2017 11:48
that would work on a nextflow file object, but say I have a config file with: params.filter = "/home/marc/git/some_pipeline/filters/a_filter.bf" - only that the whole path to that filter file needs to be determined in the nextflow config, based on its own physical location
Maxime Garcia
@MaxUlysse
Apr 10 2017 11:49
maybe try using $workflow.projectDir
marchoeppner
@marchoeppner
Apr 10 2017 11:49
isn't that where I am executing the pipeline?
Maxime Garcia
@MaxUlysse
Apr 10 2017 11:50
It's were the workflow project is stored in the computer
marchoeppner
@marchoeppner
Apr 10 2017 11:50
ah! that should do it, thanks
Félix C. Morency
@fmorency
Apr 10 2017 14:33
Very nice. FYI, I'm currently running our NF pipelines using Singularity on our SLURM cluster
Paolo Di Tommaso
@pditommaso
Apr 10 2017 14:34
Cool, I would love to learn more
any plan to blog or publish something about that ?
Félix C. Morency
@fmorency
Apr 10 2017 14:34
No dependency other than NF (headnode and nodes) and Singularity (nodes) are deployed on the cluster
Maybe. Some people here were interested in our use-case
Félix C. Morency
@fmorency
Apr 10 2017 14:56
@pditommaso anything in particular you would like to know more about?
Paolo Di Tommaso
@pditommaso
Apr 10 2017 14:59
well yes, what kind of application workflows have you development, how many cpus/node you use on average to deploy your workflows, tipical input/output size, average time of your jobs, etc
but I need to leave now :/
Félix C. Morency
@fmorency
Apr 10 2017 15:00
Ok. That gives me some ideas. Thanks
Paolo Di Tommaso
@pditommaso
Apr 10 2017 15:01
eventually pros & cons compared with other tools you have tried in the past
Félix C. Morency
@fmorency
Apr 10 2017 20:33
@pditommaso how does -bg work when there are no active user session?
Paolo Di Tommaso
@pditommaso
Apr 10 2017 20:33
do you mean if the process continue to run ?
Félix C. Morency
@fmorency
Apr 10 2017 20:34
yes. how is it run using the right user
Paolo Di Tommaso
@pditommaso
Apr 10 2017 20:35
the process is detached from the current user, so you can sign-off safely
ie. it continues to run in background