These are chat archives for nextflow-io/nextflow

1st
Jun 2017
Simone Baffelli
@baffelli
Jun 01 2017 13:08
Hello. Is there a way to know the reasons why a process is rerun when nextflow is called with the resume option? I suppose the process could be modifying its own input or something like that, but I just cannot find it on my own.
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:10
use -dump-hashes and compare the differences between two runs
Simone Baffelli
@baffelli
Jun 01 2017 13:10
Excellent, exactly what I was looking for
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:10
most likely reason different order of inputs
tho the output is not so human friendly
Simone Baffelli
@baffelli
Jun 01 2017 13:10
that could be
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:11
use a difftool to compare the outputs
Simone Baffelli
@baffelli
Jun 01 2017 13:11
so if i collect several files using file("a*.something")
and they are received in a different order, the process is rerun
even though the files are the same
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:13
Umm
Are collecting files? How?
Simone Baffelli
@baffelli
Jun 01 2017 13:15
No wait, actually is the step afterwards which is rerun: collect --> do something --> something else (this is rerun)
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:16
*are you
Simone Baffelli
@baffelli
Jun 01 2017 13:17
channel
.collect()
.into{someotherchannel}
process someprocess{
  input:
    file(files:"somefiles*.ext") from someotherchannel
 output:
  file something into nextchannel

 shell:
'''
dummy-command !{files.join(" ")}
'''

}
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:19
collect result is order neutral, strange
Can you share the different dump
Simone Baffelli
@baffelli
Jun 01 2017 13:19
sure
but in what sense?
can I get the dumps from previously run processes?
sorry workflows
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:29
Let me check
Simone Baffelli
@baffelli
Jun 01 2017 13:30
now it runs from the cache
I don't get it
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:41
mmm, tho the order is not taken is consideration I think the problem is that the files are renamed not in the same sequence eg somefiles1.ext is not the same in a following run, hence it will end up in a different hash key
Simone Baffelli
@baffelli
Jun 01 2017 13:42
I see
but actually it was the following process that was being rerun
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:42
two solutions: 1) you don't rename the input files or 2) you sort the channel by using collect(sort:true)
but actually it was the following process that was being rerun
Simone Baffelli
@baffelli
Jun 01 2017 13:43
by rename you mean using input(somename:"newname*.ext")?
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:43
weird
yes
Simone Baffelli
@baffelli
Jun 01 2017 13:43
but just input(somename)?
and then use that?
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:44
that should work
Simone Baffelli
@baffelli
Jun 01 2017 13:44
cool, didn't read the docs carefully then...it always bother me to rely on fixed names
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:45
I mean a variable handle not a fixed file name
Simone Baffelli
@baffelli
Jun 01 2017 13:45
yes
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:45
ok
Simone Baffelli
@baffelli
Jun 01 2017 13:45
somename in this case
It will be a FileList
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:45
yes
Simone Baffelli
@baffelli
Jun 01 2017 13:45
or watheever name it has in nextflow
:)
i can't remember
excellent!
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:46
in principle List<Path> :smile:
Simone Baffelli
@baffelli
Jun 01 2017 13:46
Yes, but internally I saw another name:)
I had to cast it to a list
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:47
I don't remember as well, most stupid name ever chose ;)
no, you should not
Simone Baffelli
@baffelli
Jun 01 2017 13:47
I remver that i want to transpose it
with another list, but it did not let me
unless I would cast it to List
stacking_columns = [unw_ls as List, bl.collect{item->seconds_to_day(item as long)}].transpose()
without unw_ls as List it complained
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:48
ah yes, in this case you are right, I was thinking unw_ls.transpose()
Simone Baffelli
@baffelli
Jun 01 2017 13:49
no no, I cant just build list out of different types of lists I guess
I suppose because the one list is "type" (excuse my imprecision) and the other one is a regular groovy list
*typed
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:54
actually it's a bit more complicated, actually the files a collected by an object that does not implement the List interface tho it implements of the methods of a List
this is an hack need to be able to provide a custom formatting for the list items, something that I would like to some but it's a very low level groovy mess
Simone Baffelli
@baffelli
Jun 01 2017 13:54
And it supports being cast to a List
so that filelist is not a Collection
though it supports the same methods?
Paolo Di Tommaso
@pditommaso
Jun 01 2017 13:56
yes
Simone Baffelli
@baffelli
Jun 01 2017 14:39
cool ;)
Simone Baffelli
@baffelli
Jun 01 2017 14:48
I think the problem is that some function I'm using to combine maps in the pipeline something fails
And then the list of collected files changes
and that causes a whole series of changes downstream
:laughing:
Paolo Di Tommaso
@pditommaso
Jun 01 2017 14:50
even without renaming those files ?
Simone Baffelli
@baffelli
Jun 01 2017 14:50
I did not try that yet
Paolo Di Tommaso
@pditommaso
Jun 01 2017 14:50
ah
Simone Baffelli
@baffelli
Jun 01 2017 14:50
those were crazy days ;)
I'm tyrying to get a final figure done for my conference presentation :sweat_smile:
Paolo Di Tommaso
@pditommaso
Jun 01 2017 14:51
like any other day :sunglasses:
Simone Baffelli
@baffelli
Jun 01 2017 14:51
on an easy day I manage to go home at 7 :sunglasses:
but really, you deserve a medal for nextflow
It made my life much easier
Paolo Di Tommaso
@pditommaso
Jun 01 2017 14:52
because you stop working when you are at home? ! :grin:
Simone Baffelli
@baffelli
Jun 01 2017 14:52
technically yes, in my head it never stops
Paolo Di Tommaso
@pditommaso
Jun 01 2017 14:52
ahahah
instead when you go home here arrives users from the other side of the ocean .. :D
Simone Baffelli
@baffelli
Jun 01 2017 14:54
so it never stops for you
Paolo Di Tommaso
@pditommaso
Jun 01 2017 14:54
fucking globalisation !
Simone Baffelli
@baffelli
Jun 01 2017 14:54
do you ever sleep?
Paolo Di Tommaso
@pditommaso
Jun 01 2017 14:54
sometimes
Maxime Garcia
@MaxUlysse
Jun 01 2017 14:58
I don't believe you
Félix C. Morency
@fmorency
Jun 01 2017 14:58
^
Paolo Di Tommaso
@pditommaso
Jun 01 2017 14:58
:sleeping:
Simone Baffelli
@baffelli
Jun 01 2017 14:59
It would be rather cool not having to sleep
(especially if employers would not be aware of that)
Paolo Di Tommaso
@pditommaso
Jun 01 2017 14:59
is not everybody in the Singularity channel for the new release party !? :D
Félix C. Morency
@fmorency
Jun 01 2017 15:00
new release?
Paolo Di Tommaso
@pditommaso
Jun 01 2017 15:00
singularity 2.3
Félix C. Morency
@fmorency
Jun 01 2017 15:01
oh yeah! woohoo!
Paolo Di Tommaso
@pditommaso
Jun 01 2017 15:01
you see, you were sleeping :D
already up and running here
Félix C. Morency
@fmorency
Jun 01 2017 15:03
@pditommaso do you have (or anyone here) experience in kernel tunable (or other tunable) for network io/heavy load/huge files transfer?
Paolo Di Tommaso
@pditommaso
Jun 01 2017 15:05
I'm not a kernel specialist, when I was interviewed by google I neither know what an inode is :)
Félix C. Morency
@fmorency
Jun 01 2017 15:10
:D
Simone Baffelli
@baffelli
Jun 01 2017 15:24
@pditommaso is it ok if I use the nextflow logo in a presentation?
as an advertisement
Paolo Di Tommaso
@pditommaso
Jun 01 2017 15:25
you are welcome
academic ?
Simone Baffelli
@baffelli
Jun 01 2017 15:25
sure
Paolo Di Tommaso
@pditommaso
Jun 01 2017 15:26
more than happy, if you can share also here it would be interesting for the community
Simone Baffelli
@baffelli
Jun 01 2017 15:26
I will!
Paolo Di Tommaso
@pditommaso
Jun 01 2017 15:27
great, so NF is going into space ;)
Simone Baffelli
@baffelli
Jun 01 2017 15:27
not really :)
:rocket:
what I'm working on is a bit at the edge of the space/satellite community, because i use a device similar to those on satellites, only mine is on the ground
Paolo Di Tommaso
@pditommaso
Jun 01 2017 15:28
sound cool the same
Simone Baffelli
@baffelli
Jun 01 2017 15:28
and my datasets are a bitsmaller :sweat_smile: but much longer in time
in the order of 200/500 images/day, each a few mb big
chdem
@chdem
Jun 01 2017 16:24
Hi there ! I'm trying to make a nextflow script for multiqc and I need to use conditional input in the final process : MultiQC can analyze data from many sources (each launched by processes). Depending of the available data source, user can launch the NF script with 3 parameters (--fasta_path or/and --bam_path or/and --vcf_path).
Of course, I need to be sure that all the processes are finnished to launch the final MultiQC process
I can do that by putting the input channels from the corresponding processes in the MultiQC final process BUT
inputs depends on the params combinaisons used
Paolo Di Tommaso
@pditommaso
Jun 01 2017 16:27
the exactly what is the problem ?
chdem
@chdem
Jun 01 2017 16:27
because of conditional inputs are not available, with 3 parameters, I have to make 5 tests
and to write 5 times the final MultiQC process
with differents input
is there another way to do that ?
I'm sorry if i'm not clear
Paolo Di Tommaso
@pditommaso
Jun 01 2017 16:31
I have some problem with this
because of conditional inputs are not available, with 3 parameters, I have to make 5 tests
and to write 5 times the final MultiQC process
chdem
@chdem
Jun 01 2017 16:31
actually, I have to do 7 tests and 7 versions of my final process
test 1 : (!fasta && !bam && vcf) ; test 2 : (!fasta && bam && !vcf) ; test 3 : (fasta && !bam && !vcf) ; test 4 : (!fasta && bam && vcf) ; test 5 (fasta && !bam && vcf) ; test 6 (fasta && bam && !vcf) ; test7 (fasta && bam && vcf)
```
Paolo Di Tommaso
@pditommaso
Jun 01 2017 16:32
is it supposed to be NF code ?
chdem
@chdem
Jun 01 2017 16:35
if (!params.fasta && !params.bam && params.vcf) {
process final_multiqc {
    input:
    file('') from channel_process_vcf_1.collect()
    file('') from channel_process_vcf_2.collect()
    file('') from channel_process_vcf_3.collect()

   output:
   file "*multiqc_report.html"
   file "*multiqc_data"

   script:
   """
   multiqc_command
   """
}
}
if (!params.fasta && params.bam && !params.vcf) {
process final_multiqc {
    input:
    file('') from channel_process_bam_1.collect()
    file('') from channel_process_bam_2.collect()
    file('') from channel_process_bam_3.collect()

   output:
   file "*multiqc_report.html"
   file "*multiqc_data"

   script:
   """
   multiqc_command
   """
}
}
etc....
Paolo Di Tommaso
@pditommaso
Jun 01 2017 16:36
ummm too bad
chdem
@chdem
Jun 01 2017 16:36
this is exactly the same process, only inputs change
:(
ok, so no other way....
Paolo Di Tommaso
@pditommaso
Jun 01 2017 16:37
use just one multiqc step and just create a single channel collects all the other channels
chdem
@chdem
Jun 01 2017 16:38
Great ! Thank you @pditommaso !
I'm going to test this !
Paolo Di Tommaso
@pditommaso
Jun 01 2017 16:39
or even easier
if you have two or more different branches creating the same channel
you can have a single downstream process using it eg
chdem
@chdem
Jun 01 2017 16:40
that sounds great....
but I don't have the same number of channels in each branch
Paolo Di Tommaso
@pditommaso
Jun 01 2017 16:41
if( condition ) {
  process foo {
    output: file x into channel_process_bam_1
   : 
  } 
}
else {
    process bar {
    output: file x into channel_process_bam_1
   : 
  } 
}

process multiqc {
  input file x from channel_process_bam_1.collect()
  :

}
chdem
@chdem
Jun 01 2017 16:42
understood
what if :
Paolo Di Tommaso
@pditommaso
Jun 01 2017 16:42
or even (depend if possible in your case)
chdem
@chdem
Jun 01 2017 16:43
if( condition ) {
  process foo {
    output: file x into channel_process_bam_1
   : 
  } 
process foo2 {
    output: file x into channel_process_bam_2
   : 
  } 
}
else {
    process bar {
    output: file x into channel_process_bam_1
   : 
  } 
}
Paolo Di Tommaso
@pditommaso
Jun 01 2017 16:43
process foo {
  output: file x into channel_process_bam_1

  script: 
  if( condition ) 
    '''
    to_this
    '''
 else 
   '''
   to_that
   '''
}

process multiqc {
  input file x from channel_process_bam_1.collect()
  :

}
chdem
@chdem
Jun 01 2017 16:44
ok, this is the example that you gives in google group
I know well understand your point
I'm going to do some tests
Paolo Di Tommaso
@pditommaso
Jun 01 2017 16:45
regarding your last example the have the last process producing the same output channel eg
if( condition ) {
  process foo {
    output: file x into channel_process_bam_1
   : 
  } 
process foo2 {
    output: file x into channel_process_bam_2
   : 
  } 
}
else {
    process bar {
    output: file x into channel_process_bam_2
   : 
  } 
}
ie channel_process_bam_2 in both branches
does it make sense ?
chdem
@chdem
Jun 01 2017 16:46
yes, absolutly
thank you @pditommaso , your very helpful !
Paolo Di Tommaso
@pditommaso
Jun 01 2017 16:46
:+1:
chdem
@chdem
Jun 01 2017 16:46
good evening !
:smile:
Paolo Di Tommaso
@pditommaso
Jun 01 2017 16:47
same