These are chat archives for nextflow-io/nextflow

24th
Aug 2018
Shellfishgene
@Shellfishgene
Aug 24 2018 07:56
I seem to remember it's possible to make process queue submission parameters (such as time) dependent on input file size, but can't find an example. Could someone point me to one if that can be done?
Alexander Peltzer
@apeltzer
Aug 24 2018 07:59
There is a way to do similar things, e.g. you can resubmit jobs with increasing memory/cpu/time constraints
But not directly dependent on file input size or other metainformation about your inputs (yet!)
Shellfishgene
@Shellfishgene
Aug 24 2018 08:01
Yes, I already do the resubmission with a multiplicator
kind of a workaround
Shellfishgene
@Shellfishgene
Aug 24 2018 13:46
When I have a channel that emits a groovy map such as ['color':'Blue', 'shape':'Circle'] how do I read that into variables in the process? Also using set like for a tuple? Or can I use the map as is in the process?
Paolo Di Tommaso
@pditommaso
Aug 24 2018 13:51
  input: 
  val record from foo 

  script:
  """
  $record.color 
  """
Shellfishgene
@Shellfishgene
Aug 24 2018 13:55
Thanks, that's easy. I tried $record['color'] after googling groovy maps...
Paolo Di Tommaso
@pditommaso
Aug 24 2018 13:56
as well, but in that case you need curly brackets
${record['color']}
Shellfishgene
@Shellfishgene
Aug 24 2018 13:56
Great, thanks
Winni Kretzschmar
@winni2k
Aug 24 2018 15:16
Hi
Quick question: I'm debugging my nextflow pipeline, and I'm having difficulty pinpointing why exactly things are not behaving the way I expect.
Is there something like a "tee" operator?
It seems like println, print, and view all do not pass through the elements that they print?
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:19
view does
dump is even better
Winni Kretzschmar
@winni2k
Aug 24 2018 15:20
Oh! Am I confused, or does the documentation mention that? https://www.nextflow.io/docs/latest/operator.html#view
why is dump better?
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:21
Both the view and print (or println) operators consume them items emitted by the source channel to which they are applied. However, the main difference between them is that the former returns a newly create channel whose content is identical to the source one. This allows the view operator to be chained like other operators.
because dump allows you to enable/disable it by using a command line option
Winni Kretzschmar
@winni2k
Aug 24 2018 15:22
lol, I saw that note, and skipped it because it did not seem relevant ;)
excellent! Thanks so much
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:23
:ok_hand:
Winni Kretzschmar
@winni2k
Aug 24 2018 15:25
Second question: It's about collectFile
The documentation says: "When the items emitted by the source channel are files, the grouping criteria can be omitted. In this case the items content will be grouped in file(s) having the same name as the source items."
I'm not sure I understand what that means?
I am trying to concatenate all the files in a channel, and I'm trying to use collectFile, but that's not working. Is this relevant?
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:27
do you want to concat all files to a single file ?
Winni Kretzschmar
@winni2k
Aug 24 2018 15:27
Yes!
And then ideally publish the resulting file to a dir
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:27
and all of them have the same name ?
or different names ?
Winni Kretzschmar
@winni2k
Aug 24 2018 15:28
No, all the files have a different name
yes!
Huh, after I added the "name" attribute, it seems to have worked?
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:28
was mentioning that
Winni Kretzschmar
@winni2k
Aug 24 2018 15:28
combine_before_kallisto_ch = combine_before_kallisto1_ch
    .mix(combine_before_kallisto2_ch)
    .map{ g, f -> f }
    .collectFile(name: 'merged_candidate_transcripts.fa.gz', sort: false)
    .view()

process combineBeforeKallisto {
    publishDir 'merged_candidate_transcripts'

    input:
    file 'merged_candidate_transcripts.fa.gz' from combine_before_kallisto_ch

    output:
    set val(0), file('merged_candidate_transcripts.fa.gz') into merged_candidate_transcripts_ch

    "echo hi world"
}
I assume I need to make the process call in order to publish the merged_candidate_transcripts.fa.gz file?
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:30
not sure to understand last question ?
Winni Kretzschmar
@winni2k
Aug 24 2018 15:32
The process "combineBeforeKallisto" does two things right now: It publishes "merged_candidate_transcripts.fa.gz" into the directory "merged_candidate_transcripts", and it adds a dummy value "0" before pushing the file into "merged_candidate_transcripts_ch"
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:33
it adds a dummy value "0"
how this value is added is the files is compressed ?
Winni Kretzschmar
@winni2k
Aug 24 2018 15:33
Could I rewrite this code like this?
combine_before_kallisto_ch = combine_before_kallisto1_ch
    .mix(combine_before_kallisto2_ch)
    .map{ g, f -> f }
    .collectFile(name: 'merged_candidate_transcripts.fa.gz', sort: false, publishDir: 'merged_candidate_transcripts' )
    .map{ f -> tuple(val(0), f)}
    .view()
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:34
ahh, no to the file
Winni Kretzschmar
@winni2k
Aug 24 2018 15:34
yes, val(0) is an id that I pass around the pipeline
Karin Lagesen
@karinlag
Aug 24 2018 15:35
... I am trying to figure out withName and withLabel
Winni Kretzschmar
@winni2k
Aug 24 2018 15:35
And the gzipped fastas can of course be concatenated together
Karin Lagesen
@karinlag
Aug 24 2018 15:35
I get what to write in the config file, but how do I name/label a process in the main script?
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:35
the .map{ f -> tuple(val(0), f)} should be .map{ f -> tuple(0, f)}
Winni Kretzschmar
@winni2k
Aug 24 2018 15:35
I think withName just works on the process names
Karin Lagesen
@karinlag
Aug 24 2018 15:36
that makes sense
but how do I label one?
Karin Lagesen
@karinlag
Aug 24 2018 15:36
awesome, thanks :)
Winni Kretzschmar
@winni2k
Aug 24 2018 15:37
Thanks @pditommaso
@pditommaso , does collectFile work with publishDir?
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:37
:ok_hand:
nope
Winni Kretzschmar
@winni2k
Aug 24 2018 15:38
:(
And I assume storeDir works just like the directive?
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:38
what do you mean "does collectFile work with publishDir" ?
Winni Kretzschmar
@winni2k
Aug 24 2018 15:38
That is, collectFile(storeDir:...)
collectFile has the option to store the resulting file in a directory using the "storeDir" parameter. From the naming of the parameter, I assume that the behavior of the storeDir parameter is similar to the "storeDir" directive, in that the file is copied to the new directory and not symlinked?
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:43
they are not directly related
Winni Kretzschmar
@winni2k
Aug 24 2018 15:43
Ah! I'll give it a try then and see what the behavior ends up being.
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:43
the collectFile storeDir is just used to define the dir where that files are created
Winni Kretzschmar
@winni2k
Aug 24 2018 15:45
Conceptually, is that identical to a publishDir process directive?
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:46
somehow ..
Winni Kretzschmar
@winni2k
Aug 24 2018 15:46
hehe ;)
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:46
:sweat_smile:
Winni Kretzschmar
@winni2k
Aug 24 2018 15:48
well, my tests pass with your suggested re-write, so... success!
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:49
cool
Winni Kretzschmar
@winni2k
Aug 24 2018 15:49
thanks a bunch @pditommaso !
Paolo Di Tommaso
@pditommaso
Aug 24 2018 15:50
:ok_hand:
Alexander Peltzer
@apeltzer
Aug 24 2018 16:38
lets say i have a process that should differentiate between singleEnd or pairedEnd data and output different files in both cases. I suppose I can't just have a groovy-style oneliner statement in the output: section ?
I could also filter the channel afterwards and drop (depending on params.singleEnd) which file I want to get rid of ;-)
Paolo Di Tommaso
@pditommaso
Aug 24 2018 16:40
something like that ?
Winni Kretzschmar
@winni2k
Aug 24 2018 16:41
pretty!
Alexander Peltzer
@apeltzer
Aug 24 2018 16:41
I need vacation
Thanks @pditommaso
I thought too complicated
Paolo Di Tommaso
@pditommaso
Aug 24 2018 16:41
:joy:
I need to include it in the patterns list
Sven F.
@sven1103
Aug 24 2018 16:42

I need vacation

you have

:joy:
Alexander Peltzer
@apeltzer
Aug 24 2018 16:43
Actually true, but taking the flight on sunday ;-)