These are chat archives for nextflow-io/nextflow

31st
Aug 2018
Pierre Lindenbaum
@lindenb
Aug 31 2018 07:31
Hi All, is there a way to tell me if a process is cached or not. I mean, I've a process that download some BAMS from the web but there is a bug in the subsequent process that I'm trying to fix. I don't change the 'download' process; However, everytime ... wait... I forgot the -resume parameter... THANK YOU EVERYONE !
Maxime Garcia
@MaxUlysse
Aug 31 2018 07:35
Glad to be a good rubber duck ;-)
Paolo Di Tommaso
@pditommaso
Aug 31 2018 07:48
@lindenb not sure to understand what you are asking
Pierre Lindenbaum
@lindenb
Aug 31 2018 07:54
@pditommaso there is no question, I fixed my problem while I was typing :-)
Paolo Di Tommaso
@pditommaso
Aug 31 2018 07:55
it was just an imprecation :satisfied:
anyhow I've just replied on twitter as well
Paolo Di Tommaso
@pditommaso
Aug 31 2018 08:07
@ypriverol the best web ui at my knowledge is Flowcraft project by @ODiogoSilva https://twitter.com/bioinformAnt/status/1035420910210220032
however my understanding is that it's not a generic NF front-end but it works with their own framework based on NF
also you may be interested to know that latest version includes an API to track workflow execution metics
Pierre Lindenbaum
@lindenb
Aug 31 2018 09:04
when there is an error in one process, is there a way to tell NF to continue the other ongoing processes until they completed ? I cannot find this in the manual ?
Maybe that?
Pierre Lindenbaum
@lindenb
Aug 31 2018 09:09
@MaxUlysse thanks !
Maxime Garcia
@MaxUlysse
Aug 31 2018 09:10
Glad to help ;-)
micans
@micans
Aug 31 2018 09:35
Thanks for groupKey() update Paolo.
Paolo Di Tommaso
@pditommaso
Aug 31 2018 09:36
:+1:
micans
@micans
Aug 31 2018 09:38
A new problem on k8s. When we resume the pipeline NF lists cached processes in the console and then it seems to hang, but we checked the pods and processes have been created by NF and are running, they are just not mentioned in the console. (Vlad is sitting next to me, brought to you by both of us). Has this been seen before?
Paolo Di Tommaso
@pditommaso
Aug 31 2018 09:39
nope
micans
@micans
Aug 31 2018 09:39
We expected you to say just 'no' :-)
Paolo Di Tommaso
@pditommaso
Aug 31 2018 09:39
ahaha
without log, etc, I'm just blind
micans
@micans
Aug 31 2018 09:40
yes of course
The tail of the log is this:
~> TaskHandler[id: 83; name: crams_to_fastq (iPSCard7082621); status: SUBMITTED; exit: -; error: -; workDir: /mnt/gluster/svd/work/3d/5c9ceb2569f024c02a565e691d6e73]
~> TaskHandler[id: 85; name: crams_to_fastq (iPSCard7082643); status: SUBMITTED; exit: -; error: -; workDir: /mnt/gluster/svd/work/15/a112c8e4de4076ee14b47174472727]
Aug-31 09:38:19.196 [Task monitor] DEBUG n.processor.TaskPollingMonitor - !! executor k8s > tasks to be completed: 3 -- pending tasks are shown below
~> TaskHandler[id: 34; name: crams_to_fastq (iPSCard7082620); status: SUBMITTED; exit: -; error: -; workDir: /mnt/gluster/svd/work/be/c57182075df35af2fa6977f5fbf6d2]
~> TaskHandler[id: 83; name: crams_to_fastq (iPSCard7082621); status: SUBMITTED; exit: -; error: -; workDir: /mnt/gluster/svd/work/3d/5c9ceb2569f024c02a565e691d6e73]
~> TaskHandler[id: 85; name: crams_to_fastq (iPSCard7082643); status: SUBMITTED; exit: -; error: -; workDir: /mnt/gluster/svd/work/15/a112c8e4de4076ee14b47174472727]
Tail of console output is:
[87/6987b4] Cached process > featureCounts (iPSCard7082624)
[f7/2fed67] Cached process > star (iPSCard7082628)
          Passed alignment > star (iPSCard7082628.)   >> 92.14% <<
[8a/176cdf] Cached process > featureCounts (iPSCard7082628)
micans
@micans
Aug 31 2018 09:45
The processes that are running but not shown seem to be the ones that were aborted yesterday.
Paolo Di Tommaso
@pditommaso
Aug 31 2018 09:46
you are saying pods are running but do not show in the log
micans
@micans
Aug 31 2018 09:47
OK, we found it, our mistake: the submitted processes are in the middle of the log, not in the tail.
[4d/c3fd42] Cached process > star (iPSCard7082629)
          Passed alignment > star (iPSCard7082629.)   >> 92.30% <<
[15/a112c8] Submitted process > crams_to_fastq (iPSCard7082643)
[7a/5d1852] Cached process > star (iPSCard7082631)
[30/e11de2] Cached process > featureCounts (iPSCard7082649)
Karin Lagesen
@karinlag
Aug 31 2018 12:25
Hi!
I have files that conform to this pattern:
60-2015-01-5018-1_CGAGGCTG-ACTGCATA_L001_R1_001.fastq.gz
60-2015-01-5018-1_CGAGGCTG-ACTGCATA_L002_R1_001.fastq.gz
61-2015-01-5020-1_CGAGGCTG-AAGGAGTA_L001_R1_001.fastq.gz
61-2015-01-5020-1_CGAGGCTG-AAGGAGTA_L002_R1_001.fastq.gz
and I have 4 of each, i.e. four for 61 and four for 60 etc
how do I write an input channel pattern that will leave me with only the 61-2015-01-5020-1, and not the indices?
Paolo Di Tommaso
@pditommaso
Aug 31 2018 12:32
how are named the indices ?
Karin Lagesen
@karinlag
Aug 31 2018 12:48
the indices are the CGAGGCTG-AAGGAGTA strings
we want to group things as 61-2015-01-5020-1
my normal pattern would be *_L00{1,2}_R{1,2}_001.fastq.gz
but that includes the indices in the filename
in the prefix, I meant
Paolo Di Tommaso
@pditommaso
Aug 31 2018 13:02
uhh, so you want to exclude all the ones having CGAGGCTG-AAGGAGTA in the file name ?
micans
@micans
Aug 31 2018 13:10
Do you want to manipulate the file names? You can do things like this:
Channel
   .from('Ta-s1-c1,c2,c3::Ta-s2-d1,d2::Tb-s3-e1,e2,e3::Tb-s4-f1,f2,f3,f4')
   .flatMap { it -> it.split('::') }    //  [Ta-s1-c1,c2,c3] etc
   .map { it -> it.split('-') }         //  [Ta, s1, c1,c2,c3 ] etc
   .map { it -> tuple(it[0], it[1], it[2]) }
   .map { tg, sn, slist -> [sn, tg, slist.split('\\,')] }
   .println()
Paolo Di Tommaso
@pditommaso
Aug 31 2018 13:11
oh !
what's this ! :D
[s1, Ta, [c1, c2, c3]]
[s2, Ta, [d1, d2]]
[s3, Tb, [e1, e2, e3]]
[s4, Tb, [f1, f2, f3, f4]]
micans
@micans
Aug 31 2018 13:12
Something I produced while working on nexted groupKey() use
I still need to finish that; it was so tricky (for me) that I just started with a single level of groupKey(), and that has been working :+1:
Paolo Di Tommaso
@pditommaso
Aug 31 2018 13:14
likely the 3 map can be merged in a single one
likely=surely
micans
@micans
Aug 31 2018 13:15
yes, appreciate that. I was experimenting a lot, and then having them apart can be handy, especially if I want to omit steps and move println() around ... I'm still a novice
and there is a lot of power in NF + Groovy
Paolo Di Tommaso
@pditommaso
Aug 31 2018 13:15
tend to agree :smile:
Karin Lagesen
@karinlag
Aug 31 2018 13:56
@pditommaso no, I don't want to exclude them. I just want to not have that part included in the prefix that it groups on
if that made sense
Paolo Di Tommaso
@pditommaso
Aug 31 2018 13:56
ahhh
also because with otherwise it won't group the 4 files, no ?
Karin Lagesen
@karinlag
Aug 31 2018 13:58
my normal pattern *_L00{1,2}_R{1,2}_001.fastq.gz does group the four
but as far as I've understood anything "below" that star becomes the prefix
Paolo Di Tommaso
@pditommaso
Aug 31 2018 13:59
let me try one thing
Karin Lagesen
@karinlag
Aug 31 2018 13:59
thus, I need to include the indices, i. e 6XATCG-6XATGC in the pattern
Paolo Di Tommaso
@pditommaso
Aug 31 2018 14:03
have you tried this ?
pattern = '*_{CGAGGCTG-ACTGCATA,CGAGGCTG-AAGGAGTA}_L00{1,2}_R{1,2}_001.fastq.gz'
Channel
   .fromFilePairs(pattern, size: 4)
   .println()
Karin Lagesen
@karinlag
Aug 31 2018 14:04
no
however, I have several of those indices, they are different per read set
thus, any way I could create a regex that could cover that? (regexes are one of my weak spots)
Paolo Di Tommaso
@pditommaso
Aug 31 2018 14:06
it depends how different they are
it may be easier to create a csv files list them properly
and then parse the csv file
Karin Lagesen
@karinlag
Aug 31 2018 14:07
hmmm
so, it is always 6[ATCG]-6[ATCG]
6 times [ATCG] should that have been
Paolo Di Tommaso
@pditommaso
Aug 31 2018 14:08
it should even work with multiple * and it only capture the first
pattern = '*_*_L00{1,2}_R{1,2}_001.fastq.gz'
Channel
   .fromFilePairs(pattern, size: 4)
   .println()
ah no
I think you need custom grouping as shown here
you need to provide a closure that give the file name return only the part you want to group on
Karin Lagesen
@karinlag
Aug 31 2018 14:15
ok thanks, will have a look at that :)