These are chat archives for nextflow-io/nextflow

23rd
Jan 2018
Martin Šošić
@Martinsos
Jan 23 2018 14:38
I have a situation where I call groupTuple() on a channel that is result of a process and nextflow just gets stuck there. I tried adding view() before it and that works fine, it is printing stuff, however if I add one later nothing is happening. So I am sure that all the processes execute and that it gets to that moment where it should call groupTuple(), but then it just hangs -> any idea on what could be the problem? I tried replacing groupTuple() with toListCollection() just to see how would some other operator that works on the whole channel behave, and I got the same problem. Is it possible taht nextflow can't detect when channel is done with receiving new elements so it does not know when to run groupTuple()?
Félix C. Morency
@fmorency
Jan 23 2018 14:40
@Martinsos Are there running processes on which groupTuple()depends?
Martin Šošić
@Martinsos
Jan 23 2018 14:42
@fmorency thank you for quick answer! There is one process, "aligner" that outputs into channel "results". And then I have this code regarding the results:
results
  .map{ [it[0].baseName, it[1].baseName, it[2], it[3], it[4], it[5], *(it[6].split(" "))] }
  .map{ [it.take(5).join('-'), *it] }
  .groupTuple()
  .map{ it.drop(1) }
  .toSortedList()
  .view()
I have 5 items that go into process "aligner", they all get executed, and then it just hangs.
Félix C. Morency
@fmorency
Jan 23 2018 14:45
And what's the output of the map() just before the groupTuple()?
Martin Šošić
@Martinsos
Jan 23 2018 14:45
I also tried modifying code to
results
  .view()
  .map{ [it[0].baseName, it[1].baseName, it[2], it[3], it[4], it[5], *(it[6].split(" "))] }
  .view()
  .map{ [it.take(5).join('-'), *it] }
  .view()
  .groupTuple()
  .view()
  .map{ it.drop(1) }
  .toSortedList()
  .view()
ups wait for edit :)!
Ok, so I edited the code as above and first three views are printed as expected, but the one after the groupTuple is not printed and nextflow is hanging.
Félix C. Morency
@fmorency
Jan 23 2018 14:48
What's the output of the 3rd view()?
Martin Šošić
@Martinsos
Jan 23 2018 14:48
Sorry, took me some time to get it:
[e_coli_DH1_illumina_1x10000-e_coli_DH1-HW-100-0, e_coli_DH1_illumina_1x10000, e_coli_DH1, HW, 100, 0, edlib, 0.577511, 87
]
[e_coli_DH1_illumina_1x10000-e_coli_DH1-HW-1000-0, e_coli_DH1_illumina_1x10000, e_coli_DH1, HW, 1000, 0, edlib, 2.490245, 87
]
[mutated_94_perc-e_coli_DH1-HW-1000-0, mutated_94_perc, e_coli_DH1, HW, 1000, 0, edlib, 3.312938, 605
]
[mutated_97_perc-e_coli_DH1-HW-1000-0, mutated_97_perc, e_coli_DH1, HW, 1000, 0, edlib, 2.757531, 298
]
[mutated_90_perc-e_coli_DH1-HW-1000-0, mutated_90_perc, e_coli_DH1, HW, 1000, 0, edlib, 3.723918
]
Félix C. Morency
@fmorency
Jan 23 2018 14:50
there is nothing to group on that output.
Martin Šošić
@Martinsos
Jan 23 2018 14:50
I have a tuple that describes experiment and its results, so I want to group them by experiment and then later I will aggregate results.
That is why I join a few of the first keys, those that describe experiment, in order to create a key which I put as the first element.
Well yes in this example there is nothing to group hm, but I would expect it to return groups of 1 then?
Truth is I simplified input to make it faster for testing and forgot that there will be nothing to group! Ok let me try with something to group, sec
Félix C. Morency
@fmorency
Jan 23 2018 14:51
Check the remainder and size parameters of the groupTuple operator
Martin Šošić
@Martinsos
Jan 23 2018 14:52
So now I did this, and there is stuff to group now, but still it hangs:
[4b/f9cf48] Submitted process > align (2)
[81/2683f7] Submitted process > align (5)
[ac/b8d7d7] Submitted process > align (6)
[78/889020] Submitted process > align (3)
[19/c0eec7] Submitted process > align (4)
[d4/975df4] Submitted process > align (7)
[10/5f4942] Submitted process > align (1)
[4f/81f85d] Submitted process > align (8)
[c1/729145] Submitted process > align (10)
[e_coli_DH1_illumina_1x10000-e_coli_DH1-HW-100-0, e_coli_DH1_illumina_1x10000, e_coli_DH1, HW, 100, 0, myers, 0.19, ?
]
[da/68a7ed] Submitted process > align (9)
[e_coli_DH1_illumina_1x10000-e_coli_DH1-HW-1000-0, e_coli_DH1_illumina_1x10000, e_coli_DH1, HW, 1000, 0, myers, 1.48, ?
]
[mutated_97_perc-e_coli_DH1-HW-1000-0, mutated_97_perc, e_coli_DH1, HW, 1000, 0, myers, 1.55, ?
]
[e_coli_DH1_illumina_1x10000-e_coli_DH1-HW-100-0, e_coli_DH1_illumina_1x10000, e_coli_DH1, HW, 100, 0, edlib, 0.576868, 87
]
[e_coli_DH1_illumina_1x10000-e_coli_DH1-HW-1000-0, e_coli_DH1_illumina_1x10000, e_coli_DH1, HW, 1000, 0, edlib, 2.398807, 87
]
[mutated_94_perc-e_coli_DH1-HW-1000-0, mutated_94_perc, e_coli_DH1, HW, 1000, 0, myers, 1.56, ?
]
[mutated_97_perc-e_coli_DH1-HW-1000-0, mutated_97_perc, e_coli_DH1, HW, 1000, 0, edlib, 2.758755, 298
]
[mutated_94_perc-e_coli_DH1-HW-1000-0, mutated_94_perc, e_coli_DH1, HW, 1000, 0, edlib, 3.158158, 605
]
[mutated_90_perc-e_coli_DH1-HW-1000-0, mutated_90_perc, e_coli_DH1, HW, 1000, 0, myers, 1.26, ?
]
[mutated_90_perc-e_coli_DH1-HW-1000-0, mutated_90_perc, e_coli_DH1, HW, 1000, 0, edlib, 3.378452
]
All right I will check them, how do you think they affect this?
Félix C. Morency
@fmorency
Jan 23 2018 14:53
All execution finished? There's nothing else to execute and NF just hangs?
Martin Šošić
@Martinsos
Jan 23 2018 14:54
Yes -> there is exactly these 10 experiments in the input channel, and I see here that it processed them all
Félix C. Morency
@fmorency
Jan 23 2018 14:56
Do you have a small reproducible example I could play with?
Martin Šošić
@Martinsos
Jan 23 2018 14:56
I checked out size and reminder -> however, I don't know exact size of the groups, and it is not going to be exactly the same for any group.
Right, I should make one, I just didn't get to it yet. Give me a few minutes, I will try to make one quickly.
Martin Šošić
@Martinsos
Jan 23 2018 15:37
#!/usr/bin/env nextflow

e1 = Channel.from([['q']]).combine(['a', 'b'])
e2 = Channel.from([['q']]).combine(['a', 'c'])

experiments = Channel.create().mix(
  e1, e2
)

process align {
  input:
  set q, a from experiments

  output:
  set q, a, stdout into results

  shell:
  '''
  echo 0.2 5
  '''
}

results
  .view()
  .groupTuple()
  .view()
Here is minimal example, and I actually figured out that problem is in emtpy channel! So if I remove Channel.create() and do e1.mix(e2) it works fine. or if I just do experiments = e1 for example. Is that a bug?
Félix C. Morency
@fmorency
Jan 23 2018 15:46
Not sure... @pditommaso ?
Paolo Di Tommaso
@pditommaso
Jan 23 2018 15:57
replace
experiments = Channel.create().mix(
  e1, e2
)
with
experiments = e1.mix(e2)
Martin Šošić
@Martinsos
Jan 23 2018 15:58
Yes, that helped! However, is that a bug? Channel.create().mix(e1, e2) sounds like it should be working
Paolo Di Tommaso
@pditommaso
Jan 23 2018 15:58
no, it's not a bug
Channel.create() is undefined hence it will do nothing
note that Channel.create() != Channel.empty()
like an undefined var is different from null
Martin Šošić
@Martinsos
Jan 23 2018 15:59
Got it -> I always thought that .create() creates empty channel, but I should have used empty() instead.
Maybe it would be worth mentioning this in documentation? Or I missed this?
Paolo Di Tommaso
@pditommaso
Jan 23 2018 16:02
maybe you are right
Martin Šošić
@Martinsos
Jan 23 2018 16:02
Anyway, thanks a lot, both @fmorency and @pditommaso! This solves my problem.
Paolo Di Tommaso
@pditommaso
Jan 23 2018 16:02
very good
Félix C. Morency
@fmorency
Jan 23 2018 16:03
Np