These are chat archives for nextflow-io/nextflow

21st
Jun 2017
Matthieu Foll
@mfoll
Jun 21 2017 07:24
Hi all, we are looking for a HPC sysadmin + developper at IARC/WHO to help us with our bioinformatics pipelines, including nextflow development. Please share! https://tl-int.vcdp.who.int/careersection/in/jobdetail.ftl?job=1701799&tz=GMT%2B02%3A00
Maxime Garcia
@MaxUlysse
Jun 21 2017 07:25
@mfoll Do you have a link or a tweet to share, it'll be easier I think
Nevermind, I found it ;-)
Paolo Di Tommaso
@pditommaso
Jun 21 2017 07:27
happy to ear you are expanding your team. you may want to post in the NF google group
Matthieu Foll
@mfoll
Jun 21 2017 07:31
Thats @MaxUlysse, just forgot it!
Maxime Garcia
@MaxUlysse
Jun 21 2017 07:31
np
By the way, the link to the job offer doesn't seem to be working for me
Paolo Di Tommaso
@pditommaso
Jun 21 2017 07:33
too bad ;)
Maxime Garcia
@MaxUlysse
Jun 21 2017 07:40
Thanks a lot
Evan Floden
@evanfloden
Jun 21 2017 08:12
Does anyone else use nextflow console on Mac and have issues trying to close it? Not sure if it just me.
Paolo Di Tommaso
@pditommaso
Jun 21 2017 08:13
yes me :/
never had time to debug properly
Emilio Palumbo
@emi80
Jun 21 2017 08:14
same here :worried:
Evan Floden
@evanfloden
Jun 21 2017 08:14
Good to know. I can’t even force quit sometimes…
Paolo Di Tommaso
@pditommaso
Jun 21 2017 08:15
force quite, works :)
LukeGoodsell
@LukeGoodsell
Jun 21 2017 10:17
Is there a way to monitor a channel without affecting its contents? I’d like to add optionally-active debugging code that reports items being emitted to a channel without affecting the channel. .subscribe seems to prevent processes using the channel as input. Adding another process that passes items from one channel to another seems overkill. Are there any other solutions I’ve overlooked?
Paolo Di Tommaso
@pditommaso
Jun 21 2017 10:35
Use view operator for that
LukeGoodsell
@LukeGoodsell
Jun 21 2017 10:36
Perfect. Thanks!

I still get

ERROR ~ Channel `item` has been used twice as an input by process `afterProcess` and another operator

What am I doing wrong?

LukeGoodsell
@LukeGoodsell
Jun 21 2017 10:41
Demo code:
debug = true
itemListChannel = Channel.from(1, 2, 3, 4)
if(debug) itemListChannel.view()

process afterProcess {
    input:
    val item from itemListChannel

    exec:
    println("afterprocess: ${item}")
}
Maxime Garcia
@MaxUlysse
Jun 21 2017 10:45
Try 'itemListChannel=itemListChannel.view()'
Paolo Di Tommaso
@pditommaso
Jun 21 2017 10:45
view still consume the channel but it returns an identical one
Yes the Max trick should work
Maxime Garcia
@MaxUlysse
Jun 21 2017 10:46
B-)
LukeGoodsell
@LukeGoodsell
Jun 21 2017 10:46
There we go! Thanks @pditommaso and @MaxUlysse
Maxime Garcia
@MaxUlysse
Jun 21 2017 10:49
You're welcome
LukeGoodsell
@LukeGoodsell
Jun 21 2017 11:54
Is there a way to have the view output written to STDERR, or must I print the content myself and return null?
LukeGoodsell
@LukeGoodsell
Jun 21 2017 12:06
Actually, return null makes it print “null”, so that doesn’t quite do want I want either.
LukeGoodsell
@LukeGoodsell
Jun 21 2017 12:13
I guess, more generally, what I’d like is a channel method that provides a per-item closure call and then re-emits the same item without doing anything else. That would allow me to powerfully log item information using (e.g.) log4j, or collect progress metrics etc.
Paolo Di Tommaso
@pditommaso
Jun 21 2017 12:13
Nope, in this case you can use map printing it, then returning the same value
LukeGoodsell
@LukeGoodsell
Jun 21 2017 12:14
Excellent, that’s what I want
Thanks again.
Paolo Di Tommaso
@pditommaso
Jun 21 2017 12:15
It could have sense to add a debug operator
LukeGoodsell
@LukeGoodsell
Jun 21 2017 12:18
Like map but with automatic re-emitting of the item? Or like view but to STDERR? Both would be nice ;-)
Once I get good at Nextflow, I’ll try to actually implement some features and make pull requests - but don’t hold your breath!
Paolo Di Tommaso
@pditommaso
Jun 21 2017 12:20
Any contribution is welcome :)
LukeGoodsell
@LukeGoodsell
Jun 21 2017 13:24
Today I’ve started repeatedly get errors of the form:
Command output:
      (empty)

    Command wrapper:
      .command.run: fork: Resource temporarily unavailable
      .command.run: fork: Resource temporarily unavailable
but they keep appearing in different processes so I can’t reproduce and diagnose them.
I’ve restarted my computer, but it hasn’t helped.
Any suggestions?
I’ve followed the advice here and set maxproc and maxprocperuid to 2048 each: http://blog.ghostinthemachines.com/2010/01/19/mac-os-x-fork-resource-temporarily-unavailable/
LukeGoodsell
@LukeGoodsell
Jun 21 2017 13:52
So it turns out that that link is obsolete; the instructions here worked for me: https://support.code42.com/CrashPlan/4/Troubleshooting/Backups_stall_due_to_too_many_open_files
Paolo Di Tommaso
@pditommaso
Jun 21 2017 14:23
I'm a bit lost here, it's a Mac issue
LukeGoodsell
@LukeGoodsell
Jun 21 2017 14:38
I have a nextflow script that spawns a lot of subprocesses - more than OSX 10.12 (Sierra) allows by default. After increasing the limits, it works.
The problem was that around 10.9, Apple changed how the process limits are set
Mike Smoot
@mes5k
Jun 21 2017 15:26

Hi @pditommaso I've got a pipeline that's hanging and I'm seeing this message in the logs:

Jun-21 15:20:42.842 [Thread-3] DEBUG n.processor.TaskPollingMonitor - !! executor slurm > tasks to be completed: 1 -- first: TaskHandler[jobId: 9722; id: 4867; name: blast_clusters_parse (189); status: SUBMITTED; exit: -; workDir: /mnt/efs/nextflow/run.8adf7107-915e-42f4-980e-30ff84782a4f/work/e4/4e10523b6b7b4b6840ce0f9ba5ebb5 started: -; exited: -;

Does that imply that the process blast_clusters_parse is waiting on its input channel? Looking in the work directory specified I see that it exists, the .command.sh and .command.run files exist, but the files from the input channel have not been symlinked in yet.

Félix C. Morency
@fmorency
Jun 21 2017 15:32
@mes5k is it in the slurm queue?
Mike Smoot
@mes5k
Jun 21 2017 15:33
no, slurm doesn't have it
Mike Smoot
@mes5k
Jun 21 2017 15:43
A little more info. The nextflow process is still running. Looking at my trace file I see that the all of the processes preceding blast_clusters_parse (i.e. blast_clusters, which creates the input channel to blast_clusters_parse) completed successfully. Somehow it looks like the last output of blast_clusters didn't get added to its output channel.
FWIW, I've been seeing very similar behavior with this particular pipeline and the dataset I've been testing with, although in different locations in the pipeline. My sense is that occasionally a channel isn't getting populated correctly, which causes everything to eventually grind to a halt.
Félix C. Morency
@fmorency
Jun 21 2017 16:00
Maybe a bug in the pipeline itself?
Mike Smoot
@mes5k
Jun 21 2017 16:04
Very possibly, but the pipeline runs cleanly on smaller datasets and the behavior I see isn't consistently related to any one process in the pipeline. The pipeline also runs cleanly on this same dataset using the local executor and nextflow version 0.25.0-RC4 (the slurm cluster has 0.24.4).
Félix C. Morency
@fmorency
Jun 21 2017 16:06
Do you have issue with other pipelines on the slurm cluster?
Mike Smoot
@mes5k
Jun 21 2017 16:08
I haven't noticed anything like this, but I've only tried a couple other, simpler pipelines. I'll make an effort to try some other more complicated ones.
Mike Smoot
@mes5k
Jun 21 2017 16:16
Ooooh, more information: blast_clusters_parse with tag 189 failed once with a docker error in a different work directory and was restarted in work/e4/4e1052... seen above. So maybe this is related to retry?
Paolo Di Tommaso
@pditommaso
Jun 21 2017 16:29
version 0.25.0-RC4 does not help here ?
Mike Smoot
@mes5k
Jun 21 2017 16:31
Haven't tried yet since getting a new version of nextflow up on the cluster is a bit more involved than locally. I'll see what I can do.
I'm wondering if this is a problem specific to slurm and resubmission.
Paolo Di Tommaso
@pditommaso
Jun 21 2017 16:32
well, if re-submission fail there should be an error message
Mike Smoot
@mes5k
Jun 21 2017 16:36
Sorry, just started some meetings...
Paolo Di Tommaso
@pditommaso
Jun 21 2017 16:36
a bit busy here as well,
Mike Smoot
@mes5k
Jun 21 2017 16:37
:)
Paolo Di Tommaso
@pditommaso
Jun 21 2017 16:37
the best place for this problems is always GitHub, I thin you have already have opened a issue for this, right?
Mike Smoot
@mes5k
Jun 21 2017 16:38
It seems slightly different from the two I'm paying attention to at the moment, but I can open a github ticket if 0.25.0-RC4 doesn't fix things
Paolo Di Tommaso
@pditommaso
Jun 21 2017 16:39
:ok_hand: