These are chat archives for nextflow-io/nextflow

28th
Jul 2017
Sergey Venev
@sergpolly
Jul 28 2017 04:57
Morning @pditommaso , I think I understand the issue a bit better now, see nextflow-io/nextflow#412
Same thing in a real pipeline:
Jul-28 00:49:08.853 [Task monitor] TRACE n.executor.AbstractGridExecutor - Queue status map does not contain jobId: `4728323`
Jul-28 00:49:13.838 [Task monitor] TRACE n.executor.AbstractGridExecutor - Queue status:
  job: 4727744: RUNNING

Jul-28 00:49:13.838 [Task monitor] TRACE n.executor.AbstractGridExecutor - JobId `4727744` active status: true
Jul-28 00:49:13.854 [Task monitor] TRACE n.executor.AbstractGridExecutor - Queue status:
  job: 4727744: RUNNING

Jul-28 00:49:13.854 [Task monitor] TRACE n.executor.AbstractGridExecutor - Queue status map does not contain jobId: `4728323`
Jul-28 00:49:13.856 [Task monitor] DEBUG nextflow.executor.GridTaskHandler - Failed to get exist status for process TaskHandler[jobId: 4728323; id: 151; name: merge_pairsam_into_runs (library:HeLa1 run:lane1); status: RUNNING; exit: -; error: -; workDir: /farline/umw_job_dekker/HPCC/sv49w/distiller-nf/work/81/e67006b79d110316327d1d4be71a71 started: 1501216878802; exited: -; ] -- exitStatusReadTimeoutMillis: 270000; delta: 274965
Current queue status:
>   job: 4727744: RUNNING

Content of workDir: /farline/umw_job_dekker/HPCC/sv49w/distiller-nf/work/81/e67006b79d110316327d1d4be71a71
> total 133
> drwxrwxr-x 2 sv49w umw_job_dekker  183 Jul 28 00:41 .
> drwxrwxr-x 3 sv49w umw_job_dekker   48 Jul 28 00:40 ..
> -rw-rw-r-- 1 sv49w umw_job_dekker    0 Jul 28 00:41 .command.begin
> -rw-rw-r-- 1 sv49w umw_job_dekker   21 Jul 28 00:40 .command.env
> -rw-rw-r-- 1 sv49w umw_job_dekker   43 Jul 28 00:41 .command.log
> -rw-rw-r-- 1 sv49w umw_job_dekker 3486 Jul 28 00:40 .command.run
> -rw-rw-r-- 1 sv49w umw_job_dekker 2672 Jul 28 00:40 .command.run.1
> -rw-rw-r-- 1 sv49w umw_job_dekker  232 Jul 28 00:40 .command.sh

Jul-28 00:49:13.856 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[jobId: 4728323; id: 151; name: merge_pairsam_into_runs (library:HeLa1 run:lane1); status: COMPLETED; exit: -; error: -; workDir: /farline/umw_job_dekker/HPCC/sv49w/distiller-nf/work/81/e67006b79d110316327d1d4be71a71 started: 1501216878802; exited: -; ]
Jul-28 00:49:13.858 [Task monitor] WARN  nextflow.processor.TaskProcessor - Process `merge_pairsam_into_runs (library:HeLa1 run:lane1)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (2)
Jul-28 00:49:14.060 [Task submitter] DEBUG nextflow.executor.GridTaskHandler - Submitted process merge_pairsam_into_runs (library:HeLa1 run:lane1) > lsf jobId: 4728325; workDir: /farline/umw_job_dekker/HPCC/sv49w/distiller-nf/work/3b/37ed2fa2ba160f54196e762e24c83b
Jul-28 00:49:14.060 [Task submitter] INFO  nextflow.Session - [3b/37ed2f] Re-submitted process > merge_pairsam_into_runs (library:HeLa1 run:lane1)
Jul-28 00:49:18.839 [Task monitor] TRACE n.executor.AbstractGridExecutor - Queue status:
  job: 4727744: RUNNING
4728323 is not visible to nextflow because it's a job from the long queue
So, when nextflow checks the exit status of 4728323 - it cannot find it, because it's still running
Sergey Venev
@sergpolly
Jul 28 2017 05:05
eventually nextflow starts checking long queue as well (or instead), so the 3-rd re-submission is running ok
So, I'm wondering if nextflow handles such mixed-queue pipelines properly
?
Sergey Venev
@sergpolly
Jul 28 2017 05:34
Here is the piece of log file starting from 10 mins before the error around 00:49(log-time)
https://pastebin.com/krv21au3
Sergey Venev
@sergpolly
Jul 28 2017 05:44
will get some sleep - will check back sometime in the afternoon (Spanish time)
Simone Baffelli
@baffelli
Jul 28 2017 06:52
Good morning. Is there a recommended naming convention for channels and processes in nextflow scripts?
I'm starting to have too many channels and variable
Paolo Di Tommaso
@pditommaso
Jul 28 2017 09:29
I tend to give channel names something foo_ch
Oskar Vidarsson
@oskarvid
Jul 28 2017 10:01
I have to say, coming from WDL to NF, setting up scatter gather processing is an absolute pain
the pairing of files that go together between the steps is done automatically in WDL
doing that manually, being aware of all details that affect the different scenarios makes it tedious in comparison since you basically don't have to consider that with WDL
Paolo Di Tommaso
@pditommaso
Jul 28 2017 10:04
I would like to see both of them, do you have ?
Oskar Vidarsson
@oskarvid
Jul 28 2017 10:05
but NF is more feature complete when it comes to managing resource usage, and it's also possible to run a loop easily, while WDL will spawn as many processes as the universal config file has defined
both of my scripts?
Paolo Di Tommaso
@pditommaso
Jul 28 2017 10:06
just in case they are public
Oskar Vidarsson
@oskarvid
Jul 28 2017 10:07
the NF script obviously isn't functional but I'll upload it, and I'll upload the latest WDL script too, give me a few minutes
Paolo Di Tommaso
@pditommaso
Jul 28 2017 10:07
no hurry
Oskar Vidarsson
@oskarvid
Jul 28 2017 10:54
here's the wdl pipeline, it should work with a few edits to fit your machine https://github.com/oskarvid/wdl_pipeline/tree/temp
here's the NF pipeline, use bwamem.nf for the full version, or just baserecal to start at that step. https://github.com/oskarvid/nextflow-GermlineVarCall
Sergey Venev
@sergpolly
Jul 28 2017 11:45
Morning @pditommaso , did you get a chance to look at the second log file (bigger log file)?
around 00:49 time point
Paolo Di Tommaso
@pditommaso
Jul 28 2017 12:09
Morning, I've uploaded a new patch, please see the last comment on #412
Tobias Neumann
@t-neumann
Jul 28 2017 12:11
I have not tried it yet, but maybe this can be answered up front: is it possible to specify an entire folder as output file in nextflow?
Paolo Di Tommaso
@pditommaso
Jul 28 2017 12:11
yes
Tobias Neumann
@t-neumann
Jul 28 2017 12:12
speedy and crisp reply :) thanks
Paolo Di Tommaso
@pditommaso
Jul 28 2017 12:12
;)
these are the best answer, aren't they ?
Tobias Neumann
@t-neumann
Jul 28 2017 12:13
yeah - I wish most scientific answers could be as simple ;)
Paolo Di Tommaso
@pditommaso
Jul 28 2017 12:13
ahah
@oskarvid thanks for sharing, I can agree that NF can be a bit more tricky especially at the beginning due to its dataflow oriented programming model, but once you get it it's very flexible and powerful
Paolo Di Tommaso
@pditommaso
Jul 28 2017 12:18
anyhow I agree that the pairing should be simplified a bit
Félix C. Morency
@fmorency
Jul 28 2017 12:28
Morning!
Paolo Di Tommaso
@pditommaso
Jul 28 2017 12:29
almost time to leave here ;)
Félix C. Morency
@fmorency
Jul 28 2017 12:31
I thought you were never leaving the nextflow gitter haven
Paolo Di Tommaso
@pditommaso
Jul 28 2017 12:32
that's the perfect world :satisfied:
Sergey Venev
@sergpolly
Jul 28 2017 12:34
thank you @pditommaso, I'll relaunch and we'll see
Paolo Di Tommaso
@pditommaso
Jul 28 2017 12:35
:ok_hand:
Simone Baffelli
@baffelli
Jul 28 2017 13:50
@pditommaso :+1:
Michael Halagan
@mhalagan-nmdp
Jul 28 2017 18:22

I'm trying to run a job across a nextflow cloud cluster and it doesn't seem to be utilizing the worker nodes. The master node can access the nodes, because it's modifying the .node-nextflow.log file but it's not running anything on them. I copied the nextflow AMI (ami-43f49030) from the EU Ireland region to us-east-1b and am using that. Here's the tail of the .node-nextflow.log file that keeps repeating:

Jul-28 18:04:58.772 [scheduler-agent] DEBUG nextflow.scheduler.SchedulerAgent - === Waiting for master node to join..
Jul-28 18:09:58.632 [grid-timeout-worker-#33%nextflow%] INFO  o.a.i.internal.IgniteKernal%nextflow - 
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=238c40c8, name=nextflow, uptime=00:30:00:050]
    ^-- H/N/C [hosts=1, nodes=1, CPUs=4]
    ^-- CPU [cur=0%, avg=0.11%, GC=0%]
    ^-- Heap [used=225MB, free=93.69%, comm=429MB]
    ^-- Non heap [used=44MB, free=79.42%, comm=44MB]
    ^-- Public thread pool [active=0, idle=16, qSize=0]
    ^-- System thread pool [active=0, idle=16, qSize=0]
    ^-- Outbound messages queue [size=0]
Jul-28 18:09:58.784 [scheduler-agent] DEBUG nextflow.scheduler.SchedulerAgent - === Waiting for master node to join..

Any thoughts on what might be going on?