These are chat archives for nextflow-io/nextflow

3rd
Jun 2016
Anthony Underwood
@aunderwo
Jun 03 2016 10:36
@pditommaso how easy would it be to implement a new executor? Since cloudK is no longer a platform - would it be possible to have an executor that spins up a VM on EC2?
Paolo Di Tommaso
@pditommaso
Jun 03 2016 11:09
@aunderwo Actually that is the idea on EC2. I think with a proper configuration of AWS autoscaling service it should be already possible.
Paolo Di Tommaso
@pditommaso
Jun 03 2016 11:15
Alternatively @jbyars suggested to use Nextflow along with CnfCluster which installs a SGE cluster manager and integrates with the AWS stack to spin instances on-demand
never tried personally though
Anthony Underwood
@aunderwo
Jun 03 2016 11:17
Thanks. How do execute nextflo with EC2 currently? Would you have an instance already instantiated?
Paolo Di Tommaso
@pditommaso
Jun 03 2016 11:18
We are using these scripts
Basically what they do is:
1) launch a bunch of EC2 instances
2) using boot-init download install and launch a nextflow daemon
3) download the required docker image(s) for the pipeline deps
4) download the nextflow pipeline project from a git repository
Anthony Underwood
@aunderwo
Jun 03 2016 11:22
Great - thanks looks good. Might also be able to adapt that to run in our openstack env
Paolo Di Tommaso
@pditommaso
Jun 03 2016 11:22
5) nextflow, by using the embedded Ignite engine setup a cluster that will run jobs
Jason Byars
@jbyars
Jun 03 2016 11:22
cfncluster+nextflow mostly works with the auto scaling on ec2. I've had no problems on a torque cluster on openstack
Paolo Di Tommaso
@pditommaso
Jun 03 2016 11:23
@jbyars great news
Jason Byars
@jbyars
Jun 03 2016 11:24
if you just do a static size cluster with cfncluster you'll avoid most of the oddities I'm working through.
Paolo Di Tommaso
@pditommaso
Jun 03 2016 11:24
@aunderwo I think it's definitively possible. If you look the scripts are easy easy. You can create a version for open stack or even submit a pull request if you want
Anthony Underwood
@aunderwo
Jun 03 2016 11:25
@jbyars Sounds cool. So you've created a torque cluster using openstack VMs rather than raw metal? Do you then use the docker functionality with nextflo in conjunction with the toque scheduler to manage the tasks from nextflo workflows?
Jason Byars
@jbyars
Jun 03 2016 11:26
yes, openstack + jenkins + ansible allows me to automate worker image builds and recreate the cluster easier.
Anthony Underwood
@aunderwo
Jun 03 2016 11:27
Nice - and docker on top of that?
Jason Byars
@jbyars
Jun 03 2016 11:27
At present I don't use docker in production, but I want to move that direction. Keep the worker image minimal, just enough to glue the cluster together and troubleshoot. All pipeline tools in the docker containers.
Anthony Underwood
@aunderwo
Jun 03 2016 11:27
what is the role of jenkins? I've used that for CI before
Jason Byars
@jbyars
Jun 03 2016 11:28
housekeeping. Every integration imaginable is available in some form as a plugin
Anthony Underwood
@aunderwo
Jun 03 2016 11:29
OK - thanks for the tip
Jason Byars
@jbyars
Jun 03 2016 11:29
so image build, software builds, container builds, data management, pipeline queueing, you name it
Essentially, the features I find useful, but I believe don't make sense to make part of nextflow. Just a disclaimer I'm working with IonTorrent sequencers so the data is more of a continous trickle for me vs labs doing HiSeq or other big runs that logically look more like batch jobs. So I have a slightly different perspective.
Anthony Underwood
@aunderwo
Jun 03 2016 11:36
Got it. We process large batches of pathogen genomes coming of HiSeq 2500s. We are currently processing workflows as sequential code (qsub jobs) submitted to a SGE cluster that is mostly physical blades but can can be scaled by adding virtual compute nodes using our OpenStack cloud. However I like the features of nextflo including its auto paralellisation, ability to run on a range of infrastuctures. Just trying to guage the best way to implement nextflo given the large OpenStack resource we have available.
Paolo Di Tommaso
@pditommaso
Jun 03 2016 11:39
As a far as I know Univa provides an integration with OpenStack to auto-scale your cluster. Have you tried to investigate in this direction?
Anthony Underwood
@aunderwo
Jun 03 2016 11:39
No. That would be worth exploring. Thanks for the info.
Paolo Di Tommaso
@pditommaso
Jun 03 2016 11:39
That would allow you to use nextflow to submit jobs to a SGE queue that would trigger new instances on-demand
Anthony Underwood
@aunderwo
Jun 03 2016 11:40
That sounds excellent :)
Hugues Fontenelle
@huguesfontenelle
Jun 03 2016 11:49
Hi in here! I just signed up to gitter..
Paolo Di Tommaso
@pditommaso
Jun 03 2016 11:49
Welcome, just starting to give a look to your issue
Hugues Fontenelle
@huguesfontenelle
Jun 03 2016 11:49
Regarding my last post at https://groups.google.com/forum/#!topic/nextflow/H8PYrtuYMe4
I have tried to write a concise reproducible example..
But so far I cannot reproduce the error!
Paolo Di Tommaso
@pditommaso
Jun 03 2016 11:50
Ah
Let me try
but this works
Paolo Di Tommaso
@pditommaso
Jun 03 2016 11:56
Can you provide the log of the failing one ?
as a side note you don't need to declare a class to define an helper method, you can just write in the script:
    def whois(def number) {
        def subdir = ['one', 'two', 'three']
        return subdir[number-1]
    }
@huguesfontenelle Wait, in your comment I've noticed you wrote
publishDir "$cnv_storage_path/${Util.whois(sample_name, analysis)}", mode 'copy'
it's the real code? because here is missing the : in mode: 'copy'
yes I guess that is the issue
Hugues Fontenelle
@huguesfontenelle
Jun 03 2016 12:01
(where should I send the log? is there attachment support in gitter)
well the class resides in another file, in the real code
easier for packaging
Paolo Di Tommaso
@pditommaso
Jun 03 2016 12:01
With pastebin.com for example, but I think I don't need it any more .. :)
well the class resides in another file, in the real code
ok, make sense
Hugues Fontenelle
@huguesfontenelle
Jun 03 2016 12:02
omg
:
will try that :)
Hugues Fontenelle
@huguesfontenelle
Jun 03 2016 12:12

so obv that one is solved, thanks!
In general, when I read a log like:

un-03 12:05:20.556 [Actor Thread 4] INFO  nextflow.processor.TaskProcessor - [skipping] Stored process > trio_mapping_parallel (2)
Jun-03 12:05:21.520 [Actor Thread 3] DEBUG nextflow.processor.TaskProcessor - <trio_mapping_parallel> Sending poison pills and terminating process
Jun-03 12:05:21.522 [main] ERROR nextflow.cli.Launcher - @unknown
java.lang.NullPointerException: Cannot get property 'type' on null object
    at org.codehaus.groovy.runtime.NullObject.getProperty(NullObject.java:60)
    at org.codehaus.groovy.runtime.InvokerHelper.getProperty(InvokerHelper.java:172)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callGroovyObjectGetProperty(AbstractCallSite.java:302)
    at nextflow.dag.MultipleOutputChannelException.getMessage(MultipleOutputChannelException.groovy:57)
    at java.lang.Throwable.getLocalizedMessage(Throwable.java:391)
    at java.lang.Throwable.toString(Throwable.java:480)
    at org.codehaus.groovy.runtime.InvokerHelper.format(InvokerHelper.java:634)
    at org.codehaus.groovy.runtime.InvokerHelper.format(InvokerHelper.java:575)
    at org.codehaus.groovy.runtime.InvokerHelper.toString(InvokerHelper.java:130)
    at org.codehaus.groovy.runtime.InvokerHelper.write(InvokerHelper.java:527)
    at groovy.lang.GString.writeTo(GString.java:183)
    at groovy.lang.GString.toString(GString.java:155)
    at org.codehaus.groovy.runtime.typehandling.ShortTypeHandling.castToString(ShortTypeHandling.java:45)
    at nextflow.Session.abort(Session.groovy:474)
    at nextflow.script.ScriptRunner.execute(ScriptRunner.groovy:161)
    at nextflow.cli.CmdRun.run(CmdRun.groovy:198)
    at nextflow.cli.Launcher.run(Launcher.groovy:385)
    at nextflow.cli.Launcher.main(Launcher.groovy:534)
Jun-03 12:05:21.524 [Actor Thread 2] DEBUG n.processor.ParallelTaskProcessor - <trio_variantcalling_parallel> Poison pill arrived
Jun-03 12:05:21.525 [Actor Thread 13] DEBUG n.processor.ParallelTaskProcessor - <cnv_parallel> Poison pill arrived

do I know where in the script or which variable that at null object is?

Paolo Di Tommaso
@pditommaso
Jun 03 2016 12:15
actually this is a bug of which I'm aware of
here there problem is that there's a conflict in some output channel declaration in your pipeline
Hugues Fontenelle
@huguesfontenelle
Jun 03 2016 12:16
the fact that I don't know where the NullPointerException occurred?
Paolo Di Tommaso
@pditommaso
Jun 03 2016 12:17
this is supposed to be the validation feature that I was mentioning in my presentation, but it turn out there's a small problem
you can he either to roll-back to a version prior 0.19.x
or use the 0.19.4-SNAPSHOT
in both cases you can do that declaring the variable NXF_VER in your environment specifying the version you want to use
Hugues Fontenelle
@huguesfontenelle
Jun 03 2016 12:21
ok
Hugues Fontenelle
@huguesfontenelle
Jun 03 2016 12:29
Another issue: I have a process failing (the bioinformatics tool did not get the right arguments) and the pipeline stops telling me in which dir to cd. Fine. Except that some other programs launched by parallel processes keep running ..
(here version 0.17.1)
Rickard Hammarén
@Hammarn
Jun 03 2016 13:20
Hi! I'm having the same issue as Hugues above and was wondering on how to do a roll back? Where can I get hold of an older version of nextflow?
Paolo Di Tommaso
@pditommaso
Jun 03 2016 13:54
Just declare use the following variable in your env
export NXF_VER=0.18.3
However I'm uploading a new release fixing that issue, it should be available in half an hour
I will let you know
Another issue: I have a process failing (the bioinformatics tool did not get the right arguments) and the pipeline stops telling me in which dir to cd. Fine. Except that some other programs launched by parallel processes keep running ..
@huguesfontenelle Could you share the log file
Paolo Di Tommaso
@pditommaso
Jun 03 2016 14:06
This message was deleted
This message was deleted
Paolo Di Tommaso
@pditommaso
Jun 03 2016 17:39
Late Friday minor update just released
Mike Smoot
@mes5k
Jun 03 2016 21:00
I'm trying to get a couple servers running ignite and to get nextflow distributing tasks to them, but I'm not having any luck. I'm following this: http://www.nextflow.io/docs/latest/ignite.html When I run nextflow node -bg on a worker server and then execute my pipeline with -process.executor ignite I see that the pipeline does actually execute with ignite, but I don't see any activity on the worker node. The nextflow.log on the head node seems to indicate that it sees the worker. I've tried this with both multicast (which is working according to the iperf test) and shared filesystem. Any ideas on how I can debug this?
Paolo Di Tommaso
@pditommaso
Jun 03 2016 21:12
is it a cloud or a local cluster ?
Mike Smoot
@mes5k
Jun 03 2016 21:14
All local machines
Paolo Di Tommaso
@pditommaso
Jun 03 2016 21:15
in the nextflow you should see the ignite info reporting the nodes, cpus etc
Mike Smoot
@mes5k
Jun 03 2016 21:16
Looking at the logs on the head node I see it reporting its own cpus, but I don't see anything reported for the worker.
Paolo Di Tommaso
@pditommaso
Jun 03 2016 21:17
something like this
Apr-26 12:16:56.681 [grid-timeout-worker-#65%nextflow%] INFO  o.a.i.internal.IgniteKernal%nextflow - 
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=2b52cc19, name=nextflow]
    ^-- H/N/C [hosts=5, nodes=5, CPUs=80]
    ^-- CPU [cur=0.13%, avg=0.67%, GC=0%]
    ^-- Heap [used=650MB, free=91.06%, comm=1549MB]
    ^-- Public thread pool [active=2, idle=30, qSize=0]
    ^-- System thread pool [active=0, idle=32, qSize=0]
    ^-- Outbound messages queue [size=0]
the fourth row [hosts=5, nodes=5, CPUs=80]
do you that in your log ?
Mike Smoot
@mes5k
Jun 03 2016 21:19

This is what I see for my worker:

Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=646bd35f, name=nextflow]
    ^-- H/N/C [hosts=1, nodes=1, CPUs=8]
    ^-- CPU [cur=0.1%, avg=0.11%, GC=0%]
    ^-- Heap [used=180MB, free=98.72%, comm=723MB]
    ^-- Public thread pool [active=0, idle=16, qSize=0]
    ^-- System thread pool [active=0, idle=16, qSize=0]
    ^-- Outbound messages queue [size=0]

and this is what I've got for my head node:

Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=bb8687cd, name=nextflow]
    ^-- H/N/C [hosts=1, nodes=1, CPUs=24]
    ^-- CPU [cur=0.07%, avg=0.12%, GC=0%]
    ^-- Heap [used=550MB, free=97.98%, comm=2176MB]
    ^-- Public thread pool [active=5, idle=43, qSize=0]
    ^-- System thread pool [active=0, idle=48, qSize=0]
    ^-- Outbound messages queue [size=0]
Paolo Di Tommaso
@pditommaso
Jun 03 2016 21:20
it looks they do not discover each other
Mike Smoot
@mes5k
Jun 03 2016 21:20
Agreed.
When I tried the shared filesystem approach, I could see that files were added by both servers.
Added to the shared directory, that is
Paolo Di Tommaso
@pditommaso
Jun 03 2016 21:21
I should see the log
Mike Smoot
@mes5k
Jun 03 2016 21:24
Can I attach it here somehow?
Paolo Di Tommaso
@pditommaso
Jun 03 2016 21:24
pastebin.com is great
Mike Smoot
@mes5k
Jun 03 2016 21:28
Here's the worker http://pastebin.com/Pm9QEA1a
And here's the executor: http://pastebin.com/tZJgBmRd
Paolo Di Tommaso
@pditommaso
Jun 03 2016 21:30
do you have a docker daemon installed in these machines?
Mike Smoot
@mes5k
Jun 03 2016 21:30
Yes
Paolo Di Tommaso
@pditommaso
Jun 03 2016 21:31
it messes things up
because it installs a bridge interface
Mike Smoot
@mes5k
Jun 03 2016 21:31
Ok, any way around this?
Paolo Di Tommaso
@pditommaso
Jun 03 2016 21:32
you should specify which interface to use
you can see with ifconfig
Mike Smoot
@mes5k
Jun 03 2016 21:32
presumably not docker0
Paolo Di Tommaso
@pditommaso
Jun 03 2016 21:33
exactly
I guess eth0 or eth1
Mike Smoot
@mes5k
Jun 03 2016 21:34
ok, perfect, let me try that
Mike Smoot
@mes5k
Jun 03 2016 21:41
Still no luck, but then I see 18 different network interfaces on these machines, so it's possible I've chosen the wrong one. Perhaps it'll be easier if I spin up a few nodes in AWS to experiment?
Paolo Di Tommaso
@pditommaso
Jun 03 2016 21:42
!
I guess so
on AWS you will need to use a S3 bucket to share the ip of the nods
Mike Smoot
@mes5k
Jun 03 2016 21:43
Yup, I understand. I think I'll talk to our network admins to see if they have any ideas. Thanks for your help!
Paolo Di Tommaso
@pditommaso
Jun 03 2016 21:43
you can make reference always the command I've linked before
great
Mike Smoot
@mes5k
Jun 03 2016 22:00
Turns out it was a firewall rule, because who doesn't love firewalls between internal servers? Sigh. Everything working as expected!
Paolo Di Tommaso
@pditommaso
Jun 03 2016 22:01
ah!
:)
let me know how does it work
here there would be a great room for improvements
Mike Smoot
@mes5k
Jun 03 2016 22:04
Will do. Today was just a first experiment distributing work, so I'm glad I got this much working.
Paolo Di Tommaso
@pditommaso
Jun 03 2016 22:05
:+1: