These are chat archives for nextflow-io/nextflow

6th
Aug 2016
Sandeep Shantharam
@machbio
Aug 06 2016 07:09
@pditommaso you talk about nextflow supporting kubernetes here - https://groups.google.com/d/msg/nextflow/NHYylOjOM9s/C_15NjuvAgAJ - any documentation and examples of how to go about this ?
Sandeep Shantharam
@machbio
Aug 06 2016 07:29
found an easter egg - "nextflow node" command is amazing though
Paolo Di Tommaso
@pditommaso
Aug 06 2016 08:05
@machbio yep, there's a prototype in this branch
Sandeep Shantharam
@machbio
Aug 06 2016 08:06
so kubernetes will be integrated like ignite ?
Paolo Di Tommaso
@pditommaso
Aug 06 2016 08:06
it's mostly working but I've no resources to stress test at scale, if you are willing to give it a try and report your experience would be great
no, kubernetes works much more a batch scheduler
it doesn't require remote nextflow daemons
Sandeep Shantharam
@machbio
Aug 06 2016 08:09
I am not completely sure about the configuration parameters for kube, do you know which file i shoudl be looking at ?
Paolo Di Tommaso
@pditommaso
Aug 06 2016 08:09
instead "nextflow node" i.e. ignite executor works pretty well, though I'm working on a revamped scheduler right now
well, the configuration is
process.executor = 'kube'
in your nextflow.config file, that's all
:)
I like easy things .. you know
Sandeep Shantharam
@machbio
Aug 06 2016 08:11
I get it, I am still learning kube - so you basically just run from pipelines from kube-master
Paolo Di Tommaso
@pditommaso
Aug 06 2016 08:12
yes
it's a bit boring to setup the kubernetes cluster though
well, actually does not need to be the kubernetes master, the important thing is to run nextflow in a node where you can run kubectl command
Sandeep Shantharam
@machbio
Aug 06 2016 08:13
they have come up with some good tools now - https://github.com/kubernetes/kops to setup kubernetes cluster
Paolo Di Tommaso
@pditommaso
Aug 06 2016 08:14
I've seen this but didn't try it
which cloud are u using? AWS or GCE?
Sandeep Shantharam
@machbio
Aug 06 2016 08:16
I will try the kube branch and upate you in few days.. gitter is slow today
Paolo Di Tommaso
@pditommaso
Aug 06 2016 08:17
ok, let me know if u need help to compile it.
Sandeep Shantharam
@machbio
Aug 06 2016 08:18
I was a huge fan of Starcluster (AWS) - unfortunately the maintiner seems to have lost interest and i was trying to find an alternative.. Kubernetes support for Spot Instances is still not production ready.. so I would use AWS now..
Paolo Di Tommaso
@pditommaso
Aug 06 2016 08:19
same story here, for this I'm working on a better scheduler for the nextflow Ignite executor
Sandeep Shantharam
@machbio
Aug 06 2016 08:22
sorry, I did not understand - what you meant by better scheduler for ignite executor - why would nextflow handle the scheduler ?
Paolo Di Tommaso
@pditommaso
Aug 06 2016 08:24
because Ignite it's java oriented computing engine for fine grain tasks i.e. it doesn't provide a way to allocate memory, cpus, etc how it's needed large grain jobs
Sandeep Shantharam
@machbio
Aug 06 2016 08:29
Ok.. I get it.. But is it really worth your precious time on Ignite ? I felt you were trying to position nextflow as a client side executor - and did not care about the server side executors - and also seems like ignite has a special place with 'nextflow node' using only apache ignite ..Is this how you see Ignite and Nextflow together ?
Paolo Di Tommaso
@pditommaso
Aug 06 2016 08:32
well, the idea with nextflow is to enable portability targeting multiple computing environments in a transparent manner
in some context the only way do this is deploying a remote daemon, and Ignite is perfect for this
for example if you want to run nextflow in a HPC cluster the ignite executor is the only way to do it
have a look to this post
Sandeep Shantharam
@machbio
Aug 06 2016 08:39
Now I understand, I am sorry - I got judgemental about your process.. I have not been exposed to use cases that Ignite is handling..
Paolo Di Tommaso
@pditommaso
Aug 06 2016 08:40
you are welcome
Sandeep Shantharam
@machbio
Aug 06 2016 09:03
so in context of 'nextflow node' - is the node a client node or server node when the nextflow is run in daemon node.. and also if I run the nextflow node on all of my SGE cluster nodes - how does the SGE allocation compete with the Nextflow Ignite Allocation or they both seperate and hence the resource allocation for jobs that run on SGE and Nextflow overlap
Paolo Di Tommaso
@pditommaso
Aug 06 2016 09:04
nextflow node launches a daemon that will carry out the process executions
you can run it with or w/o SGE. When using it, the SGE is used only to allocate the nodes that you will use for your pipeline execution, then the individual tasks are managed by the ignite/nextflow scheduler
Sandeep Shantharam
@machbio
Aug 06 2016 09:14
Ok, I got it.. so eventhough kube and ignite are treated as executors - I can start a kube cluster and start ignite on all the nodes and have both executors available at my disposal ?
But there can be resource allocation problems - if I run both the executors at the same time.
Paolo Di Tommaso
@pditommaso
Aug 06 2016 09:15
um, kube + ignite become more complex but it should be possible
let's put in this way, nextflow can run over a kube cluster without deploying a node daemon (i.e. ignite)
that it may be necessary only if you need to run many (> millions) fine grain tasks
in that case kube should be used to spin-up the nextflow/ignite cluster somehow similar to what is done when running nextflow/ignite over SGE/MPI
Sandeep Shantharam
@machbio
Aug 06 2016 09:41
thats very helpful - since nextflow allows process level executor definition - I was thinking about use cases for processes that might use ignite and others using kube.. but its evident that it will have its conflict if both are used together
so coming back to kube branch - nextflow-io/nextflow@8792186 - only cpu and mem definition allowed right ?
Paolo Di Tommaso
@pditommaso
Aug 06 2016 09:43
um yes, but kubernetes doesn't allow much more
(need to leave now)
Sandeep Shantharam
@machbio
Aug 06 2016 09:45
Sure.. thanks will report back with any findings with Kube Cluster and Nextflow
Paolo Di Tommaso
@pditommaso
Aug 06 2016 15:20
@mes5k Sorry for the late reply. What you are proposing makes sense. It could be implemented with a pair of new splitYaml and splitJson operators though I still not able to realise how to define a splitting a custom splitting rule