Loving what I see about nextflow. We are looking for a platform to scale to running 10,000s of workflows for pathogen genomics. Ideally we'd like to use something that is OpenStack aware since we have a very large OpenStack infrastructure. Can nextflow use Docker with OpenStack. Are there any examples of groups using this kind of set up?
@pditommaso we currently have a UGE cluster that connects to a Lustre shared parallel file system. However with a mind to the future we are building a large OpenStack virtual environment. We have already proved that we can have an OpenStack image for a compute nose that we can add to the UGE cluster. In this environment I guess we can add nodes manually on demand and run nextflow docker instances on both raw tin or virtual nodes. I was wondering if anybody has used nextflow with a virtual environment such as OpenStack to instantiate an instance and run a nextflow workflow on this. As you say it's all about scheduling. In a totally virtual environment there needs to be some process to manage instances. In your opinion would instantiating a separate VM for each nextflow workflow be too costly?
@pditommaso would it be more efficient to have VMs running all the time
and then use nextflow to submit docker jobs to these VMs. I've not used docker before and am not aware how you would manage a queue of nextflow 'submissions' when there is a finite infrastructure available. E.g 500 nextflow workflows to run but only enough resource to process 100 at a time. Any advice you can give would be greatly appreciated
thanks for info. will have a good read of the white paper. Looking for something that will help address our future software and infrastructure architecture needs. Nextflow looks like it may solve some of the our future use cases.