These are chat archives for nextflow-io/nextflow

10th
Jul 2018
Shawn Rynearson
@srynobio
Jul 10 2018 02:48
Is their a configuration or way to impose a runtime limit on a process?
Radoslaw Suchecki
@bioinforad_twitter
Jul 10 2018 04:25
time directive @srynobio
Shellfishgene
@Shellfishgene
Jul 10 2018 09:26
Using collect on an input channel like file '*.fasta' from fasta.collect() will rename the fasta files with numbers, i.e. foo.fasta bar.fasta will become 1.fasta 2.fasta. Is there a way to prevent this renaming?
Luca Cozzuto
@lucacozzuto
Jul 10 2018 09:29
use '*'
file '*' from fasta.collect()
Shellfishgene
@Shellfishgene
Jul 10 2018 09:35
@lucacozzuto Thanks!
Alan B. Christie
@alanbchristie
Jul 10 2018 10:37
Hi, I'm past the hello world example on my AWS ECS instance. I've created EFS storage, and setup everything according to https://www.nextflow.io/blog/2016/deploy-in-the-cloud-at-snap-of-a-finger.html. The nextflow cloud create appears to be successful ... except that the nextflow file on the master is zero length and the cloud nodes appear to be dysfunctional. I wondered if anyone's can point out my mistake? I'm using my own AMI (with docker, cloud-init, nfs-utils, git etc.) but I get the same results if I use the AMI in the aforementioned guide. It's all orchestrated with Packer, Terraform and Ansible. Maybe it's a security group issue (although the launch/bastion node has no problems accessing the outside world)? My config is:
cloud {
    imageId = 'ami-02ffefe8'
    userName = 'ec2-user'
    instanceType = 'c5.18xlarge'
    subnetId = 'subnet-a1bc02e9'
    securityGroup = 'sg-8b8141f7'
    bootStorageSize = '8GB'

    sharedStorageId = 'fs-ae403567'
    sharedStorageMount = '/mnt/efs'

    keyName = 'abc-im'
    keyFile = '~/.ssh/abc-im.pem'

    spotPrice = '1.5'
}
My launch/bastion NF is 0.30.2 build 4867
Paolo Di Tommaso
@pditommaso
Jul 10 2018 10:44
make sure you have opened all ports in the nodes belonging to the security context you are using
Alan B. Christie
@alanbchristie
Jul 10 2018 10:46
OK. I believe I tried ingress/all/from-anywhere that but I'll try again on a new cluster.
Alan B. Christie
@alanbchristie
Jul 10 2018 13:27
I think it was the subnet's Auto-assign public IPv4 address (which was not set - thanks Tim). My orchestrator does this on a node-by-node basis but I overlooked the subnet setting. But the installation was basically dead - could the setup problem be detected and reported rather than saying ready? That might save a lot of user time.
micans
@micans
Jul 10 2018 13:28
I'm using this line: env.PATH = "$baseDir/utils:$PATH", which leads me to wonder; is there a class of variables such as baseDir and why is it $PATH on the right, rather than ${env.PATH}? Is PATH what comes into Nextflow, and is env.PATH what it exports? More generally is there a resource (or course) describing the Nextflow execution model? I assume it parses the main NF file, builds the graph of channels and processes, loads config files. It's possible to stick process definitions in conditional branches and change the process DAG that way. Perhaps I'm looking for something more in depth than the documentation and less in depth than the source code. But I appreciate the fact that NF is a great product and time is precious ... this is just a data point, not that important. And I may have missed the salient part (although I do use the NF documentation a lot).
Alan B. Christie
@alanbchristie
Jul 10 2018 13:29
My next problem is that the EFS volume is not mounted.
There's a mount point but clearly no data. Am I missing something from my config?
sharedSorageId and sharedStorageMount are set, they're available on the launch/bastion node but not in the NF cluster.
Alan B. Christie
@alanbchristie
Jul 10 2018 14:45
My config is as above. with shared/EFS volumes defined as documented. Nextflow creates a cluster without error, the mount directory is present but my EFS volume is not there. It's available on my other nodes. Could the fact it's mounted elsewhere be a conflict? What is the Nextflow mount command? I've successfully mounted the EFS volume using examples from the Mount the Amazon EFS file system section on https://docs.aws.amazon.com/efs/latest/ug/wt1-test.html#wt1-mount-fs-and-test
Paolo Di Tommaso
@pditommaso
Jul 10 2018 14:45
@micans try to summerize your question in no more than two lines, please :smile:
@alanbchristie are you using you own AMI or the one provided in the blog post
micans
@micans
Jul 10 2018 14:47
Please open your brain .... :-)
Alan B. Christie
@alanbchristie
Jul 10 2018 14:48
@pditommaso My own AMI based on Amazon Linux 2018.03.0 HVM with...
  - sudo yum update -y
  # Cloud-Init and Docker
  # and nfs-utils for EFS mounting and git...
  - sudo yum install -y cloud-init docker nfs-utils git
  - sudo service docker start
  - sudo usermod -a -G docker ec2-user
  # Now move to Java 8...
  - sudo yum install -y java-1.8.0
  - sudo yum remove -y java-1.7.0-openjdk
  # And install Nextflow...
  - sudo wget -qO- https://get.nextflow.io | bash
Paolo Di Tommaso
@pditommaso
Jul 10 2018 14:49
I would suggest before you make a test with the one specified the blog post, if OK you focus why your is not working
Alan B. Christie
@alanbchristie
Jul 10 2018 14:52
Understand. Out of curiosity, do you have the instruction steps for your base AMI?
And the mount command the Nextflow would use?
Paolo Di Tommaso
@pditommaso
Jul 10 2018 14:53
Out of curiosity, do you have the instruction steps for your base AMI
Just java + docker engine and nfs tools
Alan B. Christie
@alanbchristie
Jul 10 2018 14:56
Thanks. I do like just java ;-) It's not as if there's more than one version is there? :-) But thanks. That's helpful.
Alan B. Christie
@alanbchristie
Jul 10 2018 15:02
In case it's of interest, and if you're not already automating/scripting your AMI builds, then https://www.packer.io is a neat IaC tool.
Paolo Di Tommaso
@pditommaso
Jul 10 2018 15:07
the goal of NF is to not required AMI customisation, other a basic toolset ie. Java + docker
Alan B. Christie
@alanbchristie
Jul 10 2018 15:11
Yep - but you need an AMI and I guess yours is not off the shelf so it is useful to have a machine record of precicely how it was built. Packer is a tool for building images. I just pointed out the tool.
Francesco Strozzi
@fstrozzi
Jul 10 2018 17:49
@alanbchristie packer looks very interesting, I'll give it a try with our AMIs. Thanks for the heads up
Shawn Rynearson
@srynobio
Jul 10 2018 17:50
@bioinforad_twitter yes! I knew it was available, but I couldn't remember the name of it to save my life.
Alex Cerjanic
@acerj_twitter
Jul 10 2018 18:26

Hi, I'm trying to setup some neuroimaging (specifically iterative MRI reconstructions) pipelines in Nextflow, which has gone pretty well so far. I love the ability to run steps in various environments (in a Docker container, or in the host environment with environment modules).

However, I ran into a problem where Docker doesn't like our NFS share security (root squash, and a horribly complicated group GUID scheme, which is not docker or nextflows fault). For us the easiest way to get around this would be to make the work directory on a local disk (like /tmp or a local scratch disks). I can't find a way in the documentation to specify that via nextflow.configure or something similar. Am I just missing something obvious? I've seen there is a $NXF_WORK environment variable, but is that just to find the work directory or can I specify it to make nextflow use that as the variable.

Alan B. Christie
@alanbchristie
Jul 10 2018 19:05
@fstrozzi Glad to offer something. Sadly Packer's file format is JSON. Sorry, but any file that is supposed to be read by a human that doesn't accommodate comments it pretty poor in my opinion. So I author the code in YAML and then just use a python module that does the conversion (which is sys.stdout.write(json.dumps(yaml.load(sys.stdin), indent=2))).
Francesco Strozzi
@fstrozzi
Jul 10 2018 20:16
@alanbchristie I can live with that (JSON) :smile:
Alan B. Christie
@alanbchristie
Jul 10 2018 22:09
@fstrozzi :-O