These are chat archives for nextflow-io/nextflow

30th
Jul 2018
Paolo Di Tommaso
@pditommaso
Jul 30 2018 10:17
@apeltzer oops, I was missing this. I guess you have solved with -resume
Anthony Underwood
@aunderwo
Jul 30 2018 11:39
This is probably a dumb question - any reason why all my tasks are failing with
Command error:
  /bin/bash: .command.sh: Permission denied
Here's the permissions for the shell
ls -l  /root/work/99/e92a72226c4e494b230345f59a8ffb/.command.sh
-rw-r--r-- 1 root root 81 Jul 30 11:34 /root/work/99/e92a72226c4e494b230345f59a8ffb/.command.sh
This is on a digital ocean droplet where the default user is root.
The same Nextflow pipeline ran just fine on my local machine
Paolo Di Tommaso
@pditommaso
Jul 30 2018 11:44
using docker over a shared file system ?
Anthony Underwood
@aunderwo
Jul 30 2018 11:46
yes docker - not sure about the shared file system. Running it in /root
df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            7.9G     0  7.9G   0% /dev
tmpfs           1.6G  8.7M  1.6G   1% /run
/dev/vda1       310G   14G  297G   5% /
tmpfs           7.9G     0  7.9G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           7.9G     0  7.9G   0% /sys/fs/cgroup
/dev/vda15      105M  3.4M  102M   4% /boot/efi
tmpfs           1.6G     0  1.6G   0% /run/user/0
can run docker run -it IMAGE_NAME ok
Paolo Di Tommaso
@pditommaso
Jul 30 2018 11:50
try adding the following in the nextflow.config file
docker.runOptions='-u $(id -u):$(id -g)'
note the single quote '
Anthony Underwood
@aunderwo
Jul 30 2018 11:51
yes - thanks that works - what does that do?
makes sure it runs as the user I'm logged in as ?
amazing support as always!
Paolo Di Tommaso
@pditommaso
Jul 30 2018 11:53
tell docker to run using the current host user and group ..
amazing support as always!
thanks :sunglasses:
Francesco Strozzi
@fstrozzi
Jul 30 2018 14:31
hi, is there a way to have a params from the command line that accepts a list and then in the when directive to check if a specific term is present in the list ? I’ve checked the documentation but I can’t find an example of this use case...
Paolo Di Tommaso
@pditommaso
Jul 30 2018 14:32
myList = params.myList?.tokenise(',') ?: []

process foo {
  when:
  'something' in myList
}
Francesco Strozzi
@fstrozzi
Jul 30 2018 14:33
nice
thanks!
Paolo Di Tommaso
@pditommaso
Jul 30 2018 14:38
@fstrozzi there's @apeltzer is struggling configuring AWS Batch, can you give some pro tip ?
Francesco Strozzi
@fstrozzi
Jul 30 2018 14:39
sure
Alexander Peltzer
@apeltzer
Jul 30 2018 14:40
@fstrozzi I'll write you directly
Paolo Di Tommaso
@pditommaso
Jul 30 2018 14:41
I'd suggest to use the public channel for the sake of the community
Alexander Peltzer
@apeltzer
Jul 30 2018 14:41
Oh okay
(just don't want to annoy everyone)
Okay: Submitting Jobs via Bash Dashboard and via NXF works, and Jobs are in state "Submitted". Spot requests are fulfilled, machines spin up and the job never moves from "Runnable" to "Starting" or "Running"
Machines requested can be SSHed into however
Any idea?
Paolo Di Tommaso
@pditommaso
Jul 30 2018 14:42
who feels annoyed can opt-out, the value of these (open) channels is to share knowledge
Alexander Peltzer
@apeltzer
Jul 30 2018 14:43
Good :-)
Agree on that
Nextflow trace (nextflow -trace nextflow run ... ) shows however that Scheduler queue size: 0 in the log file
Paolo Di Tommaso
@pditommaso
Jul 30 2018 14:45
the Ec2 instance running NF has S3 and Batch full permission?
Alexander Peltzer
@apeltzer
Jul 30 2018 14:49
I'll check
Francesco Strozzi
@fstrozzi
Jul 30 2018 14:49

as I wrote to @apeltzer :

mmmm either that or some problem with the requested RAM….it may happen (we’ve seen it few times already) that you request let’s say 16GB of RAM in your NF process, and then AWS Batch spins up a machine with 16GB of RAM, although the 16GB of RAM specified by NF gets converted into slightly more than 16GB in the job submission and the job never starts.

I’m starting with this simple example since already happened more than once to us, and the first times it took a while to find out where the problem was.
it’s simple to check, just go into the dashboard, see how much RAM the jobs that is in RUNNABLE state got assigned and then check on the machine how much RAM it actually has. Sometimes is a small difference of few hundreds MBs but enough to get the job stuck

Alexander Peltzer
@apeltzer
Jul 30 2018 14:52
So the RAM is in the limits requested properly
Francesco Strozzi
@fstrozzi
Jul 30 2018 14:54
ok so either is a IAM permissions problem, or the configuration of the AMI you are using on the computing environment. I’d have to admit that the lack of proper logging from AWS Batch can be annoying sometimes, because you don’t have information on why a job is in a RUNNABLE state
Alexander Peltzer
@apeltzer
Jul 30 2018 14:54
The AMI is just having docker basically
(similar to what the Nextflow Docs say)
Francesco Strozzi
@fstrozzi
Jul 30 2018 14:55
is the ECS one ?
Alexander Peltzer
@apeltzer
Jul 30 2018 14:56
Amazon Linux 2 AMI (HVM), SSD Volume Type - ami-b70554c8 I think
Francesco Strozzi
@fstrozzi
Jul 30 2018 14:56
in which region are you operating ?
Alexander Peltzer
@apeltzer
Jul 30 2018 14:57
us-east-1
As I need access to some cancer data on ICGC, but in theory I could also run somewhere else for other things
Francesco Strozzi
@fstrozzi
Jul 30 2018 14:59
ok, the best thing to do for AWS Batch is to use the ECS image from the MarketPlace. It has Docker pre-installed. But before thinking about the AMI, let’s see if it’s not a IAM problem
did you create the computing environment from the dashboard ?
Alexander Peltzer
@apeltzer
Jul 30 2018 14:59
Yup
Francesco Strozzi
@fstrozzi
Jul 30 2018 15:01
ok so if I remember correctly, the wizard during the creation should also point you to create all the needed IAM roles. Usually you need at least 3, one for the Instance, one for the Task (i.e. the job) and one for the SpotFleet.
considering you said that the spot requests are correctly fullfilled, I’d rule out there is a permission problem on SpotFleet requests
Alexander Peltzer
@apeltzer
Jul 30 2018 15:01
I assume they work yes
As I can SSH into etc
Francesco Strozzi
@fstrozzi
Jul 30 2018 15:04
so if it’s a IAM problem it’s either with the instance role or the task role, could you check from the Dashboard if those roles are correctly assigned to your computing environment ?
Alexander Peltzer
@apeltzer
Jul 30 2018 15:05
I'm checking
I'm now kicking out all jobs and queues, removed the roles and will set things up from scratch
Maybe that helps
Francesco Strozzi
@fstrozzi
Jul 30 2018 15:08
yes, try that…otherwise it’s likely an AMI issue, as I was mentioning, the best it’s to use the Amazon ECS image, not the Amazon Linux one.
Alexander Peltzer
@apeltzer
Jul 30 2018 15:09
and add things required such aws-cli to it?
Ok
Francesco Strozzi
@fstrozzi
Jul 30 2018 15:09
in case, try to prepare a Batch AMI using the ECS Image as a template (you can find that from the EC2 MarketPlace)
Alexander Peltzer
@apeltzer
Jul 30 2018 15:09
Found it :-)
Francesco Strozzi
@fstrozzi
Jul 30 2018 15:10

and add things required such aws-cli to it?

yes exactly, like it’s explained here https://www.nextflow.io/docs/latest/awscloud.html#custom-ami

if you still have troubles let us know…we’ll try to help :)
Alexander Peltzer
@apeltzer
Jul 30 2018 15:18
Thanks @fstrozzi and @pditommaso - we're going to get this up and running :-)
Francesco Strozzi
@fstrozzi
Jul 30 2018 15:18
:+1:
Paolo Di Tommaso
@pditommaso
Jul 30 2018 15:18
of course
Anthony Underwood
@aunderwo
Jul 30 2018 15:19
I'm following this conversation closely because I will be trying AWS batch shortly. Thanks for keeping the conversation public @fstrozzi and @apeltzer
Francesco Strozzi
@fstrozzi
Jul 30 2018 15:20
you are welcome, but as Paolo said, that’s the channel purpose :smile:
Mike Smoot
@mes5k
Jul 30 2018 15:22
FWIW, I've had good luck bringing up the necessary AWS Batch infrastructure using terraform. Highly recommended!
Anthony Underwood
@aunderwo
Jul 30 2018 15:22
@mes5k - cool. Any tips or advice?
Mike Smoot
@mes5k
Jul 30 2018 15:24
this provides a basic template. I'd start with that and see where it gets you.
Paolo Di Tommaso
@pditommaso
Jul 30 2018 16:20
@mes5k that sounds a nice community contribution, provided you are willing to share it :wink:
Francesco Strozzi
@fstrozzi
Jul 30 2018 17:22
@mes5k nice! I’ve also on my todo list to try Packer, was already discussed on the channel in the past…any experience with it ?
Mike Smoot
@mes5k
Jul 30 2018 17:34
Yup, we use Packer to build the AMIs and the Terraform to bring up the infrastructure. Very happy with both. Unfortunately, I don't think I can share what I've developed internally. Lawyers. Sigh. Happy to help others debug their configurations, though...
tbugfinder
@tbugfinder
Jul 30 2018 17:53
Would there be a GitHub folder available to commit NF Terraform and packer examples?
Are multiline comments supported within aws.config file?
Paolo Di Tommaso
@pditommaso
Jul 30 2018 18:21
you can choose any of your choice and then I will be happy to link in the docs of make of a blog post of it
Jemma Nelson
@fwip
Jul 30 2018 18:39
I'd love to see terraform examples for AWS batch & nextflow, I had a heck of a time getting AWS batch to accept jobs last I tried.
Alexander Peltzer
@apeltzer
Jul 30 2018 18:50
Same here 😅
Mike Smoot
@mes5k
Jul 30 2018 23:15

Can anyone think of a different way to do this without using a process to rename the file?

Channel
    .from( [[location: "s3://asdf/upload/5708840654765763362",
             name: "XYZ-00001_S1_L001_R1_001.fastq.gz"],
            [location: "s3://asdf/upload/3275234543098368798",
             name: "XYZ-00001_S1_L001_R2_001.fastq.gz"]])
    .map{ it -> [it.name, file(it.location)] }
    .set{ s3files }

process rename {
    input:
    set val(name), file('some.fastq.gz') from s3files

    output:
    file("${name}") into fqfiles

    script:
    """
    mv some.fastq.gz ${name}
    """

}

// properly named files
fqfiles.view()

I'm hoping for something like this:

.map{ it -> file(it.location).renameTo(it.name) }

But this doesn't work because it renames the file before downloading.