These are chat archives for nextflow-io/nextflow

29th
Jun 2018
Radoslaw Suchecki
@bioinforad_twitter
Jun 29 2018 05:04

Hi all, I often pass sets of values and files through channels connecting processes. For example

input: 
  set(val(meta), file(data)) from someChannel

Occasionally, I want to apply one or more operator to the channel, which may result in what I believe is a set of sets (or tuple of tuples), but I don't think it is possible to then consume these like this:

input: 
  set( set(val(meta1), file(data1)),set(val(meta2), file(data2))) from someChannel.anOperator().anotherOperator()

Am I missing some syntactic trick?

On a somewhat related note, it would aid code readability if the set notation could be used with input repeaters, for example:

input: 
  each set(val(meta), file(data)) from someChannel
Lorenz Gerber
@lorenzgerber
Jun 29 2018 07:18
Hi there, we wrote a small addition to the clean command as a possibility to reclaim space in the work directory while keeping all (dot.command) logs nextflow-io/nextflow#775 This is also partly related to an earlier request nextflow-io/nextflow#668
Dave Istanto
@DaveIstanto
Jun 29 2018 15:03
Hi all, in the documentation, templates are meant to be shell scripts (.sh files). I tried using it with nextflow scripts and it was not able to output files to the output channel. Is using template native scripts not supported yet?
Paolo Di Tommaso
@pditommaso
Jun 29 2018 16:53
@lorenzgerber thanks, I will review asap
@ShawnConecone template is supposed to create the output file as a normal script, the output channel declaration still needs to be defined in the process as usual, so template and output channel are unrelated
@bioinforad_twitter set definition cannot be nested, but this does not limit that ability to handle any n-tuple structure
Félix C. Morency
@fmorency
Jun 29 2018 17:09
Is anyone using NF + Singularity + SLURM + GPU (cuda)? It would seem the CUDA_VISIBLE_DEVICES is not passed from SLURM to NF to Singularity. We've tried a bunch of things to pinpoint this and are preparing an NF issue atm
Paolo Di Tommaso
@pditommaso
Jun 29 2018 17:14
I think there was a conversation in the past, don't remember exactly the outcome
basically the problem is that slurm defines CUDA_VISIBLE_DEVICES and you need to access it, right?
Félix C. Morency
@fmorency
Jun 29 2018 17:16
Correct. SLURM defines the correct value to set CUDA_VISIBLE_DEVICES depending on the available resources. It works via NF if we don't use Singularity. However, the environment variable is not set in the container itself.
Paolo Di Tommaso
@pditommaso
Jun 29 2018 17:17
try to add in the config file
env.CUDA_VISIBLE_DEVICES='$CUDA_VISIBLE_DEVICES'
note the single quotes
Félix C. Morency
@fmorency
Jun 29 2018 17:19
It doens't work. The variable is empty in the container.
Paolo Di Tommaso
@pditommaso
Jun 29 2018 17:20
how is the sing command line in the .command.run ?
Clément ZOTTI
@czotti
Jun 29 2018 17:22
@pditommaso here is the pastebin with the .command.run.
Paolo Di Tommaso
@pditommaso
Jun 29 2018 17:24
make this test:
change into work/3f/a303188b4d73d7132d770303d1f04b/
edit the .command.sh as env | sort
then run
CUDA_VISIBLE_DEVICES=xxx bash .command.run
what's the output ?
Clément ZOTTI
@czotti
Jun 29 2018 17:27
I don't get the env | sort for .command.sh the content of this file is:
#!/bin/bash -ue
echo $CUDA_VISIBLE_DEVICES
Paolo Di Tommaso
@pditommaso
Jun 29 2018 17:27
ok, same
Clément ZOTTI
@czotti
Jun 29 2018 17:29
The output is blank for CUDA_VISIBLE_DEVICES=1 bash .command.run it print nothing.
Paolo Di Tommaso
@pditommaso
Jun 29 2018 17:30
have you prefixed CUDA_VISIBLE_DEVICES=xxx, right ?
Clément ZOTTI
@czotti
Jun 29 2018 17:30
yes I have set xxx to a gpu id (0 or 1).
Paolo Di Tommaso
@pditommaso
Jun 29 2018 17:31
why ?
Clément ZOTTI
@czotti
Jun 29 2018 17:31
Even with xxx it does not echo xxx.
Paolo Di Tommaso
@pditommaso
Jun 29 2018 17:32
ok, if so run this and copy and paste the output
CUDA_VISIBLE_DEVICES=xxx bash -x .command.run
ok, I think I've understood, this trick cannot work
it requires a patch on NF side, that the only var needed ?
Clément ZOTTI
@czotti
Jun 29 2018 17:36
Here is the ouput if you still need it .
Félix C. Morency
@fmorency
Jun 29 2018 17:36
As far as we know, it's the only one
Paolo Di Tommaso
@pditommaso
Jun 29 2018 17:37
Could you please open an issue for that ?
Clément ZOTTI
@czotti
Jun 29 2018 17:37
Sure
Paolo Di Tommaso
@pditommaso
Jun 29 2018 17:37
:+1:
Félix C. Morency
@fmorency
Jun 29 2018 17:52
Thanks @pditommaso
Paolo Di Tommaso
@pditommaso
Jun 29 2018 17:53
you are welcome
Clément ZOTTI
@czotti
Jun 29 2018 18:00
The issue is filled, thanks a lot for your time @pditommaso.
vkaimal
@vkaimal
Jun 29 2018 18:30
@pditommaso I'm new to Nextflow and I'm trying to test a NF pipeline for RNA-Seq (https://github.com/nextflow-io/rnaseq-encode-nf) on AWS Batch. I have the Batch environment working well (tested using simple Fastqc job definition that I created). However, when I try the NF pipeline, I'm running into space issues (From CloudWatch log: download failed: s3://myBucket/test_data/data-raw/SRR493369_1.fastq to ./SRR493369_1.fastq [Errno 28] No space left on device). I've attached a 1TB '/docker_scratch' volume to the AMI, and tried process.scratch = '/docker_scratch', but it still does not work. Is there something else I should be doing to have the container utilize the '/docker_scratch' space? Thanks
Paolo Di Tommaso
@pditommaso
Jun 29 2018 18:32
the storage configuration it's a bit tricky
follow the instructions here
vkaimal
@vkaimal
Jun 29 2018 18:45
Ah yes, I tried that too. I followed the steps here (https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-ami-storage-config.html) to configure docker storage.
$ docker info | grep -i data
Data Space Used: 4.377GB
Data Space Total: 1.18TB
Data Space Available: 1.175TB
Metadata Space Used: 1.315MB
Metadata Space Total: 109.1MB
Metadata Space Available: 107.7MB
Paolo Di Tommaso
@pditommaso
Jun 29 2018 18:48
WARNING: The maximum storage size of a single Docker container is by default 10GB, independently the amount of data space available in the underlying volume (see Base device size for more details).
have you checked the Base Device Size
vkaimal
@vkaimal
Jun 29 2018 18:52
Aha - Base Device Size: 10.74GB
I'll try dm.basesize and test it again. Thanks!
Francesco Strozzi
@fstrozzi
Jun 29 2018 21:18
@vkaimal you can also define into the AWS Batch job definition the volume mounts for docker. We usually attach medium / large size EBS volume to the ECS instances used by Batch and then in the job definition you can simply mount the /docker_scratch to the /tmp inside the container
NF by default runs into /tmp so it’ll use it automatically by default