These are chat archives for nextflow-io/nextflow

28th
Sep 2017
Simone Baffelli
@baffelli
Sep 28 2017 06:35

@pditommaso

mmm, .flatMap { it }
Thanks, that is what I did

Paolo Di Tommaso
@pditommaso
Sep 28 2017 06:35
:+1:
maybe that should be supported by default i.e. as .flatMap()
Simone Baffelli
@baffelli
Sep 28 2017 06:37
that could make sense
Francesco Strozzi
@fstrozzi
Sep 28 2017 07:09
Hello, simple curiosity: creating the DAG image is possible only at run-time because Nextflow dynamically resolve the workflow graph while running ?
Paolo Di Tommaso
@pditommaso
Sep 28 2017 07:10
yes
Francesco Strozzi
@fstrozzi
Sep 28 2017 07:15
:+1:
thanks
Paolo Di Tommaso
@pditommaso
Sep 28 2017 07:17
sorry for the short answer, but it's exactly how it works :)
Paolo Di Tommaso
@pditommaso
Sep 28 2017 07:51
@baffelli it turns out that flatMap already works like that (!)
Francesco Strozzi
@fstrozzi
Sep 28 2017 07:55
@pditommaso no problem :)
Francesco Strozzi
@fstrozzi
Sep 28 2017 08:06
another simple question. When using something like splitFasta on a Channel inside a process, the “split” gets executed outside any job. For instance, when using the aws-batch executor the split of the Fasta file into the chunks happens locally and then each chunk is sent to a job that is submitted to Batch. Is that correct ? (Y/N) :)
Paolo Di Tommaso
@pditommaso
Sep 28 2017 08:07
[Y]
:)
Francesco Strozzi
@fstrozzi
Sep 28 2017 08:08
:+1: thanks!
Paolo Di Tommaso
@pditommaso
Sep 28 2017 08:08
I think, you need to set splitFasta (file:true)
otherwise it may not work in the aws-batch branch
Francesco Strozzi
@fstrozzi
Sep 28 2017 08:10
well, it seems to work without (file: true)
Paolo Di Tommaso
@pditommaso
Sep 28 2017 08:10
better good!
Francesco Strozzi
@fstrozzi
Sep 28 2017 08:12
the point was more to have a confirmation of the behaviour I was seeing. For small Fasta files it’s not a problem, for large files I think I should move the splitfasta inside a job, since it’s not very practical to download the file from S3 locally only to do the split and then save back the chunks on S3. And pulling out data from AWS has a non-negligible cost as well :)
Paolo Di Tommaso
@pditommaso
Sep 28 2017 08:13
are u running NF locally?
Francesco Strozzi
@fstrozzi
Sep 28 2017 08:14
yes
Paolo Di Tommaso
@pditommaso
Sep 28 2017 08:14
what above on having NF driver in a Ec2 instance for big deployment?
Francesco Strozzi
@fstrozzi
Sep 28 2017 08:15
of course the other option would be to have NF running inside an EC2 instance
Paolo Di Tommaso
@pditommaso
Sep 28 2017 08:15
:+1:
Francesco Strozzi
@fstrozzi
Sep 28 2017 08:15
but if possible I would prefer to keep it local
of course it depends on the workflow
Paolo Di Tommaso
@pditommaso
Sep 28 2017 08:15
I see, makes sense
Anthony Underwood
@aunderwo
Sep 28 2017 10:43
Hi - if you don't specify scratch true where are files staged to?
Paolo Di Tommaso
@pditommaso
Sep 28 2017 11:37
if depends, by using which executor ?
when NF cloud deployment scratch is implicitly true
Anthony Underwood
@aunderwo
Sep 28 2017 11:41

Hi @pditommaso

when running on AWS we are getting this error

Sep-28 11:33:37.005 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'GenerateMaskReference (AF2122.fasta)'

Caused by:
  Process `GenerateMaskReference (AF2122.fasta)` terminated with an error exit status (126)

Command executed:

  python /home/compass/PIPELINE/compass/nf_ref_index.py -r AF2122.fasta

Command exit status:
  126

Command output:
  (empty)

Command error:
  /bin/bash: /tmp/nxf-7154476645470675566/.command.env: Permission denied
  /bin/bash: /tmp/nxf-7154476645470675566/.command.env: Permission denied
  /bin/bash: /tmp/nxf-7154476645470675566/.command.run.1: Permission denied

Work dir:
  /nxftestpipeline/bwa/work/40/2076d3a9c30b342f532f58614324b8
Don't understand why we're seeing permission denied
Paolo Di Tommaso
@pditommaso
Sep 28 2017 11:42
are you using EFS ?
Anthony Underwood
@aunderwo
Sep 28 2017 11:42
yes but this failed so we specified s3 via -w
Paolo Di Tommaso
@pditommaso
Sep 28 2017 11:44
and the above is the output when using S3 ?
however there's something with the permissions
Paolo Di Tommaso
@pditommaso
Sep 28 2017 11:49
I would make sure that it run in a single Ec2 instance using the local executor
then I will try to troubleshoot the above problem changing in the temp folder /tmp/nxf-7154476645470675566/
and checking the permissions and trying to run it as bash -x .command.run
Anthony Underwood
@aunderwo
Sep 28 2017 11:52
ok thanks. will give it a go
and the above is the output when using S3 ?
all files specified in s3
in cloud how do you force it to use local, not ignite?
Paolo Di Tommaso
@pditommaso
Sep 28 2017 12:15
nextflow run .. -process.executor local
Anthony Underwood
@aunderwo
Sep 28 2017 15:21
Can you explain the relationship between scratch and work. I can see there's some lines in the log saying "staging to ......... /tmp/XXXXXXX"
Félix C. Morency
@fmorency
Sep 28 2017 15:23
The scratch will use the local storage (default in /tmp/XXXXXX) to put input/temporary/ and output files. The files listed in output: will then be moved back in work
Anthony Underwood
@aunderwo
Sep 28 2017 15:24
does work need to be available to all nodes but then files are staged locally to run the processes?
Félix C. Morency
@fmorency
Sep 28 2017 15:24
yes and yes
Anthony Underwood
@aunderwo
Sep 28 2017 15:25
@fmorency Thanks. So if scratch is not set to true where are the files staged to?
Félix C. Morency
@fmorency
Sep 28 2017 15:25
work directly
Anthony Underwood
@aunderwo
Sep 28 2017 15:28
@fmorency thanks. Have you run any workflows in AWS?
Félix C. Morency
@fmorency
Sep 28 2017 15:29
no, we are running on-premise for now. our processing requires too much resources and the cost is too high to be run in AWS
we have a freenas/slurm backend
Anthony Underwood
@aunderwo
Sep 28 2017 15:30
makes sense :) We're mostly running on SGE and experimenting with AWS
btw did you consider using CWL and Toil when thinking about workflow managers
?
Félix C. Morency
@fmorency
Sep 28 2017 15:31
No. Our previous codebase was based on Luigi (spotify) which was a PITA to use. One night I decided it was time to switch to something else and stumbled on NF. It was a perfect match
Paolo Di Tommaso
@pditommaso
Sep 28 2017 15:31
I need to post this
Anthony Underwood
@aunderwo
Sep 28 2017 15:32

:)

No. Our previous codebase was based on Luigi (spotify) which was a PITA to use. One night I decided it was time to switch to something else and stumbled on NF. It was a perfect match

Félix C. Morency
@fmorency
Sep 28 2017 15:32
We're using NF+Singularity exclusively
Those two techs were a game changer for us
Anthony Underwood
@aunderwo
Sep 28 2017 15:33
Yeah containers + a nice DSL for workflows - no brainer!!
Félix C. Morency
@fmorency
Sep 28 2017 15:33
The cluster has nothing else than slurm, singularity on all nodes and NF on the service node
that's it
Anthony Underwood
@aunderwo
Sep 28 2017 15:34
That's so simple and awesome!
Félix C. Morency
@fmorency
Sep 28 2017 15:35
thanks :)
Paolo Di Tommaso
@pditommaso
Sep 28 2017 15:35
this reminded me also
I think of it since a moment and I never takes time to tell it to you but NextFlow completely changed our way of work here, at the hospital ! We are better in traceability, better in script organisation, better in development cycles...with Singularity usage and Slurm, It really revolutionized our way of working (while we work with Nextflow since hardly a few months)...We have some more of work but we progress faster than before!
Christophe DEMAY, Public hospital in North of France: CHRU de Lille
:point_up: June 2, 2017 11:12 AM
thanks @fmorency for the endorsement
Félix C. Morency
@fmorency
Sep 28 2017 15:36
Anytime
Francesco Strozzi
@fstrozzi
Sep 28 2017 16:00

No. Our previous codebase was based on Luigi (spotify) which was a PITA to use. One night I decided it was time to switch to something else and stumbled on NF. It was a perfect match

@fmorency was the same for us. I developed and used for years a simple pipeline engine that was about right for our set of use cases. Then when faced with moving all our pipelines in a different environment and also in the cloud I started to re-code everything and had that feeling of “reinventing the wheel” since I already knew Paolo and NF. So at that moment was a no brain decision to test and then progressively switch to NF

Venkat Malladi
@vsmalladi
Sep 28 2017 16:08
Question I am using collectFile method and then saving my output to a file and then using the set to name the file. How do I reference the path to file in a process?
Paolo Di Tommaso
@pditommaso
Sep 28 2017 16:11
then using the set to name the file
do you have snippet showing that ?
Venkat Malladi
@vsmalladi
Sep 28 2017 16:11
Channel
.fromPath( params.reads )
.flatten()
.map { file -> [ file.toString(), file.getFileName().toString() ].join("\t")}
.collectFile( name: 'fileList.tsv', newLine: true )
.set { readsList }
Paolo Di Tommaso
@pditommaso
Sep 28 2017 16:11
ok
process foo {
input: 
file x from readsList

"""
cat $x
"""
}
Venkat Malladi
@vsmalladi
Sep 28 2017 16:13
what if I want to use the path to 'fileList.tsv'
Paolo Di Tommaso
@pditommaso
Sep 28 2017 16:14
in a process or outside ?
Venkat Malladi
@vsmalladi
Sep 28 2017 16:14
in a process
I need to pass the path to ‘fileList.tsv’ to a python function
python $baseDir/scripts/check_design.py -f $readsList
Paolo Di Tommaso
@pditommaso
Sep 28 2017 16:16
the $x reference does not work ?
Anthony Underwood
@aunderwo
Sep 28 2017 16:17
If I want to run NF on cloud without scratch , setting scratch to false doesn't seem to make a difference
Venkat Malladi
@vsmalladi
Sep 28 2017 16:18
@pditommaso the code that is attempted to run is python /Users/venkatmalladi/BICF/chipseq_analysis/workflow/scripts/check_design.py -d /Users/venkatmalladi/BICF/chipseq_analysis/workflow/../test_data/design_ENCSR238SGC_SE.txt -f DataflowQueue(queue=[DataflowVariable(value=/Users/venkatmalladi/BICF/chipseq_analysis/workflow/work/tmp/6f/8adf2e934a29820f2724678a741b51/fileList.tsv), DataflowVariable(value=groovyx.gpars.dataflow.operator.PoisonPill@4b60969c)])
Paolo Di Tommaso
@pditommaso
Sep 28 2017 16:20
it looks you are not using the approach I suggested
@aunderwo if you are using S3 as work is ignored
Venkat Malladi
@vsmalladi
Sep 28 2017 16:21
@pditommaso let me modify and give that a try
Paolo Di Tommaso
@pditommaso
Sep 28 2017 16:21
if you are using EFS actually is false by default
Anthony Underwood
@aunderwo
Sep 28 2017 16:21
OK - thanks
Paolo Di Tommaso
@pditommaso
Sep 28 2017 16:24
solved the permission problem ?
Anthony Underwood
@aunderwo
Sep 28 2017 16:24
@pditommaso no not yet
Paolo Di Tommaso
@pditommaso
Sep 28 2017 16:25
weird, have you tried with the local exec
Anthony Underwood
@aunderwo
Sep 28 2017 16:25
yeah - its super weird - a bwa index process is just hanging
Paolo Di Tommaso
@pditommaso
Sep 28 2017 16:26
umm, focus on the task w/o running directly the bash wrapper
Anthony Underwood
@aunderwo
Sep 28 2017 16:27
You mentioned trying to run .command.sh - the thing is within the work directory, there were no files at all
Paolo Di Tommaso
@pditommaso
Sep 28 2017 16:28
.command.run
Anthony Underwood
@aunderwo
Sep 28 2017 16:28
my bad. but there were 0 files
no .command.*
Paolo Di Tommaso
@pditommaso
Sep 28 2017 16:29
umm, it could be that NF has not permission to write in the /tmp folder ..
read NF as your user
Anthony Underwood
@aunderwo
Sep 28 2017 16:29
yes - that's what I think was happening.
what user does it run as - 'NF'
Paolo Di Tommaso
@pditommaso
Sep 28 2017 16:31
well the user which you ssh to the instance
Anthony Underwood
@aunderwo
Sep 28 2017 16:32
yeah weird - that user has permissions to write to /tmp
Paolo Di Tommaso
@pditommaso
Sep 28 2017 16:40
Don't forget docker runs with a different user..
Anthony Underwood
@aunderwo
Sep 28 2017 16:53
So I lied we do see .command.err .command.log .command.out and .exitcode but the error is permission denied for .command.env
there is no .command.sh or .command.env file
Anthony Underwood
@aunderwo
Sep 28 2017 17:12
could it be that the user in the docker image is non-standard???
#Switch user compass
USER compass
Paolo Di Tommaso
@pditommaso
Sep 28 2017 18:21
uummm, suspicious ..
Venkat Malladi
@vsmalladi
Sep 28 2017 18:25
@pditommaso thanks for the help, the $x reference worked, just an error earlier on in the file
Paolo Di Tommaso
@pditommaso
Sep 28 2017 18:26
:+1: