These are chat archives for nextflow-io/nextflow

29th
Oct 2018
Tobias Neumann
@t-neumann
Oct 29 2018 09:12
hi. I have this nextflow pipeline which will process a list of bam-files supplied with a wildcard and convert them to fastq files. Now I want to port this to an AWS batch profile. I guess this does not work since wildcards are probably not supported on s3 storage?
Tobias Neumann
@t-neumann
Oct 29 2018 09:22
In general I'm not sure about the file locations - so it won't run without supplying a s3 workdir. I thought this will be resident in the EBS storage and only anything produced for the publishDir will be put into s3? Furthermore, I would like to run only like 20 processes in parallel (maxForks) but I guess still the entire dataset has to reside in s3 before starting up the pipeline, no copying from local to s3 is done on demand
Paolo Di Tommaso
@pditommaso
Oct 29 2018 09:29
I guess this does not work since wildcards are probably not supported on s3 storage?
yes, it does
Tobias Neumann
@t-neumann
Oct 29 2018 09:33
so something like this would work for all *bam files under s3://obenauflab/test with the following channel definition?
Channel
    .fromPath( "${params.inputDir}/*/*.bam" )
    .map { file -> tuple( file.baseName, file ) }
    .set { rawBamFiles }
nextflow run obenauflab/virus-detection-nf -r TCGAconversion --inputDir s3://obenauflab/test -profile aws -with-report -bucket-dir s3://obenauflab/tmp
Paolo Di Tommaso
@pditommaso
Oct 29 2018 09:34
yes, NOTE -bucket-dir only work with the latest version
tip, make a test with only only
Tobias Neumann
@t-neumann
Oct 29 2018 09:35
ah yes - that's -w with older ones right
only only?
Paolo Di Tommaso
@pditommaso
Oct 29 2018 09:35
*only one bam :)
Tobias Neumann
@t-neumann
Oct 29 2018 09:36

ah yeah sure ;)

and with regards to work vs publishDir - how does that work? or more - where will I find the publishDir?

Paolo Di Tommaso
@pditommaso
Oct 29 2018 09:36
where will I find the publishDir ?
in the path you have specified as publishDir ..
Tobias Neumann
@t-neumann
Oct 29 2018 09:37
so this has to be an s3 path?
because currently it's a relative dir
publishDir = [
        [path: './results', mode: 'copy', overwrite: 'true', pattern: "*fq.gz"],
      ]
Paolo Di Tommaso
@pditommaso
Oct 29 2018 09:38
yes, you need to specify s3://something if you want the result in the s3 bucket
otherwise rel path are resolved agains the instance where NF is running
Tobias Neumann
@t-neumann
Oct 29 2018 09:40
I see. Is there a way to have it actually the other way around? Because usually one does not care so much of having the work dir in the s3 bucket and the publishDir in the EBS, but actually the other way around
because having everything twice also means twice the storage cost of course
Paolo Di Tommaso
@pditommaso
Oct 29 2018 09:41
"having everything twice" what ?
Tobias Neumann
@t-neumann
Oct 29 2018 09:41
have the work dir in s3 and then the publishDir with copied results (at least for my settings) also in s3
Paolo Di Tommaso
@pditommaso
Oct 29 2018 09:41
and how is the way you are suggesting ?
Tobias Neumann
@t-neumann
Oct 29 2018 09:43
maybe I totally got this wrong but the way I understand the general flow is to copy input data from the s3 storage to EBS, process everything and then copy the work dir back to s3. so if one also specifies a publishDir in s3 in copy mode, you will have the same files twice in s3. so having only the workdir in EBS and cleaning it after running the pipeline and only copying the publishDir to s3 would be saving you half the space
Paolo Di Tommaso
@pditommaso
Oct 29 2018 09:44
AFAIK there's no way to share the same EBS across multiple instances, therefore you need to download the data from somewehere
Tobias Neumann
@t-neumann
Oct 29 2018 09:45
no input is clear. it's the output in one workdir + copying the same output to a publishDir I wanted to ask about
Paolo Di Tommaso
@pditommaso
Oct 29 2018 09:47
yes, but each compute node has its own EBS volume, how you can use it to store the final (published) output ?
Tobias Neumann
@t-neumann
Oct 29 2018 09:48
ah sorry I was not clear about the pipeline setup: It's just a single process doing the bam2fastq conversion
so the output of the process == results in publishDir
Paolo Di Tommaso
@pditommaso
Oct 29 2018 09:50
again, even a single process can spawn many instances with its own ebs volume? how would you consolidate task results in the same ebs?
Tobias Neumann
@t-neumann
Oct 29 2018 09:53
So I have one process / input file leading to one set of fastq files, totally independent of other instances for other input files, whether they are on the same or different EBS. Where exactly do I need to consolidate task results?
I have the feeling I am missing something fundamental or am not communication clearly sorry
Tobias Neumann
@t-neumann
Oct 29 2018 10:26
image.png
So the submission works, but all that happens is that the job is stuck at "Runnable" and an EC2 instance got fired up
apparently nextflow cannot read the exit status
Oct-29 11:20:20.482 [Task submitter] DEBUG n.executor.AwsBatchTaskHandler - [AWS BATCH] submitted > job=2b5a7e77-928f-434d-9400-8234b24aca65; work-dir=s3://obenauflab/work/eb/083ad4eddf44d8cca40fd215d57b20
Oct-29 11:20:20.482 [Task submitter] INFO  nextflow.Session - [eb/083ad4] Submitted process > bamToFastq (2faea127-0ba5-4f7f-9502-8eaf8a4f5adf_gdc_realn_rehead)
Oct-29 11:21:39.777 [Task monitor] DEBUG n.executor.AwsBatchTaskHandler - [AWS BATCH] Cannot read exitstatus for task: `bamToFastq (2faea127-0ba5-4f7f-9502-8eaf8a4f5adf_gdc_realn_rehead)`
java.nio.file.NoSuchFileException: /obenauflab/work/eb/083ad4eddf44d8cca40fd215d57b20/.exitcode
           at com.upplication.s3fs.S3FileSystemProvider.newInputStream(S3FileSystemProvider.java:275)
           at java.nio.file.Files.newInputStream(Files.java:152)
           at java.nio.file.Files.newBufferedReader(Files.java:2784)
           at org.codehaus.groovy.runtime.NioGroovyMethods.newReader(NioGroovyMethods.java:1311)
           at org.codehaus.groovy.runtime.NioGroovyMethods.getText(NioGroovyMethods.java:422)
           at nextflow.executor.AwsBatchTaskHandler.readExitFile(AwsBatchExecutor.groovy:325)
           at nextflow.executor.AwsBatchTaskHandler.checkIfCompleted(AwsBatchExecutor.groovy:314)
           at nextflow.processor.TaskPollingMonitor.checkTaskStatus(TaskPollingMonitor.groovy:588)
           at nextflow.processor.TaskPollingMonitor.checkAllTasks(TaskPollingMonitor.groovy:514)
           at nextflow.processor.TaskPollingMonitor.pollLoop(TaskPollingMonitor.groovy:395)
           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
           at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.lang.reflect.Method.invoke(Method.java:498)
           at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
           at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
           at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
           at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1034)
           at org.codehaus.groovy.runtime.InvokerHelper.invokePogoMethod(InvokerHelper.java:947)
           at org.codehaus.groovy.runtime.InvokerHelper.invokeMethod(InvokerHelper.java:930)
           at org.codehaus.groovy.runtime.InvokerHelper.invokeMethodSafe(InvokerHelper.java:92)
           at nextflow.processor.TaskPollingMonitor$_start_closure4.doCall(TaskPollingMonitor.groovy:296)
           at nextflow.processor.TaskPollingMonitor$_start_closure4.call(TaskPollingMonitor.groovy)
           at groovy.lang.Closure.run(Closure.java:499)
           at java.lang.Thread.run(Thread.java:745)
Oct-29 11:21:39.780 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 1; name: bamToFastq (2faea127-0ba5-4f7f-9502-8eaf8a4f5adf_gdc_realn_rehead); status: COMPLETED; exit: -; error: -; workDir: s3://obenauflab/work/eb/083ad4eddf44d8cca40fd215d57b20]
Oct-29 11:21:39.793 [Task monitor] INFO  nextflow.processor.TaskProcessor - [eb/083ad4] NOTE: Process `bamToFastq (2faea127-0ba5-4f7f-9502-8eaf8a4f5adf_gdc_realn_rehead)` terminated for an unknown reason -- Likely it has been terminated by the external system -- Execution is retried (1)
Tobias Neumann
@t-neumann
Oct 29 2018 10:31
maybe it would not start because of this
CannotPullContainerError: API error (400): invalid reference format
Paolo Di Tommaso
@pditommaso
Oct 29 2018 10:33
you need to properly configure IAM permissions
Tobias Neumann
@t-neumann
Oct 29 2018 11:16
hm now I redid it from scratch and it's still stuck as runnable job, except now it does not even fire up ec2 instances any longer
Tobias Neumann
@t-neumann
Oct 29 2018 11:27
I think the previous error was because I was running it in singularity in the previous version and still had the docker:// prefix to it. but now I broke the instance fireup :(
Riccardo Giannico
@giannicorik_twitter
Oct 29 2018 11:59

Hi guys, I'm trying to understand how to use the integrated Apache Ignite tool. I only have one node for the moment and I have previous user experience of torque/pbs.
I understand from here : https://www.nextflow.io/docs/latest/ignite.html I need to run nextflow node -bg on my node, then configure nextflow adding the following code to my $HOME/.nextflow/config

cluster {
    join = 'ip:192.168.1.104'
    interface = 'eth0'
}

This is probably a stupid set of questions, please be patient!
1# I do not understand exactly the "interface" value. Why is it set to eth0 in the example? What is it? Description says "Network interfaces that Ignite will use. It can be the interface IP address or name". How can I find the interface IP address? Or I just have to invent one to be used?

2# The following environment variable CAN be exported or HAVE to be exported? Where? in the config file? In the node ~/.bash_profile ?

export NXF_CLUSTER_JOIN='ip:192.168.1.104'
export NXF_CLUSTER_INTERFACE='eth0'

3# One Ignite is configured how can I monitor the queue? Probably using the interface cited above? Is there something similar to the "qstat -a" I'm used to?

arontommi
@arontommi
Oct 29 2018 12:56
how to i set where .nextflow.log ends up ?
Edgar
@edgano
Oct 29 2018 12:58
if i'm not wrong. it's always on the base directory
arontommi
@arontommi
Oct 29 2018 12:59
ok thanks
Paolo Di Tommaso
@pditommaso
Oct 29 2018 13:31
@giannicorik_twitter nope, you need to use the mpi like execution, see here
Riccardo Giannico
@giannicorik_twitter
Oct 29 2018 13:42
@pditommaso thanks, but Isn't that instructions to execute a nextflow pipeline using mpi? I think I still need to configure ignite somehow before that! Do you mean I need to configure Ignite using MPI? Is there some guide for it?
Paolo Di Tommaso
@pditommaso
Oct 29 2018 13:42
no, ignite self-configure itself
you need to launch the workflow using a wrapper similar to thsi
#!/bin/bash
#$ -cwd
#$ -j y
#$ -o <output file name>
#$ -l virtual_free=10G
#$ -q <queue name>
#$ -N <job name>
#$ -pe ompi 5
export NXF_CLUSTER_SEED=$(shuf -i 0-16777216 -n 1)
mpirun --pernode nextflow run <your-project-name> -with-mpi [pipeline parameters]
it may vary depending your cluster
make a test with a dummy script
Riccardo Giannico
@giannicorik_twitter
Oct 29 2018 13:46
ah.. but do I still need to run nextflow node -bg on my node and set "join" and "interface" values under my $HOME/.nextflow/config file before launching nextflow with the wrapper?
Paolo Di Tommaso
@pditommaso
Oct 29 2018 13:50
nope, you need to use the script template above :point_up::point_up::point_up:
Riccardo Giannico
@giannicorik_twitter
Oct 29 2018 13:56
ok, so I need to install MPI first..
Paolo Di Tommaso
@pditommaso
Oct 29 2018 13:57
ah, but why you want to use that way to run NF ?
generally it's not needed
Riccardo Giannico
@giannicorik_twitter
Oct 29 2018 14:00
I do not "want" to install mpi, I just want to use a queue manager, I could understand apace ignite worked as a queue manager, but if you say I need a wrapper calling nextflow under "mpirun" .. it means I need to install openmpi. Am I wrong?
Paolo Di Tommaso
@pditommaso
Oct 29 2018 14:02
wait, but do you have already a batch scheduler like pbs or slurm or not ?
Riccardo Giannico
@giannicorik_twitter
Oct 29 2018 14:03
no I don't! that's the point!
Paolo Di Tommaso
@pditommaso
Oct 29 2018 14:03
:joy:
now I got it
ok, ignite can work as a queue manager but it's not a replacement for a batch scheduler
what I want to say that if you are planning to have stable usage of NF, I would suggest to install something like slurm
Riccardo Giannico
@giannicorik_twitter
Oct 29 2018 14:09
Ah ok, I thought I cound use ignite without other things.
My problem was I was trying to avoid learning how to install torque or slurm ;P But I realize now I can not avoid it
Tobias Neumann
@t-neumann
Oct 29 2018 14:16
Got this message from AWS on a 6GB, but it's run on a 500 GB EBS image - any ideas?
download failed: s3://obenauflab/test/43543b8a-03c4-47ef-b02c-c34435cdbae4/2faea127-0ba5-4f7f-9502-8eaf8a4f5adf_gdc_realn_rehead.bam to ./2faea127-0ba5-4f7f-9502-8eaf8a4f5adf_gdc_realn_rehead.bam [Errno 28] No space left on device
Paolo Di Tommaso
@pditommaso
Oct 29 2018 14:17
read carefully the custom ami docs
Tobias Neumann
@t-neumann
Oct 29 2018 14:21
But shouldn't be 10 GB be enough to download a 6 GB file? especially if the docker image is 20 MB?
ok sorry. I uploaded the wrong file
my bad
Will it always allocated the maximum space for each task or will this just push the upper bound? So basically can I set this to some high value without impacting other tasks?
micans
@micans
Oct 29 2018 15:36
Mmmmm. Maybe I have another bug. Hold on.
Never mind .... Monday zombie brain
I just deleted some idiotic comments I made to reduce clutter here.
Paolo Di Tommaso
@pditommaso
Oct 29 2018 16:06
LOL
Tobias Neumann
@t-neumann
Oct 29 2018 19:50
Is there a way to unset the publishDir? In light that on AWS all files are present in the workDir anyway
awsbatch {
            aws.region = 'eu-central-1'
            aws.client.storageEncryption = 'AES256'
            process.queue = 'awsconvert'
            executor.name = 'awsbatch'
            executor.awscli = '/home/ec2-user/miniconda/bin/aws'
            singularity.enabled = false
            docker.enabled = true
            process.publishDir = [
                [path: 's3://obenauflab/results', mode: 'copy', overwrite: 'true', pattern: "*fq.gz"],
              ]
        }
process.publishDir = null doesn't work