These are chat archives for nextflow-io/nextflow

20th
Jul 2018
Tiffany Delhomme
@tdelhomme
Jul 20 2018 07:33
Hi @pditommaso , not sure this is unexpected so I prefer directly to ask this to you:
I have a particular parameter in my pipeline, when I input --options "badReadsThreshold 0" the parameter in my command line would be ok but if I input --options "--badReadsThreshold 0" it will be transformed as true in my command line... seems that -- are not supported in string input, am I right? is this expected?
Paolo Di Tommaso
@pditommaso
Jul 20 2018 07:44
oh, weird
but it's weird it as parameter value as well .. :grin:
you may want to open an issue for that
I would use --badReadsThreshold <value>
then
badReadsThresholdOption = params.badReadsThreshold ? "--badReadsThreshold $params.badReadsThreshold" : ''
Tiffany Delhomme
@tdelhomme
Jul 20 2018 07:49
yes but actually I have many possible options like this, so my idea was to let the user enter all of the chosen parameters as a single string like --badReadsThreshold=0 --qdThreshold=0 --rmsmqThreshold=0 instead of initialize all of them one by one...
Paolo Di Tommaso
@pditommaso
Jul 20 2018 07:49
I understand, it looks like a bug
Tiffany Delhomme
@tdelhomme
Jul 20 2018 07:51
(I know this could sound weird/dirty a little bit) so yes I would open an issue for that!
Paolo Di Tommaso
@pditommaso
Jul 20 2018 07:51
:ok_hand:
Tiffany Delhomme
@tdelhomme
Jul 20 2018 07:52
:grin:
Clément ZOTTI
@czotti
Jul 20 2018 11:53
@mes5k Thanks for your information but unfortunately the work directory is not populated with the intermediate results in my case. Could it be because I use the scratch (which I need because the data is stored on a NAS) ?
Paolo Di Tommaso
@pditommaso
Jul 20 2018 11:54
Are there any directive to save my directory into the publishDir if a task is manually killed or crashed ?
you want the checkpoints only if it crash ?
Clément ZOTTI
@czotti
Jul 20 2018 11:55
Or if I kill the task by hand (Ctrl+c)
Paolo Di Tommaso
@pditommaso
Jul 20 2018 11:56
umm, it's designed to work the other way around
Clément ZOTTI
@czotti
Jul 20 2018 11:56
ok
Paolo Di Tommaso
@pditommaso
Jul 20 2018 11:56
why don't you just copy those files from the task work dir when you stop it
Clément ZOTTI
@czotti
Jul 20 2018 11:57
The work directory is populated when I kill the task ?
Paolo Di Tommaso
@pditommaso
Jul 20 2018 11:58
in the extend you are killing it .. it's not guaranteed, but if you are lucky enough yes
Clément ZOTTI
@czotti
Jul 20 2018 11:58
haha ok
thanks
I hope I will be lucky enough
Paolo Di Tommaso
@pditommaso
Jul 20 2018 11:58
:smile:
Clément ZOTTI
@czotti
Jul 20 2018 11:59
When I use the scratch the result should be in /tmp/something no ? I could copy it from there ?
Paolo Di Tommaso
@pditommaso
Jul 20 2018 12:00
you should mention that folder/file as yet another output
Clément ZOTTI
@czotti
Jul 20 2018 12:00
I need to split the ouput like this ?
output:
sid into output_patient_id
file("checkpoints") into output_checkpoints
Mike Smoot
@mes5k
Jul 20 2018 16:30

Hi @pditommaso I'm finally getting around to running pipelines in AWS Batch and I think I'm getting close. Nextflow starts and I see jobs starting in the Batch console, but then they all fail. Here is the exeception in .nextflow.log:

Jul-20 09:16:32.425 [Task monitor] DEBUG n.executor.AwsBatchTaskHandler - [AWS BATCH] Cannot read exitstatus for task: `create_files (1)`
java.nio.file.NoSuchFileException: /sgi-pipeline-dev/stress_batch_work/c0/19348edf16ed70a64a974234c76a64/.exitcode
        at com.upplication.s3fs.S3FileSystemProvider.newInputStream(S3FileSystemProvider.java:275)
        at java.nio.file.Files.newInputStream(Files.java:152)
        at java.nio.file.Files.newBufferedReader(Files.java:2784)
        at org.codehaus.groovy.runtime.NioGroovyMethods.newReader(NioGroovyMethods.java:1311)
        at org.codehaus.groovy.runtime.NioGroovyMethods.getText(NioGroovyMethods.java:422)
        at nextflow.executor.AwsBatchTaskHandler.readExitFile(AwsBatchExecutor.groovy:325)
        at nextflow.executor.AwsBatchTaskHandler.checkIfCompleted(AwsBatchExecutor.groovy:314)
        at nextflow.processor.TaskPollingMonitor.checkTaskStatus(TaskPollingMonitor.groovy:588)
        at nextflow.processor.TaskPollingMonitor.checkAllTasks(TaskPollingMonitor.groovy:514)
        at nextflow.processor.TaskPollingMonitor.pollLoop(TaskPollingMonitor.groovy:395)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1034)
        at org.codehaus.groovy.runtime.InvokerHelper.invokePogoMethod(InvokerHelper.java:947)
        at org.codehaus.groovy.runtime.InvokerHelper.invokeMethod(InvokerHelper.java:930)
        at org.codehaus.groovy.runtime.InvokerHelper.invokeMethodSafe(InvokerHelper.java:92)
        at nextflow.processor.TaskPollingMonitor$_start_closure4.doCall(TaskPollingMonitor.groovy:296)
        at nextflow.processor.TaskPollingMonitor$_start_closure4.call(TaskPollingMonitor.groovy)
        at groovy.lang.Closure.run(Closure.java:499)
        at java.lang.Thread.run(Thread.java:748)
Jul-20 09:16:32.428 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 1; name: create_files (1); status: COMPLETED; exit: -; error: -; workDir: s3://sgi-pipeline-dev/stress_batch_work/c0/19348edf16ed70a64a974234c76a64]
Jul-20 09:16:32.510 [Task monitor] DEBUG nextflow.processor.TaskRun - Unable to dump output of proce
ss 'null' -- Cause: java.nio.file.NoSuchFileException: /tmp/temp-s3-2525603517340061424/.command.out

In the Batch console, I see that the job failed because of Essential container in task exited, but I'm not sure what that means.

Indeed, the .exitcode file doesn't exist.
Does NXF_TMP need to be an S3 bucket too perhaps?
Paolo Di Tommaso
@pditommaso
Jul 20 2018 18:02
classic IAM permission error ..
I think the instance were NF is running cannot access the S3 bucket
Mike Smoot
@mes5k
Jul 20 2018 18:03
Ok, I'll look into that.
Paolo Di Tommaso
@pditommaso
Jul 20 2018 18:04
try also latest RC1, support for Batch is much more performant
tbugfinder
@tbugfinder
Jul 20 2018 18:05
Did you check cloudwatch ?
Mike Smoot
@mes5k
Jul 20 2018 18:06
I'll try that too. The instance where NF is running does have S3 access. I can run aws s3 ls s3://sgi-pipeline-dev/stress_batch_work/
tbugfinder
@tbugfinder
Jul 20 2018 18:09
Is it the instance role or the IAM user ?
Mike Smoot
@mes5k
Jul 20 2018 18:11
@tbugfinder good idea! I see bash: aws: command not found, which I'm guessing is the problem. I checked that it's using the right AMI, but maybe I haven't created the AMI correctly.
Paolo Di Tommaso
@pditommaso
Jul 20 2018 18:16
there's a good tutorial in the doc
Mike Smoot
@mes5k
Jul 20 2018 18:19
Ah yes, I missed the executor.awscli parameter!
Mike Smoot
@mes5k
Jul 20 2018 18:57
So, now I'm seeing bash: /usr/local/bin/aws: /usr/bin/python3.4: bad interpreter: No such file or directory in cloudwatch. /usr/local/bin/aws is where the aws cli tool lives and /usr/bin/python3.4 exists in the AMI, but I'm guessing that perhaps you're running aws from within the container (where /usr/bin/python3.4 does not exist). Does that sound right?
tbugfinder
@tbugfinder
Jul 20 2018 19:07
Which container do you use? Did you build it from scratch?
Mike Smoot
@mes5k
Jul 20 2018 19:09
Yeah, one of my own
And I'm sure it doesn't have python 3.4 installed
tbugfinder
@tbugfinder
Jul 20 2018 19:10
Sounds like an easy fix
Mike Smoot
@mes5k
Jul 20 2018 19:10
do you mean add python 3.4 to my image?
tbugfinder
@tbugfinder
Jul 20 2018 19:12
Yes or use a command which is available for now.
Mike Smoot
@mes5k
Jul 20 2018 19:13
would really prefer not to rebuild my 100+ images...
tbugfinder
@tbugfinder
Jul 20 2018 19:19
Or mount shared EFS with applications on it.
or mount AMI directories into container
Mike Smoot
@mes5k
Jul 20 2018 20:15
After thinking about this a while, I think the problem is that I didn't use miniconda to install awscli, I just did a pip install. It wasn't clear to me from the docs that the virtual environment conda creates is actually necessary for this to work. I'll rebuild my AMI using miniconda and see where that gets me.
Mike Smoot
@mes5k
Jul 20 2018 21:04
Got AWS Batch working on a test pipeline with all infrastructure spun up with terraform! Thanks for the help @pditommaso and @tbugfinder!