These are chat archives for nextflow-io/nextflow

12th
Nov 2018
Anthony Underwood
@aunderwo
Nov 12 2018 06:59

@tbugfinder Thanks didn't know about the -trace option only -with-trace which I think is different. I do get more output now

see this snippet of the log file https://gist.github.com/aunderwo/07126a0b5f59314358db1dca78545bf8

The lines

Nov-11 22:36:42.566 [Task monitor] TRACE n.processor.TaskPollingMonitor - Scheduler queue size: 0 (iteration: 14)

keep on repeating with the queue never growing beyond 0

Alexander Peltzer
@apeltzer
Nov 12 2018 07:04
Definitely some permissions missing
I had something similar until i gave the sub account more rights
Paolo Di Tommaso
@pditommaso
Nov 12 2018 08:44
That message, is expected, means that's wait for workflow completions
Are the jobs properly submitted and executed?
KochTobi
@KochTobi
Nov 12 2018 08:46
you also need full ec2 access
at least we got it working that way: https://apeltzer.github.io/post/01-aws-nfcore/
Paolo Di Tommaso
@pditommaso
Nov 12 2018 08:47
Actually no, but it's suggested as first setup
Anthony Underwood
@aunderwo
Nov 12 2018 09:02
@apeltzer do you remember off the top of your head what permissions teh sub accounts needed
I've added full EC2 and it does change anything :( Thanks for the suggestion @KochTobi however
It's definitely something odd about the sub accounts. I've followed my blog post and others to the letter. It works fine on the root account but not on the sub accounts
Alexander Peltzer
@apeltzer
Nov 12 2018 09:13
Had similar issues, root account is pretty "easy" but the sub accounts have some weird behaviour then
So adding full EC2 didn't help either?
Anthony Underwood
@aunderwo
Nov 12 2018 09:59
@apeltzer No unfortunately not :(
Benjamin Wingfield
@nebfold_gitlab
Nov 12 2018 14:54
Hi, is it possible for a process to output a file into a value channel?
Paolo Di Tommaso
@pditommaso
Nov 12 2018 14:56
in principle yes, but my detector signals a bad hack ..
Anthony Underwood
@aunderwo
Nov 12 2018 15:01

Some more tracing of my AWS subaccount woes

[determine_min_read_length (ERR232542)] Exit file can't be read > /ghru-torok-2014/workdir/df/a4503561a141263bf3f2e051e4b0a4/.exitcode -- return false -- Cause:
 /ghru-torok-2014/workdir/df/a4503561a141263bf3f2e051e4b0a4/.exitcode

the workdir exists but is empty.

I'm also seeing

Nov-12 14:56:34.492 [Actor Thread 5] TRACE nextflow.processor.TaskProcessor - [qc_pre_trimming (ERR232551)] Store dir not set -- return false

should I be setting a store dir?

Benjamin Wingfield
@nebfold_gitlab
Nov 12 2018 15:01

Haha probably! I have a process that outputs a reference file (classes) that needs to be input like so:

process do_low {
    input:
    file classes
    set val(level), file(valid) from ch_1

    when:
    level == 'low'

    """
    # do something
    """
}

This doesn't work because there are four levels and only one classes file. I think I want four processes (one per level) with four when statements, but there's probably a better way!

Paolo Di Tommaso
@pditommaso
Nov 12 2018 15:05
@aunderwo that looks the app cannot write to the bucket, make sure both the instance running NF and the batch instances have full s3 permission at least on that bucket
@nebfold_gitlab use a conditional script instead
Anthony Underwood
@aunderwo
Nov 12 2018 15:10
image.png

@pditommaso This is why I am baffled

The batch compute environment has this instance role

This role has full S3 access
image.png
However what is the real issues is that jobs are not even entering the queue. Spot instances are not being requested
Benjamin Wingfield
@nebfold_gitlab
Nov 12 2018 15:12
Thanks @pditommaso. All four processes will always run and I always need to collect their outputs. My goal is to split my channel of tuples (ch_1 above): ([none, files], [low, files], ..., [high, files]) into separate outputs ([none, files]) to do some work on them.
Anthony Underwood
@aunderwo
Nov 12 2018 15:12
I do not see any jobs being submitted when monitoring in AWS. I can change the queue name to something garbled and it hangs at the same point
Paolo Di Tommaso
@pditommaso
Nov 12 2018 15:13
therefore the problem is the main instance, what role are you using for that ?
Anthony Underwood
@aunderwo
Nov 12 2018 15:13
When you say main instance what do you mean?
Paolo Di Tommaso
@pditommaso
Nov 12 2018 15:13
where is running NF ?
Anthony Underwood
@aunderwo
Nov 12 2018 15:15
It's running on an EC2 t2.micro but previously I have submitted from my local machine. I thought it uses AWS credentials to submit the jobs to batch
I can see these credentials being picked up in the logs and if I remove S3 priveleges it errors
my AWS creds for the IAM user has full AWSBatch and full S3 privileges
That was all I needed with an IAM user on the root account.
I have tried adding full EC2 but still no joy
Paolo Di Tommaso
@pditommaso
Nov 12 2018 15:17
it looks to me a problem S3 perms for the user or role running NF, focus on that
Anthony Underwood
@aunderwo
Nov 12 2018 15:18
OK. Will do. Which Nextflow class submits a job to the queue?
Paolo Di Tommaso
@pditommaso
Nov 12 2018 15:18
AwsBatchExecutor
Anthony Underwood
@aunderwo
Nov 12 2018 15:48

I'm afraid I am at a loss.
I have tried

  • submitting jobs to the same queue as the same IAM user via the web console => √
  • writing to the workdir specified where it could not find .exitcode as the same IAM user => √

The fact that I do not see the jobs entering the queue via the web interface suggests to me it's earlier in the process.
I think it's a sub account issue. Maybe I can pick the brains of the AWS person at the workshop later this month :)

Benjamin Wingfield
@nebfold_gitlab
Nov 12 2018 16:00
I'm getting template: command not found with conditional executions. I think I'm getting the syntax wrong:
process organise_valid {
    input:
    file classes_valid
    set val(level), file(valid) from vp

    script:
    if( level == 'none' )
    """
    template 'organise_valid.sh'
    """
    else if( level == 'high' )
    """
    echo "high!"
    """
    else
        error "Invalid level: ${level}"
}
Benjamin Wingfield
@nebfold_gitlab
Nov 12 2018 16:16
I just realised you don't triple quote the template command. Oops!
Paolo Di Tommaso
@pditommaso
Nov 12 2018 16:16
very good, achievement unlocked ;)
Benjamin Wingfield
@nebfold_gitlab
Nov 12 2018 16:17
I got my pipeline working with the conditional execution. Thanks again @pditommaso !
Paolo Di Tommaso
@pditommaso
Nov 12 2018 16:17
:+1:
tbugfinder
@tbugfinder
Nov 12 2018 18:33
@aunderwo Could you check within S3 if all "command files" are uploaded? Do you have a specific bucket policy in place so that your IAM user can read/write from that? Could you try to write to the bucket using aws-cli?
Tobias "Tobi" Schraink
@tobsecret
Nov 12 2018 18:35

Hey folks, I got an error like the following:

ERROR ~ Error executing process > 'gatk_haplotype_caller_bp_resolution (SampleID ERS193625)'

Caused by:
  Process `gatk_haplotype_caller_bp_resolution` input file name collision -- There are multiple input files for each of the following file names: ERR015432.bam.bai, ERR015335.bam.bai, ERR015428.bam.bai, ERR404190.bam, ERR015432.bam, ERR404190.bam.bai, ERR015406.bam, ERR015428.bam, ERR015335.bam, ERR015406.bam.bai


Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

 -- Check '.nextflow.log' file for details
[02/be99d8] Submitted process > gatk_haplotype_caller_gvcf (SampleID ERS311728)
WARN: Killing pending tasks (1)

How do I find the actual working directory? in .nextflow.log there is only one occurrence of gatk_haplotype_caller_bp_resolution (SampleID ERS193625)and it's that error message. Also for some reason using find to find the working directory with the duplicated files yields an insane number of working directories (over 2000).

Could this be related to me using storeDir for the raw data?