~/transfuse-0.5.0-linux-x86_64/bin$ ./salmon --help
Allowed Options:
-v [ --version ] print version string
--no-version-check don't check with the server to see if this is the
latest version
-h [ --help ] produce help message
Salmon v0.4.2
===============
Please invoke salmon with one of the following commands {index, quant, swim}.
For more inforation on the options for theses particular methods, use the -h
flag along with the method name. For example:
salmon index -h
will give you detailed help information about the index command.
transfuse
:(
...oops, I'm using the transfuse-0.5.0.tar.gz package version and it seems to be running fine. I just thought I'd share a few things I've found trying to get it running (huge caveat - I'm a biologist trying to be a bioinformatician!). Firstly, I got a warning from SNAP "FASTQ file doesn't end with a newline! Failing." - I'm fairly sure this was because my left.fq and right.fq files contained unpaired reads (I was using input read files produced by Trinity --trimmomatic option, which contain paired and unpaired reads). Second, SNAP failed beacause it "Ran out of scoring candidate pool entries". This was resolved by following a post by Richard Smith on a TransRate google group: You'll need to edit the function build_paired_cmd
- specifically you want to add a line. Find the line:
cmd << " -omax 10" # max alignments per pair/read
And add a new line after it:
cmd << " -mcp 10000000" # maximum candidate pool size
...Finally, Salmon kept failing when given multiple threads to run on, but seems to be working fine when run on only a single thread. Not sure if any of this is useful, but thought I'd share! Cheers, Drew
so ..
/transfuse/transfuse-0.5.0-linux-x86_64/transfuse -a transrate_R20_transfuse_cons/good.R20_transfuse_cons.fa,transrate_R35_transfuse_cons/good.R35_transfuse_cons.fa -l R20/trinity/insilico_read_normalization/left.norm.fq,R35/trinity/insilico_read_normalization/left.norm.fq -r R20/trinity/insilico_read_normalization/right.norm.fq,R35/trinity/insilico_read_normalization/right.norm.fq -o Eyestalk
/transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/lib/ruby/gems/2.2.0/gems/bundler-1.7.12/lib/bundler/runtime.rb:222: warning: Insecure world writable dir /usr/local/appl in PATH, mode 040777
/transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/2.2.0/gems/transrate-1.0.1/lib/transrate/read_metrics.rb:201:in populate_contig_data': undefined method
p_seq_true=' for nil:NilClass (NoMethodError)
from /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/2.2.0/gems/transrate-1.0.1/lib/transrate/read_metrics.rb:154:in block in analyse_read_mappings'
from /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/lib/ruby/2.2.0/csv.rb:1739:in
each'
from /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/lib/ruby/2.2.0/csv.rb:1122:in block in foreach'
from /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/lib/ruby/2.2.0/csv.rb:1273:in
open'
from /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/lib/ruby/2.2.0/csv.rb:1121:in foreach'
from /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/2.2.0/gems/transrate-1.0.1/lib/transrate/read_metrics.rb:151:in
analyse_read_mappings'
from /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/2.2.0/gems/transrate-1.0.1/lib/transrate/read_metrics.rb:73:in run'
from /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/2.2.0/gems/transrate-1.0.1/lib/transrate/transrater.rb:86:in
read_metrics'
from /transfuse/transfuse-0.5.0-linux-x86_64/lib/lib/transfuse/transfuse.rb:152:in block in transrate_consensus'
from /transfuse/transfuse-0.5.0-linux-x86_64/lib/lib/transfuse/transfuse.rb:148:in
chdir'
from /transfuse/transfuse-0.5.0-linux-x86_64/lib/lib/transfuse/transfuse.rb:148:in transrate_consensus'
from /transfuse/transfuse-0.5.0-linux-x86_64/lib/bin/transfuse:89:in
<main>'
How can I pass two (or more) sets of normalized reads to transfuse? Many thanks.
_cons.fa
file is consensus sequences from the clustering. This is filtered using transrate again to give the final output. You might find that they might not be very different, but often the final output is another improvement over just the consensus sequences. Hope this helps :)
@cboursnell Sorry for posting yet another query. I compared the transrate scores of the _cons.fa
file that transfuse automatically creates with a manual transrate analysis I ran on the final output file. I expected them to be fairly similar since both analyses use the same reads for the mapping and both files should be very similar. However, the transrate results for _cons.fa
are much better than what I get for the final output file. Any ideas what the problem could be?
>cat Transfuse_output_cons_stats.txt
fragments 9955695
fragments_mapped 9116349
p_fragments_mapped 0.9156918728426293
good_mappings 7881454
p_good_mapping 0.7916528178093041
bad_mappings 1234895
potential_bridges 14537
bases_uncovered 4923627
p_bases_uncovered 0.03910550080631024
contigs_uncovbase 51971
p_contigs_uncovbase 0.3711286464098261
contigs_uncovered 1846
p_contigs_uncovered 0.013182418681044025
contigs_lowcovered 110466
p_contigs_lowcovered 0.7888456457314242
contigs_segmented 8378
p_contigs_segmented 0.05982790016781519
assembly score: 0.44934868372400943
optimal score : 0.5368280402996235
cutoff : 0.4987261911265521
And this is result of manual transrate analysis of the final output file from transfuse:
[ INFO] 2017-01-27 12:34:47 : -----------------------------------
[ INFO] 2017-01-27 12:34:47 : fragments 9955695
[ INFO] 2017-01-27 12:34:47 : fragments mapped 4834496
[ INFO] 2017-01-27 12:34:47 : p fragments mapped 0.49
[ INFO] 2017-01-27 12:34:47 : good mappings 4124619
[ INFO] 2017-01-27 12:34:47 : p good mapping 0.41
[ INFO] 2017-01-27 12:34:47 : bad mappings 709877
[ INFO] 2017-01-27 12:34:47 : potential bridges 0
[ INFO] 2017-01-27 12:34:47 : bases uncovered 57553164
[ INFO] 2017-01-27 12:34:47 : p bases uncovered 0.47
[ INFO] 2017-01-27 12:34:47 : contigs uncovbase 67609
[ INFO] 2017-01-27 12:34:47 : p contigs uncovbase 0.49
[ INFO] 2017-01-27 12:34:47 : contigs uncovered 138052
[ INFO] 2017-01-27 12:34:47 : p contigs uncovered 1.0
[ INFO] 2017-01-27 12:34:47 : contigs lowcovered 138052
[ INFO] 2017-01-27 12:34:47 : p contigs lowcovered 1.0
[ INFO] 2017-01-27 12:34:47 : contigs segmented 13159
[ INFO] 2017-01-27 12:34:47 : p contigs segmented 0.1
[ INFO] 2017-01-27 12:34:47 : Read metrics done in 1545 seconds
[ INFO] 2017-01-27 12:34:47 : No reference provided, skipping comparative diagnostics
[ INFO] 2017-01-27 12:35:31 : TRANSRATE ASSEMBLY SCORE 0.1434
[ INFO] 2017-01-27 12:35:31 : -----------------------------------
[ INFO] 2017-01-27 12:35:31 : TRANSRATE OPTIMAL SCORE 0.2202
[ INFO] 2017-01-27 12:35:31 : TRANSRATE OPTIMAL CUTOFF 0.3812
[ INFO] 2017-01-27 12:35:32 : good contigs 89448
[ INFO] 2017-01-27 12:35:32 : p good contigs 0.65
It looks like it's a transrate problem. Not sure which version of transrate is bundled with transfuse, but after reading some of the comments of Matt MacManes in the transrate thread it looks like the bad mapping is limited to transrate 1.0.3.
I ran transrate 1.0.1 on the second dataset show above and I get much better results than with 1.0.3, and more comparable to the transrate output of _cons.fa
produced by transfuse. Strangely, this only seems to be an issue when I run transrate on the output of transfuse, but not when I run it on the trinity output.
[ INFO] 2017-01-31 14:31:31 : -----------------------------------
[ INFO] 2017-01-31 14:31:31 : fragments 9955695
[ INFO] 2017-01-31 14:31:31 : fragments mapped 9119736
[ INFO] 2017-01-31 14:31:31 : p fragments mapped 0.92
[ INFO] 2017-01-31 14:31:31 : good mappings 7883571
[ INFO] 2017-01-31 14:31:31 : p good mapping 0.79
[ INFO] 2017-01-31 14:31:31 : bad mappings 1236165
[ INFO] 2017-01-31 14:31:31 : potential bridges 14300
[ INFO] 2017-01-31 14:31:31 : bases uncovered 3231278
[ INFO] 2017-01-31 14:31:31 : p bases uncovered 0.03
[ INFO] 2017-01-31 14:31:31 : contigs uncovbase 49880
[ INFO] 2017-01-31 14:31:31 : p contigs uncovbase 0.36
[ INFO] 2017-01-31 14:31:31 : contigs uncovered 287
[ INFO] 2017-01-31 14:31:31 : p contigs uncovered 0.0
[ INFO] 2017-01-31 14:31:31 : contigs lowcovered 108491
[ INFO] 2017-01-31 14:31:31 : p contigs lowcovered 0.79
[ INFO] 2017-01-31 14:31:31 : contigs segmented 8238
[ INFO] 2017-01-31 14:31:31 : p contigs segmented 0.06
[ INFO] 2017-01-31 14:31:31 : Read metrics done in 4335 seconds
[ INFO] 2017-01-31 14:31:31 : No reference provided, skipping comparative diagnostics
[ INFO] 2017-01-31 14:31:31 : TRANSRATE ASSEMBLY SCORE 0.4622
[ INFO] 2017-01-31 14:31:31 : -----------------------------------
[ INFO] 2017-01-31 14:31:31 : TRANSRATE OPTIMAL SCORE 0.5372
[ INFO] 2017-01-31 14:31:31 : TRANSRATE OPTIMAL CUTOFF 0.5038
[ INFO] 2017-01-31 14:31:31 : good contigs 102954
[ INFO] 2017-01-31 14:31:31 : p good contigs 0.75
For completeness, this is the output when I run the _cons.fa
file through transrate 1.0.3
[ INFO] 2017-01-31 14:33:07 : -----------------------------------
[ INFO] 2017-01-31 14:33:07 : fragments 9955695
[ INFO] 2017-01-31 14:33:07 : fragments mapped 4729856
[ INFO] 2017-01-31 14:33:07 : p fragments mapped 0.48
[ INFO] 2017-01-31 14:33:07 : good mappings 4034657
[ INFO] 2017-01-31 14:33:07 : p good mapping 0.41
[ INFO] 2017-01-31 14:33:07 : bad mappings 695199
[ INFO] 2017-01-31 14:33:07 : potential bridges 0
[ INFO] 2017-01-31 14:33:07 : bases uncovered 60814931
[ INFO] 2017-01-31 14:33:07 : p bases uncovered 0.48
[ INFO] 2017-01-31 14:33:07 : contigs uncovbase 69981
[ INFO] 2017-01-31 14:33:07 : p contigs uncovbase 0.5
[ INFO] 2017-01-31 14:33:07 : contigs uncovered 140035
[ INFO] 2017-01-31 14:33:07 : p contigs uncovered 1.0
[ INFO] 2017-01-31 14:33:07 : contigs lowcovered 140035
[ INFO] 2017-01-31 14:33:07 : p contigs lowcovered 1.0
[ INFO] 2017-01-31 14:33:07 : contigs segmented 13130
[ INFO] 2017-01-31 14:33:07 : p contigs segmented 0.09
[ INFO] 2017-01-31 14:33:07 : Read metrics done in 4308 seconds
[ INFO] 2017-01-31 14:33:07 : No reference provided, skipping comparative diagnostics
[ INFO] 2017-01-31 14:33:41 : TRANSRATE ASSEMBLY SCORE 0.132
[ INFO] 2017-01-31 14:33:41 : -----------------------------------
[ INFO] 2017-01-31 14:33:41 : TRANSRATE OPTIMAL SCORE 0.2154
[ INFO] 2017-01-31 14:33:41 : TRANSRATE OPTIMAL CUTOFF 0.3812
[ INFO] 2017-01-31 14:33:42 : good contigs 89018
[ INFO] 2017-01-31 14:33:42 : p good contigs 0.64
@cboursnell do you know of a reason offhand why transfuse would fail on certain (Transabyss) assemblies but not others (Binpacker, Trinity) because of what seems like a memory leak during the salmon stage? e.g.
processed 8000000 reads in current roundterminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
/var/lib/gems/2.3.0/gems/transrate-1.0.1/lib/transrate/salmon.rb:27:in `run': Salmon failed (Transrate::SalmonError)
If I watch htop the RAM usage climbs to several times a normal run (>75Gb) and then dies. Same assemblies fail with transrate 1.0.1 separately at the same stage too. I previously had this happen on certain Transabyss assemblies from a different dataset but was able to remake the assembly but never identified the specific error.
Hi @cboursnell any thoughts on why transfuse only uses the first read-pair name in making the bam alignments whilst within the transrate steps? Just trying to debug a failed run that seems to crash with snap error as follows:
Loading index from directory... 2s. 832500803 bases, seed size 23
Aligning.
Welcome to SNAP version 1.0beta.18.
sched_setaffinity: Invalid argument
sched_setaffinity: Invalid argument
Ran out of scoring candidate pool entries. Perhaps trying with a larger value of **-mcp** will help.
SNAP exited with exit code 1 from line 489 of file SNAPLib/IntersectingPairedEndAligner.cpp
Does transfuse allow tweaking of underlying snap options?