Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Matt MacManes
    @macmanes
    where did you get your Salmon from, I’ll try and reproduce that on my end - or is this the Salmon bundled with transfuse?
    bundled salmon works fine
    ~/transfuse-0.5.0-linux-x86_64/bin$ ./salmon --help
    Allowed Options:
      -v [ --version ]      print version string
      --no-version-check    don't check with the server to see if this is the
                            latest version
      -h [ --help ]         produce help message
    
        Salmon v0.4.2
        ===============
    
        Please invoke salmon with one of the following commands {index, quant, swim}.
        For more inforation on the options for theses particular methods, use the -h
        flag along with the method name.  For example:
    
        salmon index -h
    
        will give you detailed help information about the index command.
    Chris Boursnell
    @cboursnell
    It was the salmon from here:
    https://github.com/COMBINE-lab/salmon/releases/download/v0.4.0/SalmonBeta-0.4.0_DebianSqueeze.tar.gz
    that gave the error. I don't know why i used that version. lemme check 0.4.2
    But the fact that it gave the same error implies there's a larger issue than something specific to transfuse :(
    Matt MacManes
    @macmanes
    yep, I get that same error with that version of Salmon
    Chris Boursnell
    @cboursnell
    Yup, same. weird...
    Matt MacManes
    @macmanes
    What you think, @rob-p?
    yep.
    Chris Boursnell
    @cboursnell
    That should be the exact same one that was bundled with transfuse though
    Matt MacManes
    @macmanes
    right. which brings me to my next question, mostly for another day.. What are chances of getting updated transfuse with recent version of Salmon and TransRate?
    Chris Boursnell
    @cboursnell
    Yup, I will get on that. I've been writing up the paper so haven't done any work on the software recently.
    Matt MacManes
    @macmanes
    great news about the paper! I’ll stay tuned for updates
    Rob Patro
    @rob-p
    Hey guys, I think that the issue with the old version of salmon is that's it's before we had the "robust" binary building process. There were some strange libc issues in certain places. That has mostly gone (probably completely, but don't want to jinx it) away since I switched to Holy Build Box for making the binaries.
    Matt MacManes
    @macmanes
    so @cboursnell, how long does it take once 0.5 is pushed to rubygems for it to show up? Seems like this might be easiest workaround for me, at the moment.
    Lavinia Gordon
    @MrsLaviniaG_twitter
    Ran transfuse on 13 transcriptomes, very nice!, big fan, getting great metrics. Two queries, one of the output files, *_transfuse.data is empty, should I be concerned? Also can I confirm what the columns in _transfuse_cons_scores.csv are? (five columns - contig name, then four numerical columns, no header). Many thanks!
    Chris Boursnell
    @cboursnell
    @macmanes transfuse v0.5.0 is up on rubygems now. sorry for the delay.
    I have a feature that's nearly ready that was requested that will allow you to merge transcriptomes that were built with different read datasets. it'll need some testing as well though
    Maybe if I build the travelling-ruby package on a ubuntu16.04 machine then it'll work there. I'll try that next
    Richard Smith-Unna
    @blahah
    @cboursnell I think 16.04 was released after the version of travelling Ruby we build with
    and it might have new c abis included
    so updating travelling Ruby to the latest version should solve it
    Chris Boursnell
    @cboursnell
    Ah ok. Thanks. The version I used was 20150210, and there is a 20150715 version out that I will try
    Matt MacManes
    @macmanes
    AWS updated their default machines to 16.04, so basically transrate, transfuse are completely borked, basically related to the travelling Ruby issue. It’s especially hard when 1.0.3 requires certain version of software, which are not the current versions, which are not the same as the ones transfuse 0.5.0 uses. It’s a challenge when you need 3 versions of, for instance, Salmon installed. Salmon 0.4, 0.6, 0.72 all need to be available. @cboursnell @blahah
    Richard Smith-Unna
    @blahah
    gah sounds like a PITA
    we'll bring things in line
    Chris Boursnell
    @cboursnell
    @MrsLaviniaG_twitter I've cleaned up the output files and given them headers. We're working on doing a new release at the moment that works on 16.04 . Those four columns are score, p_good, p_bases_covered, coverage. They're taken directly from the transrate output.
    ulrichkudahl
    @ulrichkudahl
    Hi Chris. I am currently writing up my thesis and was wondering if there was any particular way you would like to have Transfuse cited?
    Chris Boursnell
    @cboursnell
    Thanks for asking. I hope to have a preprint of the paper up on bioarxiv soon. When is your thesis going to be finished? Until a preprint is available I guess just cite the github page
    ulrichkudahl
    @ulrichkudahl
    I will use the GitHub page for now, but look forward to reading the preprint it is ready. Let me know :)
    Drew-Oli
    @Drew-Oli
    Hi,
    Drew-Oli
    @Drew-Oli

    ...oops, I'm using the transfuse-0.5.0.tar.gz package version and it seems to be running fine. I just thought I'd share a few things I've found trying to get it running (huge caveat - I'm a biologist trying to be a bioinformatician!). Firstly, I got a warning from SNAP "FASTQ file doesn't end with a newline! Failing." - I'm fairly sure this was because my left.fq and right.fq files contained unpaired reads (I was using input read files produced by Trinity --trimmomatic option, which contain paired and unpaired reads). Second, SNAP failed beacause it "Ran out of scoring candidate pool entries". This was resolved by following a post by Richard Smith on a TransRate google group: You'll need to edit the function build_paired_cmd - specifically you want to add a line. Find the line:

    cmd << " -omax 10" # max alignments per pair/read

    And add a new line after it:

    cmd << " -mcp 10000000" # maximum candidate pool size
    ...Finally, Salmon kept failing when given multiple threads to run on, but seems to be working fine when run on only a single thread. Not sure if any of this is useful, but thought I'd share! Cheers, Drew

    Lavinia Gordon
    @MrsLaviniaG_twitter
    Thanks @cboursnell for the column headers, appreciated. Ran transfuse on a single libraries with multiple assemblers - beautiful! Now I would like to combine several libraries to get a single transcriptome for a single tissue, simple approach does not work:
    whoops, sorry, always forget to switch modes:

    so ..

    /transfuse/transfuse-0.5.0-linux-x86_64/transfuse -a transrate_R20_transfuse_cons/good.R20_transfuse_cons.fa,transrate_R35_transfuse_cons/good.R35_transfuse_cons.fa -l R20/trinity/insilico_read_normalization/left.norm.fq,R35/trinity/insilico_read_normalization/left.norm.fq -r R20/trinity/insilico_read_normalization/right.norm.fq,R35/trinity/insilico_read_normalization/right.norm.fq -o Eyestalk

    /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/lib/ruby/gems/2.2.0/gems/bundler-1.7.12/lib/bundler/runtime.rb:222: warning: Insecure world writable dir /usr/local/appl in PATH, mode 040777
    /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/2.2.0/gems/transrate-1.0.1/lib/transrate/read_metrics.rb:201:in populate_contig_data': undefined methodp_seq_true=' for nil:NilClass (NoMethodError)
    from /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/2.2.0/gems/transrate-1.0.1/lib/transrate/read_metrics.rb:154:in block in analyse_read_mappings' from /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/lib/ruby/2.2.0/csv.rb:1739:ineach'
    from /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/lib/ruby/2.2.0/csv.rb:1122:in block in foreach' from /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/lib/ruby/2.2.0/csv.rb:1273:inopen'
    from /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/lib/ruby/2.2.0/csv.rb:1121:in foreach' from /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/2.2.0/gems/transrate-1.0.1/lib/transrate/read_metrics.rb:151:inanalyse_read_mappings'
    from /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/2.2.0/gems/transrate-1.0.1/lib/transrate/read_metrics.rb:73:in run' from /transfuse/transfuse-0.5.0-linux-x86_64/lib/ruby/2.2.0/gems/transrate-1.0.1/lib/transrate/transrater.rb:86:inread_metrics'
    from /transfuse/transfuse-0.5.0-linux-x86_64/lib/lib/transfuse/transfuse.rb:152:in block in transrate_consensus' from /transfuse/transfuse-0.5.0-linux-x86_64/lib/lib/transfuse/transfuse.rb:148:inchdir'
    from /transfuse/transfuse-0.5.0-linux-x86_64/lib/lib/transfuse/transfuse.rb:148:in transrate_consensus' from /transfuse/transfuse-0.5.0-linux-x86_64/lib/bin/transfuse:89:in<main>'

    How can I pass two (or more) sets of normalized reads to transfuse? Many thanks.

    Lavinia Gordon
    @MrsLaviniaG_twitter
    Apologies all, ignore my previous post - the tried and tested method of removing everything and starting again appears to have worked, an intermediate file was causing problems, thanks.
    Chris Boursnell
    @cboursnell
    I'm working on an update to both transfuse and transrate at the moment and one of the things i hope to include is better handling of some of the errors that people have had here. Glad it's working ok now @MrsLaviniaG_twitter
    ulrichkudahl
    @ulrichkudahl
    Chris, are you aware of any published studies where Transfuse have been used?
    Chris Boursnell
    @cboursnell
    I should check!
    Roger Huerlimann
    @RogerHuerlimann_twitter
    I'm quite excited to use transfuse on my assemblies and I was wondering what the difference between the two output files .fa and _cons.fa is?
    Chris Boursnell
    @cboursnell
    The _cons.fa file is consensus sequences from the clustering. This is filtered using transrate again to give the final output. You might find that they might not be very different, but often the final output is another improvement over just the consensus sequences. Hope this helps :)
    Roger Huerlimann
    @RogerHuerlimann_twitter
    Ah, I see! :smile: thanks! Are the "good" reads from transrate taken into account? My consensus file has 73,073 contigs, the good.* file has 45,420 contigs, and the final output has 72,665 contigs. It looks like the filtering from consensus to final is less stringent than the filtering in transrate?
    Roger Huerlimann
    @RogerHuerlimann_twitter

    @cboursnell Sorry for posting yet another query. I compared the transrate scores of the _cons.fa file that transfuse automatically creates with a manual transrate analysis I ran on the final output file. I expected them to be fairly similar since both analyses use the same reads for the mapping and both files should be very similar. However, the transrate results for _cons.fa are much better than what I get for the final output file. Any ideas what the problem could be?

    >cat Transfuse_output_cons_stats.txt
    fragments         9955695
    fragments_mapped        9116349
    p_fragments_mapped      0.9156918728426293
    good_mappings   7881454
    p_good_mapping  0.7916528178093041
    bad_mappings    1234895
    potential_bridges       14537
    bases_uncovered 4923627
    p_bases_uncovered       0.03910550080631024
    contigs_uncovbase       51971
    p_contigs_uncovbase     0.3711286464098261
    contigs_uncovered       1846
    p_contigs_uncovered     0.013182418681044025
    contigs_lowcovered      110466
    p_contigs_lowcovered    0.7888456457314242
    contigs_segmented       8378
    p_contigs_segmented     0.05982790016781519
    assembly score: 0.44934868372400943
    optimal score : 0.5368280402996235
    cutoff        : 0.4987261911265521

    And this is result of manual transrate analysis of the final output file from transfuse:

    [ INFO] 2017-01-27 12:34:47 : -----------------------------------
    [ INFO] 2017-01-27 12:34:47 : fragments                   9955695
    [ INFO] 2017-01-27 12:34:47 : fragments mapped            4834496
    [ INFO] 2017-01-27 12:34:47 : p fragments mapped             0.49
    [ INFO] 2017-01-27 12:34:47 : good mappings               4124619
    [ INFO] 2017-01-27 12:34:47 : p good mapping                 0.41
    [ INFO] 2017-01-27 12:34:47 : bad mappings                 709877
    [ INFO] 2017-01-27 12:34:47 : potential bridges                 0
    [ INFO] 2017-01-27 12:34:47 : bases uncovered            57553164
    [ INFO] 2017-01-27 12:34:47 : p bases uncovered              0.47
    [ INFO] 2017-01-27 12:34:47 : contigs uncovbase             67609
    [ INFO] 2017-01-27 12:34:47 : p contigs uncovbase            0.49
    [ INFO] 2017-01-27 12:34:47 : contigs uncovered            138052
    [ INFO] 2017-01-27 12:34:47 : p contigs uncovered             1.0
    [ INFO] 2017-01-27 12:34:47 : contigs lowcovered           138052
    [ INFO] 2017-01-27 12:34:47 : p contigs lowcovered            1.0
    [ INFO] 2017-01-27 12:34:47 : contigs segmented             13159
    [ INFO] 2017-01-27 12:34:47 : p contigs segmented             0.1
    [ INFO] 2017-01-27 12:34:47 : Read metrics done in 1545 seconds
    [ INFO] 2017-01-27 12:34:47 : No reference provided, skipping comparative diagnostics
    [ INFO] 2017-01-27 12:35:31 : TRANSRATE ASSEMBLY SCORE     0.1434
    [ INFO] 2017-01-27 12:35:31 : -----------------------------------
    [ INFO] 2017-01-27 12:35:31 : TRANSRATE OPTIMAL SCORE      0.2202
    [ INFO] 2017-01-27 12:35:31 : TRANSRATE OPTIMAL CUTOFF     0.3812
    [ INFO] 2017-01-27 12:35:32 : good contigs                  89448
    [ INFO] 2017-01-27 12:35:32 : p good contigs                 0.65
    Roger Huerlimann
    @RogerHuerlimann_twitter

    It looks like it's a transrate problem. Not sure which version of transrate is bundled with transfuse, but after reading some of the comments of Matt MacManes in the transrate thread it looks like the bad mapping is limited to transrate 1.0.3.
    I ran transrate 1.0.1 on the second dataset show above and I get much better results than with 1.0.3, and more comparable to the transrate output of _cons.faproduced by transfuse. Strangely, this only seems to be an issue when I run transrate on the output of transfuse, but not when I run it on the trinity output.

    [ INFO] 2017-01-31 14:31:31 : -----------------------------------
    [ INFO] 2017-01-31 14:31:31 : fragments                   9955695
    [ INFO] 2017-01-31 14:31:31 : fragments mapped            9119736
    [ INFO] 2017-01-31 14:31:31 : p fragments mapped             0.92
    [ INFO] 2017-01-31 14:31:31 : good mappings               7883571
    [ INFO] 2017-01-31 14:31:31 : p good mapping                 0.79
    [ INFO] 2017-01-31 14:31:31 : bad mappings                1236165
    [ INFO] 2017-01-31 14:31:31 : potential bridges             14300
    [ INFO] 2017-01-31 14:31:31 : bases uncovered             3231278
    [ INFO] 2017-01-31 14:31:31 : p bases uncovered              0.03
    [ INFO] 2017-01-31 14:31:31 : contigs uncovbase             49880
    [ INFO] 2017-01-31 14:31:31 : p contigs uncovbase            0.36
    [ INFO] 2017-01-31 14:31:31 : contigs uncovered               287
    [ INFO] 2017-01-31 14:31:31 : p contigs uncovered             0.0
    [ INFO] 2017-01-31 14:31:31 : contigs lowcovered           108491
    [ INFO] 2017-01-31 14:31:31 : p contigs lowcovered           0.79
    [ INFO] 2017-01-31 14:31:31 : contigs segmented              8238
    [ INFO] 2017-01-31 14:31:31 : p contigs segmented            0.06
    [ INFO] 2017-01-31 14:31:31 : Read metrics done in 4335 seconds
    [ INFO] 2017-01-31 14:31:31 : No reference provided, skipping comparative diagnostics
    [ INFO] 2017-01-31 14:31:31 : TRANSRATE ASSEMBLY SCORE     0.4622
    [ INFO] 2017-01-31 14:31:31 : -----------------------------------
    [ INFO] 2017-01-31 14:31:31 : TRANSRATE OPTIMAL SCORE      0.5372
    [ INFO] 2017-01-31 14:31:31 : TRANSRATE OPTIMAL CUTOFF     0.5038
    [ INFO] 2017-01-31 14:31:31 : good contigs                 102954
    [ INFO] 2017-01-31 14:31:31 : p good contigs                 0.75

    For completeness, this is the output when I run the _cons.fa file through transrate 1.0.3

    [ INFO] 2017-01-31 14:33:07 : -----------------------------------
    [ INFO] 2017-01-31 14:33:07 : fragments                   9955695
    [ INFO] 2017-01-31 14:33:07 : fragments mapped            4729856
    [ INFO] 2017-01-31 14:33:07 : p fragments mapped             0.48
    [ INFO] 2017-01-31 14:33:07 : good mappings               4034657
    [ INFO] 2017-01-31 14:33:07 : p good mapping                 0.41
    [ INFO] 2017-01-31 14:33:07 : bad mappings                 695199
    [ INFO] 2017-01-31 14:33:07 : potential bridges                 0
    [ INFO] 2017-01-31 14:33:07 : bases uncovered            60814931
    [ INFO] 2017-01-31 14:33:07 : p bases uncovered              0.48
    [ INFO] 2017-01-31 14:33:07 : contigs uncovbase             69981
    [ INFO] 2017-01-31 14:33:07 : p contigs uncovbase             0.5
    [ INFO] 2017-01-31 14:33:07 : contigs uncovered            140035
    [ INFO] 2017-01-31 14:33:07 : p contigs uncovered             1.0
    [ INFO] 2017-01-31 14:33:07 : contigs lowcovered           140035
    [ INFO] 2017-01-31 14:33:07 : p contigs lowcovered            1.0
    [ INFO] 2017-01-31 14:33:07 : contigs segmented             13130
    [ INFO] 2017-01-31 14:33:07 : p contigs segmented            0.09
    [ INFO] 2017-01-31 14:33:07 : Read metrics done in 4308 seconds
    [ INFO] 2017-01-31 14:33:07 : No reference provided, skipping comparative diagnostics
    [ INFO] 2017-01-31 14:33:41 : TRANSRATE ASSEMBLY SCORE      0.132
    [ INFO] 2017-01-31 14:33:41 : -----------------------------------
    [ INFO] 2017-01-31 14:33:41 : TRANSRATE OPTIMAL SCORE      0.2154
    [ INFO] 2017-01-31 14:33:41 : TRANSRATE OPTIMAL CUTOFF     0.3812
    [ INFO] 2017-01-31 14:33:42 : good contigs                  89018
    [ INFO] 2017-01-31 14:33:42 : p good contigs                 0.64
    Matt MacManes
    @macmanes
    Hey @cboursnell, how bad will transfuse break if I swap in binaries for updated salmon and transrate? Is the different column format in quant.sf in Salmon going to be a problem?
    cd-ccmb
    @cd-ccmb
    I intend to run transfuse on 15 assemblies. My first doubt is that should I give the normalized reads used for assembly or the clean reads without normalization?
    And second is the what is the time and space requirement for the same?
    cd-ccmb
    @cd-ccmb
    Additionally , how do we pass two different libraries used for assembly??
    Stuart Willis
    @stuartwillis

    @cboursnell do you know of a reason offhand why transfuse would fail on certain (Transabyss) assemblies but not others (Binpacker, Trinity) because of what seems like a memory leak during the salmon stage? e.g.

    processed 8000000 reads in current roundterminate called after throwing an instance of 'std::bad_alloc'
      what():  std::bad_alloc
    
    /var/lib/gems/2.3.0/gems/transrate-1.0.1/lib/transrate/salmon.rb:27:in `run': Salmon failed (Transrate::SalmonError)

    If I watch htop the RAM usage climbs to several times a normal run (>75Gb) and then dies. Same assemblies fail with transrate 1.0.1 separately at the same stage too. I previously had this happen on certain Transabyss assemblies from a different dataset but was able to remake the assembly but never identified the specific error.

    Stuart Willis
    @stuartwillis
    I can reproduce the error running salmon 0.4.2 directly on the bam created by snap, and transrate 1.0.3/salmon 0.6.0 work fine on the same assemblies.
    MD Sharma
    @MDSharma

    Hi @cboursnell any thoughts on why transfuse only uses the first read-pair name in making the bam alignments whilst within the transrate steps? Just trying to debug a failed run that seems to crash with snap error as follows:

    Loading index from directory... 2s.  832500803 bases, seed size 23
    Aligning.
    Welcome to SNAP version 1.0beta.18.
    
    sched_setaffinity: Invalid argument
    sched_setaffinity: Invalid argument
    Ran out of scoring candidate pool entries.  Perhaps trying with a larger value of **-mcp** will help.
    SNAP exited with exit code 1 from line 489 of file SNAPLib/IntersectingPairedEndAligner.cpp

    Does transfuse allow tweaking of underlying snap options?