These are chat archives for nellore/rail

22nd
Nov 2016
Bernt Popp
@berntpopp_twitter
Nov 22 2016 00:04
Magic-Blast just soft clippes the supporting reads. With BLAT (pipeline adapted from Trinity github), the alignment is correct but does not get converted to a splice junction by inchworm. Rail-rna alignes some reads to the canonical splice site and adds an "insertion", while other reads are beeing soft clipped. Any ideas on parameters to mprove this output? How do you handel such complex splicing events in rail-rna and the accompanying databeses (introlpolis/recount)?
buci
@buci
Nov 22 2016 00:56
yeah, rail's also missing flags and mate info in each bam record
so if IGV uses that, it's a problem
so what are you looking for in improved output?
rail is attempting to maximize the number of correctly aligned bases per read
there are soft-clipped reads typically if we're sure we've successfully aligned one side of a splice junction and are confident about one splice site, but we're unsure where to place the other
if you could clarify your objective i can probably help more
i'll tell you if you want the number of mates that map across a given junction, you should use one of the junction output files in cross_sample_outputs
buci
@buci
Nov 22 2016 01:02
all we're doing in recount and intropolis is counting the number of mates (for paired-end samples) or reads (for single-end samples) that overlap a junction; if there are two junctions overlapped by a mate or read, recount/intropolis increment the coverages of both junctions
so recount and intropolis don't care about complex splicing events
they're annotation-agnostic
well ... ok, true of intropolis, not of recount
recount uses ucsc genes to define exons
Bernt Popp
@berntpopp_twitter
Nov 22 2016 17:54
Hey, what I am currently looking for is a "natural" representation of non-canonical splicing events. The problem seems to be that most spliced alignment pipelines seem to use some internal prior knowledge for assigning splice junctions. This is either a BED transcript annotation file or the pressence of a cannonical AG/GT acceptor/donor sequence in the reference. This leads wrong or missing alignments in the resulting BAM files for splice-site-mutations which introduce a new splice site, as the aligner does not have any information about the change in the DNA sequence. In the example above a new splce acceptor is generated by a germline mutation at position -12 which leads to the inclusion of 10 intronic basepairs in the new transcript. No software has yet alignet this splicing event right and called this new junction.
I would very much like to see this splice junction in the alignment and be able to call it so one could filter it aigainst controlls and thus automatically pathogenic splicing events without need manual review of the alignments.
The only software dealing with this particular problem I have found has not been updated since two years (http://pvaas.sourceforge.net/).
Bernt Popp
@berntpopp_twitter
Nov 22 2016 18:02
This problem will become more important in future as we will see higher coverage RNA-Seq and comparative calling against exome/genome data (like in cancer genetics tumor/normal, for LOH, monoallelic expression ...).
buci
@buci
Nov 22 2016 18:58
so you might be interested in looking at v1 of mapsplice
it's one of the few tools i know of that i believe doesn't look for splice-site motifs (GT-AG and so on) when identifying junctions
this probably increases its FP rate significantly
"MapSplice is not dependent on splice site features or intron length, consequently it can detect novel canonical as well as non-canonical splices. "
i'm pretty sure the most recent version of mapsplice does use splice-site features to identify junctions though
also ask that question at https://www.biostars.org/
there may be better answers
buci
@buci
Nov 22 2016 19:04
another option is to "hack it" yourself
create a reference transcript sequence with exactly the variant you're looking for
and add it to either hg38 or a reference transcriptome
then create the fm index for whatever aligner you're using and realign
rail probably isn't the tool you're looking for though
i'm sorry
Bernt Popp
@berntpopp_twitter
Nov 22 2016 19:37
Thanks a lot for your help! The idea with introducing the Mutation into the reference is what i am currently testing :)
Anyway rail, imtropolis and recount are fantastic resources. Thanks for these!
buci
@buci
Nov 22 2016 21:05
thanks for checking them out, and good luck! feel free to come back if you use them again and need help