These are chat archives for dereneaton/ipyrad

Feb 2017
Feb 06 2017 16:55
Hi @isaacovercast and @dereneaton , I am running the denovo+reference pipeline, and I am getting some conflicting results. When I look at the s7 stats, I see that for some of my samples the number of refseq_mapped_reads is higher than the no. of reads_raw. Also, when I add the no. of refseq_mapped_reads and no. of refseq_unmapped_reads this is not equal (rather almost twice) to the no. of raw reads. I am interpreting the results wrong?? Also, to my surprise when using the denovo pipeline I am recovering more loci than when using reference+denovo. ANy thoughts on why the results I am getting are totally different to what I was expecting? THANKS!
Emily Warschefsky
Feb 06 2017 19:02
@isaacovercast - thanks, I bet that will fix the problem!
Isaac Overcast
Feb 06 2017 20:58
@LinaValencia85 You are correct that the mapped and unmapped counts are goofy. This is a known bug and I just haven't fixed it yet (#201).
As for the difference between denovo and denovo+reference it does seem counterintuitive, but also I can imagine a situation where this happens. For instance if the clustering_threshold parameter is set too high then denovo might oversplit loci. I guess it depends on how closely related your samples are, what your clustering threshold is, and the distance from your samples to the reference genome.
Feb 06 2017 21:22
@isaacovercast Thanks for the info! Are either the estimates of mapped or unmapped reads accurate? Or both are wrong? And regarding the devono vs reference, I am still trying to find an explanation. What puzzles me is that I used a clustering threshold of 0.85, and when using the denovo+reference pipeline for some samples for which a reference genome exists, I still get significantly less loci.
Amely Martins
Feb 06 2017 21:41
Hi @isaacovercast I'm working with @LinaValencia85 in the denovo+reference assembly method. I've run the samtools flagstat in the .bam files that are saved in the _refmapping folder. The stats doesn't match the stats from s7 and are very different than what I got only running BWA directly. Any thoughts?