These are chat archives for dereneaton/ipyrad

May 2017
Gemma Clucas
May 26 2017 16:22
Hi, I'm new to ipyrad so sorry if is a dumb question and/or this is the wrong place to ask. I've done a reference-aligned analysis and I'm a little unsure of the results I'm getting for the mapping, from the assemblyname_stats.txt file. I was expecting refseq_mapped_reads + refseq_unmapped_reads == reads_passedfilter, but there's a large deficit. Is this due to dereplication i.e. are refseq(un)mapped_reads dereplicated and so the number reported is lower? Or do I have tons of reads being thrown out because they multi-mapped or have been filtered some other way? I just want to figure out my mapping success rate, and at the moment refseq_mapped_reads is about 10% of reads_passed_filter which is a little worrying! Thanks for creating such an easy to use pipeline :)
Isaac Overcast
May 26 2017 18:26
@DrGemClucas_twitter Glad you like it! this question is not dumb at all and this is exactly the right place to ask. In fact you are correct that the difference between reads passing the filter and mapped+unmapped reads is due to dereplication. We dereplicate identical reads before the mapping/clustering step which typically does reduce the number of reads by about 90%, give or take.