Feb 2018
Jean-RĂ©mi Trotta
Feb 19 2018 11:04
Hi @eaton-lab , thank you! I look forward to see if you can fix this potential bug.
Glib Mazepa
Feb 19 2018 13:05
Hi @isaacovercast @eaton-lab I have a question related to step #4: I want to use the haploid samples together with the diploid ones, the reason is to filter out the paralogs - e.g. the loci that are heterozygous in the haploids. What would be the most appropriate way of doing this: just pulling together 2n and n samples and running step #4 with default specifications (i gave a try and from the s4 log file the heterozygosity of haploids seems to be within the variance for diploids...) OR it is possible to run haploids separately with max_alleles_consens to be set to 1 and and merging them with the diploids on the later stages?
Isaac Overcast
Feb 19 2018 16:53
@mazepago_twitter You could run them separately and then merge the output files by hand, but this would be annoying. Another thing you could do is run the haploid by itself through step 7 and then look in the stats file in the output to see how many loci are getting filtered for ploidy, this will give you an idea of how much of a problem you have with paralogs. This would also give you an indication of the paralogous loci, in which case you could run 2n and n samples together and then just remove the paralogous haploid loci at the end (again by hand). There's not a really straightforward way to do this because mixing ploidy is an edge case.