These are chat archives for dereneaton/ipyrad

18th
Dec 2016
ViviSette
@ViviSette
Dec 18 2016 11:49
@dereneaton @isaacovercast
Hi all, I have a question regarding the possibility of setting that a locus should be kept only if in a minimum number of individuals per population.
In the old pyRAD it was possible to specify that a locus/allele should be reported in the final output only in it was represented in a minimum number of individual per population (defining the population in the hierarchical clustering at the bottom of the param file).
In ipyrad there is option 21. min_samples_locus which is the minimum number of samples that must have data at a given locus for it to be retained in the final data set but this is across all samples. So for example in our case we have 11 populations with 10 individuals per population and we want loci that are in at least 5 individuals per population but if we set 5 as min_samples_locus we get loci that are represented in a random selection of individuals across the 110 individuals with most of the populations not represented, if we set 55 as min_sample_locus there will be individuals from some populations but not from others represented in the final loci.
Is there any option in ipyrad to specify a Min ind per population? I don't manage to find any indication of it in the manual.
Also, what is the popfile in ipyrad used for? In the manual there is a description of how the file should be made but I don't understand its role in step7. I would imagine that maybe something can be done at this stage to specify a value per population?
Thanks a lot already from now :)