These are chat archives for dereneaton/ipyrad

Aug 2017
Aaron M. Duffy
Aug 07 2017 14:21 UTC
Looking at the output from step 4 in s4_joint_estimate.txt, the estimated heterozygosity and estimated error are correlated. I see this pattern in 3 different datasets from different organisms, analyzed at different cluster settings. In all cases the slope of the relationship is between 0.28-0.38 with R-squared around 0.9. I don't think there is any reason why the "true" heterozygosity and error rates would be correlated, so why are the estimated rates so consistently correlated? Is this just an expected artifact of the joint estimation process or is it something we should be concerned about? If it is an artifact, does it mean the estimated heterozygosity rates are only intended to be used in that joint estimation process and should not be used to try to infer anything else?