@dereneaton Do you know if "remove duplicates" uses the clustering similarity threshold to determine what are "duplicates", and can you please point me to where I can read about the "remove duplicates" algorithm? I ran a branching assembly where I varied only the clustering similarity threshold from 76%-98% with the hopes of picking a good parameter value for my data by minimizing the percentage of loci lost due to "remove duplicates," as a sign of over-splitting my loci. However, I saw an unexpected pattern where the percentage of loci lost due to "remove duplicates" increased with clustering similarity, but dropped off after 95%. I figured it would only increase with clustering similarity. Thoughts? Thanks!