Aug 2017
Aug 26 2017 00:28
@dereneaton I'm looking at my .vcf file from ipyrad in vcftools. In ipyrad, I set min_samples_locus to be 237/315 samples, but I have SNPs in my .vcf file that occur in less than 237 individuals. This is the case for about 700/14000 total SNPs in the .vcf file. I used de novo assembly, all reads were 90bp, single end, I filtered out raw reads with any Ns before ipyrad, I set max_Ns_consens to 0. I expected all SNPs to be in at least 75% of my individuals. Why is this not the case? Thanks!!
Zac Forsman
Aug 26 2017 07:46
@dereneaton Awsome Deren,
I may have just run a complete massive dataset in just a few days... I'll let you know what the results look like. Thanks much for your help. -Zac
Isaac Overcast
Aug 26 2017 19:12
@mazepago_twitter That is a pretty normal error that crops up occasionally in reference assemblies. It shouldn't impact the outcome of step 3 or any future steps. It's not a problem with the index of the reference sequence. The weirdness of the results could be due to some issues with reference based assembly that were fixed some time in the 0.7.x version. It might be worth upgrading and trying it again.