Beatriz Otero Jiménez
Feb 08 2017 02:02
@dereneaton Hi, I am trying to use ipyrad to demultiplex my data. I have 3rad PE reads where individuals have a unique combination of barcode and index. The sequencing output file was not split by indexes (as I have read is typical) and I was wondering what the format for the barcode file would be in this case. Would two barcode columns per individual work?
Isaac Overcast
Feb 08 2017 17:47
@beatrizotero Yes exactly, the 3RAD barcodes file should look like this:
wist222-7_Pippin        CCGAATG CTAACGT
wist225-1_Pippin        TTAGGCAG        CTAACGT
wist230-1_Pippin        AACTCGTCG       CTAACGT
wist234-4_Pippin        GGTCTACGTG      CTAACGT
wist241-3_Pippin        GATACCG CTAACGT
wist246-1_Pippin        AGCGTTGG        CTAACGT
wist246-2_Pippin        CTGCAACTG       CTAACGT
wist276-2_Pippin        TCATGGTCAG      CTAACGT
Beatriz Otero Jiménez
Feb 08 2017 18:51
@isaacovercast Thanks!
Feb 08 2017 21:20
@isaacovercast thanks for the clarification regarding the number of reads. @amelymartins and I doubled check the flagstat and is correct apologies for that. However, as I said before we are puzzled by the fact that for those species that we have a reference genome, we are getting way less (90% less) loci for denovo+ref than denovo. We were wondering whether this is happening because we used a reference genome that included all chromosomes plus several other contigs. Thus we might loosing those reads that align to those contigs with a smaller depth. Our clustering threshold is 0.85, which should not be an issue. Any thoughts? THANKS!
Isaac Overcast
Feb 08 2017 22:18
@LinaValencia85 Hm, well yeah, i'm just not sure what's happening. If you want to dropbox me a couple of the sample fastq files and the reference genome I'll take a look at it.