These are chat archives for dereneaton/ipyrad

15th
Feb 2019
Jean-Rémi Trotta
@jrtrottablanc
Feb 15 11:57
Hi @isaacovercast and @dereneaton I'm currently analysing pair-end GBS data using ipyrad 0.7.28 and I would need your expertise for understanding denovo-reference results (it looks like an hot topic lately). Since I have a chloroplast sequence this method seems the most appropriate one. I have been able to complete the assembly but looking at the variants something looks weird. What is drew my attention is the fact that all the variants have been called as homozygote (ref or alt). There is absolutely 0 heterozygous positions in any of the samples. I guesss this is related to the heterozygosity estimation done at step 4. I repeated the analysis using denovo method and I got completely different results. I was expecting some variations, but not such discrepancies (only a small amount of reads are discarded because aligning against chloroplast sequence). Next find final sample stats summary of both analysis. Thanks for your help!
## Final Sample stats summary  - denovo-reference
        state  reads_raw  reads_passed_filter  refseq_mapped_reads  refseq_unmapped_reads  clusters_total  clusters_hidepth  hetero_est  error_est  reads_consens  loci_in_assembly
S1      7     788262               784106                 9832                 391773          128923              6860    0.001618   0.000014           6860              6331
S2      7    1153048              1147190                11743                 518052          134790              5401    0.010000   0.001000           5401              4679
S3      7    1034107              1028704                12126                 450707          115842              8922    0.001618   0.000014           8922              8292
S4      7    1059189              1053013                12596                 478446          124538              5091    0.000591   0.000093           5091              4588
S5      7    1050168              1044101                11600                 473436          125932              3441    0.010000   0.001000           3441              3016
S6      7    1625342              1615927                15727                 628870          147070              8873    0.001618   0.000014           8873              7880
S7      7    1321467              1314330                14037                 539374          132721              7747    0.001618   0.000014           7747              7012
S8      7    1514239              1506182                17316                 608115          153577              3964    0.010000   0.001000           3964              2977
## Final Sample stats summary - denovo
        state  reads_raw  reads_passed_filter  clusters_total  clusters_hidepth  hetero_est  error_est  reads_consens  loci_in_assembly
S1      7     788262               784106          144685             44621    0.017562   0.003454          42675             22460
S2      7    1153048              1147190          171441             57563    0.016808   0.003309          55084             30821
S3      7    1034107              1028704          143682             52089    0.016868   0.003305          49878             27304
S4      7    1059189              1053013          158420             52552    0.016949   0.003423          50220             27393
S5      7    1050168              1044101          160846             52440    0.017014   0.003414          50165             27449
S6      7    1625342              1615927          188721             62377    0.016674   0.003128          59422             34038
S7      7    1321467              1314330          168323             57133    0.016722   0.003125          54557             30650
S8      7    1514239              1506182          196495             63192    0.016356   0.003106          60481             32863
gscabanne
@gscabanne
Feb 15 12:26
Hi Issac, I am using ipyrad 7.28 to process ddRAD data, single end (R1). When does the program cut restriction sites? Is it at step 2? I remember to have seen somewhere mentioned a parameter “cut-sites”, to turn on cutting these sites, but which is not present in actual param-files.
I do not see the restriction site in the final sequences, and I wonder where it is cut.
Thank you.