These are chat archives for dereneaton/ipyrad

5th
Oct 2018
Edgardo M. Ortiz
@edgardomortiz
Oct 05 2018 09:48
Hi @eaton-lab and @isaacovercast , I am running some pairddrad data with the restriction overhangs set as CATGC,AATTC, at the end in the .loci file the first overhang is trimmed from the clusters but not the second. I guess I can force the trimming with the trim_loci parameter, but is there a way to recover the full locus untrimmed? All the alignment look like these ones, with the second cutsite left untouched:
EO2243_1-P02     ATATATGAGTGACTTATGACTAAAATCTAG-CTTTTTAAAAATGAAATATTGATTTATATTTGTTTGTTCGCTTATTCTCAATACGTTTATAAnnnnTTTATTAATAATTTAAATACAAGTCATAAAAATGATTGCGGTAGG-AAGAAAAGGGGAATTTGTGTCATAAAGTATACAAATGTGCATGACATATTAGAATT
EO2456-P05       ATATATGAGTGACTTATGACTAAAATCTAG-CTTTTTAAAAATGAAATATTGATTTATATTAGTTTGTTCGCTTATTCTCAATACGTTTATAAnnnnTTTATTAATAATTTAAATACAAGTCATAAAAATGATTGCGGTAGG-AAGAAAAGGGGAATTTGTGTCATAAAGTATACAAATGTGCATGACATATTAGAATT
EO2458-P05       ATATATGAGTGACTTATGACTAAAATCTAG-CTTTTTAAAAATGAAATATTGATTTATATTAGTTTGTTCGCTTATTCTCAATACGTTTATAAnnnnTTTATTAATAATTTAAATACAAGTCATAAAAATGATTGCGGTAGG-AAGAAAAGGGGAATTTGTGTCATAAAGTATACAAATGTGCATGACATAKTAGAATT
EO2495-P02       ATATATGAGTGACTTATGACTAAAATCTAG-CTTTTTAAAAATGAAATATTGATTTATATTAGTTTGTTCGCTTATTCTCAATACGTTTATAAnnnnTTTATTAATAATTTAAATACAAGTCATAAAAATGATTGCGGTAGG-AAGAAAAGGGGAATTTGTGTCATAAAGTATACAAATGTGCATGACATATTAGAATT
EO2575-P11       ATATATGAGTGACTTATGACTAAAATCTAG-CTTTTTAAAAATGAAATATTGATTTATATTAGTTTGTTCGCTTATTCTCAATACGTTTATAAnnnnTTTATTAATAATTTAAATACAAGTCATAAAAATGATTGCGGYAGG-AAGAAAAGGGGAATTTGTGTCATAAAGTATACAAATGTGCATGACATATTAGAATT
EO2607-P11       ATATATGAGTGACTTATGACTAAAATCTAG-CTTTTTAAAAATGAAATATTGATTTATATTAGTTTGTTCGCTTATTCTCAATACGTTTATAAnnnnTTTATTAATAATTTAAATACAAGTCATAAAAATGATTGCGGTAGG-AAGAAAAGGGGAATTTGTGTCATAAAGTATACAAATGTGCATGACATATTAGAATT
EO2618-P10       ATATATGAGTGACTTATGACTAAAATCTAGTCTTTTTAAAAATGAAATATTGATTTATATTAGTTTGTTCGCTTATTCTCAATATGTTTATAAnnnnTTTATTAATAATTTAAATACAAGTCATAAAAATGATTGCGGTAGGAAAGAAAAGGGGAATTTGTGTCATAAAGTATACAAATGTGCATGACATATTAGAATT
EO2632-P10       ATATATGAGTGACTTATGACTAAAATCTAG-CTTTTTAAAAATGAAATATTGATTTATATTAGTTTGTTYGCTTATTCTCAATACGTTTATAAnnnnTTTATTAATAATTTAAATACAAGTCATAAAAATGATTGCGGTAGG-AAGAAAAGGGGAATTTGTGTCATAAAGTATAMAAATGTGCATGACATATTAGAATT
EO2652-P10       ATATATGAGTGACTTATGACTAAAATCTAG-CTTTTTAAAAATGAAATATTGATTTATATTAGTTTGTTCGCTTATTCTCAATACGTTTATA-nnnnTTTATTAATAATTTAAATACAAGTCATAAAAATGATTGCGGTAGG-AAGAAAAGGGGAATTTGTGTCATAAAGTATACAAATGTGCATGACATATTAGAATT
EO2686-P07       ATATATGAGTGACTTATGACTAAAATCTAG-CTTTTTAAAAATGAAATATTGATTTATATTAGTTTGTTCGCTTATTCTCAATACGTTTATAAnnnnTTTATTAATAATTTAAATACAAGTCATAAAAATGATTGCGGTAGG-AAGAAAAGGGGAATTTGTGTCATAAAGTATACAAATGTGCATGACATATTAGAATT
EO2695-P08       ATATATGAGTGACTTATGACTAAAATCTAG-CTTTTTAAAAATGAAATATTGATTTATATTAGTTTGTTCGCTTATTCTCAATACGTTTATAAnnnnTTTATTAATAATTTAAATACAAGTCATAAAAATGATTGCGGTAGG-AAGAAAAGGGGAATTTGTGTCATAAAGTATACAAATGTGCATGACATATTAGAATT
EO2708-P01       ATATATGAGTGACTTATGACTAAAATCTAG-CTTTTTAAAAATGAAATATTGATTTATATTAGTTTGTTCGCTTATTCTCAATACGTTTATA-nnnnTTTATTAATAATTTAAATACAAGTCATAAAAATGATTGCGGTAGG-AAGAAAAGGGGAATTTGTGTCATAAAGTATACAAATGTGCATGACATATTAGAATT
EO2725-P01       ATATATGAGTGACTTATGACTAAAATCTAG-CTTTTTAAAAATGAAATATTGATTTATATTAGTTTGTTCGCTTATTCTCAATACGTTTATAAnnnnTTTATTAATAATTTAAATACAAGTCATAAAAATGATTGCGGTAGG-AAGAAAAGGGGAATTTGTGTCATAAAGTATACAAATGTGCATGACATATTAGAATT
EO2734-P02       ATATATGAGTGACTTATGACTAAAATCTAG-CTTTTTAAAAATGAAATATTGATTTATATTWGTTTGTTCGCTTATTCTCAATACGTTTATAAnnnnTTTATTAATAATTTAAATACAAGTCATAAAAATGATTGCGGTAGG-AAGAAAAGGGGAATTTGTGTCATAAAGTATACAAATGTGCATGACATATTAGAATT
EO2745-P02       ATATATGAGTGACTTATGACTAAAATCTAG-CTTTTTAAAAATGAAATATTGATTTATATTWGTTTGTTCGCTTATTCTCAATACGTTTATAAnnnnTTTATTAATAATTTAAATACAAGTCATAAAAATGATTGCGGTAGG-AAGAAAAGGGGAATTTGTGTCATAAAGTATACAAATGTGCATGACATATTAGAATT
                                                                              *       -              -                                                     -                                   -                -       |9775|
EO2243_1-P02     TCCTTTTGGTAGAATTTTATTAACTTGGGACCCAAATCTTTGGTCCATCACAGTGTTACATGAGTCCAATCAACTCATTCACTGTGATGTCnnnnAGAGAAAGGCCCTTGGTGGTCCTTGGTGACTTTAATGTAGTTAGAACACCCAAAGAAAAAGTTGGGGGGCATAACCATTGGGCTCACCAATTGGATGAATT
EO2456-P05       TCCTTTTGGTAGAATTTTATTAACTTGGGACCCAAATCTTTGGTCCATCACAGTGTTACATGAGTCCAATCAACTCATTCACTGTGATGTCnnnnAGCGAAAGGCCCTTGGTGGTCCTTGGTGACTTTAATGTAGTAAGAACACCCAAAGAAAAAGTTGGGGGGCATAACCATTGGGCTCACCAATTGGATGAATT
EO2458-P05       TCCTTTTGGTAGAATTTTATTAACTTGGGACCCAAATCTTTGGTCCATCACAGTGTTACATGAGTCCAATCAACTCATTCACTGTGATGTCnnnnAGCGAAAGGCCCTTGGTGGTCCTTGGTGACTTTAATGTAGTAAGAACACCCAAAGAAAAAGTTGGGGGGCATAACCATTGGGCTCACCAATTGGATGAATT
EO2575-P11       TCCTTTTGGTAGAATTTTATTAACTTGGGACCCAAATCTTTGGTCCATCATAGTGTTACATGAGTCCAATCAACTCATTCACTGTGATGTCnnnnAGCGAAAGGCCCTTGGTGGTCCTTGGTGACTTTAATGTAGTAAGAACACCCAAAGAAAAAGTTGGGGGGCATAACCATTGGGCTCACCAATTGGATGAATT
EO2607-P11       TCCTTTTGGTAGAATTTTATTAACTTGGGACCCAAATCTTTGGTCCATCACAGTGTTACATGAGTCCAATCAACTCATTCACTGTGATGTCnnnnAGCGAAAGGCCCTTGGTGGTCCTTGGTGACTTTAATGTAGTAAGAACACCCAAAGAAAAAGTTGGGGGGCATAACCATTGGGCTCACCAATTGGATGAATT
EO2618-P10       TCCTTTTGGTAGAATTTTATTAACTTGGGACCCAAATCTTTGGTCCATCACAGTGTTACATGAGTCCAATCAACTCATTCACTGTGATGTCnnnnAGCGAAAGGCCCTTGGTGGTCCTTGGTGACTTTAATGTAGTAAGAACACCCAAAGAAAAAGTTGGGGGGCATAACCATTGGGCTCACCAATTGGATGAATT
EO2652-P10       TCCTTTTGGTAGAATTTTATTAACTTGGGACCCAAATCTTTGGTCCATCACAGTGTTACATGAGTCCAATCAACTCATTCACTGTGATGTCnnnnAGCGAAAGGCCCTTGGTGGTCCTTGGTGACTTTAACGTAGTAAGAACACCCAAAGAAAAAGTTGGGGGGCATAACCATTGGGCTCACCAATTGGATGAATT
EO2686-P07       TCCTTTTGGTAGAATTTTATTAACTTGGGACCCAAATCTTTGGTCCATCACAGTGTTACATGAGTCCAATCAACTCATTCACTGTGATGTCnnnnAGCGAAAGGCCCTTGGTGGTCCTTGGTGACTTTAATGTAGTAAGAACACCCAAAGAAAAAGTTGGGGGGCATAACCATTGGGCTCACCAATTGGATGAATT
EO2695-P08       TCCTTTTGGTAGAATTTTATTAACTTGGGACCCAAATCTTTGGTCCATCACAGTGTTACATGAGTCCAATCAACTCATTCACTGTGATGTCnnnnAGCGAAAGGCCCTTGGTGGTCCTTGGTGACTTTAATGTAGTAAGAACACCCAAAGAAAAAGTTGGGGGGCATAACCATTGGGCTCACCAATTGGATGAATT
EO2708-P01       TCCTTTTGGTAGAATTTTATTAACTTGGGACCCAAATCTTTGGTCCATCACAGTGTTACATGAGTCCAATCAACTCATTCACTGTGATGTCnnnnAGCGAAAGGCCCTTGGTGGTCCTTGGTGACTTTAATGTAGTAAGAACACCCAAAGAAAAAGTTGGGGGGCATAACCATTGGGCTCACCAATTGGATGAATT
EO2725-P01       TCCTTTTGGTAGAATTTTATTAACTTGGGACCCAAATCTTTGGTCCATCACAGTGTTACATGAGTCCAATCAACTCATTCACTGTGATGTCnnnnAGCGAAAGGCCCTTGGTGGTCCTTGGTGACTTTAATGTAGTAAGAACACCCAAAGAAAAAGTTGGGGGGCATAACCATTGGGCTCACCAATTGGATGAATT
EO2734-P02       TCCTTTTGGTAGAATTTTATTAACTTGGGACCCAAATCTTTGGTCCATCACAGTGTTACATGAGTCCAATCAACTCATTCACTGTGATGTCnnnnAGCGAAAGGCCCTTGGTGGTCCTTGGTGACTTTAATGTAGTAAGAACACCCAAAGAAAAAGTTGGGGGGCATAACCATTGGGCTCACCAATTGGATGAATT
EO2745-P02       TCCTTTTGGTAGAATTTTATTAACTTGGGACCCAAATCTTTGGTCCATCACAGTGTTACATGAGTCCAATCAACTCATTCACTGTGATGTCnnnnAGCGAAAGGCCCTTGGTGGTCCTTGGTGACTTTAATGTAGTAAGAACACCCAAAGAAAAAGTTGGGGGGCATAACCATTGGGCTCACCAATTGGATGAATT
                                                                   -                                              -                                -     -                                                           |9783|
EO2243_1-P02     AACATTGAACTCCGTGGAGATGACCAATCGAAATACATGAAACATAGAAAAGATTCTATGCCCTTGCAAATGTTATCGTACGCGGATATCAnnnnGTTGAATGACAGTTCTGCCAAGTGATTATTTTTGGAGGATTGTTTGACCTTGATTTCAAATCCCTAAGATGTGACACTCTGTGATTTGCTTCATTTGAATT
EO2456-P05       AACATTGAACTCCGTGGAGATGACCAATCGAAATACATGAAACATAGAAAAGATTCTATGCCCTTGCAAATGTTATCGTACGCGGATATCAnnnnGTTGAATGACAGTTCTGCCAAGTGATTATTTTTGGAGGATTGTTTGACCTTGATTTCAAATCCCTAAGATGTGACACTCTGTGATTTGCTTCATTTGAATT
EO2458-P05       AACATTGAACTCCGTGGAGATGACCAATCGAAATACATGAAACATAGAAAAGATTCTATGCCCTTGCAAATGTTATCGTACGCGGATATCAnnnnGTTGAATGACAGTTCTGCCAAGTGATTATTTTTGGAGGATTGTTTGACCTTGATTTCAAATCCCTAAGATGTGACACTCTGTGATTTGCTTCATTTGAATT
EO2495-P02       AACATTGAACTCCGTGGAGATGACCAATCGAAATACATGAAACACAGAAAAGATTCTATGCCCTTGCAAATGTTATCGTACGCGGATATCAnnnnGTTGAATGACAGTTCTGCCAAGTGATTATTTTTGGAGGATTGTTTGACCTTGATTTCAAATCCCTAAGATGTGACACTCTGTGATTTGCTTCATTTGAATT
EO2575-P11       AACATTGAACTCCGTGGAGATGACCAATCGAAATACATGAAACATAGAAAAGATTCTATGCCCTTGCAAATGTTATCGTATGCGGATATCAnnnnGTTGAATGACAGTTCTGCCAAGTGATTATTTTTGGAGGATTGTTTGACCTTGATTTCAAATCCCTAAGATGTGACACTCTGTGATTTGCTTCATTTGAATT
EO2607-P11       AACATTGAACTCCGTGGAGATGACCAATCGAAATACATGAAACATAGAAAAGATTCTATGCCCTTGCAAATGTTATCGTATGCGGATATCAnnnnGTTGAATGACAGTTCTGCCAAGTGATTATTTTTGGAGGATTGTTTGACCTTGATTTCAAATCCCTAAGATGTGACACTCTGTGATTTGCTTCATTTGAATT
EO2618-P10       AACATTGAACTCCGTGGAGATGACCAATCGAAATACATGAAACATAGAAAAGATTCTATGCCCTTGCAAATGTTATCGTACGCGGATATCAnnnnGTTGAATGACAGTTCTGCCAAGTGATTATTTTTGGAGGATTGTTTGACCTTGATTTCAAATCCCTAAGATGTGACACTCTGTGATTTGCTTCATTTGAATT
EO2632-P10       AACATTGAACTCCGTGGAGATGACCAATCGAAATACATGAAACATAGAAAAGATTCTATGCCCTTGCAAATGTTATCGTACGCGGATATCAnnnnGTTGAATGACAGTTCTGCCAAGTGATTATTTTTGGAGGATTGTTTGACCTTGATTTCAAATCCCTAAGATGTGACACTCTGTGATTTGCTTCATTTGAATT
EO2652-P10       AACATTGAACTCCGTGGAGATGACCAATCGAAATACATGAAACATAGAAAAGATTCTATGCCCTTGCAAATGTTATCGTACGCGGATATCAnnnnGTTGAATGACAGTTCTGCCAAGTGATTATTTTTGGAGGATTGTTTGACCTTGATTTCAAATCCCTAAGATGTGACACTCTGTGATTTGCTTCATTTGAATT
EO2686-P07       AACATTGAACTCCGTGGAGATGACCAATCGAAATACATGAAACATAGAAAAGATTCTATGCCCTTGCAAATGTTATCGTANGCGGATATCAnnnnGTTGAATGACAGTTCTGCCAAGTGATTATTTTTGGAGGATTGTTTGACCTTGATTTCAAATCCCTAAGATGTGACACTCTGTGATTTGCTTCATTTGAATT
EO2695-P08       AACATTGAACTCCGTGGAGATGACCAATCGAAATACATGAAACATAGAAAAGATTCTATGCCCTTGCAAATGTTATCRTACGCGGATATCAnnnnGTTGAATGACAGTTCTGCCAAGWGATTATTTTTGGAGGATTGTTTGACCTTGATTTCAAATCCCTAAGATGTGACACTCTGTGATTTGCTTCATTTGAATT
EO2708-P01       AACATTGAACTCCGTGGAGATGACCAATCGAAATACATGAAACATAGAAAAGATTATATGCCCTTGCAAATGTTATCGTACGCGGATATCAnnnnGTTGAATGACAGTTCTGCCAAGTGATTATTTTTGGAGGATTGTTTGACCTTGATTTCAAATCCCTAAGATGTGACACTCTGTGATTTGCTTCATTTGAATT
EO2725-P01       AACATTGAACTCCGTGGAGATGACCAATCGAAATACATGAAACATAGAAAAGATTCTATGCCYTTGCAAATGTTATCGTACGCGGATATCAnnnnGTTGAATGACAGTTCTGCCAAGTGATTATTTTTGGAGGATTGTTTGACCTTGATTTCAAATCCCTAAGATGTGACACTCTGTGATTTGCTTCATTTGAATT
EO2734-P02       AACATTGAACTCCGTGGAGATGACCAATCGAAATACATGAAACATAGAAAAGATTCTATGCCCTTGCAAATGTTATCGTANGCGGATATCAnnnnGTTGAATGACAGTTCTGCCAAGTGATTATTTTTGGAGGATTGTTTGACCTTGATTTCAAATCCCTAAGATGTGACACTCTGTGATTTGCTTCATTTGAATT
EO2745-P02       AACATTGAACTCCGTGGAGATGACCAATCGAAATACATGAAACATAGAAAAGATTCTATGCCCTTGCAAATGTTATCGTACGCRGATATCAnnnnGTTGAATGACAGTTCTGCCAAGTGATTATTTTTGGAGGATTGTTTGACCTTGATTTCAAATCCCTAAGATGTGACACTCTGTGATTTGCTTCATTTGAATT
                                                             -          -      -              -  *  -                                 -                                                                              |9809|
Isaac Overcast
@isaacovercast
Oct 05 2018 21:35
Hey Edgar, yes, I can see this is totally happening. In the code for step 7 there's a function edgetrim_numba() which is trimming off the cutsite from R1, and isn't doing the same for the cutsite on R2. I didn't write this part of the code, but just thinking about it here now I bet it's complicated to do right. We know that the proximal end of R1 will always have the cutsite. But, depending on the protocol, and the size selection idea the distal end of R2 may or may not have the cutsite, and in some datasets it could be that some loci have the cutsite and some don't.