These are chat archives for dereneaton/ipyrad

15th
Jun 2018
Saritonia
@Saritonia
Jun 15 2018 08:57
Sorry, I have detected the problem. In fact, it was because of the params file. Regarding the other question, I have already mentioned that there is no a file with the information about the partitions of the loci as *.phy.partition format in Pyrad. How can I create this file or extract this information? Thanks you very much in advance for your help!!!
J. Mark Porter
@jmark_porter_twitter
Jun 15 2018 15:17

Hi @isaacovercast, (#300) I think that I see where the problem is coming from. I noticed that the two recalcitrant files have sequences that are 145 bp, rather than the expected 95 bp. Both 15421_I and 15430_C were part of our test lane. It appears that the lab used a different read length for the first lane. This could easily cause the problem in clustering!

head /Users/e.l.greene/Desktop/opuntia_TEST_fastqs/15421_I.fastq
@K00337:20:HF2H3BBXX:5:1101:18416:27496 1:N:0:2
TGCAGATGTTCTGTGCTTCTTTTCTTGTGTCTTTGGTAGAGAAGTTATGCCTCATTGTTACCGATCATTTTGTTATTCAATTATTCAGGCTGCTGCATTGAAGGGCTCAGACCACCGCCGTGCATCCAATGTTAGTGCTAGACTT
+
JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJFJJF
@K00337:20:HF2H3BBXX:5:1101:18822:27496 1:N:0:2
TGCAGCAGATTGTCGTTTTCTTCAACTATTGTTCCTTTTTAGGTGTCATTTCTGTTAGCAGTAATGCTCCTCCATTGCATAAAACAGCCAGGCATAATTACCTCAAGCGTACATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAA
+
FJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFAJFFAFJF-JJJJFFFFJJJAJA<J<AJFAJ<-AFFF<FA
@K00337:20:HF2H3BBXX:5:1101:19532:27496 1:N:0:2
TGCAGTTGTTACCGGAAAACCTGTTGTAAGTTTTCTGACTGCCTGATTATGAGTATGAAGCTGTGCTTAGTATCTCTGACTACCTGATTGTGAGTATGCGTACATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAAAAAAAA

head /Users/e.l.greene/Desktop/opuntia_TEST_fastqs/15430_C.fastq
@K00337:20:HF2H3BBXX:5:1101:17279:27496 1:N:0:2
TGCAGGAGGTGGCATATCATCCCTATGGAAAACAAGAAAAGCGCTCAACTAAGGCAAGCCTAGTCAAGACAAGGTTAGTTTTGTTATGGCCCGAGTGCCCGACTGAAAAGAATGACACCATATGAAGCAAGCATAAGCCCAAGCT
+
A-7FFAJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJFFJJJJJJJJJJJFFJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJ<
@K00337:20:HF2H3BBXX:5:1101:17320:27496 1:N:0:2
TGCAGTCGCTGGTTAGCCATCGCAACTTTCCTCCCATTCCTCCGTGCTTGTTCTAGTGAGGTTGAAAGGTGAGCCCTAACTTTTTACTCTTTCAATTTCTTAATTTGATCCACTGATATGTTCTTTAGCGTACATCTCGTATGCC
+
JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJAFJJFFJFFJJJJJJJJJ
@K00337:20:HF2H3BBXX:5:1101:17746:27496 1:N:0:2
TGCAGAGACAATGGGAGGATTCAAACTCAAAGAAGAATGCCCAAAACTTGTGGCATGGGCTAAGAGGTGTATGCAGAGGGAAAGTGTCTTGCGTTCTCTTCCTGATCCACATAAAGTGCTTGACTTCTGCTTACAGCGTACATCT

head /Users/e.l.greene/Desktop/opuntia_TEST_fastqs/15220_D.fastq
@K00337:98:HLJJTBBXX:2:1107:18822:31576 1:N:0:2
TGCAGCAGCTGCCAATAAGTGATTCTCAGAAACAGTAACTCTAGGAAGAGTCTTGCCAATCTTCATCTCCCAAAGAGAGGCAAGCTCATTCATGT
+
JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
@K00337:98:HLJJTBBXX:2:1107:21440:31576 1:N:0:2
TGCAGACATCTTTGACTGAGAGCATTGATCTTCAGGTGTGTTTTACTGACCTCGCCTACGAAACAAAGATAAAAATCGGTCTTCCAACCATCAGT
+
JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJF
@K00337:98:HLJJTBBXX:2:1107:25337:31576 1:N:0:2
TGCAGAAGGTTATCTTGTGTCGTGGTTCTCCGTTGTTAGCCAGGTTGTTGGTATCAGACTCCCCCTTCAGATGGCTGTATCACATCAGGCGTACA

I presume that the solution is to rerun step 1 and 2, trimming 50 bases from the 3' end.

Isaac Overcast
@isaacovercast
Jun 15 2018 17:40
@jmark_porter_twitter That would certainly make clustering take longer. It wouldnt' explain the fishy "both files are the same size" behavior though, but maybe it's a coincidence?
I'm running your data now, so we'll see how it goes.