These are chat archives for dereneaton/ipyrad

Jan 2018
Jan 23 2018 14:06
@dereneaton @isaacovercast. Hi Isaac and Deren. I'm trying the ip.merge function in Jupyter notebook, but run into an error on step 2. The log files say: cutadapt: error: gzip: /export/mergedPyradOpt_edits/PHE088_R1_concat.fq.gz: not in gzip format
And when I look at the files they seem to be in .fq format: /mergedPyradOpt_edits$ head PHE080_R1_concat.fq.gz @1_1101_14797_1493_1
Jan 23 2018 15:38

Hi @dereneaton and @isaacovercast , I have a question regarding the ABBA-BABA tests. I used the generate_tests_from_tree function and fixed p3 and p4. Our dataset contains 13 species as ingroup and two outgroup species one is relatively close to the ingroup and the other one is further out. I noticed that the outcome of the tests is quite different depending on which outgroup is selected.
Could you please describe how the loci are selected for the tests?
And what about when individuals of the same clade were pooled together for a test?


Isaac Overcast
Jan 23 2018 18:25
@alexjvr1 Yep, well that sure does look like the file isn't gzipped. I'm not sure how it got in this condition, but it's probably easiest to just gzip all these files, and rerun step 2.
Jan 23 2018 19:54
@isaacovercast Thanks for the quick answer. The problem is that s2 writes the .gz files as the first part of that step. So once I've gzipped the files how do I get the rest of the step to run?
(If I gzip the files and rerun step 2 they just get replaced with un-gzipped files again)
Jan 23 2018 19:55
@isaacovercast Hi, when running with pair-end reads, does ipyrad require identifying barcodes to be present on both forward and reverse reads?
Isaac Overcast
Jan 23 2018 21:00
@cb1579 Nope, just R1. R2 reads are paired, so they aren't normally barcoded.
Isaac Overcast
Jan 23 2018 21:31
@alexjvr1 I see. Ok, i've found the problem. Importing demux'd fastq files that aren't gzipped was broken. I'm pushing a fix now v0.7.21. conda install -c ipyrad ipyrad, give it 5 or ten minutes to finish building...
@alexjvr1 Conda's being weird right now, so the package may not be available yet. The fix is in the git repo if you're able to clone it and build it. Otherwise the other solution is to gzip your fastq files and rerun from step 1.
richie hodel
Jan 23 2018 21:57
If we want to combine raw reads from several sequencing runs that have different read lengths (some 1x50, some 1x100, some 1x150), will all loci that include reads from the 50 bp run automatically be at most 50bp (i.e., if reads of 100 and/or 150 bp are clustered with 50 bp reads, will the 3' ends of the 100 and/or 150 bp reads be automatically trimmed off) ?
richie hodel
Jan 23 2018 22:15
@nitishnarula how long did it end up taking to run steps 1-6 for your ~1200 sample dataset? how much ram did you use? thanks!
Jan 23 2018 22:17
@isaacovercast Thanks! I've just gzipped the fastq files and that works fine. As soon as conda's working I'll try the new version. Thanks for helping so quickly.