These are chat archives for dereneaton/ipyrad

Sep 2017
Sep 16 2017 02:43 UTC
Hi all, is there any way to skip the cleaning step? Jenarch helped me a while back saying that I "need to start from step 1, even if you have already demultiplexed. You won't need param 2, but instead will put the path to your data under param [4] [sorted_fastq_path]. " However, I can't figure out how to get around the cleaning step. My data are already cleaned and demultiplexed. Thanks - Brit
Deren Eaton
Sep 16 2017 16:38 UTC
Hi @bbarker505, yes you need to start from step1. If your data are already demultiplexed (i.e., you set the 'sorted_fastq_path' instead of 'raw_fastq_path") then step1 simply checks that your data is properly formatted and counts the number of reads for each sample. Step 2 must also be run, which if you set the filter parameter to 0 does not do very much to your data. Again it mostly just checks that the data is properly formatted for the next step. Both steps will run very quickly.
An example of why step 2 needs to be run, maybe you used a different trimming/filtering method than we use, and it leaves behind empty sequences if it trimmed the entire sequence. This would crash step 3. So we do a quick check even if your data are already trimmed to ensure there are no empty sequences, or sequences that are all Ns, etc.