@dereneaton. Many thanks. (1) I ran a series of analyses last year for the same rad data using pyrad where for 85%, 90% and 95% clustering thresholds I set max_low_quality_bases at 8, 6 and 4 respectively. Maybe follow an approach like this? (2) For filter_min_trim_len in my ipyrad trial I used setting 2 (strict). I had a quick look at some of the data before and after ipyrad steps 1-2 on FastQC. Those with some adapter content before step 1 had none after the strict filtering, so I thought might be good to stick with the strict filter. I used the default 35 bp for minimum trimmed read length because I thought I should try and save as much potentially useful data as possible.
(3) I did go through the ipyrad documentation, but am uncertain what all the column headings mean in the stats file from Step 3. Have been assuming that ave_depth_total means the average depth for the total number of clusters at the end of step 3, and that the av_depth_stat is the average depth for clusters meeting the specified minimum depth criterion (= 7 in my trial run). For my trial (85% clustering threshold, minimum depth of 7), I have an ave_depth_total ranging from about 10-35 (most values are in the 15-20 range), av_depth_stat (= av_depth_maj) from about 25-50, the clusters_hidepth ranges from about 7000-42000 (mostly between 15000-25000). The number of clusters seems low (and varies a lot between samples). My data I think is low-depth – in my pyrad analyses at clustering thresholds 85%, 90% and 95%, I varied minimum depth from 2-13 for each clustering threshold (i.e. I used 2, 5, 7, 9, 11, 13). Found that for all three clustering thresholds the length of the final assembly (i.e. number of positions) at a minimum_samples_in_a_locus of 4 declined by 70% as you move from a minimum depth of 2 to 13 (i.e. from 5 million bp to 1.5 million bp). So it looks like for my dataset the optimal minimum depth is in the 7-9 range. (4) Should max_Hs_consens be varied with respect to the clustering threshold – i.e. should the parameter be set at a higher value for a low clustering threshold (85%) and a lower value for a high clustering threshold (90%)?