These are chat archives for dereneaton/ipyrad

14th
Aug 2017
Deren Eaton
@dereneaton
Aug 14 2017 16:55
New release v.0.7.11 is now up on conda for linux (and OSX soon).
  • Bug fix for VCF alt allele missing when >2 alts @markb729
  • Bug fix for BUCKy analysis tool import error @fangbohao_twitter
  • minor design changes and bug fixes to other ipa analysis tools.
@tommydevitt , we have instructions for creating the bpp input files after you've finished assembling your data set, by using the ipyrad Python API inside an interactive IPython session or Jupyter-notebook. We have instructions for this in the documentation: http://ipyrad.readthedocs.io/analysis.html.
toczydlowski
@toczydlowski
Aug 14 2017 18:47
Hi @dereneaton - I think this may have come up before, but I can't find it. Where can I find the average final depth per locus (across individuals) and/or per individual (across loci)? I use statistical calls (instead of majority). I know there are a bunch of stats in the s3 stats file, but it is unclear to me what these mean. Thanks!!!
tommydevitt
@tommydevitt
Aug 14 2017 18:53
Thanks, @dereneaton
Deren Eaton
@dereneaton
Aug 14 2017 19:01

@toczydlowski You can find the location of all stats files by using the -r command from the ipyrad command line, with the specified params file. For example:

ipyrad -p params-cli.txt -r

...

Full stats files
------------------------------------------------
step 1: ./cli_fastqs/s1_demultiplex_stats.txt
step 2: ./cli_edits/s2_rawedit_stats.txt
step 3: ./cli_clust_0.85/s3_cluster_stats.txt
step 4: ./cli_clust_0.85/s4_joint_estimate.txt
step 5: ./cli_consens/s5_consens_stats.txt
step 6: ./cli_across/s6_cluster_stats.txt
step 7: ./cli_outfiles/cli_stats.txt

You can see that this prints a list of available stats file locations at the bottom of the print out. Look in the step 3 stats file to see the coverage for clusters within each sample, and see the step 7 stats file for coverage across samples.

toczydlowski
@toczydlowski
Aug 14 2017 20:17
@dereneaton Thanks! I was aware of the stats files and looking in s3 stats, so was on the right track there - The average depth for stat and maj are the same in my case, which I expect, because my min_depth for stat is >= maj. What is avg_depth_total, and why is it so much lower than avg_depth_stat (35 vs. 5) though?
Deren Eaton
@dereneaton
Aug 14 2017 21:34
Hi @toczydlowski, the avg_depth_total is lower because it is calculated as an average for all clusters, including those that had a depth of coverage below your mindepth setting (i.e., it includes tons of singleton clusters). The avg_depth_statistical is calculated for a subset of all clusters -- just the ones we're keeping because they have a high enough depth -- and thus its average depth is much higher. Make sense?