These are chat archives for dereneaton/ipyrad

Sep 2016
Shea Lambert
Sep 08 2016 03:17
Hi @dereneaton and @isaacovercast , I'm having trouble trying to adjust the mindepth_statistical parameter. I'm getting the error "mindepth_statistical cannot be less than mindepth_majrule", even though my params file has mindepth_majrule set to, say, 2 or 3, and mindepth_statistical set to 5. The error message returned if mindepth_statistical is < 5 indicates that 5 should be the minimum value. If I set mindepth_statistical back to 6 though, ipyrad starts running fine.
Ivan Prates
Sep 08 2016 16:50

Hey peeps, I need some clarification about a few parameters in the params file when using just one enzyme -

  1. For restriction_overhang, do I need a comma after the overhang if I'm using just one enzyme (i.e., is it "TGCAT" or "TGCAT,")?
  2. For max_Ns_consens, max_Hs_consens, max_SNPs_locus, max_Indels_locus, max_shared_Hs_locus, and edit_cutsites, do I need one or two values (i.e., is it "0.25" or "0.25, 0.25")? If I need two, and since I have only one cutter, what is the second value doing?
  3. Would you please explain what trim_overhang does? I notice that using "4, 4, 4, 4" instead of "0, 0, 0, 0" is the difference between success and failure, but I have no clue about what this parameter is doing.

Thank you so much Isaac and Deren, you're heroes to me.

Isaac Overcast
Sep 08 2016 17:39
@SheaML I'm guessing this is on a new assembly you are trying to run from step 1. There's a weird artifact of how we set parameters at the very beginning of an assembly. If you leave mindepth_statistical at 6, run step 1, then edit the params file you can change this param back to 5 and continue with steps 2-7.
Isaac Overcast
Sep 08 2016 18:01
@ivanprates Hey Ivan!
  1. You do not need the comma if you are just using 1 enzyme
  2. For the parameters you mention you only need to put in 2 values if your data is paired-end. It makes it so you can apply different filters to R1 vs R2.
  3. trim_overhang specifies how to deal with non-overlapping edges. If you want to trim all your reads to be the same length this parameter sets the minimum number of samples of the shortest read per locus in order to trim to that length. Say you have 20 reads at a locus, 19 of them are 100bp and one is 75bp. Do you really want to trim that locus to 75bp? Setting this to 0,0,0,0 should prevent any edge-trimming. When you say "difference between success and failure" did you mean setting to all zeros failed? Did it error or it just didn't return any sequences?
Isaac Overcast
Sep 08 2016 18:33
@Cycadales Hi James. The progress bar stopping at 90% probably indicates some of your samples aren't successfully clustering. We have step 6 set up so that even if some samples fail (for whatever reason) it still will complete with the samples that completed successfully. It could do a better job of communicating this. What do you seen in the ipyrad_log.txt?
Ivan Prates
Sep 08 2016 18:34
Hi Isaac, thanks for the quick reply and clarifications. My data is single-end, but I guess the 2-4 values separated by comma(s) are needed nevertheless, at least for some of the parameters. For instance, for the edit_cutsites parameter, I get an error if I try using "TGCAT" instead of "TGCAT, 0": "Error setting parameter 'edit_cutsites'; list index out of range; You entered: TGCAT". Runs fine if I add ", 0". But the data are single-end, so not sure what ipyrad is doing with the parameter stated like that. As for the trim_overhang, I was getting an error - I'll try it again and see if it persists. Thanks!
Isaac Overcast
Sep 08 2016 18:41
Looks like a bug in the way we're reading the edit_cutsites param, you shouldn't need to specify the value for R2 if you have SE data. I'll fix it.
Sep 08 2016 18:59
@isaacovercast ahh ok I thought that could be the thing...would it be worth running it in debug mode? I will try and get the log.txt file.
Isaac Overcast
Sep 08 2016 19:13
debug mode can't hurt, itll give us more of an idea why those samples are failing.
Ivan Prates
Sep 08 2016 21:09
@isaacovercast, the comma after the overhang sequence for param 8 also seems to be mandatory for now; if I get rid of it, step 2 fails with the error "Error: %s AttributeError('tuple' object has no attribute 'replace')".
Deren Eaton
Sep 08 2016 21:15
@ivanprates Thanks, we'll work on making the inputs more robust to commas, etc.