These are chat archives for dereneaton/ipyrad

24th
Feb 2016
Isaac Overcast
@isaacovercast
Feb 24 2016 00:00
trying to figure out the what condition would cause this and what's the best way to handle, if you have ideas lemme know.
Deren Eaton
@dereneaton
Feb 24 2016 00:06
hmm. Damn empty GBS strings keep getting us.
I think this is removing sites that look like 'fake indels', meaning there was a sequence repeat and the site column looked something like '-------A------'. Because it comes up very low frequency we remove it.
but in this case maybe it removed all the columns... You should definitely use LOGGER to spit out the cluster to see what is going on.
I'm doing a bunch of testing on a mac in the lab since that's what I'll be using in Idaho for the workshop it looks like. Everything seems to be running smooth.
Deren Eaton
@dereneaton
Feb 24 2016 00:11
I think I'm gonna reduce the default preview_step2 value down to 100K. I'm trying to optimize it for running an empirical data set on 4 cores and having it finish in about an hour.
Isaac Overcast
@isaacovercast
Feb 24 2016 00:12
sounds good.
I'll let you know when i figure out more about this empty gbs bug
Deren Eaton
@dereneaton
Feb 24 2016 05:12
I've got the makings of a progress bar. Currently implemented in svd4tet, but I can eventually expand it to all the steps I think.
In [6]: ipa.svd4tet.main(data, ipyclient)
  loading array
  loading quartets
  populating array with 495 quartets
  [####################] 100.00%  done
Isaac Overcast
@isaacovercast
Feb 24 2016 16:05
Sweet!
Isaac Overcast
@isaacovercast
Feb 24 2016 22:12
So i verified this problem i've been having in step 5, calling removerepeats on a consensus sequence that's all N's:
removerepeats(consens['N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N'
 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N'
 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N'
 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N'
 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N'
 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N'
 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N'
 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N'
 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N' 'N'])
Thinking I'll wrap the call to removerepeats in a try/except, and then just continue the while loop if it catches an exception, think this is kosher?
Deren Eaton
@dereneaton
Feb 24 2016 22:14
Nice. That sounds right. Are they all Ns because of low depth? If so it seems like the locus should be excluded before it gets to that step.
Isaac Overcast
@isaacovercast
Feb 24 2016 22:19
I'll have to look, i didn't print out depth info, but i can, should be quick.
I think this is the offending chunk:
>cra-ant-72854-Amazonas-Duraznopampa_1380853_r1;size=1;*0
TGCAGAGGGCTGGCTCACATGCTGTTTCCATGCTAGGATCCTTCTGGCAGTGTTGGCCCAGGGGAGCTCCTGTCTTCCTGCATCTGCTTCTACAG-------------------------------------------------------------
>cra-ant-72854-Amazonas-Duraznopampa_1585588_r1;size=1;+1
TGCAGAGGGCTGGCTCACATGCTGTTTCCATGCTAGGATCCTTCTGGCAGTGTTGGCCCAGGGGNNCTCCTGTCTTCCTGCATCTGCTTCTACAN-------------------------------------------------------------
>cra-ant-72854-Amazonas-Duraznopampa_1756629_r1;size=1;-2
--------------------------------------------------------------GGAGCTCCTGTCTTCCTGCATCTGCTTCTACAGCAAGGGCAGGTTTTAATGTTCTGCTCTGCTCTTACTCCTCTATCCCCTCCAGGGATGCTGC
>cra-ant-72854-Amazonas-Duraznopampa_2288986_r1;size=1;+3
TGCAGAGGGCTGGCTCACATGCTGTTTCCATGCTAGGATCCTTC----------------------------------------------------------------------------------------------------------------
>cra-ant-72854-Amazonas-Duraznopampa_2449243_r1;size=1;+4
TGCAGAGGGCTGGCTCACATGCTGTTTCCATGGTTGGATCCTTCTGGCAGTGTTGGCCCAGGGGAGCTCCTGTCTTCCTGCATCTGCTTCTACAG-------------------------------------------------------------
>cra-ant-72854-Amazonas-Duraznopampa_552425_r1;size=1;+5
TGCAGAGGGCTGGCTCACATGCTGTTTCCATGGTTGGATCCTCCTGGCAGTGTTGGCCCAGGGGAGCTCCTGTCTTCCTGCATCTGCTTCTACAG-------------------------------------------------------------
That's def the offending chunk. (Good name for a teen garage band from the 90's: "The offending chunk")
Deren Eaton
@dereneaton
Feb 24 2016 22:23
Yeah, there are no sites with depth > 5.
Lol
It passes the earlier filter because the read depth is > 5, but site depth isn't. An odd gbs thing that can happen.
Isaac Overcast
@isaacovercast
Feb 24 2016 22:25
yeah, totally. I'll just handle it the way i said earlier, seems appropriate.
Deren Eaton
@dereneaton
Feb 24 2016 22:25
Yeah