These are chat archives for dereneaton/ipyrad

12th
Dec 2015
Isaac Overcast
@isaacovercast
Dec 12 2015 20:10
question: mindepth_majrule vs mindepth_statistical. I'm considering throwing out all pileups that do not satisfy the mindepth requirement. Seems consistent with expectations from denovo, and probably will save us a little time. Which of these would be more appropriate to use in the refmap pipeline?
Deren Eaton
@dereneaton
Dec 12 2015 22:43
Hmm, in the case of the denovo clusters it does not exclude any low depth clusters in step3, and so the header "clusters_kept" is not really fitting. It is simply reporting how many fall below the "mindepth_majrule". The low depth clusters are then ignored in step4, and removed in step5. I guess I see it as a good thing to leave them in the data set in case users change there mind and want to 'branch' their assembly at step5 to make both low depth and high depth base calls. If they got to that point and then found out their low depth calls had been removed at step3 it might be confusing... But then again we do report "clusters_kept" for that reason. It lets them know a bunch of their data at the current settings isn't going to be used...
Deren Eaton
@dereneaton
Dec 12 2015 22:51
It seems to me the easiest thing to do would be to leave all the low depth clusters in and just drop the refmapped clusters in with the denovo clusters before the muscle alignments. If the only issue is speed we can test it later and see if it's worth excluding lowdepth clusters earlier.
Isaac Overcast
@isaacovercast
Dec 12 2015 23:27
That's totally doable. I was thinking something similar myself actually, re: dropping refmapped back in before muscle. It will be an easy change.
I got the full stack refmap working yesterday, and have been working today on getting it to parallelize the post-processing (identifying stacks, making pileups, decompiling pileups to fasta, etc).
I copied the structure of multi_muscle_align, in terms of passing in the ipyclient and then having the function chunk up and deal out the tasks to the clients. I'm having a weird problem and i'm hoping your experience with ipp will help. I have a bunch of methods inside refmap to do subtasks associated with the post-flight processing, when i had it running serial it all worked fine, but now that i'm trying to push it to ipclients the clients can't see these methods.
Isaac Overcast
@isaacovercast
Dec 12 2015 23:34
I gather ipyrad.assemble.refmap hasn't been imported into the client view correctly or at all, but can't figure out the best way. I want to do something similar to how clustall() works, calling other methods to break up the functionality. Got any insight based on your ipp experience? What am i doing wrong?