These are chat archives for dereneaton/ipyrad

6th
Jan 2016
Deren Eaton
@dereneaton
Jan 06 2016 15:55
Well, two things I can think of. First, it ensuresget_params() will print in the same order all the time, but I suppose we could alternatively enforce that by sorting the keys printing. But more importantly, their order in the ordereddict is currently the thing that determines the index of params, and thus our ability to call set_params(1, "./"). We could work around that as well by having a simple translation dict, from numbers to params, in set_params. So, I guess, no, there is nothing requiring us to stick with an ordereddict, we just need to make some fixes to allow it.
I'm trying to think of where this would really matter... but do you think maybe population should be a Sample level attribute, instead of an Assembly attribute? That way we could make it so that an Assembly object only holds pop info for the Samples that are currently linked to it.
Deren Eaton
@dereneaton
Jan 06 2016 16:13
Should we write samples to state=6 after step6? I think so. Even though it is technically creating new Assembly object features by finding which samples cluster together, and not really doing anything new to individual Samples. But still, if the point is just to indicate whether or not step6 has been run already for those samples then setting the state to 6 makes sense.
_get_samples() really cleans things up nicely.
I'm gonna work on finishing the code to build the supercatg array today.
Isaac Overcast
@isaacovercast
Jan 06 2016 17:19
re: paramsdict, not a big deal, doesn't really gain us anything to make the switch so cancel that idea.
re: population as sample attribute rather than assembly attribute? Oh man, that's probably a good idea. i literally just finished the assembly.populations config, and it's very convenient to have it as an assembly param from the perspective of generating the different output formats... I will think about sample.population and whether it gains us enough to warrant refactoring.
Isaac Overcast
@isaacovercast
Jan 06 2016 18:06
I did a little clever hacking and rewrote all loci2*.make() functions to accept just two args, the assembly and a list of samples. The only param i'm having trouble with is 'seed' for loci2SNP. Looks like seed used to be a parameter in V3. You think we should incorporate 'seed' as a param in hackers only dict? Is it ever useful to use the same seed over and over? My instinct would be to set this randomly, but if it's useful to seed the same i can add it to hackers dict.
Isaac Overcast
@isaacovercast
Jan 06 2016 20:26
Will you check this out and let me know if you think this is not a good way to do it: #44
Isaac Overcast
@isaacovercast
Jan 06 2016 20:40
I merged refmap into master, i tested and fixed conflicts before merging origin/master. It's a big merge, but seems to be pretty solid.
Isaac Overcast
@isaacovercast
Jan 06 2016 21:22
Step2 for pair* datatype is failing with this message "other error: mismatched number of paired read files", for me, through both api and cli. Are you seeing this?
Isaac Overcast
@isaacovercast
Jan 06 2016 22:48
re: step2() mismatch, nvm i fixed it 9b605bd
Deren Eaton
@dereneaton
Jan 06 2016 23:38
cool. I'm making progress on step6. I'll wait to pull for now.