These are chat archives for dereneaton/ipyrad

22nd
Apr 2016
Isaac Overcast
@isaacovercast
Apr 22 2016 00:00
Are you seeing simulated SE RAD data recovering < 1000 clusters after step3? This is in 'denovo'. Wondering if its an artifact of some of the changes yr working on in step 2
Deren Eaton
@dereneaton
Apr 22 2016 00:00
yeah
Isaac Overcast
@isaacovercast
Apr 22 2016 00:00
ok cool
Deren Eaton
@dereneaton
Apr 22 2016 00:00
its a problem in step1 or 2, tho
working onit
I made step1 about 10X faster
but not quite working yet
Isaac Overcast
@isaacovercast
Apr 22 2016 00:01
lol, we could just call it a non-deterministic algorithm
Deren Eaton
@dereneaton
Apr 22 2016 00:01
ha
Isaac Overcast
@isaacovercast
Apr 22 2016 00:01
:p nice work on the speedup tho
Deren Eaton
@dereneaton
Apr 22 2016 00:02
thanks, finally paid attention to some super newb old parts of the code
Isaac Overcast
@isaacovercast
Apr 22 2016 01:35
It seems like the new apply style for ipyclients swallows exceptions, they fail silently, and the only way I can see them is by turning on debug logging for ipcluster and then looking inside the .ipython/<profile>/log files. Are you seeing this too? Also appears to be no indication in the async result that something went wrong, no metadata.error and the metadata.status always says 'ok'
still having some super newb growing pains with the new step 3 mindset, knuckling it out tho
Isaac Overcast
@isaacovercast
Apr 22 2016 02:37
Oop, if you call get() then it'll return the exceptions
Deren Eaton
@dereneaton
Apr 22 2016 15:26
that's right. I left some notes in the code, but not much.
I can try to put together a cheat sheet of some tricks
Deren Eaton
@dereneaton
Apr 22 2016 16:34
Looks like we can demultiplex a full lane of data in about 15 mins now.
using 8 cores
Deren Eaton
@dereneaton
Apr 22 2016 19:00
whoa, did you know about 'memoize'?
its a decorator you can put above a function and it will cache the results like a dictionary value for the set of arguments to the function. So if you call the same function a bunch of times with the same argument it just returns the cached value instead of running the function. Can't use it for things that would store way too much in memory, but I'm testing it in the basecaller for step5 and it's way faster without much of a memory bump.
Deren Eaton
@dereneaton
Apr 22 2016 20:06
BTW, step1 will probably raise a warning unless you set the max number of mismatches between barcodes to 0 for the sim_rad_test currently. I didn't realize the barcodes I made up were within 1 base difference of each other. It doesn't stop the run but just prints a warning.
I'll work on replacing the sim data soon.