These are chat archives for dereneaton/ipyrad

29th
Feb 2016
Deren Eaton
@dereneaton
Feb 29 2016 17:52
Hmm, I have a job running step3 on HPC that was killed for some reason. It didn't checkpoint the finished clusters at state=2.5 for some reason. Maybe has to do with how it was killed. It also left behind the tmp-aligns/ dir. Seems the try/exception cleanup effort doesn't work like we had hoped when a node is shutdown without warning...
One way around this would be to do multiple saves while the step is running.
Deren Eaton
@dereneaton
Feb 29 2016 19:51
example: save could happen after each is clustered, and after each finishes aligning. This will take some restructuring. I can work when doing the major parallel restructuring and progress bars. Might be a lot of work.
Did you get Glenn's GBS to go all the way through step7 yet? That will be a milestone.
Isaac Overcast
@isaacovercast
Feb 29 2016 20:55
Glenns' data is still cranking out step 6. I actually had a different experience w/ hpc killing jobs, the LSU cluster has a 72hr wall clock limit, so it kills long jobs (like step 3) before it's done, but it killed in a nice way where step 2.5 was preserved and it cleaned up the tmp files. Maybe different cluster programs have different strategies for killing tasks.
Deren Eaton
@dereneaton
Feb 29 2016 21:05
Oh, that's good to know. I'll try to find out more details.