@dereneaton Deren, thanks for you reply. I rerun the step 67. I got the massage. 2016-10-01 14:46:20,738 pid=14025 [assembly.py] INFO Unable to create dataset (Chunk size must be < 4gb) 2016-10-01 14:46:20,764 pid=14025 [assembly.py] INFO shutting down engines
@dereneaton@isaacovercast I have that big dataset that I decided to split in 12 subsets, on each subset I ran steps 1, 2 and 3 successfully. So then I merged the 12 before going to step 4 and didn't work because the merged .json has every path written multiple times, I hope is not a problem that I attach them here.
@edgardomortiz I see, yeah we haven't supported merging at every step yet.
I think just at step 2 currently. We should support merging at 4 and 5, I think it was just an oversight. But yeah, right now it's concatenating the string paths instead of making them into a list, which is obviously a problem.
I tried running the paired-end data for all 666 inds in a single set but it timed out, when I split into groups of 55 or 56 inds most finish in less than 48hrs, however I still had some subsets that timed out after the limit of 48hrs. These files I posted come from an analysis of just R1 which was super fast for step 3.