These are chat archives for dereneaton/ipyrad

3rd
Oct 2016
jianlizhao
@jianlizhao
Oct 03 2016 07:12
@dereneaton Deren, thanks for you reply. I rerun the step 67. I got the massage. 2016-10-01 14:46:20,738 pid=14025 [assembly.py] INFO Unable to create dataset (Chunk size must be < 4gb)
2016-10-01 14:46:20,764 pid=14025 [assembly.py] INFO shutting down engines
R2C2.lab
@R2C2_Lab_twitter
Oct 03 2016 09:39
STEP6.tiff
Hi @dereneaton @isaacovercast, I am having this message in step 6. My clustering threshold is 0.85 PE 300bp. Any idea?
Edgardo M. Ortiz
@edgardomortiz
Oct 03 2016 14:33
You need to update to ipyrad 0.4.1 with conda update -c ipyrad ipyrad, that bug was fixed after 0.3.42
R2C2.lab
@R2C2_Lab_twitter
Oct 03 2016 14:59
@edgardomortiz thanks, my version is 0.3.42
Edgardo M. Ortiz
@edgardomortiz
Oct 03 2016 15:04
Unfortunately I think that you will need to run the pipeline from step 1
Edgardo M. Ortiz
@edgardomortiz
Oct 03 2016 15:24
@dereneaton @isaacovercast I have that big dataset that I decided to split in 12 subsets, on each subset I ran steps 1, 2 and 3 successfully. So then I merged the 12 before going to step 4 and didn't work because the merged .json has every path written multiple times, I hope is not a problem that I attach them here.
Deren Eaton
@dereneaton
Oct 03 2016 15:35
@jianlizhao ah, thanks. This should easy to fix. I'll work on it.
@R2C2_Lab_twitter Yes, I think you'll have to restart the assembly due to some major new changes. Sorry.
Deren Eaton
@dereneaton
Oct 03 2016 15:53
@edgardomortiz I see, yeah we haven't supported merging at every step yet.
I think just at step 2 currently. We should support merging at 4 and 5, I think it was just an oversight. But yeah, right now it's concatenating the string paths instead of making them into a list, which is obviously a problem.
Edgardo M. Ortiz
@edgardomortiz
Oct 03 2016 16:05
I tried running the paired-end data for all 666 inds in a single set but it timed out, when I split into groups of 55 or 56 inds most finish in less than 48hrs, however I still had some subsets that timed out after the limit of 48hrs. These files I posted come from an analysis of just R1 which was super fast for step 3.
R2C2.lab
@R2C2_Lab_twitter
Oct 03 2016 17:09
@dereneaton ok; no problem. Only step 3 takes longer (3 days), steps 1 and 2 are quite fast
Deren Eaton
@dereneaton
Oct 03 2016 17:10
Cool. And I would recommend trying out a branch with filter_adapters set to 2. I'm seeing some substantial improvements to my assemblies with the new filtering options applied.
R2C2.lab
@R2C2_Lab_twitter
Oct 03 2016 17:12
@dereneaton ok, I’ll do that. Thanks