May 24 2016 10:12
@isaacovercast @dereneaton ok so I think I may also have a problem with step three too.I have also ready had an email conversation with Isaac. Basically I am testing 13 samples on a 24 core Red Hay system with 96gbs ram and step three seems to be taking forever....As of this morning it has been there days 20hrs and cluttering is still at 0%. I have checked what is running is the system background and vsrerach is it. Also I have used the -c 24 command to force it to use all cores. I am not sure if this is the same problem but I have not got past this step yet.
this is the total number of reads I am running though it. reads_raw filtered_by_qscore filtered_by_adapter reads_passed
Bspe-15-39_S1_001 3217808 3403 9873 3204532
Carm15-24-9-R_S11_001 3210918 2321 7172 3201425
Ccal-16-18-8-R_S16_001 3123683 2416 6580 3114687
Ckue-15-40_S86_001 8558022 9132 19249 8529641
Cmac-15-3-5-R_S51_001 3501176 2658 7407 3491111
Ctai-15-75_S88_001 2905407 3282 6885 2895240
Dmej-15-41-R_S12_001 2795366 1949 8912 2784505
Eleb-15-42_S90_001 3075899 3428 9240 3063231
Lper-15-43_S91_001 5519262 5813 19053 5494396
Mcal-15-45_S92_001 3622719 4287 11642 3606790
Mjoh-15-44_S93_001 9895138 10610 29673 9854855
Seri-15-46_S94_001 3058563 3212 10873 3044478
Zint-15-47_S95_001 3653987 3744 7900 3642343
Isaac Overcast
May 24 2016 16:14
@roneytan Hey Ron, will you DL 0.2.9 and try it out? I have run step 3 on those example files several times and haven't seen an issue.
@cycadeles Yeah, I'm seeing something similar I have step 3 running on your samples for many many days, should have been done long ago. I'll check it out and email you.
May 24 2016 19:52
@isaacovercast Hi Isaac. I pulled 0.2.9 and ran a couple of datasets. They both worked and made it through step 3. One of them had the phredq score changed, so it isn't hanging on that, for sure. So, everything looks good. I'll let you know if any other troubles surface.
Isaac Overcast
May 24 2016 20:57
@roneytan Awesome. Good news. Thanks Ron.
@dereneaton I'm sort of only half understanding the difference between query_cov and clust_threshold. My suspicion is something funny is going on w/ one of those during clustering w/ @Cycadales 150bp PE ezrad data bcz the number of unmatched reads (htemp) is huge compared at least to glenn's gbs data.