These are chat archives for dereneaton/ipyrad

17th
Oct 2018
Tom Barbette
@tbarbette
Oct 17 2018 07:08
So any idea why the assembly does not use the available cores? I mean ipyrad process are idle most of the time. I think that if I gave "-c 1" it would take the same time. Only the threading works. Is this expected?
Isaac Overcast
@isaacovercast
Oct 17 2018 15:09
@tbarbette Which step are we talking about here?
The -c flag does allocate cores accurately and distributes the workload across these. Depending on which step is running there can be more or less parallel workload to distribute. For example, there is one part of step 6 that is not parallelized, so it can appear at this stage that the -c flag isn't doing anything, but in fact this is not the case.
Tom Barbette
@tbarbette
Oct 17 2018 16:23
Well step 3 is by far the longest for me. For most of step 3, if I use -c 16 -t 2 I will never use more than 3 CPUs, most ipyrad (ipcluster) processes are idle except one or two that actually run vsearch (and those vsearch do follow the -t threading I ask). So for now I use -c 16 -t 16, but I may actually use -c 2 -t 16 , it would give the same results.
AliceLedent
@AliceLedent
Oct 17 2018 21:48
@isaacovercast , ok thank you again! Then do you think the fact to come from a mindepth_marjule set to 3 to a mindpeth_marjule set to 2 could lead to the job crashing in step 3 building_clusters? Is there any possible reason? Thanks!
AliceLedent
@AliceLedent
Oct 17 2018 22:04
@isaacovercast May be knowing that i'm using pair-end data could help finding the problem? May be vsearch doesn't support a mindpeth_marjule of 2? Are the reads first filtered with the mindepth_marjule and then passed to vsearch or the other way around? Thanks
Isaac Overcast
@isaacovercast
Oct 17 2018 22:53
@AliceLedent mindepthmajrule minimum value is 3 i think, but Step 3 doesn't pay attention to the mindepth* values. Mindepth settings are used in step 5.
What is the error that you're getting? Can you rerun step 3 with the -d flag to switch on debug and show me the last 20-30 lines of the ipyrad_log.txt file?
Isaac Overcast
@isaacovercast
Oct 17 2018 22:58
@tbarbette The -c flag indicates how many cores to allocate total. The -t flag indicates how many of these cores to allocate per vsearch process, so for the best results -t should evenly divide -c. Performance is different between different systems, so you may have to tune these to your hardware and your data.