These are chat archives for dereneaton/ipyrad
Hello, I'm Jeronymo and I have a question about an error message in steps 6 and 7.
I'm using 238 samples and step 6 was taking too long to run. I was able to parallelize this step in my university cluster (HPC/Flux). However, when the step reaches 100%, after 2 days running (96GB in 24 cores - maximum for me as a student), the job didn't stop and it kept running until the time I set, 4 days. I received a notification that the job was aborted due to lack of time. I previously ran a test analysis with few samples to know the program and the command lines and everything went well. I compared the outputs from those runs (test and this 100% aborted-run) and apparently they are ok. Some co-workers told me sometimes one of the nodes can be stuck and could not finish the job. They has already had this problem before.
I thought everything was ok, so I ran the step 7, also parallelized (same core and memory), but at the end of the "100% filtering loci" the following error occurs:
[####################] 100% filtering loci | 2:06:49 ERROR:ipyrad.assemble.write_outfiles:error in filter_stacks on chunk 0: EngineError(Engine '87ce68b4-541eb86efaa3078ad3c9b103' died while running task u'7cc66d3a-ae75c472ed85ca7ab9176dc3')
ERROR:ipyrad.core.assembly:IPyradWarningExit: error in filter_stacks on chunk 0: EngineError(Engine '87ce68b4-541eb86efaa3078ad3c9b103' died while running task u'7cc66d3a-ae75c472ed85ca7ab9176dc3')
Encountered an error (see details in ./ipyrad_log.txt)
Error summary is below -------------------------------
error in filter_stacks on chunk 0: EngineError(Engine '87ce68b4-541eb86efaa3078ad3c9b103' died while running task u'7cc66d3a-ae75c472ed85ca7ab9176dc3')
On the internet this error "error in filter_stacks on chunk 0" is associated with a popfile but I am not using any popfile. I reran the step 7 without paralleling and I find a similar error "error in filter_stacks on chunk 5386" in the step "0% writing VCF" with 12GB. Maybe the memory wasn't enough.
I didn't have problems in run step 7 in the test. I think there's a chance of the files generated in step 6 are corrupted. Should I run step 6 again? Or this message is another problem, like the paralleling approch set wrongly or lacking of memory? The step 6 take 2 days to run, so before I rerun it I would like to know if someone could help me. Thanks for your time
_hackersonly, which is a "dictionary" of hidden parameters, one of which is
bwa_args. You can update this parameter with whatever arguments you'd like to specify for bwa and ipyrad will pass these through. Is that kind of what you were looking for?
-dflag to generate debug output to the ipyrad_log.txt file. Also, what substep of step 6 reached 100% and never completed?
@isaacovercast Thanks for the reply. The substep in step 6 is the "building database", I think that is the last one in this step, following the tutorial.
I thought I found the solution. I was putting in the "min_samples_locus" parameter a number above 200 to have avoid too much missing data (I have 238 samples). But my samples are for a rodent genus with great interspecific divergence in Cit b. I put 150 samples and I did not parallelize and worked, step 7 ran without errors. However, the number of loci was low 5,000. I think step7 did not find shared loci to all samples with 200 samples. I will do a subsampling again from step 2 and run step 6 with the -d flag as you suggested, removing more samples with a low number of reads (I allowed 100,000 reads per samples, I read in forum the ideal was 300,000 or more). But I guess the problem was not in step 6 but my "min_samples_locus" parameter. Thanks again for your time!