These are chat archives for dereneaton/ipyrad

Oct 2017
Ollie White
Oct 27 2017 08:42
Hi @Mattalac_twitter, not sure if there is a flag for minimum allele frequency in ipyrad but its possible to filter SNPs using vcftools
Oct 27 2017 14:13
Oct 27 2017 14:19

@ChaoShenzjs try this

ipcluster start --n 20 --daemonize

n stands for the number of cores you want to use
and then run your command

Deren Eaton
Oct 27 2017 16:49
hey @Ollie_W_White_twitter, it certainly shouldn't take that long. Try restarting your ipcluster instance.
Deren Eaton
Oct 27 2017 16:55
Thanks @JeanMichMuch. This appears to be a problem associated with an update to HDF5 that now raises an error on certain systems when file locking is disabled. Glad to know you can fix it. We will work a fix in the ipyrad code soon so it won't be necessary to set the env variable by hand.
Deren Eaton
Oct 27 2017 17:04
Hi @tommydevitt. The command job.ready() returns whether the job is done running or not, but does not tell you whether it was successful or not. If the job is finished but there was an error, you can check for the error using job.result(). One reason that an error might be raised is if it cannot find the bpp binary.
@nspope :thumbsup: for awk tips.
Hi @Wind-ant, yes we need to add more to the documentation still. The horizontal colored bars under the tree indicate which individuals are being used for each analysis. If a bars spans several individuals then those individuals are being pooled to represent a single taxon. For exampe, in the figure you attached the outgroup is always the same two samples pooled, shown by a grey bar. The black bar shows the P3 taxon. The green and orange bars show the P1 and P2 taxa. The Z-score is the number of standard deviations that the D-statistic deviates from zero based on s.d. measured from bootstrap replicates. To the left of that is the distribution of bootstrap D values, shows in grey if not significant, or colored if it is significant. You can vary the significance cutoff in the code for the plot.
Deren Eaton
Oct 27 2017 17:13
Hi @tommydevitt Oh, I think I see a problem. It seems to be writing that there is a named file that doesn't exist. For example, in the .files.mcmcfiles attribute of my bpp object is says that 3 files exist ...test.mcmc.txt, ...test_r0.mcmc.txt, and ...test_r1.mcmc.txt, but actually only the latter two exist (the ones with _r counters for the replicates) while the first one without the _r should not be there. I will make a note to fix this.
Matt McElroy
Oct 27 2017 17:59
@Ollie_W_White_twitter okay thanks!
Oct 27 2017 21:07
@isaacovercast , I sent you some files via Dropbox. Let me know if you have any problems accessing them. I have an update regarding the issue. I tried doing a denovo assembly on the same RADseq dataset that I am having discrepancies with loci being filtered_by_rm_duplicates between ipyrad versions. The denovo assembly with this dataset is working comparably between ipyrad versions 0.5.15 and 0.7.15 regarding loci filtered_by_rm_duplicates. In both ipyrad versions, loci filtered_by_rm_duplicates is 3%, so it definitely seems like there is a reference assembly issue in 0.7.15.
Oct 27 2017 21:35
Thanks @dereneaton .
@dereneaton somewhat unrelated, but how can I update bpp to v4.0 so that it plays nice with ipyrad? I assume the version installed via conda install -c ipyrad bpp is version 3.3?