These are chat archives for dereneaton/ipyrad

28th
Jun 2016
danielyao12
@danielyao12
Jun 28 2016 08:54
@dereneaton Thanks very much for your help! Finally, I ran the first whole procedure of my data as your suggestions. Now, I wonder how I can evaluate my assembly or what values or test I can use to evaluate my assembly, even if a preliminary estimation. I geogled this question but didn't find useful information.
danielyao12
@danielyao12
Jun 28 2016 09:11
Besides, in your opinion, how should I do to reduce the time of assembly if I have a relative bigger dataset containing 100 species. Should I just run the assembly of all these 100 species as one dataset, or run assembly of a 20 species subset at a time and then merge these 5 assembly? Is it possible to merge assembly of each subset to get the results of whole dataset?
Isaac Overcast
@isaacovercast
Jun 28 2016 15:47
@danielyao12 How many lanes of data do you have? I would merge all assemblies and run them as one dataset, it's not possible to assemble subsets and then merge them at this time.
Isaac Overcast
@isaacovercast
Jun 28 2016 16:03
New Version: 0.3.13 fixes broken step 6 for users w/o vsearch installed locally.
Deren Eaton
@dereneaton
Jun 28 2016 17:49

I can see how branching and then merging data sets could be useful. For example, if you had a mix of haploid and diploid samples that you wanted to treat differently. We can work on making this possible, but we've discussed it before and it's a bit complicated if we allow data sets to be merged at any step. It is currently easy to apply step functions to a subset of samples in an Assembly when using the API, but that is not well documented yet.

As for speed, I don't think branching and merging should typically help for speed purposes. On an HPC cluster you should be able to connect to as many cpus as you can get and the speed improvements across samples should be nearly linear. Just make sure you use the --MPI flag when connecting to multiple nodes on HPC.

Deren Eaton
@dereneaton
Jun 28 2016 18:53
New Version: 0.3.14 -- Huge speed improvements to steps 4 and 6.
Deren Eaton
@dereneaton
Jun 28 2016 23:54
We're working on fixing a bug in step 6 for Mac. Working fine in Linux, though.