These are chat archives for dereneaton/ipyrad

17th
Nov 2017
siriusb
@siriusb-nox
Nov 17 2017 01:39
Hi @dereneaton, I am trying to analyse GBS data of ~20 species (33 individuals), using ipyrad. My goal is to clarify phylogenetic relationships on a species complex that we were not able to solve using few loci. I have managed to successfully run the pipeline under different parameter values (e.g. cluster of 0.85, and now 0.95), but sadly, the phylogeny that derives from these analyses have a really bad LBS supports (although the topology makes a lot of sense for us - esp. that one derived from 0.95 analyses - in terms of morphology and distribution). Also, I tried to infer a species tree from individual RAxML trees I inferred from each loci, but this phylogeny is even worse than the RAxML tree derived from a concatenated loci super matrix. I would be extremely grateful for any advice or piece of information you could provide on how to cope with this! My guess is that we might have a lot of incongruence between the loci, so I have tried to do a RAD partitioned analyses in RADami, which I cannot complete because of this error message (please see bellow - I am following the script you provide in http://nbviewer.jupyter.org/gist/dereneaton/32382a28db11b83f6da5/Carex.ipynb). Do you know what might be the problem here, and overall with my analyses that are deriving in poorly supported ML trees? Thanks a million in advance for any help! gen.RAD.loci.datasets(loci.pyrad, scop.nni.c95, minTaxa = 2,
  • taxa = "all", onlyVariable = T,
  • fileBase = "c95_ipyrad_comb_v3", cores = 6)
    ... making rad.mat for locus.1
    ... making rad.mat for locus.1001
    ... making rad.mat for locus.2001
    ... making rad.mat for locus.3001
    ... making rad.mat for locus.4001
    ... making rad.mat for locus.5001
    ... making rad.mat for locus.6001
    ... making rad.mat for locus.7001
    ... making rad.mat for locus.8001
    ... making rad.mat for locus.9001
    ... making rad.mat for locus.10001
    ... making rad.mat for locus.11001
    ... making rad.mat for locus.12001
    ... making rad.mat for locus.13001
    Error in colSums(tax.thresh.mat) : 'x' must be numeric
    In addition: Warning message:
    In open.connection(outfile) : connection is already open
tommydevitt
@tommydevitt
Nov 17 2017 03:43
@dereneaton @isaacovercast Is there a way tailor output files to contain only some subset of loci or taxa?
Deren Eaton
@dereneaton
Nov 17 2017 14:01
Hi @siriusb-nox, the .loci file format for ipyrad is slightly different from what it was in pyrad, and I'm not sure whether RADami has been updated to deal with it.
You'll probably want to contact Andrew Hipp from the Morton Arboretum, he is the maintainer of the RADami package.
@tommydevitt not currently, other than by toggling parameter settings. Are you thinking something like being able to write a list of locus IDs that you want to be used for the output files?
tommydevitt
@tommydevitt
Nov 17 2017 15:40
@dereneaton
Yeah, or exporting only variable loci or that have some minimum number of SNPs. Can you do that currently just in the parameter settings?
Deren Eaton
@dereneaton
Nov 17 2017 16:14
@tommydevitt , we don't have a minimum SNP parameter but I imagine it could be useful. One thing we've done to try to minimize the need to assembly too many datasets under different settings is to instead incorporate additional filters into the ipyrad.analysis tools. For example, the bucky and bpp tools have methods for you to assign individuals to populations and filter based on coverage within populations (which you can also do during assembly), but also have additional filters such as minSNPs. So to some extent you can implement some of these filters after assembly.
It's worth noting, though, that there are some arguments against filtering out invariant loci, particularly in coalescent based methods like bpp, since they rely on priors that are inherently modeling a distribution of gene trees that is expected to possibly contain invariant loci.
joqb
@joqb
Nov 17 2017 18:04

@dereneaton I managed to do multiple abba baba tests but then I want to do the 5-taxa test for confirmation. I managed to get it working as described in the cookbook but I was wondering if there is a way to perform multiple tests simultaneously like with the regular abba baba test. I tried but
from ipyrad.analysis.baba import _loci_to_arr, _get_signif_5
with open(ee.data, 'r') as infile:
loci = infile.read().strip().split("|\n")

arr, _ = _loci_to_arr(loci, ee.tests, ee.params.mincov)
arr
returned an error TypeError: unhashable type: 'dict'
Any suggestion?

Deren Eaton
@dereneaton
Nov 17 2017 18:10
hi @joqb, which notebook are you using? The 5-taxon test shouldn't be listed b/c it is not officially released in the baba tool yet, I haven't had time to finish writing and testing it.
the four-taxon test should work though