These are chat archives for dereneaton/ipyrad

Apr 2017
Edgardo M. Ortiz
Apr 10 2017 01:02
Thanks a lot! that fixed it...
Anna Mason
Apr 10 2017 12:40
Hi there, I was wondering how I can match the loci number in the .loci to the loci number in the .vcf, they do not seem to be the same id but are they in the same order? Also, is there a way to match the loci number in the *.gphocs output file too, they seem to be different again? Thank you
Apr 10 2017 14:11
Hi, I am working with PE ddrad data generated by using 2 different enzymes. My question is: how does ipyrad handle the information from PE-reads that not merge together? I think these cluster together based on the first read, but I am uncertain how the second read is treated. Is this unlinked or is there an association? How is this handled when using a reference? Thank you
Isaac Overcast
Apr 10 2017 16:31
@AnnaMaryMason_twitter The only difference in numbering between loci and vcf files is that loci files start counting at 0 (zero) and vcf files start counting at 1. I fixed this in the most recent version which I'm pushing right now (0.6.12), but you should be able to do a simple shift-by-one to figure it out.
@AnnaMaryMason_twitter In what way are the .gphocs loci numbered differently? Numbering should be identical to the .loci file. If it isn't can you please give me an example of how it differs?
Isaac Overcast
Apr 10 2017 17:09
@rfolkert If PE reads don't merge then we concatenate them with a short internal spacer, so we do treat them as "linked". When using a reference it's handled exactly the same way. R1 and R2 are mapped jointly. We test to make sure the mapping is "sane" (on the same chromosome, the orientation is right, etc, etc,) then we merge them. If they do not merge we concatenate them with a spacer and continue the assembly.
In ipyrad R1 and R2 are always associated, they are always treated as coming from a contiguous genomic region... unlike in STACKS, lol ;p
Apr 10 2017 19:01
@isaacovercast Thanks for the explanation..."unlike in STACKS" the exact reason why I use ipyrad! Is there a maximum distance between R1 and R2 when mapping? I am working with a rather rough assembly (n50=364,535, #scaffold=367,242) of a mammalian genome, and so far it keeps stuck at the Clustering/Mapping stage after a couple of days. Might this be due to the reference?