These are chat archives for dereneaton/ipyrad

13th
Apr 2017
Isaac Overcast
@isaacovercast
Apr 13 2017 06:25
@rfolkert The max inner mate distance is inferred from the data (mean + 3 SD i believe). Reference sequence mapping can sometimes be slow (especially the finalize mapping stage). It also depends on your data (how many samples, etc, etc).
@AnnaMaryMason_twitter Yeah past the first file lines of the .loci and .gphocs files so i can see what you're talking about.
@fangbohao_twitter What do you see for the CHROM values for reference mapped loci? What do the CHROM names look like in the reference sequence?
Isaac Overcast
@isaacovercast
Apr 13 2017 07:49
@edgardomortiz @AnnaMaryMason_twitter I fixed the gphocs output format. Available in the newest version v.0.6.13
Anna Mason
@AnnaMaryMason_twitter
Apr 13 2017 07:53

@isaacovercast thanks Isaac, here are the first few lines of the .loci and the .gphocs, .loci first:

LB00002 AAAGTACATACCCAAATAGTACCGTAGAAGACATTAATCCACTTTATCAAAGGAACTTGCCTGnnnnTTTGAAAAGATAGAAACGAAGAGATTGATATTAAACGCAAGTCACCATCCCCATGGAGTTCACCTGCA
LB00012 AAAGTACATACCCAAATAGTACCGTAGAAGACATTAATCCACTTTATCAAAGGAACTTGCCTGnnnnTTTGAAAAGATAGAAACGAAGAGATTGATATTAAACGCAAGTCACCATCCCCATGGAGTTCACCTGCA
LB00052 AAAGTACATACCCAAATAGTACCGTAGAAGACATTAATCCACTTTATCAAAGGAACTTGCCTGnnnnTTTGAAAAGATAGAAACGAAGAGATTGATATTAAACGCAAGTCACCATCCCCATGGAGTTCACCTGCA
// |3|
LB00002 TCTGACATATGACATCTGATTCCATAGAAACGATGACATACATTGTTTGCAGACTTTTGGAAAnnnnAATGGAATAATATATCCAAAATAGTCATCACAGAATA-TTTTTTCTCCCATTCCTAACAAGCTCCTGCA

and now .gphocs:

10594

locus0 1 131
B00002 AAAGTACATACCCAAATAGTACCGTAGAAGACATTAATCCACTTTATCAAAGGAACTTGCCTGnnnnTTTGAAAAGATAGAAACGAAGAGATTGATATTAAACGCAAGTCACCATCCCCATGGAGTTCACCTGCA

locus1 1 131
3| AAAGTACATACCCAAATAGTACCGTAGAAGACATTAATCCACTTTATCAAAGGAACTTGCCTGnnnnTTTGAAAAGATAGAAACGAAGAGATTGATATTAAACGCAAGTCACCATCCCCATGGAGTTCACCTGCA

locus2 1 130
TATAACTNACATCACCCTCMTYRGCACCTAARGATRGGAGAGGGGATATACARANNGCAR--nnnnANMTWGNAGTATAACTTACATCACCCKCMYTTGCACCTAAAGATAGGAGAGGGGATATACAGACTGCA

locus3 6 131
GATANGTGGGGTGACTCGTACAAAAATCATTTGATAACNAACGCGCAAAGCAGAGACAGATTTnnnnCGCAGNGGTTACACTCGCCCTCGCAAGGGAAAGGAATYCCACACTTGTCAGGATGGTGGTCGCCTGCA
B00002 GATASGTGGGGTGACTCGTACAAAAATCATTTGATAACAAACGCGCAAAGCAGAGACAGATTTnnnnCGNNGAGGTNACACTCGCCCTCGCAAGGGAAAGGAATTCCACACNTGTCAGGATGGTGGTCGCCTGCA
B00007 GATANGTGGGGTGACTCGTACAAAANTCATTTGATAACAANCGCNCAAAGCAGAGACAGATTTnnnnCGCAGAGGTTACACTCGCCCTCGCAAGGGAAAGGANTTCCACACTTGTCAGGATGGTGGTCGCCTGCA
B00036 GATASGTGGGGTGACTCGTACAAAAATCATTTGATAACAAACGCGCAAAGCAGAGACAGATTTnnnnCGCAGAGGTTACACTCNCCCTCGCAAGGGAAAGGAATTCCACACTTGTCAGGATGGTGGTCGCCTGCA
B00037 GATASGTGGGGTGACTCGTACAAAAATCATTTGATAACAAACGCGCAAAGCAGAGACAGATTTnnnnCGCAGAGGTTACACTCGCCCTCGCAAGGGAAAGGAATTCCACACTTGTCAGGATGGTGGTCGCCTGCA
B00038 GATASGTGGGGTGACTCGTACAAAAATCATTTGATAACAAACGCGCAAAGCAGAGACAGATTTnnnn-GCAGAGGTTACACTCGCCCTCGCNAGGGAAAGGAATTCCACACTTGTCAGGATGGTGGTCGCCTGCA

locus4 10 136

@isaacovercast thanks for the fix I will check the v.0.6.13! Any idea why I have different number of loci between the .vcf and the .loci files? Has anyone got this issue as well? Cheers
Edgardo M. Ortiz
@edgardomortiz
Apr 13 2017 08:43
@isaacovercast @dereneaton just a minor issue with tetrad now, it seems the toyplotpackage is not being installed with ipyrad, I got tetrad running again after using pip install toyplot
Isaac Overcast
@isaacovercast
Apr 13 2017 08:47
@AnnaMaryMason_twitter The vcf file only reports variable sites, whereas the loci file prints out all sequences for all loci (including monomporphics).
If you have a monomorphic locus it just won't show up in the vcf file. Does that make sense?
Bohao Fang
@fangbohao_twitter
Apr 13 2017 08:55
blob
blob
@isaacovercast
blob
Thanks!
Isaac Overcast
@isaacovercast
Apr 13 2017 08:58
@fangbohao_twitter What assembly method did you use? denovo+reference or just reference?
That looks like the output you'd get from denovo or denovo+reference. Did you look at the end of the vcf file? If you did denovo+reference the reference mapped reads might come at the end of the vcf.
Bohao Fang
@fangbohao_twitter
Apr 13 2017 09:57
@isaacovercast
blob
blob
Isaac Overcast
@isaacovercast
Apr 13 2017 10:01
@fangbohao_twitter Is this an assembly that you've been working on for a while? Did you by chance use an older version to run some of the steps?
Isaac Overcast
@isaacovercast
Apr 13 2017 11:40
@fangbohao_twitter Hm, well it looks like the CHROM/POS information in the output vcf is broken. I'll work on it.
Bohao Fang
@fangbohao_twitter
Apr 13 2017 11:44
@isaacovercast
Thank you for this!
This is a trial assembly with 3 individuals, but previous run with large dataset also failed in providing CHROM/POS infomation. BTW, Im using ipyrad 0,6,10
Isaac Overcast
@isaacovercast
Apr 13 2017 12:53
@fangbohao_twitter Fixed. v.0.6.14 fixes a bug that prevented CHROM/POS info from writing out to the final VCF.
Bohao Fang
@fangbohao_twitter
Apr 13 2017 13:24
@isaacovercast I re-run the step 7 by v.0.6.14, but there is still no CHROM/POS info in vcf. Should I rerun it since other step?
blob
blob
Isaac Overcast
@isaacovercast
Apr 13 2017 14:31
@fangbohao_twitter Oh, sorry i should have been more specific, you need to go back and re-run from step 5.
Bohao Fang
@fangbohao_twitter
Apr 13 2017 14:35
@isaacovercast Thank you! it works from step 5!
Deren Eaton
@dereneaton
Apr 13 2017 17:12
@edgardomortiz Hey Edgar, yes tetrad is going through a transition where I'm working on removing some dependencies. It should be fully built with the conda installation soon.