Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    murraycadzow
    @murraycadzow
    Hi @stevanspringer , could you provide the command you used to get the 19.20959639-21159638 region please
    thanks
    Murray
    stevanspringer
    @stevanspringer

    Absolutely,

    http://browser.1000genomes.org/Homo_sapiens/Location/View?r=19:20959639-21159638
    click Get VCF data, confirm coordinates in text entry box, click next, right click to save and extract.

    snips from my actual data file:

    fileformat=VCFv4.1

    FILTER=<ID=PASS,Description="All filters passed">

    fileDate=20150218

    reference=ftp://ftp.1000genomes.ebi.ac.uk//vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz

    source=1000GenomesPhase3Pipeline

    snip ~250 lines. The following lines intentionally truncated to fit in the chat.

    CHROM POS ID REF ALT QUAL FILTER INFO FORMAT HG00096 HG00097

    19 20862396 esv3643896;esv3643897 A <CN0>,<CN2> 100 PASS AC=2,2;AF=0.000399361,0.000399361;AN=5008;CS=DUP_gs;END=20974015;NS=2504;SVTYPE=CNV;DP=13403;EAS_AF=0,0;AMR_AF=0,0.0029;AFR_AF=0,0;EUR_AF=0,0;SAS_AF=0.002,0;VT=SV GT 0|0 0|0
    19 20959646 rs73543372 C T 100 PASS AC=61;AF=0.0121805;AN=5008;NS=2504;DP=12049;EAS_AF=0;AMR_AF=0.0029;AFR_AF=0.0446;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP GT 0|0 0|0
    19 20959677 rs541375182 T G 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=14387;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP GT 0|0 0|0 0|0
    19 20959686 rs554447352 C G 100 PASS AC=1;AF=0.000199681;AN=5008;NS=2504;DP=14414;EAS_AF=0;AMR_AF=0;AFR_AF=0.0008;EUR_AF=0;SAS_AF=0;AA=.|||;VT=SNP GT 0|0 0|0 0|0

    Thanks very much!

    murraycadzow
    @murraycadzow
    thanks, looking into what the problem might be
    stevanspringer
    @stevanspringer
    Thank you! Great pipeline too, it's sure to be very useful to lots of folks.
    murraycadzow
    @murraycadzow
    I can't replicate the error you are getting. Could you tell me the command you are running selection tools with please?
    James Boocock
    @theboocock

    Hi Stevan,

    Could you also possibly upload the VCF onto dropbox, would really would like to try and duplicate this error.

    Thanks for your patience.
    James

    stevanspringer
    @stevanspringer
    Thanks so much for your help. It's definitely something I've done wrong on this end so I'll run through everything again on a fresh install. The only thing I'm doing off-book is mounting a directory referencefiles with subfolders: analysis (containing defaults.cfg), ancestral_ref, impute_ref, and so on. This overwrites everything in there on docker, but I think that should be ok. VCFs are straight from 1000 genomes. I've placed two in this folder https://www.dropbox.com/sh/8p02zxomq76x4n6/AAD5Ln9xNsVsYhIJayPUnHw5a?dl=0 (X fails for me, 6 works fine). I put files specifying populations in their own folder. Trying different pops doesn't change the outcome. Sorry to trouble you. I'm sure if I reinstall it'll probably work fine. I'll do that now on a different machine and report back.
    stevanspringer
    @stevanspringer
    Hi guys,
    stevanspringer
    @stevanspringer
    Unfortunately I've replicated the problem on another machine here. Complete fresh install, different physical machine but still running OSX. It affects only chromosome 19 and X, so I think it has something to do with the shapeit genetic map files. In the link you provide 19 has two versions one .txt and one .txt~, X is missing. That's got to be the common factor. I tried both the .txt~ and .txt versions of 19 with no luck. I ran through the install exactly as described, though I did mount the local referencefiles folder to docker as before. commands are the same as the manual except that I'm putting populations in their own folder and running from a folder called analysis where I've copied the default.cfg file. I'll put folders containing all the log files from the two failed runs, plus the logs of a successful run on CHR1 into the above dropbox folder in case that is helpful to you. I noticed a few things as I was installing it on OSX that i'll mention below in case they are helpful.
    Notes on OSX install: 1) have to run all docker commands without sudo on OSX. 2) link to ancestral files in documentation pdf is broken. 3) "--impute" only tuns on mine with the command "--impute-split-size 5000" which I think you mention in the manual, but having the exact syntax would be helpful to some. 4) in one of the multipop commands the -a is written --a. Otherwise a very smooth install, even for a newb like me. Hope that helps, and thanks!
    stevanspringer
    @stevanspringer
    I placed a file called referencefiles.zip on dropbox. Please excuse the generic name, just so it expands as it would be named on my computer. vcfs and terminal history included, see troubleshooting.txt. I've left the files in genetic_map untouched since I think the problem could be there. I've removed the ancestral_ref and impute_ref folders for the purposes of size. Thanks very much for your help with this, I hope this info will help you isolate what I'm doing wrong. I'm more than happy to help with anything else, please just let me know.
    murraycadzow
    @murraycadzow
    I know that X at the moment isn't going to behave nicely, especially if you are wanting to impute because both impute and shapeit have special flags that need to be set and we haven't set that up and needs to be manually done outside the pipeline at the moment and then the files made passed in using the --haps --sample commands.
    murraycadzow
    @murraycadzow
    @stevanspringer,
    murraycadzow
    @murraycadzow
    I ran your chr19 command and found the cause of the failure. It is failing at the ihs step and I'll need to look further into why but in the mean time you can skip this by using the --no-ihs flag as such as in this command which ran successfully for me on your data:
    multipop_selection_pipeline -p ../populations/CEU_ids.txt -p ../populations/YRI_ids.txt -i ../data/19.51637718-51837717.ALL.chr19.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf --config-file ../analysis/defaults.cfg -a "--phased-vcf --no-ihs" -c 19
    stevanspringer
    @stevanspringer
    Thanks so much for looking in to this. I was playing around with perhaps using PLINK formatted genetic maps, but ran into trouble. Just curious, does this command go through on other docker installations? I could for example set up an install on Google's container engine to see if the trouble is specific to docker on OSX. I can always run the ihs locally as well, was just using docker for Fay&Wu'sH. I really appreciate the help, thanks again.
    murraycadzow
    @murraycadzow
    I haven't played with docker enough or gotten around to making an actual docker image hence the need to use the dockerfile to make one. As far as I could tell the problem at least in what you sent through was a different error to what you originally posted about. The iHS problem is one that I have encountered and it is entirely in the script and not an OS issue. I'm currently just testing the 1.1 branch to see if it was able to run fine since there is a slightly different implementation in there
    murraycadzow
    @murraycadzow
    I have just checked and the pipeline gets all the way through on the selectionTools1.1 branch due to the different way things are done in the iHS script
    stevanspringer
    @stevanspringer
    Yes! 1.1 made it through here as well. Thanks so much for your help with this, very much appreciated.
    Looks like the X still doesn't make it, but it seems like you're on the trail. I'll put the output on dropbox in case it might be helpful.
    James Boocock
    @theboocock

    If you have find issues like these, feel free to open an issue on the github repo.

    These can then be tracked and marked when they seem like they are fixed. I will add the chromosome X one.

    Thanks for being a great user!

    Ageldinov
    @Ageldinov
    Hello, I'm trying to run the tutorial of the pipeline. All goes well until the data visualization. My charts do not coincide with yours. What could be the problem?
    Ageldinov
    @Ageldinov
    Hope you can help me, and thanks!
    samantha kohli
    @samantha88k_twitter
    Hi Error in scan_hh(d, big_gap = opt$big_gap, small_gap = opt$small_gap, :
    unused arguments (big_gap = opt$big_gap, small_gap = opt$small_gap, small_gap_penalty = opt$small_gap_penalty, physical_positions = physical_positions)
    I am getting this error
    what can be wrong?
    @murraycadzow @samantha88k_twitter
    Hi Error in scan_hh(d, big_gap = opt$big_gap, small_gap = opt$small_gap, :
    unused arguments (big_gap = opt$big_gap, small_gap = opt$small_gap, small_gap_penalty = opt$small_gap_penalty, physical_positions = physical_positions)
    I am getting this error
    what can be wrong?
    samantha kohli
    @samantha88k_twitter
    @murraycadzow
    Error in scan_hh(d, big_gap = opt$big_gap, small_gap = opt$small_gap, :
    unused arguments (big_gap = opt$big_gap, small_gap = opt$small_gap, small_gap_penalty = opt$small_gap_penalty, physical_positions = physical_positions)