Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Mar 14 15:34
    gedankenstuecke closed #537
  • Mar 14 15:34
    gedankenstuecke commented #537
  • Mar 14 10:00
    joannearcilla opened #537
  • Feb 06 14:35
    tsujigiri synchronize #536
  • Feb 06 14:35

    tsujigiri on setup-github-action

    Update paperclip gem (compare)

  • Feb 05 16:46
    tsujigiri synchronize #536
  • Feb 05 16:46

    tsujigiri on setup-github-action

    WIP (compare)

  • Feb 05 16:41
    tsujigiri synchronize #536
  • Feb 05 16:41

    tsujigiri on setup-github-action

    WIP (compare)

  • Feb 05 16:28
    tsujigiri synchronize #536
  • Feb 05 16:28

    tsujigiri on setup-github-action

    WIP (compare)

  • Feb 05 16:25
    tsujigiri synchronize #536
  • Feb 05 16:25

    tsujigiri on setup-github-action

    WIP (compare)

  • Feb 05 16:23
    tsujigiri synchronize #536
  • Feb 05 16:23

    tsujigiri on setup-github-action

    WIP (compare)

  • Feb 05 16:18
    tsujigiri synchronize #536
  • Feb 05 16:18

    tsujigiri on setup-github-action

    WIP (compare)

  • Feb 05 16:17
    tsujigiri synchronize #536
  • Feb 05 16:17

    tsujigiri on setup-github-action

    WIP (compare)

  • Feb 05 16:04
    tsujigiri synchronize #536
Bastian Greshake Tzovaras
@gedankenstuecke
whops :D
i think our automated cleanup of older zip archives doesn’t work, that’s why it just keeps adding them :D
Paweł Olszewski
@olszewskip
Hi, hello!
I'm interested in doing some genome-wide, "population-long" association studies between genotypes and phenotypes, with the motivation being just self-education and training in this sort of analyses, trying out different statistics and algorithms etc. First I was planning on simulating data with HAPGEN2, but then I stumbled upon openSNP. I was wondering if You know of any existing projects that use openSNP data for discovering phenotype associations? Also, maybe You can recommend any software tools for converting the genotypes from openSNP into a uniform file format, like VCF?
Bastian Greshake Tzovaras
@gedankenstuecke

hey @olszewskip! There was one crowdsourced AI challenge done with the data that would fit your description: https://www.crowdai.org/challenges/opensnp-height-prediction

They also created an open source pipeline to prepare the data from openSNP for the challenge: https://github.com/onaret/opensnp-cohort-maker

Paweł Olszewski
@olszewskip
Awesome! Many thanks.
Bastian Greshake Tzovaras
@gedankenstuecke
And if you want to use plink for the GWAS: They actually have a function to convert 23andMe files to the input plink needs
Paweł Olszewski
@olszewskip
Nice, I was vaguely aware of that, but I haven't yet had a chance of working with any 23andMe files. Does it make sense to use plink not only for GWAS'es but also simply for 23andMe -> VCF conversion?
Bastian Greshake Tzovaras
@gedankenstuecke
iirc plink doesn’t use VCF files but their own, more simple format. I think @philippbayer knows more about that though
Paweł Olszewski
@olszewskip
Sadly I cannot access the data prepared by crowdai without an account, and I cannot sign up because apparently the site is shutting down and not allowing new sign-ups :( If anyone knows of any way that I could still access that AI challenge data, please let me know.
Bastian Greshake Tzovaras
@gedankenstuecke
I’d try emailing them if there’s a contact email
@olszewskip if there’s no email around let me know and i’ll look through my archives and find it
Paweł Olszewski
@olszewskip
Sylvain Bernard from EPFL has kindly provided me with the data. Many thanks, @gedankenstuecke
Bastian Greshake Tzovaras
@gedankenstuecke
yay, that’s great! :)
alcogni
@alcogni
@gedankenstuecke I downloaded some phenotypes and genotypes included sometimes belongs to users either not at openSNP website anymore or the user number at website and user number at genotype are different. Could you please help me to understand.
JialinKang
@JialinKang
hi, the website opensnp.org is broke.
Helge Rausch
@tsujigiri
Oops... You are right! 🔎
Bastian Greshake Tzovaras
@gedankenstuecke
@JialinKang thanks to @tsujigiri the site is back up 💖
marcushdawson
@marcushdawson
Hi, I am getting "404: Whatever you tried to access is not here." message when trying to download all data from OpenSnp
Bastian Greshake Tzovaras
@gedankenstuecke
hey @marcushdawson, thanks for letting us now! I think there was an issue with creating the latest version of the archive and I’ve already kicked the machine and restarted the job. So hopefully at the end of the day there should be a new one :)
marcushdawson
@marcushdawson
Ok awesome, thanks for the quick response @gedankenstuecke. I hope the kicking works!
Bastian Greshake Tzovaras
@gedankenstuecke
yeah, the task is already 3 hours in, but it usually takes around half a day to create the archive i think
Sina Rüeger
@sinarueeger
Hi! The SSL certificate for opensnp.org seems to have expired recently (5/20/2020). Could you have a look?
Bastian Greshake Tzovaras
@gedankenstuecke
thanks @sinarueeger, doing the update right now :)
and fixed :)
Sina Rüeger
@sinarueeger
Thanks @gedankenstuecke for the quick fix!
Amborella
@Amborella
Hi! I'm new here and not an expert on genetics but if you look at the page for SNP rs187238 it says the genotype frequency for CC is 54%, CG is 39% and GG is 7%. However the article this info is based on here seems to say that GG has frequency 54% and CC has frequency 7%. Am I misinterpreting something or is it the wrong way around on the website? Thanks!
Bastian Greshake Tzovaras
@gedankenstuecke

hey @Amborella! That’s well spotted. This is most likely the result of how 23andMe reports their SNPs. 23andMe and most other genotyping companies report the variations all based on the +/forward strand of the DNA. But dbSNP and many publications etc. give variants based on the orientation of the gene.

In the case you highlight the IL18 gene is on the -/reverse strand of the DNA (see https://www.snpedia.com/index.php/Rs187238), so one needs to take the complement of the reported 23andMe allele to be in line with that notation, which flips the Gs & Cs around in this case.

Hope that makes sense! :)

Amborella
@Amborella
Hi Bastian! Thanks for the quick reply. I sort of understand what you are talking about. Just to confirm if I took a genetic test and got CC as my genotype what would it correspond to in openSNP? Thanks!
Also according to what it says on openSNP 7% of the population (those with genotype GG by the notation on the page) have increased risk for SCD due to hypertension while according to the original paper 54% of people had this risk (those with genotype GG going by the notation in the original paper). Should the complement be taken somewhere here as well?
Bastian Greshake Tzovaras
@gedankenstuecke
in openSNP it will also be noted down as CC. But if you look at dbSNP, SNPedia or any other source that takes orientation into account you need to check if the variant of interest is on the minus strand
So if you are CC in 23andme (and by extension in openSNP), then this translates to being GG in the paper you are looking at
and then i think all the numberes make sense, no? 7% CC in openSNP means we have 7% GG in the notation of the paper?
Amborella
@Amborella
Thanks for the reply. If you look at the Links to SNPedia bit on the page it says: "rs187238 G/G Hypertension increases risk 3.75x for sudden cardiac death". On openSNP it says 7% of people have this genotype while the paper says 54% do.
From your reply I get the impression that it is not possible to just go from the genotype frequency on openSNP to the Links to SNPedia to work out what percentage of population is susceptible to the increased risk without checking if the variant is on the negative strand. Can you confirm if that is correct?
Bastian Greshake Tzovaras
@gedankenstuecke
yes, and the snpedia page gives that informatio
so for rs187238 SNPedia says that GG on the minus strand gives the elevated risk, which the paper says is found in 54% of the population. As it’s on the minus strand we have to look at the CC in openSNP and can see that 54% of openSNP members have that variant too, no?
Amborella
@Amborella
Great, thanks a lot. That makes it much clearer!
Also as an aside how do you tell if a SNP is on the negative strand from SNPedia? Is it the - in -137G or something else?
Bastian Greshake Tzovaras
@gedankenstuecke
in the box on the right hand side the first item is “orientation minus"
Amborella
@Amborella
Got it. Thanks!
Philipp Bayer
@philippbayer
nice, i just resubmitted a paper on Amborella pangenomics
Amborella
@Amborella
Yeah, I think Amborella is amazing. Strongly considering taking a holiday in New Caledonia sometime in the next few years just so I can get myself a pot of the stuff and bring it back. Although I'm more interested in it from the rare plant side of things than the genetics side.
Bastian Greshake Tzovaras
@gedankenstuecke
All the best for the resubmission!
Sina Rüeger
@sinarueeger
Hi, for some reason the SSL certificate expired again, could you update that?
Philipp Bayer
@philippbayer
sorry about that, fixing it right now
done!
Sina Rüeger
@sinarueeger
many thanks!
Philipp Bayer
@philippbayer
@gedankenstuecke @tsujigiri i've just added a bash script to /root/run-letsencrypt.sh and added that to crontab, let's see whether that works :PP
the current cert expires Sunday, 22 November 2020. if it works right then next thursday (tomrorow) we'll have a later cert.
Helge Rausch
@tsujigiri
I already added a CRON job earlier. Doesn't seem to work... :laughing: