Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Jun 07 13:16
    gedankenstuecke opened #538
  • Jun 07 13:15

    gedankenstuecke on genomeprep

    add genomeprep link (compare)

  • Mar 14 15:34
    gedankenstuecke closed #537
  • Mar 14 15:34
    gedankenstuecke commented #537
  • Mar 14 10:00
    joannearcilla opened #537
  • Feb 06 14:35
    tsujigiri synchronize #536
  • Feb 06 14:35

    tsujigiri on setup-github-action

    Update paperclip gem (compare)

  • Feb 05 16:46
    tsujigiri synchronize #536
  • Feb 05 16:46

    tsujigiri on setup-github-action

    WIP (compare)

  • Feb 05 16:41
    tsujigiri synchronize #536
  • Feb 05 16:41

    tsujigiri on setup-github-action

    WIP (compare)

  • Feb 05 16:28
    tsujigiri synchronize #536
  • Feb 05 16:28

    tsujigiri on setup-github-action

    WIP (compare)

  • Feb 05 16:25
    tsujigiri synchronize #536
  • Feb 05 16:25

    tsujigiri on setup-github-action

    WIP (compare)

  • Feb 05 16:23
    tsujigiri synchronize #536
  • Feb 05 16:23

    tsujigiri on setup-github-action

    WIP (compare)

  • Feb 05 16:18
    tsujigiri synchronize #536
  • Feb 05 16:18

    tsujigiri on setup-github-action

    WIP (compare)

  • Feb 05 16:17
    tsujigiri synchronize #536
philiprhoades
@philiprhoades
Yes, I just tested that as an exercise with MyHeritage - it took them a few days to process the 23andMe file and I got a bit of info back but to get all of the info they want to charge another AUS$48 . .
It looks like we might stick with 23andMe just for the convenience of most of the family already being there . .
Bastian Greshake Tzovaras
@gedankenstuecke
yeah, that certainly makes things easier if you already have many people on 23andme
Philipp Bayer
@philippbayer
makes sense to me!
Bastian Greshake Tzovaras
@gedankenstuecke
and i just uploaded my own 23andme data to myheritage just for fun, to see what they offer :)
philiprhoades
@philiprhoades

I know what you mean about scaling and processing but for some years I have been following:

https://safenetwork.org

which could potentially offer a distributed store of data and possibly processing . . it would be great to have an inexpensive store and comparison of nearly ALL the data from the commercials . .

Philipp Bayer
@philippbayer
currently we're still ok (thanks to patreon!!) but it could be a thing for the future!
philiprhoades
@philiprhoades
There is a bunch of stuff I want to put there once they go live - I will keep this stuff in mind and bring it up again in the future . .
Bastian Greshake Tzovaras
@gedankenstuecke
thanks!
Philipp Bayer
@philippbayer
Bastian Greshake Tzovaras
@gedankenstuecke
wow!
Bastian Greshake Tzovaras
@gedankenstuecke
420 and me! :D
Philipp Bayer
@philippbayer
I didn't see that dna.land is closing down/relaunching as a commercial service? https://medium.com/@dl1dl1/dna-land-is-relaunching-34f5a505504f
Philipp Bayer
@philippbayer
I like this: 'Please note that because DNA.Land is closing as a research project, all accounts and data will be permanently deleted and erased from the DNA.Land servers on September 30th, 2019. You will be able to recreate an account on DNA.Land 2.0 and upload your data just as you did when you signed up to join DNA.Land.'
Bastian Greshake Tzovaras
@gedankenstuecke
@philippbayer yeah, that’s cool. I had missed that somehow!
Philipp Bayer
@philippbayer
updating ssl certificate right now, server seems to have trouble coming back
oh. the harddrive is full\
Philipp Bayer
@philippbayer
ok fixed, for now
Bastian Greshake Tzovaras
@gedankenstuecke
whops :D
i think our automated cleanup of older zip archives doesn’t work, that’s why it just keeps adding them :D
Paweł Olszewski
@olszewskip
Hi, hello!
I'm interested in doing some genome-wide, "population-long" association studies between genotypes and phenotypes, with the motivation being just self-education and training in this sort of analyses, trying out different statistics and algorithms etc. First I was planning on simulating data with HAPGEN2, but then I stumbled upon openSNP. I was wondering if You know of any existing projects that use openSNP data for discovering phenotype associations? Also, maybe You can recommend any software tools for converting the genotypes from openSNP into a uniform file format, like VCF?
Bastian Greshake Tzovaras
@gedankenstuecke

hey @olszewskip! There was one crowdsourced AI challenge done with the data that would fit your description: https://www.crowdai.org/challenges/opensnp-height-prediction

They also created an open source pipeline to prepare the data from openSNP for the challenge: https://github.com/onaret/opensnp-cohort-maker

Paweł Olszewski
@olszewskip
Awesome! Many thanks.
Bastian Greshake Tzovaras
@gedankenstuecke
And if you want to use plink for the GWAS: They actually have a function to convert 23andMe files to the input plink needs
Paweł Olszewski
@olszewskip
Nice, I was vaguely aware of that, but I haven't yet had a chance of working with any 23andMe files. Does it make sense to use plink not only for GWAS'es but also simply for 23andMe -> VCF conversion?
Bastian Greshake Tzovaras
@gedankenstuecke
iirc plink doesn’t use VCF files but their own, more simple format. I think @philippbayer knows more about that though
Paweł Olszewski
@olszewskip
Sadly I cannot access the data prepared by crowdai without an account, and I cannot sign up because apparently the site is shutting down and not allowing new sign-ups :( If anyone knows of any way that I could still access that AI challenge data, please let me know.
Bastian Greshake Tzovaras
@gedankenstuecke
I’d try emailing them if there’s a contact email
@olszewskip if there’s no email around let me know and i’ll look through my archives and find it
Paweł Olszewski
@olszewskip
Sylvain Bernard from EPFL has kindly provided me with the data. Many thanks, @gedankenstuecke
Bastian Greshake Tzovaras
@gedankenstuecke
yay, that’s great! :)
alcogni
@alcogni
@gedankenstuecke I downloaded some phenotypes and genotypes included sometimes belongs to users either not at openSNP website anymore or the user number at website and user number at genotype are different. Could you please help me to understand.
JialinKang
@JialinKang
hi, the website opensnp.org is broke.
Helge Rausch
@tsujigiri
Oops... You are right! 🔎
Bastian Greshake Tzovaras
@gedankenstuecke
@JialinKang thanks to @tsujigiri the site is back up 💖
marcushdawson
@marcushdawson
Hi, I am getting "404: Whatever you tried to access is not here." message when trying to download all data from OpenSnp
Bastian Greshake Tzovaras
@gedankenstuecke
hey @marcushdawson, thanks for letting us now! I think there was an issue with creating the latest version of the archive and I’ve already kicked the machine and restarted the job. So hopefully at the end of the day there should be a new one :)
marcushdawson
@marcushdawson
Ok awesome, thanks for the quick response @gedankenstuecke. I hope the kicking works!
Bastian Greshake Tzovaras
@gedankenstuecke
yeah, the task is already 3 hours in, but it usually takes around half a day to create the archive i think
Sina Rüeger
@sinarueeger
Hi! The SSL certificate for opensnp.org seems to have expired recently (5/20/2020). Could you have a look?
Bastian Greshake Tzovaras
@gedankenstuecke
thanks @sinarueeger, doing the update right now :)
and fixed :)
Sina Rüeger
@sinarueeger
Thanks @gedankenstuecke for the quick fix!
Amborella
@Amborella
Hi! I'm new here and not an expert on genetics but if you look at the page for SNP rs187238 it says the genotype frequency for CC is 54%, CG is 39% and GG is 7%. However the article this info is based on here seems to say that GG has frequency 54% and CC has frequency 7%. Am I misinterpreting something or is it the wrong way around on the website? Thanks!
Bastian Greshake Tzovaras
@gedankenstuecke

hey @Amborella! That’s well spotted. This is most likely the result of how 23andMe reports their SNPs. 23andMe and most other genotyping companies report the variations all based on the +/forward strand of the DNA. But dbSNP and many publications etc. give variants based on the orientation of the gene.

In the case you highlight the IL18 gene is on the -/reverse strand of the DNA (see https://www.snpedia.com/index.php/Rs187238), so one needs to take the complement of the reported 23andMe allele to be in line with that notation, which flips the Gs & Cs around in this case.

Hope that makes sense! :)

Amborella
@Amborella
Hi Bastian! Thanks for the quick reply. I sort of understand what you are talking about. Just to confirm if I took a genetic test and got CC as my genotype what would it correspond to in openSNP? Thanks!
Also according to what it says on openSNP 7% of the population (those with genotype GG by the notation on the page) have increased risk for SCD due to hypertension while according to the original paper 54% of people had this risk (those with genotype GG going by the notation in the original paper). Should the complement be taken somewhere here as well?
Bastian Greshake Tzovaras
@gedankenstuecke
in openSNP it will also be noted down as CC. But if you look at dbSNP, SNPedia or any other source that takes orientation into account you need to check if the variant of interest is on the minus strand
So if you are CC in 23andme (and by extension in openSNP), then this translates to being GG in the paper you are looking at