Sunday, February 27, 2011

My 23andMe Results: Getting a (Free) Second Opinion

NOTEGetting Advice About Genetic Testing

In order to get an idea about how well the 23andMe risk calculator agrees with other algorithms (when using the same exact same SNP data), I searched for other tools that I could use to analyze my genetic data.

For this post, I have compared my risk assessments from 23andMe to those provided by Promethease (which uses the information available in SNPedia).  I also played around with the free version of Enlis Genome (Personal Edition), but I found the GUI to be a little buggy and they didn’t automatically prioritize risk assessments (unlike 23andMe and Promethease).  So, this post will focus only on comparing my 23andMe assessment with my Promethease assessment.

To be fair, I should point out that I would not necessarily expect 100% concordance between my 23andMe and Promethease results for various reasons.  For example, the “magnitude” score from promethease is a subjective measure, and the curation methods are different for these two tools.  However, I think such a comparison will still be useful because it will still be encouraging to see any predictions that are shared by both tools, and I think both of these tools provide useful information since there is no “standard” way to combine all possible associated SNPs associated with a particular disease.

I will focus on my increased disease risks, but the same principles could be applied to decreased disease risk, drug response, or any other trait.

All Diseases with Increased Risk
NOTE: Percentages refer to percent of individuals with my genotype that have a particular disease, and the relative risk compared to average percentage is given in parentheses.  Percentages are not given for Promethease results because the percentage provided in the summary report refers to the population frequency of the SNP and not the percent of individuals with that SNP that will have a particular disease. Only Promethease results with clearly defined disease names and relative risk values were considered.   Promethease cutoff chosen based upon change in color from pink to red in summary report (and also the number of associations listed at this threshold).  23andMe risk assessment was recorded on 2/26/2011.  Promethease report was generated on 1/27/2011.  Multiple values are provided for Promethease but not 23andMe because promethease provides risk assessments for individual SNPs whereas 23andMe provides a single risk value for each disease.

Overall, I thought that there was pretty good agreement between the two methods.  This may not be apparent from the table above, but that is because my list of “higher importance” SNPs is considerably smaller but with greater overlap.  For example, I would have ideally preferred to look at SNPs with a 1.5x increase in risk and an absolute risk greater than 50%. The absolute cut-off of 50% is because I would prefer to look at SNPs where I am more likely to get the disease than not get the disease.  The 1.5x (or 50% increase in risk) is a somewhat arbitrary cutoff that is loosely based upon my microarray data analysis experience.  Since no SNPs meet both of these criteria, I chose to look at those with a greater than 1.5x relative risk and greater than 5% absolute risk (which, in my opinion, is still quite low).  Now, take a look at my more subjective SNP list.

 “Higher Priority” SNPs

Now, 2 out of the 3 SNPs have similar predictions.  Although there wasn’t a high magnitude SNP in promethease for venous thromboembolism, this could be because I subjectively considered this disease to be less well known than arthritis or diabetes, so I figured less popular diseases may have lower magnitude scores.  For this reason, I decided to look into what SNPs are used by 23andMe and promethease to determine venous thromboembolism risk.  I also checked the Genome-Wide Association (GWAS) Catalog to try and get a idea which SNPs are the best established (according to the US National Human Genome Institute).

SNPs Associated with Venous Thromboembolism Risk

GWAS Catalog
NOTE: Promethease lists 19 SNPs associated venous thrombembolism.  In order to simplify the table (and avoid listing some potentially inaccurate and/or low-confidence associations), I have only listed SNPs listed by 23andMe or the GWAS Catalog.

Now there is agreement between the 23andMe and Promethease results because both tools indicate that I have a mutation in rs6025, which results in an increased risk of developing venous thromboembolism.  However, I think it is worth pointing out that the results are not quite as clean as they could be.  For example, this SNP was not listed in the GWAS Catalog, and I couldn’t determine the dbSNP annotation for i3002432 (so it was relatively hard for me to cross-reference this result with other databases).

Another topic that is worth considering is family history.  Before I saw my results, there were 3 diseases that I wanted to check due to family history: type I diabetes, type II diabetes, and macular degeneration.  Thus, it was interesting to see type I diabetes come up in both reports.  Although I doubt that I will get type I diabetes (since the absolute risk is low and this disease usually appears during childhood), this information may still be useful if these mutations have other affects and/or increase the likelihood of children inheriting type I diabetes.

On the other hand, I didn’t see results indicating a increase in risk for type II diabetes and there were some conflicting results about macular degeneration.  Of course, family history is not a gold standard, and I may very well never develop type II diabetes or macular degeneration.  However, I think it is important to think carefully about ambiguous or uncertain results.  For example, this could be done by comparing SNP association to family history as well as considering both genetic and non-genetic risk factors for disease (the later is the topic of my second post).


  1. Regarding: "Now there is agreement between the 23andMe and Promethease results because both tools indicate that I have a mutation in rs6025, which results in an increased risk of developing venous thromboembolism. However, I think it is worth pointing out that the results are not quite as clean as they could be. For example, this SNP was not listed in the GWAS Catalog, ..."

    It is good to keep in mind that the GWAS catalog sometimes doesn't include variants of large effect (since one doesn't need GWAS to find variants of large effect). The rs6025 variant (also known as Factor V Leiden, or F5) is a well known genetic contributor to risk for venous thromboembolism. Note also that the 23andMe chip includes thousands of genetic variants such as Factor V Leiden that are understood to be associated with disease but are not on the standard chips used for GWAS.

  2. Thanks for finding that. Promethease now checks all 3.

  3. SNPedia also submits SNPs to NIH's dbSNP database in order to get (public) Rs identifiers for SNPs of importance; an example is rs113993960, aka delta508, the most common SNP leading to cystic fibrosis. GWAS studies are unlikely to include SNPs like the "private" i3002432 you discuss since that's a SNP (and a name) used by one private company, rather than one from a public source like dbSNP.

    We feel it's very important to have stable public identifiers for the ever-growing collection of variants (and medically associated consequences) that are coming along from chip and both exome and full genome sequencing studies. If a SNP is important enough to publish about and to a discuss with a doctor or family member, it should have a public identifier.

    So for SNPs of interest, we are happy to help any individual or company unwilling or unable to on their own go through the process of obtaining dbSNP Rs identifiers. Just contact us at and we'll work it out.

  4. Thanks for the helpful feedback!

    FYI, I also received some helpful comments from @23science on Twitter: (you won't find i3002432 in dbgap b/c it's an internal id for probes we designed. try rs1799963) Our white paper on risk calc: Tom's ASHG abstract:

  5. False positives in GWAS studies makes a fair amount of associations suspect. Calculating disease risks from false SNP associations is somewhat of a parlour game.

  6. I was quite enlightened by my results. On T2D, I am homozygous for 2 SNP's that correlate at 92% with occurrence of the disease. I knew I had a family history of diabetes on both sides of the family, but this was a bit of a cold shower. I also have several SNPs for T1D, also in my family, so glad to know I didnt develop it...but I do have other AI issues some of those seem also related to. All in all, I am glad to have the info.


Creative Commons License
My Biomedical Informatics Blog by Charles Warden is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 United States License.