Sunday, February 27, 2011

My 23andMe Results: Getting a (Free) Second Opinion

NOTEGetting Advice About Genetic Testing

In order to get an idea about how well the 23andMe risk calculator agrees with other algorithms (when using the same exact same SNP data), I searched for other tools that I could use to analyze my genetic data.

For this post, I have compared my risk assessments from 23andMe to those provided by Promethease (which uses the information available in SNPedia).  I also played around with the free version of Enlis Genome (Personal Edition), but I found the GUI to be a little buggy and they didn’t automatically prioritize risk assessments (unlike 23andMe and Promethease).  So, this post will focus only on comparing my 23andMe assessment with my Promethease assessment.

To be fair, I should point out that I would not necessarily expect 100% concordance between my 23andMe and Promethease results for various reasons.  For example, the “magnitude” score from promethease is a subjective measure, and the curation methods are different for these two tools.  However, I think such a comparison will still be useful because it will still be encouraging to see any predictions that are shared by both tools, and I think both of these tools provide useful information since there is no “standard” way to combine all possible associated SNPs associated with a particular disease.

I will focus on my increased disease risks, but the same principles could be applied to decreased disease risk, drug response, or any other trait.

All Diseases with Increased Risk
NOTE: Percentages refer to percent of individuals with my genotype that have a particular disease, and the relative risk compared to average percentage is given in parentheses.  Percentages are not given for Promethease results because the percentage provided in the summary report refers to the population frequency of the SNP and not the percent of individuals with that SNP that will have a particular disease. Only Promethease results with clearly defined disease names and relative risk values were considered.   Promethease cutoff chosen based upon change in color from pink to red in summary report (and also the number of associations listed at this threshold).  23andMe risk assessment was recorded on 2/26/2011.  Promethease report was generated on 1/27/2011.  Multiple values are provided for Promethease but not 23andMe because promethease provides risk assessments for individual SNPs whereas 23andMe provides a single risk value for each disease.


Overall, I thought that there was pretty good agreement between the two methods.  This may not be apparent from the table above, but that is because my list of “higher importance” SNPs is considerably smaller but with greater overlap.  For example, I would have ideally preferred to look at SNPs with a 1.5x increase in risk and an absolute risk greater than 50%. The absolute cut-off of 50% is because I would prefer to look at SNPs where I am more likely to get the disease than not get the disease.  The 1.5x (or 50% increase in risk) is a somewhat arbitrary cutoff that is loosely based upon my microarray data analysis experience.  Since no SNPs meet both of these criteria, I chose to look at those with a greater than 1.5x relative risk and greater than 5% absolute risk (which, in my opinion, is still quite low).  Now, take a look at my more subjective SNP list.

 “Higher Priority” SNPs



Now, 2 out of the 3 SNPs have similar predictions.  Although there wasn’t a high magnitude SNP in promethease for venous thromboembolism, this could be because I subjectively considered this disease to be less well known than arthritis or diabetes, so I figured less popular diseases may have lower magnitude scores.  For this reason, I decided to look into what SNPs are used by 23andMe and promethease to determine venous thromboembolism risk.  I also checked the Genome-Wide Association (GWAS) Catalog to try and get a idea which SNPs are the best established (according to the US National Human Genome Institute).

SNPs Associated with Venous Thromboembolism Risk

23andMe
Promethease
GWAS Catalog
rs6025
Yes
Yes
No
i3002432
Yes
No
No
rs505922
No
Yes
Yes
NOTE: Promethease lists 19 SNPs associated venous thrombembolism.  In order to simplify the table (and avoid listing some potentially inaccurate and/or low-confidence associations), I have only listed SNPs listed by 23andMe or the GWAS Catalog.


Now there is agreement between the 23andMe and Promethease results because both tools indicate that I have a mutation in rs6025, which results in an increased risk of developing venous thromboembolism.  However, I think it is worth pointing out that the results are not quite as clean as they could be.  For example, this SNP was not listed in the GWAS Catalog, and I couldn’t determine the dbSNP annotation for i3002432 (so it was relatively hard for me to cross-reference this result with other databases).

Another topic that is worth considering is family history.  Before I saw my results, there were 3 diseases that I wanted to check due to family history: type I diabetes, type II diabetes, and macular degeneration.  Thus, it was interesting to see type I diabetes come up in both reports.  Although I doubt that I will get type I diabetes (since the absolute risk is low and this disease usually appears during childhood), this information may still be useful if these mutations have other affects and/or increase the likelihood of children inheriting type I diabetes.

On the other hand, I didn’t see results indicating a increase in risk for type II diabetes and there were some conflicting results about macular degeneration.  Of course, family history is not a gold standard, and I may very well never develop type II diabetes or macular degeneration.  However, I think it is important to think carefully about ambiguous or uncertain results.  For example, this could be done by comparing SNP association to family history as well as considering both genetic and non-genetic risk factors for disease (the later is the topic of my second post).
 

Thursday, December 2, 2010

Article Recommendation: Glioblastoma Subtypes Defined Using Data from TCGA

After reading Verhaak et al. 2010 in Cancer Cell, and I was impressed by this very good study analyzing data from an important resource for genomics research.

The authors were able to define gene signatures to define 4 subtypes of glioblastoma.  The experimental design was pretty straightforward, and the results were quite clear.  Most importantly, their predictive model was trained an a relatively large set of 173 patient samples and validated on an even larger set of 260 patient samples (from 5 independent studies).

The study focused mostly on data provided by The Cancer Genome Atlas (TCGA).  TCGA is a database that contains various types of genomic data (gene/miRNA expression, gene/miRNA copy number, DNA sequence/polymorphism, and DNA methylation), and most or all types of genomic data are available for each patient in the database.  This provides a unique opportunity to integrate many different types of data, usually for a large number of clinical samples.  For anyone not aware of this resource, I would strongly recommend checking out the links provides above as well as the original TCGA paper (also on glioblastoma) published in Nature.

Sunday, November 21, 2010

The Personal Benefits of Self-Regulation

Although dishonest individuals do not always experience immediate repercussions for their unethical behavior, there are a number of benefits to having the self-discipline and courage to search for an honest career that is truly helps other people because the development of ethical habits early in one's professional career is likely to pay off later in life.

There can be significant long-term consequences to dishonest behavior, as shown by an increasing number of retractions of papers from scientific journals and a noticeable presence of scientific misconduct in the news.  For example, Anil Potti resigned from Duke University after it was revealed that he published forged results and included inaccurate information in his CV (such as falsely claiming that he was a Rhodes Scholar).  However, there are also a number less drastic consequences that do not involve formal punishment for bad behavior.

Embellishing results early in one's career can make downstream research more difficult.  For example, inaccurate predictions will make it difficult to get positive results from follow-up experiments to validate a preliminary hypothesis.  Also, many scientific disciplines offer rotations for graduate students, and it will be more difficult to recruit top-notch graduate students if other labs can offer more interesting projects with a better experimental design.  It is difficult to maintain a steady stream of publications in a lab with little or no competent personnel.

Although scientists who forge results on a regular basis may not necessarily have to worry about downstream analysis (because all of their results are false anyways), individuals who make false claims on a regular basis are more likely to be caught by others attempting to verify important results.

Networking and social interactions will also be more difficult for individuals with a reputation for behaving dishonestly.   Even if the general public is not aware of a person's reputation, individuals  who behave unethically on a regular basis will probably have difficulties developing a close network of friends.

As my grandmother used to say, the common saying shouldn't be "practice makes perfect" but rather "practice makes permanent" because all habits (good or bad) are difficult to break.  If aspiring scientists start engaging in unethical conduct, then it will become increasingly hard to break those habits at a later stage in their career.
 
Creative Commons License
Charles Warden's Science Blog by Charles Warden is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 United States License.