Monday, March 14, 2011

Article Review: Epigenetic suppression of the TGF-beta pathway revealed by transcriptome profiling in ovarian cancer

In this paper, Matsumura et al. develop a method to identify methylated genes in ovarian cancer patients using gene expression data from roughly 40 ovarian cancer cell lines and 20 cultured primary tumor samples.  The authors posit that this method provides a unique opportunity to study pathways affected by methylation because it directly examines gene expression.

My overall thoughts on this paper:

  • The study produced a relatively large amount of data, which is now available in GEO
  • The study utilized a large amount of publicly available data, providing a very useful list of citations for anyone interested in doing bioinformatics analysis on ovarian cancer (especially those interested in methylation).
  • The authors utilize useful open-source tools for pathway analysis (namely GATHER and the specialized binary regression method)

  • I think it is more likely that methylation directly suppresses EMT-related genes (such as those involved with cell adhesion) rather than repressing the TGF-beta pathway (which then regulates EMT genes).
  • Unlike in other cancers, patients with methylated genes do not show a worse prognosis.  In fact, I wouldn't be surprised if patents with methylated genes had a slightly better prognosis because methylation suppresses genes associated with the epithelial-mesenchymal transition (which is associated with a progression to a more aggressive cancer).  This hypothesis is also supported by the stromal response data shown in Figure S9.

I think one of the most useful tools discussed in this paper is GATHER, which is very fast and has a simple user interface.  GATHER provides enrichment analysis for information from various databases, such as Gene Ontology, KEGG Pathways, TRANSFAC, and MEDLINE.  More detailed information about the data mined in GATHER can be found in the associated paper by Chang and Nevins.

In fact, GATHER was immediately useful in helping interpret the results of this study.  For example, I used GATHER to check the enrichment for the list of 378 methylated genes described in this paper.  This revealed that the TGF-beta signaling pathway was not the most significantly enriched pathway in the gene list, and the TGF-beta signaling pathway actually had the smallest number of representative genes in the methylated gene list (out of the significantly enriched pathways).  GATHER was also useful for studying the enrichment of pathways in the more conservative "methyl cluster" gene list (which showed a weaker association with the TGF-beta pathway and a stronger association with other pathways, such as the focal adhesion genes).  These are some of the reasons that I believe the methylation directly suppresses EMT-related genes in these ovarian cancer patients (rather than acting through the TGF-beta pathway).

Another useful open-source tool described in the paper is the binary regression method used to define the TGF-beta gene signature.  The binary regression method is especially useful for biologists without a lot of coding experience because it has MATLAB GUI with a simple, user-friendly interface (and version 2.0 is even better than the original code).  In addition to defining gene and pathway signatures, the Bild lab is also currently using this binary regression algorithm to predict drug sensitivity from patient samples.

That said, there are probably a few things I should warn potential users about before giving this product my complete stamp of approval.  Although I have played around this tool a little bit (with encouraging results), I haven't had a chance to use it as much as the relatively common R packages for SVMs (in the e1071 package) and classification trees (in the tree package).  Therefore, I can't really comment about the practical limitations of this algorithm.

I was also a little bit nervous when I saw that Anil Potti (who I mentioned in my previous blog post) was one of the authors on the original Nature paper by Bild et al. for the binary regression method.  However, Potti wasn't involved with the early framework for this method (described by West et al.), and a retraction request for one of the retracted Potti papers states "although we believe that the underlying approach to developing predictive signatures is valid, a corruption of several validation data sets precludes conclusions regarding these signatures."  Therefore, I don't think Anil Potti had any negative influence on the binary regression method.

Overall, I found this paper to be useful and informative, and I would recommend it for anyone interested in microarray analysis.


  1. FYI, my subsequent experiences with following through with validation experiments has led me to become more lax about picking the top GO enrichment result (in general - I believe there were some aspects of these specific validation experiments that I wasn't crazy about).

    I briefly mention this in the following comment:

  2. This comment has been removed by a blog administrator.


Creative Commons License
My Biomedical Informatics Blog by Charles Warden is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 United States License.