Wednesday, March 13, 2013

Bioinformatics 101: Gene Expression Analysis

Differential Expression Tools:

  • R - statistical programming language
    • most common statistical functions (t-test, ANNOVA, etc.) are built in
    • Bioconductor - suite of R packages used for bioinformatic analysis
      • limma - most commonly used differential expression tool for microarray analysis
      • edgeR - R package for RNA-Seq differential expression analysis
      • DEseq - R package for RNA-Seq differential expression analysis
  • cuffdiff
    • differential expression package within cufflinks
    • cufflinks provides transcript abundance calculations
    • strictly speaking, the developers recommend using cuffdiff for differential expression, although it is relatively common to use edgeR, DEseq, etc. for differential expression following mRNA quantification via cufflinks
  • Java TreeView
    • free tool for clustering microarray data
  • OCplus - R package for statistical power calculations (and differential expression) for microarray studies
  • Scotty - web-based tool for statistical power calculations for RNA-Seq data
  • Partek Genomics Suite
    • Commercial program that includes a number of workflows, such as microarray gene expression and RNA-Seq analysis
    • Includes statistics for differential expression analysis as well as tools for downstream functional analysis and upstream quality control assessment
lncRNA Resources:

  • MiTranscriptome - known and novel lncRNAs with cancer-associated profiles
  • TANRIC - TCGA and CCLE expression analysis for lncRNAs (including correlations with protein-coding genes and miRNAs)
  • Expression Atlas - gene expression profiles for known genes across various datasets
  • lncrnadb - includes additional annotations for known lncRNAs
  • lncATLAS - contains subcellular location information for ENSEMBL-format lncRNAs for some cancer cell lines

Transcription Factor Motif Analysis:

  • IPA Upstream Regulator Analysis
    • Commercial tool that searches for enrichment of known targets for regulatory genes and molecules (such as transcription factors)
    • Can also detect if targets are consistent with activation or inhibition of the regulator
    • free tool that identifies upstream motifs enriched for gene lists
    • works on a wide variety of species, so it is useful for motif finding in less commonly studies organisms
  • Whole Genome rVISTA - calculate enrichment of transcription factor motifs predicted based upon evolutionary conservation
  • TRED (Transcriptional Regulatory Element Database) - database from CSHL for transcription factors.  Includes target gene lists for transcription factors in human, mouse, and rat
  • TRANSFAC - database of transcription factor motif sequences.  There are commercial and open-source versions of the database
  • JASPAR - open-source database of transcription factor motif sequences
General RNA-Seq Information:

Microarray Annotation Resources:
  • NetAffx
    • Affymetrix resource for probe design information
    • registration is free but required
  • GeneAnnot
    • an alternative resource for Affymetrix probe annotations

No comments:

Post a Comment

Creative Commons License
My Biomedical Informatics Blog by Charles Warden is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 United States License.