Wednesday, March 13, 2013

Bioinformatics 101: Genomic Databases

Genomic Annotations:

Systems Biology Databases:
  • Gene Ontology (GO)
    • Database of functional annotations for protein-coding genes
  • KEGG - Kyoto Encyclopedia of Genes and Genomes
    • primarily used as a pathway database
  • IntAct - database for protein-protein interactions
  • Reactome
  • Regulome Explorer - software to visualize integrative genomic data from the TCGA project
  • BioGRID - database of genetic and protein interactions
  • MINT - protein-protein interaction database
  • STRING - database for known and predicted protein-protein interactions
  • STITCH: database of drug-protein interactions

Microarray / Sequencing Databases:

  • GEO - microarray database
  • ArrayExpress - microarray database
  • SRA - sequencing archive; entries are often also indexed in GEO
  • BioGPS - similar to NCBI Gene, but also includes normal tissue expression levels (from microarray data)
  • TiGER - tissue-specific gene expression database
  • CellMiner - query NCI-60 cell line data
  • TCGA Data Portal - integrative genomic data for large cancer datasets

Genomic Variation Databases:


Disease-Centric Databases:

  • General
    • OMIM - Online Mendelian Inheritance in Man
      • database of human diseases
    • SIDER - EMBL side effect database
  • Cancer
    • cBioPortal
      • User-friendly interface for querying cancer datasets (including TCGA data)
    • TCGA - The Cancer Genome Atlas
      • includes microarray and sequencing data
    • Oncomine
      • database of gene expression and copy number data from patients
      • basic access is free, but license is required for premium access
    • caArray - NCI Cancer Database

Protein Databases:

No comments:

Post a Comment

 
Creative Commons License
Charles Warden's Science Blog by Charles Warden is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 United States License.