Thursday, June 14, 2012

Reformat 23andMe Data for SeattleSNP

Step #1: Download Raw Data from 23andMe

  • After signing into 23andMe, first go to "Account" (in the top right hand corner of the screen) and then "Browse Raw Data"
  • Click the link near the top of the page to "download raw data"
  • Choose "All DNA" for your data set, and then click "Download Data"

Step #2: Reformat Raw Data

  • Download the perl script 23andMe_to_SeattleSNP.pl
  • There is one parameter that you need to enter:
    • genome = raw data file from 23andMe
  • PC Users
    • Open a terminal window (type "cmd" in Run, for example)
    • Move to the folder where your 23andMe data is saved.
      • Basic commands:
        • cd = change folder
          • For example, If the data is in your D:\ drive, you can type "cd \d D:"
        • .. = move up one folder
    • Type in "perl 23andMe_to_SeattleSNP.pl" and enter the required genome parameter. See example below  (click to enlarge) .

  • Mac Users
    • Open Terminal (in Applications/Utilities, for example)
    • Basic commands:
      • cd = change folder
      • .. = move up one folder
    • Type in "perl 23andMe_to_SeattleSNP.pl" and enter the required genome parameter. See example below (click to enlarge).


Step #3: Upload Data to SeattleSNP

The 23andMe SNP data currently uses NCBI 36 / hg18.  You can confirm if this is still the case by using a text editor like Notepad++ to view the raw data.

There are a few different portals to access SeattleSNP annotations, but you will need to use this link if the 23andMe data is currently using NCBI 36 (as of today, NCBI 37 / hg19 is the latest genome build): http://snp.gs.washington.edu/SeattleSeqAnnotation/

  • Enter your e-mail address
  • Select the file created by the perl script.  It should be almost identical to the genome file, but it will say _SeattleSNP.txt at the end of the file
  • This file conforms to the "custom" format, so please select "custom" under "input file format" and enter the following information
    • Chromosome: 2
    • Location: 3
    • Reference Allele: 0
    • First Allele: 4
    • Second Allele: 5
  • Click the green submit button
  • It may take several hours to annotate your 23andME SNPs.  You will recieve an e-mail message when the annoted file is ready to download.
I have tested my perl scripts on a PC and Mac, but I cannot guarentee that they will work on every possible platform.   Also, these scripts may need modifications as file formats change, but I have currently confirmed that my scripts work with v2 and v3 arrays using genomes from Genomes Unzipped.  If you have any questions or comments, please post them below and I will do my best to help troubleshoot.

No comments:

Post a Comment

 
Creative Commons License
My Biomedical Informatics Blog by Charles Warden is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 United States License.