Thursday, June 14, 2012

Filter Combined Annotations for 23andMe SNPs

Step #1: Prepare Inputfile
  • List of 23andMe SNPs with both SeattleSNP and GWAS Catalog annotations (click here for details)
Step #2: Filter List of SNPs

  • Download the perl script 23andMe_filter.pl
  • There is one parameter that you need to enter:
    • input = file containing 23andMe SNP file with SeattleSNP and GWAS Catalog SNPs (see here for more details)
  • There is 5 optional parameters that you can enter:
    • output = output file containing filtered SNP lists.  By default, _filter.txt is appended to the end of the input file
    • OR = odds ratio cutoff (filter for scores greater than cutoff) [default = 2]
    • PAM = PAM score cutoff (filter for scores less than cutoff) [default = 0]
    • risk_status = status for GWAS Catalog risk allele,  Either "Homozygous", "Heterozygous" (which actually filters for both homozygous and heterozygous risk alleles), or "none" [default = "Heterozygous]
    • allele_freq = set of parameters to describe allele frequency cutoff.  If provided, parameter must be the following format [genetic background]_[comparison type]_[threshold]  For example, European_gt_0.25. [default = "none?]
      • Genetic background can be "European", "African", and "Asian"
      • Comparison type can be "gt" for greater than or "lt" for less than
      • Threshold corresponds to the population frequency.  Must be between 0 and 1.
  • PC Users
    • Open a terminal window (type "cmd" in Run, for example)
    • Move to the folder where your 23andMe data is saved.
      • Basic commands:
        • cd = change folder
          • If the data is not in your C:\ drive, you can type "cd \d D:"
        • .. = move up one folder
    • Type in "perl 23andMe_filter.pl" and enter the required input parameter. See example below  (click to enlarge) .
    • You can also enter in optional parameters (OR, PAMrisk_status , and/or  allele_freq ).  See example below  (click to enlarge) .

  • Mac Users
    • Open Terminal (in Applications/Utilities, for example)
    • Basic commands:
      • cd = change folder
      • .. = move up one folder
    • Type in "perl 23andMe_ filter.pl" and enter the required input parameter. See example below  (click to enlarge).




    • You can also enter in optional parameters (ORPAM,  risk_status , and/or  allele_freq ).  See example below  (click to enlarge) .


I have tested my perl scripts on a PC and Mac, but I cannot guarentee that they will work on every possible platform.  Also, these scripts may need modifications as file formats change, but I have currently confirmed that my scripts work with v2 and v3 arrays using genomes from Genomes Unzipped. If you have any questions or comments, please post them below and I will do my best to help troubleshoot.

No comments:

Post a Comment

 
Creative Commons License
My Biomedical Informatics Blog by Charles Warden is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 United States License.