Digging Deeper into my Cystic Fibrosis Carrier Status

One overall goal from the various subfolders on the DTC_Scripts repository was to get an idea about how much the data / results could vary between vendors.

I suppose some people might consider it surprising that the "raw" genotypes/variants could vary, but I previously discussed that in a post about re-processing raw data to get more concordant genotypes (and I also have a post about tools to make HLA assignments among this collection of posts).

Some things, like ancestry, may arguably fall under what I would call "hypothesis generation" results, in that some results may be more robust than others (and limitations to the accuracy of specific ancestry assignments are described in another post).

In contrast, this post focuses on something that I think can be utilized with relatively greater confidence (that I am a cystic fibrosis carrier).

That said, in an sense, making sure you get single-gene, rare-disease genomics analysis consistently correct is more complicated than you might expect.  However, in terms of being confident about any genomics result, I think rare variants associated with Mendelian diseases should be a strong point for genomics benefiting society.

So, here is the outline of what happened:

  • In 2011, I was genotyped (with the V3 chip) by 23andMe 
    • This indicated that I was a cystic fibrosis carrier
  • While some carrier status results have been removed (and added back in), I knew my carrier status before there were any issues with the FDA.
  • In 2016, I got Veritas Whole Genome Sequencing raw data (and a GET-Evidence and ClinVar report from the Personal Genome Project)
  • In 2017, I got Genos Exome raw data with an automated report
    • Update (3/17/2020): When I currently sign into the Genos browser, I see my pathogenic variant annotation in the CFTR gene.  I am not sure when/if this was changed, but the report does now successfully show multiple references that are correct for my cystic fibrosis carrier status.
  • In 2019, I ordered a bunch of extra tests (primarily emphasizing the interpretation over the raw data), but this included Helix Exome+ data from the Mayo GeneGuide (the raw data cost extra, and was a gVCF).
  • So, I had 3 high-throughput sequencing results that covered my cystic fibrosis variant.  However, none of them indicated that I was cystic fibrosis carrier in a way that was immediately obvious, and I think at least one (Mayo GeneGuide) failed to report my cystic fibrosis status (even when covering a smaller number of diseases).
    • You can see my FDA MedWatch / MAUDE report for Mayo GeneGuide in MW5093889Helix sent me an e-mail that Mayo GeneGuide was discontinued on 4/30/2020, which you can also see on this website.
    • There are some extra formatting changes that I wasn't expecting, but you can also see my FDA MedWatch / MAUDE report for Veritas Genetics in MW5093888.  That said, I was describing my Personal Genome Project report (since I ordered the sequencing through the PGP) and I don't think Veritas specifically marketed annotating my cystic fibrosis status.  So, it might be OK if it is harder to find this report for Veritas Genetics through the search function.
    • I was particularly surprised by this for GeneGuide, since they limited the number of diseases they officially tested for (which I think was a good idea).  However, their guidelines for defining a pathogenic variant didn't include the variant covered by the 23andMe array.
    • It might also be worth mentioning that an on-line physician signed off of these other 3 results, but that didn't improve the accuracy of my cystic fibrosis carrier status.
  • With the 23andMe result, I could check the details of the variant they used to define me as carrier.  Namely, I could verify my carrier status for rs121908769 in ClinVar.
  • I might be forgetting the exact order of events after that.  However, the following gave me extra confidence that my earliest 23andMe result was in fact the "correct" one.
    • I could visualize my alignment in IGV (for my Veritas WGS and Genos Exome data) to see that I did in fact carry the variant (see below).
    • While a lot less intuitive to visualize, the Helix Exome+ data (which I had to pay extra for, beyond my GeneGuide results) also indicated that I had the variant in question, and IGV does accept a gVCF as an input file (see further below, under the .bam visualization).
    • I used the above data in response to a question on Biostars, I was particularly pleased to discover that I got feedback that helped me gain confidence in my own result.
      • For example, I learned about a website called CFTR2, which provides information unique to cystic fibrosis and the CFTR gene.
      • Specifically, this specialized website indicated that my 394delTT variant should be considered pathogenic for cystic fibrosis (if you have two pathogenic alleles).  Please note that you have to check usage agreement to view the specific result linked above.
      • I also discovered some formatting issues that I believe was responsible for at least one false negative.
    • In other words, all 4 results correctly indicated that I had the variant.  The only issue was with interpretation of that variant (which was "correct" for 1 out of 4 results). 
    • I thought I talked to multiple genetic counselors, but my GeneGuide notes indicate that the genetic counselor from PWNhealth agreed that the above information indicates that I was a cystic fibrosis carrier (even though I believe they were providing guidance for a result that more formally incorrectly indicated that I was not a carrier).

Veritas WGS /  Genos Exome BAM (Provided + BWA-MEM Re-Alignment)

Helix Exome +  / Mayo GeneGuide (gVCF)

In many ways, I still consider this a positive experience.  For example, note the following:

  • Having access to raw data allowed me to determine something that was incorrect / missing in my original report (and I think this should essentially be required)
    • That said, I hope the screenshots above show that FASTQ+BAM+VCF is probably a better format to require providing, rather than gVCF
  • Notice, I got free feedback in a public community forum (Biostars) that helped provide me information that I didn't obtain from any of the companies that I paid for genotyping / sequencing.  This emphasizes the value in having free options for re-analysis / re-processing of your data.
  • While it might require some additional training, sometimes simply viewing your data in IGV (a free genome browser) may be helpful for genetic counselors to assess the accuracy of individual genotypes.
    • While it makes life more difficult, the majority vote (3/4 companies, if you count as I did above) would actually be the wrong answer (falsely indicating that I as not a cystic fibrosis carrier).  So, kind of like I can tell that I need to work on fewer projects more in-depth, I think it probably helps to have specialization for genetic counselors (so, they can have an idea about what questions to ask, beyond what is provided in a short report).
  • I successfully learned (somewhat) more in-depth about a carrier status that could impact offspring (if my partner was also a carrier).  If planning to have a child should be decided on the scale of years (or you are assessing life-time risk for diseases with onset later in life), then taking some time to understand your genome on the scale of years may be OK (although, if you use IVF+PGT, you do need to make sure that the pre-defined variants are missing with high accuracy, on a shorter time-scale)

That said, I do think it is important to have realistic expectations about what can be done in genomics, and the need to spend a non-trivial amount of time sorting out the details for your area of expertise.

