Monday, November 18, 2013

My American Gut Individual Report

I recently received my American Gut Individual Report for my fecal sample in the mail.  Click here to see this report as a PDF.

The first thing that I noticed was that the phylum distributions were very different than what I calculated from my raw data (click here to see those results).  Namely, ~75% of my reads aligned to Proteobacteria 16S rRNAs, but my individual report said that ~10% of my sample was from Proteobacteria.  So, I contacted the American Gut team to ask why the results were so different.  They said that it appears the shipping process allows differential growth of certain bacteria (especially Gammaprotoebacteria) that would not normally appear in fresh samples.  So, they filter out likely contaminants for the report.

This is something that I would like to learn more about, and the American Gut team also said that they are actively investigating this.  The filtering code is available on the American Gut GitHub website, but I am most curious in seeing population-level metrics about this differential growth.  Namely, I would like to see how the reduction in false positives is affecting the true positive rate.  At least in my case, ~70% of my reads are being filtered out as potential contaminants, which seems like a loss of a lot of information. Additionally, my filtered sample seems to cluster with samples that have relatively low Firmicutes counts (much closer to the original value of ~15%, see PCA plot in lower-right hand corner), when the report says the percentage should be ~65% Firmicutes (bar plot).  In other words, my guess is that the "best" interpretation of my results may lie somewhere between these two reports (where the true Firmuicutes abundance may be lower and the true Proteobacteria abundance may be higher).

That said, I know the American Gut analysis is ongoing, and there will likely be additional future reports.  For example, I know for certain that there will eventually be a report for my oral sample.  It will be interesting to see if the results of future reports change as new scientific findings are discovered.

No comments:

Post a Comment

 
Creative Commons License
My Biomedical Informatics Blog by Charles Warden is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 United States License.