Tuesday, December 20, 2022

Metagenomics Classifications Across Human And Cat Samples

As previously described, part of the reason why I collected a 2nd Whole Genome Sequencing (WGS) sample from Basepaws in order to have raw data to re-analyze for 2 time points (which include Dental Health Test reports).

One of the points raised in that blog post was to improve coverage for germline variants.  I do see some evidence for improvement in Bastu's heterozygous dilute variant.  However, I primarily wanted to better understand and assess the metagenomics classifications (related to the Oral/Dental Health Test), and those are the primary focus for this blog post.

The exact format for the FASTQ files for the Basepaws raw data was different for the 2nd WGS sample than the 1st WGS sample.  The 1st WGS sample had a format more similar to what was provided by other companies, so I wrote a script to confirm the file format for the 2nd WGS samples.  This requires some familiarity with programming (even if mostly to comment or uncomment lines to run), so I am not sure how much it will help with reformatting for all customers.  Nevertheless, you can see the code and details for the file reformatting and the analysis below here.  A summary of data that I am sharing (and I would be interested to see from others) is provided here.

There are different strategies for metagenomics analysis.  For Basepaws, I believe that either regular coverage WGS or low-coverage WGS (lcWGS) is being used for oral microbiome classifications.  There are some human fecal samples where I describe the details of the sequencing design for samples collected from multiple companies here.  The human oral samples are also a mix of different companies and different sequencing design (such as WGS/lcWGS versus 16S Amplicon-Seq).  All cat fecal samples come from PetQCheck.

While I have not looked in as much depth to the sequencing type and re-analysis details for the other raw data, I have at least one oral and at least one fecal sample for myself and my cat Bastu.

The oral samples for myself are from uBiome, Bristle Health, and any WGS data collected for any purpose (Veritas, Sequencing.com, Nebula, etc.).  The oral samples for my cat Bastu only come from Basepaws.

The fecal samples with raw data for myself come from uBiome, Psomagen/Kean, and thryve/Ombre.  The fecal samples for my cat Bastu only come from PetQCheck.

For previous human comparisons, the company / sequencing design had a noticeable effect on the results.

However, if I go back and use a method that is at least capable of providing some classification for all of the sequencing types (Kraken2/Bracken), then I can re-analyze all of those data types together and visualize how they cluster:

I have added clustering for the species (human or cat) and the collection site (oral or fecal).  If Pearson Dissimilarity (1 - Pearson Correlation Coefficient) is used as the distance metric for clustering, then samples cluster primarily by sample site (oral versus fecal).  For example, all fecal samples intended for microbiome analysis do tend to have Bacteroides classification rates.  

Additionally, Veritas only provided reads aligned to human chromosomes in a BAM (Binary Alignment Mapping) alignment file that I converted to paired-end FASTQ files (the reads used for the classifications above), so I expect all accurate assignments should be relatively close to 0% for that sample.  So, this may be one reason to be skeptical about differences in the Basepaws WGS samples with this classification method, which show differences from each other and vary at a level similar to the Veritas samples should really close to 100% true human reads.

I also created another heatmap, only looking at assignments with the Top 20 genera assignments with highest average read fractions (calculated as percentages after removing Eukaryotic reads, which is also how the values for the first heatmap was calculated):

In that view, it looks like there might be additional Prevotella classifications for the cat fecal samples.  However, I have not looked into the details of the PetQCheck sequencing design, to make sure it matches at least one set of human samples.

Additionally, independent of overall clustering, I think it is encouraging that there is greater detection of Faecalibacterium in the fecal samples than the oral samples (for both species).

Also, unlike the human oral sample that should have only contained human (hg19) reads, the human oral samples showed some similarities that were noticeable beyond the differences in sequencing design.

For advanced readers, you can see some additional testing for alternate normalizations here.  However, I think the main conclusions are the same, and (if anything) there is possibly a less clear distinction for the negative control approximation (and the Basepaws samples) with the other normalizations.

Also more for the advanced readers, but you can also see some troubleshooting where I looked for even coverage in representative sequence alignments here.  If a classification is made for a large fraction of reads but coverage is uneven, then I would say those classifications are more likely to be false positives.  For example, in the above images (or here), you can see some higher reads given Staphylococcus and Pseudomonas assignments in the negative control approximation.  Those representative genomes also have uneven coverage in these Bowtie2 alignments or these BWA-MEM alignments.

Finally, as yet another option for more advanced readers, you can see some alternative methods of running Kraken2/Bracken.  Perhaps of note, there is supposed to be an implementation of KrakenUniq within later version of Kracken2 (referred to as "Kraken2Uniq", where I added the parameters "--report-minimizer-data --minimum-hit-groups 3" based upon this protocol):

Without adding the minimizer parameter for the KrakenUnique implementation within Kraken2, increasing the minimum number of hit groups to 10 ("--minimum-hit-groups 10") also reduces similarity between the approximate human negative control and the Basepaws oral Whole Genome Sequencing samples (at least with the latest versions of Kraken2 and Bracken):

Using a larger database can substantially reduce the run-time.  However, there is some additional information uploaded on GitHub for that option, if using a reduced set of total reads for some samples.  You can see a large difference in the Komagataeibacter classification rate for the larger database versus the more stringent requirements for the smaller "MiniKraken2 database.

In other words, I think these are important take-home messages:

  • Human and cat fecal samples form a cluster for that collection site, and they separate by species within that cluster (although PetQCheck is the only source for any cat fecal samples).
  • The Veritas sample was provided to me as chromosome alignments, rather than reads.  By definition, this means that the reads were aligned to the human genome, so I consider "Vertias_WGS" to be a negative control approximation.
  • Both the human fecal and oral samples contain a mix of Whole Genome Sequencing (WGS) and targeted sequencing for the 16S bacterial gene (16S Amplicon-Seq).  So, when these samples cluster together, I consider that a relatively robust result (even if there are also additional factors contributing to the exact abundance values, such as the company and the sequencing design).
  • On the contrary, the 2 Basepaws samples are less similar to each other than the different companies / libraries for my human oral samples.  Additionally, if there no no additional troubleshooting, then there is relative clustering between the Basepaws oral cat samples (particularly Basepaws_WGS2) and the human negative control approximation.

In term of trying to understand the Basepaws cat oral samples, there is a noticeable negative shift in the fragment size estimated by Picard (particularly for the first sample):

If the classifications can be reproducibly run in the same way, then I hope sharing reprocessed data might help with helping develop more of an opinion of the cat Oral/Dental Health Test from Basepaws.  I have started a discussion thread here.  In the Basepaws oral health test preprint, I also have additional comments, and I would be interested to hear feedback from others.  My understanding is that Kraken2/Bracken was also used in the preprint, but I think something about the exact way that the program was run was different.  Also, if I understand correctly, then I don't think the program can take the paired-end read information into consideration if the file is in an interleaved format (as provided for my cat's 2nd Basepaws WGS sample); however, I am not sure how important that is.  Within the GitHub discussion, I also provide some information for my human oral microbiome report from Bristle Health.

I would say the Basepaws Oral/Dental Health Test risk assessments were not a great match for my cat's individual dental health assessed by the vet.  However, I understand that the risk assessment is not meant to provide a diagnosis, and I am therefore still not completely sure what I think about the results.

That said, I feel comfortable saying that I wish raw data was provided for all of the Oral/Dental Health Test results from Basepaws (not only for the Whole Genome Test).

So, I hope this contributes some interest and encourages discussion on the topic!  I have some notes for the Basepaws Oral/Dental Health Test here, but I would be interested in seeing re-processed raw data with similar classifications (if possible).

Change Log:
12/20/2022 - public post date.
1/1/2023 - add content related to Kraken2/Bracken troubleshooting + revised take-home messages.
1/4/2022 - add note for Komagataeibacter assignments in Basepaws WGS1.

Saturday, July 2, 2022

Selected Comparisons for 2 Basepaws Whole Genome Sequencing Results

I already had an earlier Whole Genome Sequencing test from Basepaws, along with other results that can be viewed here or here.

However, I decided to order another test for the following reasons:

  • I would like to have some more raw data in order to help be better evaluate what I think of the Dental Health Results
  • 15x is a bit low, even though I think the alignments for what I checked looked OK.  So, if I am correctly understanding that the goal is to sequence ~20x (with small fragments such that the genomic coverage can be closer to 15x), then a 2nd set of raw data would get me closer to the 20x-30x that I think is more commonly considered "regular" coverage for inherited/germline variants.

When I received the PDF reports, I noticed something important that I brought to the attention of Basepaws.  After a lengthy (but timely) discussion, it was agreed that something was incorrect in the report.  My understanding is that this has been reported by at least one other customer, but Basepaws previously not realize the problem with the variant calling strategy (where I believe the earlier emphasis was on trying understand unknown biology, instead of the discussion's shifted emphasis on troubleshooting presentation of earlier findings).

So, due to the fact that this currently affects at least some other customers, I thought I would try to but a blog post together relative earlier than originally planned (with a shift in the intended emphasis, for now).

Reports / Data

1st Sample (Collected 3/14/2019):

Report (as of  6/15/2022): PDF

Report (as of  8/28/2022)PDF (includes corrected "C>T" variant)

FASTQ Read #1HCWGS0003.23.HCWGS0003_1.fastq.gz

FASTQ Read #2: HCWGS0003.23.HCWGS0003_2.fastq.gz

2nd Sample (Collected 4/21/2022):

Report (as of 6/13/2022): PDF

Report (as of  8/28/2022)PDF (includes corrected "C>T" variant)

Interleaved FASTQ #1 (Run 186, Lane 1): AB.CN.45.31211051000777.LP.858.D9.L1.R186.WGS.fastq.gz

Interleaved FASTQ #2 (Run 186, Lane 2)AB.CN.45.31211051000777.SP.319.D1.L2.R186.WGS.fastq.gz

Interleaved FASTQ #3 (Run 195, Lane 2)AB.CN.45.31211051000777.SP.329.E1.L2.R195.WGS.fastq.gz

As mentioned on GitHub, I will most likely work on writing custom code to create reads in a similar file format as for the 1st WGS sample (which I prefer).  At that point, I will add additional links, but the originally provided data is still important.

Incorrect Colorpoint Mutation in Certain v4.0 Reports

(Basepaws Has Corrected Bastu's Reports, as of 8/28/2022)

    This is what I thought was important to communicate sooner rather than later.

    I was not expecting this to be something to pop-up, and I will try to explain some of the details in a bit.  The difference between the report and the data is important, but the issue can also be identified without that.

First, my cat was (incorrectly) listed has being likely to have the colorpoint trait in both reports:

As of 8/28/2022, the ability to tell that a report was more difficult than I expected because both sets of screenshots below are labeled as being version 4.0 reports:

Nevertheless, even if looking for the presence of the "black coat color result is not sufficient to tell if the "C>T" colorpoint variant has been corrected, I hope the content of this blog post can help you determine if a given v4.0 is likely to already have the correction or not.

I contacted Basepaws for the following reasons:

  • Even though you can have a Tortie Point cat, my cat does not have the colorpoint trait.
  • I have submitted samples to essentially all cat genomics organizations / companies that I was aware of.  While the markers could theoretically be different, the other results indicated that she has 0 copies of colorpoint mutations.  So, I wanted to check if the same variant was being tested.
First, here is a photo of my cat (Bastu):

So, I think is reasonable to say she does not have the colorpoint trait.  There is also an additional illustration here (from this page) that I also think helps explain the colorpoint inheritance pattern.

There is also some discussion regarding traits in this Basepaws blog post, although I prefer the explanation for this page from UC-Davis (even though the result that I will show below are for the "Cat Ancestry" rather than the separate colorpoint test).

In terms of the UC-Davis VGL "Cat Ancestry" results for Bastu, you can see the trait marker results below:

The "C/C" result indicates that Bastu should neither have the colorpoint trait nor be a colorpoint carrier.

In terms of Wisdom Panel, you can see the results for some colorpoint markers below:

Bastu's Wisdom Panel results are split among multiple PDF exports of webpages.  However, the other traits that can be viewed in the same part of the interface can be seen here.

After discussion with Basepaws, my understanding is that the specific C>T mutation is a well-characterized variant, also called the "Siamese" colorpoint variant (with Basepaws staff passing along the Lyons et al. 2005 publication via e-mail).

There are some additional details from using raw data here, including selection of felCat9 chromosomal location to use.

However, after those additional troubleshooting discussions, the alignment to the best matched region of the felCat9 reference genome (with data from my cat's first WGS sample) is shown below:

As a matter of personal preference, I think alignments similar to shown above most clearly show the lack of colorpoint variants (from this position).  However, you can also see that from the Amplicon-Seq data that was used for the report (even though these are Whole Genome Sequencing kits).  As a special exception, Basepaws provided a file that they also used to say the future reports should report 0 copies instead of 2 copies.

I have a version of that file imported into Excel with highlighted 100% sequence matches from the Amplicon-Seq data that I uploaded here.

I later realized that there is a 1 bp difference for the felCat9 for the amplicon (the "T" at the end should have been a "C").  The strand of the amplicon is the opposite of the reported nucleotide, but the felCat9 alignment is on the positive strand for the amplicon.  In contrast, the amplicon aligns on the negative strand with respect to felCat8, but the difference is still at the end of the amplicon (which appears at the beginning of the alignment).

So, now, using the raw data, you can tell that the results in future Basepaws reports will need to be changed to reflect that my cat has 0 copies (instead of 2 copies) of the Colorpoint mutation.  The lack of mutations now matches her fur color pattern!

Of course, if you also have discrepancies between results from different sources, then you should also contact everybody who is relevant.

If this is a health marker (instead of a trait marker) and you can confirm the cause of the discordance in either direction for the same variant (either a false positive or a false negative), then I would also recommend that yous end an e-mail to the Center for Veterinary Medicine at the FDA.  I believe that some technical knowledge is needed to upload data and it is most commonly used for human data, but I have confirmed that veterinary data for pets is allowed on precisionFDA.  Additional effort would be needed to configure an app for cat re-analysis, but all re-analysis of raw data is free through the precisionFDA interface.  So, some parts are arguably not ideal for the average consumer, but my point is that a framework to share such data exists and having access to raw Whole Genome Sequencing data was important for resolving the troubleshooting described above.

Health and Trait Variant Calls

(for current Whole Genome Sequencing Samples)

    In my opinion, I believe that the availability of raw data is a unique advantage to Basepaws (at least if you order the Whole Genome Sequencing kit).  The process of returning that raw data took some follow-up on my part, but I don't know how many customers intentionally order the kit to be able to receive raw data.  I also preferred the fastq.gz formatting for the 1st WGS sample over the 2nd WGS sample, but I am not sure if that might further change in the future.

    So, it was to my surprise, the Whole Genome Sequencing data was currently not being used to call the colorpoint variant for my cat.

    My understanding is that is also true for the other health and trait markers currently provided.   As far as I know, the Amplicon-Seq results are usually OK.  However, I hope that either changes or the use of different data types is more clear in the report (even for the "Whole Genome Sequencing" kit), because I was expecting variant calls from the Whole Genome Sequencing data.

    There was a problem with using the Amplicon-Seq data for the colorpoint mutation, but my understanding is that programming problems to interpret the raw data were a factor.

    Even though Basepaws went back and generated a new Amplicon-Seq library from leftover genomic DNA for my 1st sample (before the Trait markers were offered), I will not be provided the raw Amplicon-Seq data for either sample.  My understanding is that this is due to the multiplex design that could be determined from the FASTQ files.  I don't think this is ideal.  However, as long as the data types are not mixed in one library, I am OK as long as I do receive the raw WGS data.  Again, if anything, I would argue the Whole Genome Sequencing (WGS) data appears to be the preferable way to at least evaluate the C>T "Siamese" colorpoint mutation.

Ancestry Results

(uses Whole Genome Sequencing Data)

You can see the results from the 2 Whole Genome Sequencing samples below:

There are some differences, but I think this is starting to get relatively closer to the limitations of my human results.  For example, I think my cat mostly has no Exotic purebred relatives (and 1 of 2 reports for the same cat have a 0% Exotic estimate). 

However, you can definitely tell that the higher coverage data helps.

Here are earlier plots comparing the low coverage Whole Genome Sequencing (lcWGS) results to my 1st Whole Genome Sequencing result:



You can clearly see additional problems in the ancestry results for the less expensive kits, with much lower coverage data used to impute genotypes.

I am not sure how much the relative revision of genomic reference sequence for the cat versus human may also matter, but I would guess that might have some impact on making a fraction of the genotypes harder to impute (and, therefore, the ancestry harder to predict).

This also gives me the impression that the new reports may only be providing the "possible" results (without the option to view the "confident" results that I prefer).  However, I will follow-up with Basepaws on that.  Either way, I would say the higher coverage data helped, but I think that is most clear with the "possible" confidence results.

Nevertheless, even at whatever is used as the current setting, you can see my cat has mostly Western and Polycat ancestry.  That is consistent with what I see from other companies and organizations for the same cat.

Eukaryotic Metagenomic Results

(can vary over time)

You can see the results from the 2 Whole Genome Sequencing samples below:

I am not currently sure what to think about this.

Dental Health Scores

(can vary over time)

You can see the results from the 2 Whole Genome Sequencing samples below:

This is something that I hope that I can develop more of an opinion about over the next year or so.  For example, I have started a GitHub discussion where I list some things that I would like to look into and I certainly welcome additional feedback.

Similarly, I have already started a subfolder on this topic on GitHub that contains data/results and my thoughts so far.

One one hand, Basepaws has increased by general awareness of dental health in my cat.  For example, I have started discussions with my vet, and I think that is good.

However, I do not yet have a complete opinion regarding the Dental Health Test results.  For example, I would not actively recommend the Dental Health Test, but I am also not currently confident that there is a problem with the results that I can clearly explain.

Nevertheless, there are a few things that I would like to point out or comment about:

1) As I understand it, I think my cat has some of the highest "High Risk" scores that are plotted.  However, after multiple discussions with the vet, my understanding is that I should describe my cat as being in very good dental health for a cat her age.

My cat has been started on prescription dental diet, although my understanding is that is primarily a preventative measure given her age (and possibly my questions about dental health).  I will also take my cat in for the recommended yearly cleanings, and I believe that the vet confirmed that she doesn't currently need to come in more often.

Also, I believe that Basepaws has confirmed in an e-mail that the goal is to estimate risk and not to provide a diagnosis.  In other words, you can have a cat with a serious problem with low risk in the Basepaws Dental Health Test, and you can have a cat without a serious problem with high estimated risk.  My understanding is that cat falls in the second category.  The point is that something else needs to be performed by the vet to determine if your cat has a problem.

2) I saw an e-mail from 6/8/2022 titled "See what saved this kitty's life..." with a link to a blog post (which I believe was posted on 7/12/2021).

My initial reading and what I understood with  re-reading were different.  Because the test was collected considerably before any problems were observed, I think that is less of an concern.

However, what could have concerned me is if somebody noticed their cat had bad breath, ordered a Basepaws Dental Health Test, and then went to see the vet after receiving results from Basepaws 6 weeks later.

In other words, theoretically, I think this could have increased the risk of harm if there was an urgent problem.  The reason is that the Basepaws Dental Health Test can't tell you if your cat has a problem or not, and I would say you should start the process of trying to see your vet as soon as you notice a problem (not after waiting for a results that make take 6 weeks or more).  I could imagine a vet wanting to see if a symptom like bad breath persisted for some time before suggesting scheduling an appointment to bring the cat into the vet.  However, I would guess this waiting time might be more on the scale of days instead of more than a month?

For example, my understanding is the other cat was at "medium" risk in all 3 categories, which I would interpret as meaning the cat was at lower risk for than my cat (2 "high risk" scores and 1 "low risk" score for bad breath).  However, my cat had no serious dental health problems, and this cat with what I would consider to be at lower risk had a serious dental health problem.

I also had a question of "medium" risk versus "average" risk, but I have moved that to a footnote1.

So, I would at least consider bad breath a reason to call the vet and ask if the cat should come in.  I would not order a Basepaws Dental Health Test because my cat has bad breath (or any other novel symptom causing me concerns), and then wait for the results before scheduling an appointment with the vet.  I have also confirmed that Basepaws agrees you should not delay seeing your vet in order to obtain a Basepaws Dental Health Test result when you see a worrying symptom.

To be fair, I believe that I have seen at least one cat diagnosed to have a dental health problem with 3 "high risk" scores on the Facebook Basepaws Cat Club.  So, I hope that is helpful in getting an overall sense of the results.  However, I think the above examples are also important to take into consideration.

3) I think a well-characterized disease variant that has a second validation from the clinic could be different.  However, for the reasons above, I would have preferred that the Dental Health Risk scores were not added to my cat's medical record because of the discussion most immediately above.

However, I think the local vets have started to get to know my cat as an individual patient, so I would guess this is probably not an issue unless I had to see another vet (and that vet gave too much emphasis to those scores).

4) I remember previously reading about critiques from professors of veterinary medicine regarding the Basepaws Dental Health test in this Los Angeles Times article.  I have also added some comments on the preprint for the Basepaws Dental Health test, and I hope that others with relevant experience will do so as well (and I hope that there can be responses to comments from others).

However, again, I would certainly like to learn more!  You can either provide feedback as a comment, or there is a "discussion" enabled on the GitHub page.


1When I asked about what represented "average" risk, Basepaws provided a link to the Dental Health Test whitepaper.  This is similar but not identical to the preprint.  So, I don't have a direct answer.  However, I can say that the preprint and the tables here were helpful in having discussions for my vet.

For example, I added an additional comment to the preprint regarding the positive predictive value estimation after talking to the vet.

The numbers in the whitepaper are slightly different (for example, there are 441 periodontal disease cats in the whitepaper and 570 periodontal disease cats in the preprint).  Nevertheless, if I use the preprint numbers, then I had possible estimates of positive predictive values for the combined high/medium group where I estimated the periodontal disease score might have positive predictive values of roughly 50% and the tooth resorption score might have positive predictive values of roughly 20%.

In every calculation that I attempt, the positive predictive value is noticeably lower for tooth resorption than periodontal disease.

For periodontal disease, I think the prevalence estimates and stage of the disease are important, but I wonder if might help to pose the question of whether something different than the standard recommended care should take place.  My current thought is to use the the vet's individual assessment (as independently from any estimate as possible) over the risk estimate, but I think conversations with your vet are worthwhile.  Basepaws clearly indicates the risk estimates should not be used as a diagnostic for dental health.

Change Log:
7/2/2022 - public post date.
7/3/2022 - add link to Dental Health Test GitHub discussion; also, as I went back to the possible confusion matrices that I created from the preprint to attempt to estimate the positive predictive value, I decided change wording for describing "medium" risk versus "average" risk as I look into the topic more; minor changes.
7/4/2022 - minor corrections; revise description of timing for other cat's dental health test.
7/7/2022 - update post after receiving e-mail response from Basepaws
7/10/2022 - add date to separate Basepaws e-mail from Basepaws blog post
7/16/2022 - formatting changes
7/17/2022 - after looking at 59 posted Dental Health Test reports, switch to only using the higher positive predictive value estimate.  The score distribution is different than the true prevalence, but that made me feel a little better about the possible positive predictive value.
7/20/2022 - minor changes + change footnote content
7/23/2022 - add additional UC-Davis VGL colorpoint link + minor formatting changes; add description of BLAT result with a minor difference from what I originally thought (mismatch at end, but not within BLAT hit); add link to product towards the beginning of the blog post.
7/24/2022 - re-arrange colorpoint background and provide photo of Bastu.
7/31/2022 - minor change to make Amplicon-Seq link earlier to find.
8/8/2022 - add link to Basepaws blog post acknowledging the fix.
8/14/2022 - add links to raw data for 2nd WGS sample (in format provided by Basepaws); update/modify section related to raw WGS data return and use of WGS data in reports.
8/16/2022 - formatting changes to make 2 hyperlinks easier to find.
8/23/2022 - upload alternative screenshot for Basepaws Siamese Coat Color result that is incorrect in the v4.0 report.
8/27/2022 - minor formatting changes
8/28/2022 - change tense for Basepaws correction, but also add content to explain how I have different v4.0 reports downloaded at different dates that are both before and after the "C>T" colorpoint variant correction; also, add links to reports that are downloaded more recently (with the corrected variant result).

Sunday, December 12, 2021

Human Metagenomics Comp 2021/2022: Overall Summary

Inspired by an earlier post that I don't believe is currently accessible (from several years ago), I tested collecting stool samples at the same time for multiple companies.  This sometimes includes the same sample submitted the same company at the same time.

The sample collection can be summarized as follows:





Stool 1


1 Gene & GutBiome

1 GutBiome+


2 samples

Stool   2


1 GutBiome+

1 sample

1 sample

Stool   3


1 GutBiome+

2 samples

1 sample

Stool   4


1 GutBiome+

1 sample

1 sample

Stool   5


1 Kean Gut

1 Ombre


Stool   6


1 Ombre

 1 sample

 There are additional details (including the reports and data) on GitHub.

The sequencing performed can be described as follows:

Psomagen GutBiome (within combined "Gene & Gutbiome"): 16S (V3+V4 region, PE300 reads)

Kean Gut16S (V3+V4 region, PE300 reads)

thyrve/Ombre16S (V4 region, PE150 reads)

Psomagen GutBiome+"Shotgun" Metagenomics (DNA-Seq, PE150 reads)

Kean GutBiome+"Shotgun" Metagenomics (DNA-Seq, PE150 reads)

ViomeMetatranscriptomics (RNA-Seq, unknown read length)

As verified by Ombre technical support, you can view an alignment from my 5th paired sample here.  The other V4 sequences were downloaded from the GitHub repository for code associated with Johnson et al. 2019.

I have a couple paired posts (here and here), but I hope this can provide a fairly good summary of my overall experiences:

Individual assignments are not provided within the web interface for Kean Gut, but assignments can be made with re-analysis of the raw data and you can see some of such analysis with mothur here and Kraken2/Bracken here.

Raw Data Return:

1) thryve/Ombre - automatically provides raw FASTQ files as well as table with classifications at various levels

2) Psomagen/Kean - provides FASTQ files if e-mailed (but not automatically)

3) Viome - does not currently provide raw data (even if e-mailed) and classifications have discrete assignments (not percentages of reads)

When raw data was available (either automatically or by request), I have uploaded the data in public links on Google Cloud.  You can see a table of files to download here.

The samples and companies / organizations are different.  However, if it might help to see metagenomics samples that were collected before any of the samples described in this blog post, then you can see links to download raw data here.

Post-Collection Bacterial Growth Suppression:

1) thryve/Ombre - A liquid is included, but I have not yet verified the contents of that liquid.

2) Viome - liquid with preservative ("[bacteria] are not being killed nor growing").  I am not 100% sure about the implications for using RNA to study "active" bacteria, but I think it should help over adding nothing.

3) Psomagen/Kean - no liquid in collection tube to prevent / suppress bacterial growth

I only collected 1 BIOHM sample (10/6/2021), but there was no liquid (and therefore nothing to prevent / suppress post-collection bacterial growth)

Sample Collection Options:

1) Viome - originally provided 2 sizes of stool collector, but I think this is now reduced to 1 stool collector.

2a) Psomagen - 1 stool collector

2b) Kean -  tissue paper (0 stool collectors, if collected by itself)

3) thryve/Ombre - tissue paper (0 stool collectors, if collected by itself)


1) Psomagen - $149 (with discounts - I paid $83.49, including $8.99 shipping, for 1 of my samples)

Kean splits options into ability to purchase separate Gut Health (for $99) and Gut+ Health (for $169)

2) thryve/Ombre - $199 (with discounts - I paid $99 before taxes, with free shipping, for at least 1 of my samples)

If you count the re-test discount (and that is still offered through Ombre), then I paid as low as $74.32 for 1 kit (with taxes).

3) Viome -  $299 (with discounts - I paid $129 before taxes, with free shipping, for at least 1 of my samples)

Result Turn-Around Time:

I think there is a limitation or lowest raking for each company, depending upon whether you define "turn-around time" for the kit, the results, the raw data, or for answering questions.

If it takes weeks to receive results that I think should mostly be considered hypotheses, then I don't think the slightly faster arrival of materials is really helping very much.  However, I consider returning raw data for re-analysis and answering questions from consumers to be very important.

Robustness of Results:

1) thryve/Ombre

2) Psomagen/Kean

3) Viome - noticeable variability results for samples collected at the same time

The explanation here is somewhat complicated.

As explained in a paired post, the signature/scores for collections from the same sample were better for thryve than Viome.  There are also some extra examples related to variability in the Viome results in this other paired post, but that is only for Viome.

Viome does not provide raw data and the data collected is different. So, only 2 signature/scores were available for thryve.  The Psomagen Gene & GutBiome kit used a different library design than the Psomagen GutBiome+ kit, so I don't have the replicates from the same sample that I intended.

There is also some information that I manually extracted from the 3 reports here (as well as on the highest level subfolder).  As mentioned earlier, you can also see some re-analysis of raw data (for Psomagen/Kean and thryve/Ombre, here and here).

Overall, I submitted an FDA MedWatch report Viome (for the currently commercially available tests described in this blog post), where the full draft is available to view here.  You can see the de-identified version in the MAUDE database under MW5106218.

To be fair, thryve/Ombre also have some food predictions that I would not place too much emphasis on.  However, Viome clearly has less consistency than the thryve/Ombre (positive and negative) recommendations.  If you download the thryve/Ombre PDF summary tables, then you should be able to access the links to the reports on Google Cloud (because they were too large for GitHub).

If you specifically purchase Kean Gut+, then there are some food recommendations.  I have not made changes based upon the results from any company, and I am not making any changes based upon any specific feedback from Kean either.  However, there were at least no recommendations to avoid eating food that I already find helpful or favorable.  I think the food recommendations also seem like mostly good ideas, regardless of any particular metagenomic result.  On the other hand, I am not certain what to say about the supplement recommendations for either Kean Gut or Kean Gut+.

A Note About Probiotic Detection:

The reason that I ordered a Viome sample for my 6th (but not 5th) paired sample is that I better understood the difference between the Kean Gut and Kean Gut+ kits, and I wondered if it was possibly useful to have a paired sample with untargeted DNA-Seq (for Kean Gut+) along with a Viome sample.

Viome still does not provide raw data, and I believe they only provide discrete specific assignments.  The Viome interface has changed recently, but I was still able to see those in a PDF file when I e-mailed myself to "Share My Results"  (under "Scores," and then at the bottom of the details for one of the scores)

Nevertheless, I could compare Kean Gut+ assignments and the "Active" status for Viome, and most assignments matched in that sense (everything above 1% listed in the table on this page match).

The one exception where I can have both Kean Gut and Kean Gut+ information for individual bacteria is for probiotics.  Lactobacillus was copied over from earlier collections, but I thought that might be interesting in that Ombre and Viome reported detection when both Kean Gut and Kean Gut+ reported a lack of detection.

I thought this was somewhat interesting because I have Activia with lunch on Monday to Friday.  The Wikipedia page mentions that other common probiotics are also present, but Bifidobacterium is mentioned as something specific to Activia.  So, I went back to check the original reports and update the GitHub table for the 6th paired collection, and all 3 companies report Bifidobacterium detection.

There are non-zero mothur read fraction assignments for Bifidobacterium for all samples except Psomagen for the 3rd paired collection (Psomagen3), and I can tell that Kraken2/Braken is also capable of making those assignments at the genus level.  I don't think this indicates what is present is specifically the proprietary strain, but I might guess eating yogurt on a regular basis might have some level of contribution.  I also don't think the bacteria has to be that proprietary strain in order to be helpful.

Closing Thoughts:

In general, my opinion is that is better to focus on a smaller number of things that are truly predictive than attempt to make a large number of claims (and have a lot of them not be valid).  I also think having access to raw data for re-analysis is very important.

I selected Psomagen because it was the company that purchased assets from uBiome.  However, there are notable differences in what Psomagen provides (such as no longer using a liquid to prevent post-collection bacterial growth).  Even though there were other ethical problems for uBiome, I think stopping post-collection growth was a good idea.  I also previously had the ability to sample multiple sites from uBiome, but that was not currently an option.  So, I don't think these company purchases/acquisitions necessarily mean that you can expect the same product.

I think I might have purchased a Viome kit because of an advertisement, but I believe that I am less likely to make future Viome purchases (if similar to what is currently provided).  I also do not plan to make additional BIOHM purchases.  My overall impression for Psomagen/Kean was somewhere in the middle, but I would guess that I am most likely to purchase Ombre in future.  For example, I posted Trustpilot reviews for all 3 companies: 4 stars for Ombre, 3 stars for Kean, and 2 stars for Viome.  However, to be clear, that is largely because of the automatic return of the raw data and the presence of some sort of liquid that I assume helps reduce post-collection growth.  That does not mean that I like or approve everything within the Ombre results.

For example, I am not sure if I support the Ombre supplement recommendations, and I don't remember seeing anything that I considered particularly helpful among the food recommendations.

From a technical standpoint, my critiques were mostly for Viome.  Of course, anybody can submit FDA MedWatch reports for a technical issue with a diagnostic (as I did).  However, if other consumers do plan on taking supplements recommended by the company, then I hope that they use FDA MedWatch if they encounter any adverse events I did this for an earlier human genomic test, but I am hesitant to try recommendations from the multiple other companies.  So, I hope that customers for all 3 companies are aware of resources like this and take the time to provide important feedback as relevant.

Unfortunately, I don't think have enough specialized background to gauge the relative importance of various options for specific diseases (such as those offered by a given company, versus other options that might even be free).  So, if you are well aware of the common best practices (as either a physician or possibly as a patient with a chronic condition), then please provide feedback if you notice any claims that might either over-emphasize the benefit of a supplement from a given company or if enough emphasis on other available/established options is not adequately described.  I think there are reporting systems for advertising, but I am not sure of those can be recommended by somebody else.  I think critical evaluation of claims is important, and I think e-mails to the company and discussion with your general or specialized physician are probably a good starting place.

Again, I did not make any changes based upon any results from any of the companies.  I only used this for research purposes to gain a better appreciation for the bacterial metagenomic analysis, which would allow me to do things like study microbiome changes over time.  In the future, I might also make changes based upon independent physician advice and see what happens in my metagenome (mostly as a matter of curiosity), but I consider that different than starting from recommendations based upon the metagenomic data.

I hope others find this helpful, and I would certainly encourage feedback! I think the blog post comments section has been OK in the past.  However, if there is any possible benefit to using the GitHub discussion (which can include images and code), then that is also an option.

Change Log:
12/12/2021 - public post date
12/13/2021 - various wording revisions/corrections
12/14/2021 - change wording for last change log entry; additional revisions
12/19/2021 - add Viome response + additional links
12/29/2021 - thryve/Ombre food recommendations
1/22/2022 - add FDA MedWatch report
5/2/2022 - change title and add additional sample information
6/18/2022 - after receiving results from all companies, add collection date for 6th sample.
6/19/2022 - add links for raw data
6/27/2022 - revise closing thoughts and add some additional details (such as re-analysis and Bifidobacterium detection)
6/28/2022 - add link to GitHub disucssion
7/16/2022 - add Trustpilot review links, and re-arrange content.

For example, I have moved the content below out of the main post for brevity, but I am keeping the comments below for reference:

When I asked Viome for feedback regarding the extra discordance that I believe I saw in my samples (including those collected at the same time), I was directed to this page.  I am not sure if I see the assessments currently being provided to consumers.  However, if that represents the response, then perhaps it can be mentioned that I think it is important to emphasize that the "FDA Breakthrough Device Designation" for a different application that I believe is not available to consumers is not in fact FDA approved (matching the bar chart in the provided link, if you look carefully - however, it was an issue that I separately reported for regulatory misconduct, with draft available here that is a follow-up from discussion with other FDA that informed me about the regulatory misconduct reporting system). That said, I wish to emphasize that it is important that each claim be evaluated independently (within and between tests).


I believe this article references low consumer reviewers for Viome.   As of 12/19/2021, I see 1.28 out of 5 from the BBB, and 3.2 out of 5 from Trustpilot.
Creative Commons License
My Biomedical Informatics Blog by Charles Warden is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 United States License.