Charles Warden's Science Blog

Sunday, June 26, 2011

Review of Biopunk

Biopunk is a book discussing biological research that isn't conducted in traditional research setting (like an academic lab or a pharmaceutical company). The book covers a wide variety of topics such as a philosophical discussion about what motivates good scientists, how legal and political decisions affect scientific progress, and recent developments in the field of "DIY bio" (where the book mostly focuses on personalized medicine and synthetic biology). Throughout the book, Wohlsen also provides several cool factoids, like the Bridges of Cherrapunji that are engineered from living tree roots.

One chapter focuses on DTC genetic testing, where Wohlesen provides both an overview of this industry as well as accounts of individuals who have utilized DTC testing. For example, Raymond McCauley conducted his own DIY bio research on metabolites in his own blood in order to try and better understand his 23andMe result indicating an increased risk for macular degeneration. Although Wohlesen acknowledges "McCauley did not hesitate to concede that the results do not show anything conclusive," I think this is a very cool example of how DIY Bio can help inquisitive scientists try to learn more about themselves outside a formal research setting.

My subsequent research on Raymond McCauley also led me to learn more about DIYgenomics.org, which provides tools to help users further analyze their 23andMe data for health risk, drug response, and athletic performance for individual SNPs. In some ways, this reminded me of the new, free Interpretome tool, but Interpretome can load my 23andMe data more quickly and with a more streamlined interface. Nevertheless, I think it's good to know that this option is out there.

There were also a few aspects of the book that disappointed me. For example, many accounts of biopunk research seem to focus more on buying used lab equipment off craigslist or eBay than new technological developments that can help democratize research. It also seemed like a lot of the "biopunks" were pretty well-educated and not necessarily good examples of what I would consider amateur scientific research. Also, I was somewhat disappointed at how difficult it was to additional information on some of the start-ups / organizations that were mentioned in the book (which has only been out for a few months).

For example, the chapter "Cancer Kitchen" discusses how John Schloendorn and Eri Gentry studied the role that the immune system played in cancer using Schloendorn's own cancer cells, which led the creation of DIY nonprofit called Livly to develop cancer immunotherapies (and Gentry later co-founded BioCurious, another DIY nonprofit). However, the Livly website described in the book is no longer hosted on the internet (the old url, provided on the Livly facebook page, now links to an unrelated website). Likewise, BioCurious only seems to have a facebook page with limited information. Even with limited funding, the company can at least create a free Google Sites website (like my personal website) in order to more effectively convey information about the company.

I was also very interested in learning more about the Pink Army Cooperative (a DIY drug company attempting to deliver personalized treatments for breast cancer). This time, I was able to find a generally well-designed and informative website, but I couldn't find much information about concrete research accomplishments (to be fair though, Wohlsen does warn readers that "so far, Pink Army is more a concept than an actual co-op").

Although it was frustrating that I couldn't learn much more about these specific non-profits, Biopunk has successfully encouraged me to learn more about the DIY bio movement. Who knows, maybe I'll even stop by a meeting for my local DIYbio chapter!

Monday, May 16, 2011

Modeling Bimodal Gene Expression

Since it is often challenging to estimate parameters for mixture models (such as those used to model bimodal gene expression), I thought it might be useful to discuss some of my successes using non-linear least sequares (NLS) regression to model bimodal gene expression.

Many scientists use maximum likelihood estimation (MLE) to model bimodal gene expression (such as Lim et al. 2002, Fan et al. 2005, Mason et al. 2011, etc.). My MLE model is based the code provided in this discussion thread, so I used the mle function from the stats4 package (which is a wrapper for the standard optim function).

On simulated data, both the MLE and NLS models estimate 37% of samples show over-expression (i.e. come from the distribution with the higher mean), which is very close to the true value of 36%:

Simulated Data

However, I have found that the mle function often returns error messages when working with real data and models built using the nls function tend to fit the data better than the MLE estimates. Even on the simulated data above, the NLS model appears to prove an ever so slightly better fit for the data (as might be expected because it directly models the density function).

In order to illustrate my point, I have also analyzed some genes exhibiting bimodal gene expression (as identified by Mason et al. 2011) in the GEO dataseries GSE13070. For example, here is a gene that I could model relatively well with NLS regression whereas I simply couldn't produce an MLE model:

Of course, both of these tools have their limitations. For example, the data has to have a pretty clean bimodal distribution (here are 3 examples of distributions that couldn't be modeling using either method: ACTIN3, ERAP2, and MAOA (different probe)). For the NLS model, I also had to set the variance to be equal for the two samples in order to produce a reasonable estimate of over-expression, but I believe this is usually a safe assumption.

Although I do not present the data in this blog post (because it contains unpublished results), I have also found NLS regression to be useful on other genes in several other datasets, So, I know NLS regression works well with more than just the one gene that I show above.

Also, for those that are interested, here is the source code that I used to produce all of the above figures.

Sunday, April 17, 2011

How and Why the FDA Should Allow DTC Genetic Testing

At the beginning of this month, the FDA extended the period to submit public comments about Direct-To-Consumer (DTC) genetic testing to the Molecular and Clinical Genetics Panel of the Medical Devices Advisory Committee (referencing docket ID FDA-2011-N-0066 at http://www.regulations.gov). For more information on this topic, please check out this post from The Spittoon (the official blog for 23andMe).

I just submitted a comment to the FDA (which is essentially a shortened version of this blog post). You can currently view my comment here using Google Docs, but I do not currently see the posting on http://www.regulations.gov (I will work on verifying that the comment was successfully uploaded). In fact, there were only a few comments posted after the original 3/1/2011 deadline, and I do not see any new comments posted after the extension of the comment period that occurred on 4/1/2011. If you have not already done so, please submit a comment before the new deadline on May 1st!

In many ways, I think DTC genetic testing companies are similar to medical websites like WebMD (which is an idea I first remember seeing in this blog post comment). I personally think it would be a great disservice to society if websites like WebMD, Mayo Clinic, and MedlinePlus were banned because they provide medical advice to the public without consultation with a physician. Likewise, I also think it is very important that people be able to learn about their own genetic information without having to consult a physician (although I would certainly encourage people to seek advice from medical experts if they feel the need to do so). Although all doctors do not agree that patients should have access to DTC genetic tests, there are also some doctors that dislike WebMD. I do not believe this is a valid reason to ban either type of medical information.

I want to emphasize that I do not oppose any sort of FDA regulation. For example, companies that intentionally mislead people should be penalized (as one example, check out this post on My Gene Profile by Daniel MacArthur). However, I do not think the FDA should ban companies who are transparent in their actions and are basing their analysis of published, peer-reviewed scientific research.

I think there is sufficient evidence to show that most people will have reasonable reactions to their results (for example, check out this research article in New England Journal of Medicine). However, I think it might be helpful for the FDA to help classify which test results clearly require medical action and which ones are "research" grade tools that connect people with findings in the medical literature. In an earlier post, I discussed how a "3-tier system" might be able to help accomplish this. Essentially, we currently do have "clinical" tests and "research" tests, but I think formalizing some sort of system to distinguish between such tests could be useful (especially if it helps provides a way to maintain DTC genetic testing without the need to require physicians act as a gatekeepers for this information, or if it prevents these tests from being outright banned).

In general, I think it is important for individuals to have access to a variety of opinions in order to think critically when making medical decisions. It is not good to blindly trust any source of information - whether that information comes from a doctor, a DTC genetic testing agncy, a government regulatory agency, or a scientist (like myself). I strongly believe that people should have access to second opinions about their genetic tests (through tools like Promethease).

I think the FDA could also potentially help improve genetic testing (for both DTC and non-DTC tests) by helping provide people access to secondary sources of information. For example, I think it would be fine for the FDA to force companies to allow users to export their data in a standard format in order to allow people to easily get second opinions about their genetic testing results. Strictly speaking, I don't think this is necessary - for example, there are 3rd party web apps that help users learn more about their 23andMe results (such as this Firefox app), and Promethease already helps users search for annotations from SNPedia (although there is a $2 fee if you want your results quickly). However, I don't think it would hurt to have a standard format that applies to all genetic testing companies.

In fact, a standard format for sequence data from genetic tests could provide a useful framework for a collaboration between the FDA and NIH to fund development of of a free tool for people to analyze their genetic information. For example, MedlinePlus is an excellent resource provided by the NIH, and I think it could be really cool of the FDA would work with the NIH to help people analyze their genetic information similar to the way MedlinePlus provides traditional medical advice If such a collaboration were to take place, then I think it would also be fair to require genetic testing companies to provide links to this 3rd party tool (as well as other tools, if they choose to do so). This could be helpful both in terms of helping people think more critically about their results and I think it could be a good way to fund research on how to best convey genomics research to the general public and incorporate publicly available data into a single risk assessment provided by this free, 3rd party tool.

Update (6/20/2020): I wrote this post before I started adding change log entries. However, I added a note because my opinions have shifted somewhat since I originally wrote this blog post. For example, you can see several FDA MedWatch reports that I have submitted within the collection of posts linked here.

Essentially, I think I have better appreciation for the harm that can be caused if a result is rushed to the public, especially if information is distributed to a large number of people (such as 10,000s or 100,000s of customers). I still believe that situations where something partially effective that still works better than a placebo (or has non-trivial predictive power) should be thought of differently than highly effective solutions or completely ineffective solutions. Indeed, you can see some non-genomic reports in my PatientsLikeMe post, at least one of which I also submitted as an FDA MedWatch report for side effects.

I think setting the right expectations can help, but I thought the problems that I didn’t notice before were sufficiently important that I needed to add something to this post.

I also think it is important that genomic risk calculations can be validated in independent cohorts, which makes transparency and on-going quality assessment important. For example, I still believe that publicly available information is important, which means that you can have access to it with or without a physician. Even if there are consent limitations that require controlled access (or prohibit carrying out the experiment in the first place), I think maximizing specialist access to raw data (with accurate documentation of data sharing) is still important. If you look at the cystic fibrosis post, you can see that free and open feedback from a Biostars discussion helped with re-analysis of my raw data.

Charles Warden's Science Blog

Sunday, June 26, 2011

Review of Biopunk

Monday, May 16, 2011

Modeling Bimodal Gene Expression

Sunday, April 17, 2011

How and Why the FDA Should Allow DTC Genetic Testing

About Me

My Websites

Blog Archive

Labels

Charles Warden's Science Blog

Sunday, June 26, 2011

Review of Biopunk

Monday, May 16, 2011

Modeling Bimodal Gene Expression

Sunday, April 17, 2011

How and Why the FDA Should Allow DTC Genetic Testing

About Me

My Websites

Blog Archive

Labels

Follow Me!