Tuesday, October 15, 2019

What are the Expectations for Individuals with a MS in Bioinformatics versus a PhD in Genomics?

I currently have a Master of Arts (MA) in Molecular Biology from Princeton, which is a weird degree that you can only get when you drop out of a PhD program.  When people ask about my background, I explain that I took the coursework for the Quantitative and Computational Biology track (before it was a separate degree), and I have been working  as a Bioinformatics Specialist for a combined total of ~8.5 years at City of Hope (as well as other genomics experience in training programs).

I took extra math and computer science classes when I was an undergrad at Georgia Tech (where I graduated with a BS in Applied Biology), as well as extra hours beyond my degree requirements.  As I mention towards the end of my fRNA comment, my previous coursework had decent overlap with the current BS/MS Bioinformatics degree.  When I was enrolled in the Bioinformatics PhD program at Michigan, I believe that I was 1 semester away from being able to earn an MS in Bioinformatics degree (although it was decided I couldn't use my PhD fellowship to cover that coursework, and I didn't think it was worth staying for the MS degree since I already have a Master's degree).

Even if it is decided that I need to go back to school, I am not planning on going back to Princeton (even though it would match the overall question of whether a MS in Bioinformatics or a PhD in Genomics / Molecular Biology is more appropriate for me).  One reason is that I don't think a PhD from Princeton is the best way to represent my ability to handle new topics of research.  The weights of the factors for leaving the Bioinformatics PhD program at Michigan were a little different than leaving the Princeton PhD program: I didn't think the responsibilities of a Bioinformatics PhD could be handled with an appropriate balance in work and personal life (although I think my Michigan GPA for the MS+PhD coursework so far was fairly good, closer to Georgia Tech than Princeton).

In other words, from Twitter conversations, I question my ability to consistently have the frequent, precise insight as some popular scientists with PhDs (whom I assume used fewer drafts and/or less time per post).  I also have my Standardized Test Scores on LinkedIn (which I also mention in this Twitter response).  However, I believe that I sometimes have important experiences to communicate, and I am wondering if there are niches with a little less responsibility that still require an PhD over a Master's degree (and perhaps a friendly debate about whether that is right or wrong).

So, now with that background, these are the general questions that I thought might be of broader interest:

1) What are the differences in what can be accomplished by somebody with an MS in Bioinformatics or a PhD in Genomics (or "Biology" with a computational emphasis)?

Importantly, these are not mutually exclusive.  Indeed, in terms of giving advice to others, my own advice that I would give myself in retrospect is to first earn the Master of Science (in Bioinformatics) and then make an independent decision about a PhD.

That said, there is a difference between an MS in Bioinformatics, a PhD in Bioinformatics, and a PhD in Genomics.  For example, I had not previously appreciated the difference between having a PhD in Bioinformatics versus a PhD in Genetics/Genomics, but I did have some clues in terms of getting a Bs in Mathematical Statistics and Differential Equations, as well as withdrawing with a "W" from my Graph Theory course.  So, sometimes problems can be hiding in plain sight, even from yourself.

For me, everything I do now is on the computer, and there were some experiments that I had some difficulty carrying out (although, if it would help to be able to carry out all experimentation and analysis myself, I used to be able to do PCR and maintain cell lines).

If I am working within 1 lab (and/or there are concerns that I need to have more of a leadership position in order to be able to work on ideas that I think are important), then my understanding is that my degree may be relatively more important for grant applications.  However, in a shared resource, I think that requirement is less stringent.

2) I don't believe science MS degrees are ranked (or at least not with a ranking that people commonly use).  However, if you consider the ranking of the discipline (presumably for the PhD program), is it better to get a terminal MS from a degree from a school with a higher ranking or a terminal PhD from a school with a lower ranking?

As you can start to see from my Google Sites page, I am experimenting with some on-line courses (which I believe are important for continuing education).  If others in their mid-30s are thinking about changing something and/or getting additional training, I think a full range of options should be considered (but there is a need to be extra careful, since I believe mistakes can be more costly to your career / employment at this point).

3) Under the scenario described for #2, what are the responsibilities / leadership potential for someone with a MS in Bioinformatics versus a PhD in Genomics? Perhaps the current answer and the future answer is different, and having positions with a more even mix of terminal MS and PhD degrees can help (if it emphasizes what you think about somebody's work, regardless of possible bias regarding degree and/or affiliation).  For example, I think there should be some paths to being able to work at a non-profit research institute without the typical degree path.

Nevertheless, if an additional degree would help show commitment to improving the quality of the work that I do in the future, perhaps this is still a valid discussion.

I think some individuals may be concerned if I am trying to give advice to (or questioning) scientists with PhDs (and/or MDs).  However, I think my experiences show that a bioinformatician with a Master's degree can provide support for individual labs with questions / guidance from a Biological Sciences PhD or MD.

Also, part of what I am saying is that the degree is supposed to represent your ability to process new information.  However, I think there are at least some jobs where enough experience with a Master's degree can substitute for a PhD for hiring qualifications, matching my expectation that individuals without PhDs can have valuable contributions.  I also have to be fairly independent for my current job.

4) I think people could have the question "How can someone with a MS (or MA, BS, etc.) have a novel contribution for problems that are studied by people with PhDs?"  In other words, assuming equal opportunity and interest in formal education (which will not always be valid, but I think is an OK assumption for my own background), how can somebody who is relatively less intelligent be right about something, and the more intelligent person be wrong?  This is a good and fair question.  I think part of the answer involves prior assumptions: if you don't question the validity of your previous assumptions (enough to be looking for what may not be correct), you can miss something that would be obvious if it had your focus.  I think this is kind of like the "gorilla experiment," where you can entirely miss a person in a gorilla costume (if you didn't know you were supposed to be looking for them).

Another thing that I didn't realize before I got started in research is that you often call scientists by their first name.  My theory is that this can be because you can do worse quality research if you make your claims from a position of authority, with less critical assessment.  However, to an outsider, it may seem weird that you may do better science if you don't refer to somebody with a PhD as "Doctor [Smith]".  I've also heard this is not necessarily true in other countries, but it does seem like a good match for my own experiences.

5) In general, perhaps this also gets at the question of "What are you supposed to be learning in a PhD?".  I think the answer is i) gain enough substantive experience in an area to understand an area with caveats that a novice may not know about for troubleshooting, ii) learn to communicate yourself in a clear and responsible manner, iii) learn to become more mindful of what you don't know (in order to decide when you feel comfortable enough with a conclusion to move to publication, where your claim can influence public policies and possibly even medical decisions), and iv) make objectivity and accuracy your primary goal and use that to guide the pacing (such as funding, number of projects, amount of responsibility, peer opinions / assumptions, etc.).

There are probably also other equally valid answers, and I believe most of these should also apply to an individual with an MS/MA.  In fact, I think some of these are of value for mindfulness and objectivity for the general public.

6) I noticed genetic counselor students in my human genetics class at the University of Michigan were better at memorizing rare diseases than I am, and I also noticed that the genetic counselors and physicians at the City of Hope Clinical Cancer Genomics conference could parse information from family trees quicker than I could.  However, I don't want to completely rule out the possibility that I could contribute to any human discussions.  For example, I try to describe my own personal strengths and weaknesses in this blog postHowever, for those that are interested, I think there is a difference between and MS in Bioinformatics and an MS in Genetic Counseling.

Finally, to be clear (again), I am not quite convinced that a genomics PhD is the best fit for me (particularly at this time), but I really do think I have some experiences that are worth sharing (and there may be at least a temporarily value to me having a leadership role for a limited number of projects).

Update (9/26/2020): I believe a couple family members agree that having a terminal Master's degree (either biology/biotechnology or bioinformatics) sounds like the best fit for me.  This shifts the discussion somewhat, away from a genomics PhD (such as genomics MS, if that existed).

I would still like there to be ways for me to provide feedback (such as this F1000 Research article, where the ORCiD link lists my work experience as well as degrees), and I think it would be nice if I could have some partial leadership roles for some papers (such as pre-prints or perhaps some brief peer reviewed articles with external links for more information that can be updated).  However, if that is the case, I am leaning towards agreeing that a terminal Master's degree is the best solution.

You can also see my contributions to through the Disqus comment system, even though my degree (or work experience) is not directly mentioned.  I think this is important for helping improve post-publication review.

For example, I think an MS/MA in Biology and a Certificate in Bioinformatics or Genomic Data Science could also be a fair way to represent my abilities.  I would say this is pretty much what I currently have.

Change Log:

9/11/2019 - add note from colleague discussion (from previous week) that "update name" basis is not necessarily true for all countries (a response from a colleague with initials "AP")
10/15/2019 - public post date
10/19/2019 - update (I see that I left "public post date", but I think this was a minor update, and I caught the duplicated change log description on 11/14/2019)
11/14/2019 - other minor changes (and also emphasize importance of more absolute objective truth in the PhD, as well as mention on-line courses and post draft about personal strengths and weaknesses)
11/17/2019 - add link to Twitter discussions + add own advice of getting MS first, then PhD (from responses/discussion from @Sternarchella, @tatsvelez, and @eco_andrew); change "hesitant" to "not quite convinced"
12/5/2019 - add initials to entry for 1st change log
12/6/2019 - trim down and re-order post + add on-line blog link
7/4/2020 - add concern about leading without sufficient understanding
9/26/2020 - add updated thoughts + update duration of being a Bioinformatics Specialist (7.5 to 8.5 years) + minor changes
9/27/2020 - minor changes; add note (here) to acknowledge guidance from colleagues for trying to work on shorting content for communication, and this is the basis for mentioning a brief peer reviewed article (I have peer reviewed articles with similar initiative in journals like PLOS ONE and peerJ, but I do not currently have a published example for the brief part with greater emphasis on external post-publication support to be transparent but reduce chance of needing a correction, with or without developing the idea for the project/paper).

Overall Notes on my 2019 Blog Posts

This year, I have focused on a mix of topics that relates to my own work as well as testing my genomic sequence from various companies.  For example, I think this collection of blog posts most directly relates to my work experiences and this collection of blog posts most directly relates to my own genomics data (but not directly related to what I have produced at work).  There were also many other posts, so those should not be thought of as exhaustive collections (for example, comparison of my data using precisionFDA was not among the other collection of blog posts, since it had an earlier public post date, and my set of post-publication review notes was also separate).

As implied from my most recent blog post, I also haven't determined if there may exist a point where I may need to either find a different support mechanism or gradually transition out of my current job, but that is the sort of thing that I am trying to figure out (and I could very much use some help).

The posts about other genomics results (not related to work) are meant to emphasize the possible broader impacts (they are less relevant to my individual future path, but I think they have equal or greater importance for other people).

In other words, I am trying to understand and contribute to discussions about some difficult problems, but I apologize that these public discussions may leave some people with an impression that is not completely positive.  So, for the work-related side, I want to make clear that I very much support City of Hope as a community and organization, and I am very grateful for their help.  Accordingly, I expect that internal and external discussions can help with further improvements.

Change Log:

10/15/2019 - public post date

Lessons from an Antibody Amplicon-Seq Project

I thought the Antibody Amplicon-Seq project was very interesting (as part of the broader study in Mutsvunguma et al. 2019, also described in the earlier pre-print), so that I thought there was value in re-emphasizing some points here:


  • From this project, I learned about the antibody 72A1 (which I believe was purified from the HB-168 hybridoma cell line for this study).  72A1 consists of 2 previously described heavy-chain sequences and 2 previously described light-chain sequences, so this provides a very nice positive control.
  • The 72A2 L2 sequence did not appear in any MiSeq reads, but it could clearly be amplified with Sanger sequencing with a different primer pair (this is what is meant by the "specific primers for the lambda light chain").  However, importantly, the trace file had to be visually inspected.  For example, the automated EMBOSS-merged sequence incorrectly predicted a pre-mature stop codon.  Upon visual inspection of the Sanger trace, we could tell that the public 72A1 L2 sequence was correct (with the right trimming and editing of this sequence).  So, if you use Sanger sequence (or even next-generation sequencing) and you get a potentially pathogenic result (such as a frameshift), it is extremely important you visually inspect your data!
  • It is not safe to assume the automated sequence / analysis is always right (and I felt a certain obligation to bring this up, in terms of the broader implications for genetic testing / counseling)
  • In addition to sequences being completely unable to be amplified, there was a non-coding sequence that was usually the sequence with the highest frequency in the light-chain samples.  This is important for other people to realize if they use a similar primer set.


Finally, I want to congratulate my colleagues / collaborators for all of their hard work.


Change Log:

10/15/2019 - public post date

10/19/2019 - minor change, shortening final paragraph
 
Creative Commons License
Charles Warden's Science Blog by Charles Warden is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 United States License.