Thursday, November 14, 2019

What do I need to change as an individual?

I can tell that I need to work on fewer-projects in more depth.

I am not sure if additional training is necessary to accomplish this, but that is the focus of my post on "What are the expectations for Individuals with an MS in Bioinformatics versus a PhD in Genomics?"?

Technique-Wise, these are the sort of things that I think I could be comfortable with:


  • I am very comfortable coding in R / Python / Perl
  • If I needed to get back into the lab, I could previously do a PCR and maintain a cell line
    • At least previously, I had some difficulties being able to preform my own microarray experiments
    • However, to be honest, I think the best fit for me would be to keep doing mostly or entirely computational work (as a Bioinformatics Specialist, Bioinformatician, etc.).
  • I think I likely need to reduce the number of new patient samples that I encounter, but I am comfortable working with my own genomics data and I think I have had useful contributions using re-analysis of data deposited by other labs.
    • For example, I thought this was a relatively successful story of my feedback on a pre-print being helpful
    • I also have notes on my human genomics results here
    • I also have on-going analysis of public cell line perturbations to demonstrate method limits for RNA-Seq analysis


Research-Wise, there are topics that I am interested in and/or have some prior experience with:


  • Investigate whether a relatively simple strategy is more robust than a more complicated strategy (for example, try to identify problems with over-fitting)
  • Probably a good idea to either limit the number of samples I work on at a given time and/or make sure that I have enough time for several rounds of analysis / discussion of the same dataset (which is always good for critical assessment of results)
  • Perhaps place more focus on non-human genomics (and I started out doing evolutionary genomics research)?
    • I have some previous virology experience, so perhaps I could study the genetics / genomics of viruses that currently infect other animals (but have not yet evolved the ability to infect people)?  This could even be part of DNA-Seq or RNA-Seq for the host.
    • While I haven't done any such research from a professional standpoint, I have been comparing genomics results for my cat Bastu, and I hope to have a blog post summarizing that soon.
  • If I can get agreement about some things, perhaps there is some value in shared support guidelines?
    • I also truly like the idea of supporting labs with less popular research topics and/or smaller budgets (which may have a relatively greater need for shared support)
    • Essentially, I am emphasizing training (indirect analysis support) and limits on projects for shared staff
    • However, I don't particularly like telling other people what to do (the degree discussion is really more about having the right amount of autonomy and peer respect to perform my own analysis), even though I realize that we sometimes have obligations to society to do things that we may not find enjoyable.
    • So, I think finding a solution for myself what is currently most important, but I think I have some experiences that may be useful to others.
    • In general, if I were to assist with training support, I think I may be able to help with common public datasets, but I think there may need to be a rule that I couldn't help with providing analysis for a new dataset (unless a greater commitment to the project is made, where the limits on shared support would then be important)
    • While helping me stay up-to-date and refreshed on details, perhaps providing local guidance (face-to-face) for a subset of content from selected on-line courses (like Coursera) may be an appropriate way for me to help, but it would be crucial that I not complete exercises for any students (which would violate the honor code, requiring students to complete their own work and demonstrate independent competence).  For example, if I did this before they started the course, perhaps I could then recommend where to learn more and get certification (as well as setting realistic expectations on the likelihood of passing the course).
  • I would need a lot of practice, but outreach / optional education for the general public (such as a book club discussion that I led) can be rewarding 
  • While I am less certain about my role in a professional standpoint, you can see my "speculative opinion" posts about some things that I think could be interesting


Personality-Wise, these are what I believe are my strengths and weaknesses:


  • I have to be fairly independent for my current job, but I do provide a supportive role (where biological / clinical idea usually comes from PI)
  • While the difference between 1 day and 2 weeks turnaround time would be an order of magnitude (for each iteration of analysis/discussion), I have received the good suggestion that I should wait at least an extra day before returning each round of results (to see if I can catch more errors by reviewing the results again the next day).
  • I like the idea of helping provide a "public good," so I think I would prefer to continue working at non-profits
  • Continue becoming better at more mindful when I have a prior assumption (which may or may not be true) and I may not sufficiently understand other perspectives.
    • However, I appreciate those who value the need to take time to be objective and fair
  • While I believe it is important to continue to make future progress, I have some concerns about responsibilities what require excellent communication (and would frequently involve relatively short interactions with individuals where I may not have the chance to correct myself)
    • For example, part of the reason I work on the computer is that I bugs will stay corrected in the code (once I find them).
    • In contrast, if you knew that there was an experimental protocol that required X steps and I was highly like to mess up at least 1/X steps, then that is sufficient for me to not be able to get a protocol to work (for example, I think this is why I had previous difficulty with performing my own microarray experiment).
  • I think I may need to better recognize what I can fix (for myself), versus a concern about the actions of others (which may be solvable if properly communicated, or may be harder to resolve without common agreement)
    • For example, there may be some room for improvement in terms of communicating myself in sensitive situations (such as disagreeing with a policy and/or a superior).  However, this is something that I am actively working on.
  • Most of my family lives on the east coast of the United States (and I currently live in the west coast, in California).  As we get older, this may be something worth taking into consideration.



Change Log:

11/14/2019 - public post
11/15/2019 - public post
11/21/2019 - fix typos + add waiting a little longer to return results
1/7/2020 - add link for on-line course notes
4/30/2020 - update cat link for blog post versus GitHub
10/7/2020 - add note about computational emphasis

Wednesday, November 6, 2019

Requiring (At Least Some) Methods Testing for Every Project

It may currently be a little hard to find, but I wanted to point out a couple links relevant to showing the value in testing different RNA-Seq methods for every project:


  • SourceForge repository for public data analysis
    • I still have a ways to go before being able to start working on a paper, but you can see how I am progressing here
    • I think the Target_Recovery_Status.xlsx file (for checking recovery of the known genetic perturbation in an experiment) is the most relevant for showing that you could not choose 1 method out of edgeR, DESeq2, and limma-voom to maximally recover the known gene knock-down or over-expression
    • I am also experimenting with having a completely public log for notes and analysis
  • Acknowledgement for GitHub RNA-Seq gene expression template
    • Includes some papers with modified methods
  • While the newer analysis tends to have smaller samples sizes, you can see noticable differences between methods in a much larger cohort in this post


On Biostars (which you can see in a variety of responses, including but not limited to this one), I would generally give the following recommendations:


  • If possible, test calculating p-values with edgeR, DESeq2, and limma-voom
  • I would recommend having an independently calculated expression method (like FPKM, Fragment Per Kilobase per Million), in order to help assess method selection
    • For example, you might see an extremely obvious change in expression for a gene (such as the one that you altered), but it might not have a significant p-value (or have a missing p-value) for one of the methods.
    • While the optimal strategy for discovery may not necessarily be the one that most stringently recovers previous results, you may be able to tell some strategies clearly don't work well on your data.
    • I would also recommend using this gene expression measurement to create heatmaps to compare clustering of replicates
      • I would typically use this instead of exporting normalized counts from the method to calculate the p-value, but testing clustering of replicates (without defining the groups in the normalization) is another possible way to compare strategies.
      • Sometimes this can be a bit qualitative.  However, if you define your gene lists / enrichment as a "hypothesis," then I think this is made up for my having independent validation for your claim.
      • I do realize this treads the line between p-hacking and needing to test methods due to limits in precision (which I mention a little bit in this comment and this Twitter discussion).  However, as scientists, I think this is part of why it is extremely important to be transparent and admit errors as soon as we discover them (in the interests of training ourselves to be as objective as possible).
  • Robustness of identifying a result with different methods may also give you some extra confidence in the results (unless the methods are not really independent, for example)
  • If you test alternative normalization, make sure you have a visualization before and after applying that normalization (to try and assess the likelihood of over-fitting in your adjustment)
  • I also think it is important that these are open-source, freely available programs (so that you can have the ability to determine what works best for your individual project)


In general, these posts may also be relevant to the discussion of limits to precision in the genomics methods:



Again, it is going to be a while, but I do hope to eventually have a preprint to cover the above points (as well as some other observations that I have had from working on a variety of projects for RNA-Seq gene expression analysis).

Change Log:

11/6/2019 - public post
6/3/2020 - add link for earlier (larger) RNA-Seq benchmark
6/7/2022 - minor formatting change

Sunday, November 3, 2019

How much does my personal mental health experience matter?

When I was talking to a friend the other day, I came to the realization that there are 2 things where I have ~10 years of experience:

1) Genomics research (and programming in R/Perl/Python)
2) My personal path in treating and managing my mental health problems

As you can see from the "Change Log," I shortened this post to avoid getting off topic about thinking about various options but leaning towards not emphasizing the mental health part (from a professional standpoint).

However, what may still be of interest (or at least represents something that I haven't written in a previous blog post) is that I thought the combination of working with social workers (LCSW, MSW, etc.) and psychiatrists (with MDs) is what was most effective for helping me with my problems.  In fact, the Ann Arbor Consultation Services that treated me when I was at the University of Michigan had a philosophy that treatment should start with medication, but then shift to more cogitative therapy with social workers.  At least for me, understanding if I am putting myself in a situation where I am likely to have mental health problems (so, increasing understanding for prevention) and being mindful of unnecessary or harmful behaviors (which I believe is important in affecting the amount of medication that you need to take and/or maximizing the benefit from a given dose of medication) are very important.  So, I think there was something important about this, and I was very disappointed to see the place that helped me is permanently closed.  Nevertheless, I want to thank all of the mental health care professionals that have helped me, who have made a huge difference in my quality of life.

I briefly share some of the specifics for my medications in this post on PatientsLikeMe.  I also found the Headspace app to be helpful, even though there are still times where I also need medication.

Finally, I guess this doesn't leave a specific question where I would appreciate feedback.  However, if you think any of this resonates with you (or if you think it may be helpful), please feel free to comment with your own experiences and/or thoughts.

Change Log:

11/3/2019 - public post
11/4/2019 - add note about project management; additional changes
11/5/2019 - better reflect the likelihood of working with new patient data in long-term (that isn't my own)
11/6/2019 - minor/moderate changes in organization of content
11/7/2019 - minor change + shorten and re-organize post (removing portion where I mention that I think a Licensed Clinical Social Worker can have an MA degree, but I don't think that is actually best fit for me, namely for reasons of communication and remembering details about people+removing 2 paragraphs to possibly have in another post as well as a 3rd paragraph that made less sense with the other stuff removed)
 
Creative Commons License
My Biomedical Informatics Blog by Charles Warden is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 United States License.