Saturday, June 15, 2019

What About Bioinformatics Companies?

A product starting under a grant (such as in academics or a non-profit) but later being becoming a start-up (as a for-profit) is one possibility. However, my post on providing generics through non-profits would bring into question whether such a start-up could continue to be an independent non-profit (and/or a government entity/contract).

While I think I need to learn more before being able to say something strictly can't be provided from a for-profit organization, I think there are some things that may need to be taken into consideration within the current framework of options:

  • Perhaps require free command-line version of software for commercial software with a user interface (and emphasize the education component to learn more coding)?  Otherwise, it isn't really reproducible for most people, and it probably isn't appropriate to only use one program in all situations.
  • In terms of compromises to help with reproducibility, Novoalign has free version that uses fewer threads, and MATLAB allows people to run programs developed in MATLAB without a licence .
  • precisionFDA is designed/supported by a private company (DNAnexus), though what I assume is a government contract.  So, if you have genomics data, it is free to re-analyze / compare your own genomics data (since the FDA is paying for the costs).  I think this is a very good thing for citizens that can help them become more involved (and, hopefully, understand the difficulties of the regulatory process a little better).  I have some notes on on my own experiences with precisionFDA here.  So, I support this strategy (although perhaps there can be discussion about the fact that DNAnexus is currently a for-profit company).

Also, the idea that encouraging the testing multiple free RNA-Seq methods (something else that I would like to be able to show, at some point) may seem at ends with having commercial bioinformatics software.  While there is some truth to this, I would have the following response to such a critique:

1) I think the time frame matters when discussing software recommendations.  If the goal is to increase coding abilities in 5-10 years, then papers that need to be published sooner may need some alternative solution.  In other words, if somebody doesn't know how to code and there is a program with a graphical user interface that helps them do some analysis on their own, I think that can be good.  However, if they get a weird result (and/or a negative result) with that program (which could have a commercial license), I strongly suggest they test other programs before preparing for publication (and those other programs may need to be open-source command line programs)

2) Having extra options (which includes commercial software) gives labs more options of programs that they can test for their project.  So, if you get a weird result with the open-source software, having extra commercial options may help.  My only concern is that the fees may be a barrier to entry for some labs, and I don't want to encourage excessive use of free trials if licenses are not often eventually purchased.

So, I think there is still value in giving suggestions of how to make the most out of available open-source options, even you use use some commercial programs.  This is similar to what I do: the majority of the bioinformatics programs that I use are open-source, but I sometimes also use commercial programs (like IPA).  However, even with IPA, I would also recommend comparing results with free programs (like Enrichr).  Sometimes the free open-source programs end up being a better fit for the individual project than the commercial ones, but that often varies by project.

However, if you can't lock down one particular program to use in all situations (which has definitely been my experience), that is why I am somewhat concerned about the barriers to entry that could be caused by having to purchase licenses (and, if you don't already have a license, I would usually recommend trying out open-source options first).  I think this is also helps with your ability to provide support during transitions.  For example, I code in R rather than MATLAB, so I don't have to worry about keeping a MATLAB license (and a lot of genomics packages are developed in R/Bioconductor, in addition to it being free).

Change Log:
6/15/2019 - public post date

1 comment:

Creative Commons License
My Biomedical Informatics Blog by Charles Warden is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 United States License.