Bioinformatics for NGS: Open Source or Proprietary?

by

last updated: April 2, 2020


Next Event

Why Be a Mentor to Other Scientists?

Why Be a Mentor to Other Scientists?


As many who have worked with sequence data can tell you, the biggest bottleneck in publishing papers based on sequence data is the analysis step. Most researchers are faced with a dilemma when they receive sequence data for the first time: “do I try to use these free open source programs or do I shell out for the proprietary bioinformatics programs?”

Open Source Solutions

The best part about open source tools is that they’re… well… ‘open’ insofar as you can read exactly what the program does. If you are so inclined, you can view all of the parts of the program and see the logical flow of the pipeline. This makes an excellent learning tool for any beginning bioinformatician. Additionally, open source programs are typically made available to the public for free. This means that you can download the source, load up the program and run it on your data without paying huge fees. Finally, you can modify existing open source programs to deal with cutting-edge problems or to customize your pipeline.

All this code gives me a headache

This brings us to the biggest downside of open-source programs: you might need some programming skills in order to implement the program in your pipeline. Not all open source programming is equal, and I have personally scoured through source code that almost gave me a migraine. Open source programs usually lack dedicated service and support teams (often because they were the product of an overworked postdoc!) so you are responsible for troubleshooting your own errors most of the time. There are some excellent programmers who do support their products, so do not think that all open source projects are left to the winds!

Proprietary software

Most proprietary programs are developed by large teams of programmers and support staff in a company that specializes in bioinformatics solutions. This means that if you hit a ‘snag’ with your data, help is likely only a phone call away! These companies price their products competitively against the cost of a dedicated bioinformatician. You may be able to afford the program, but not the additional staff! Additionally, most of the functionality that you need in your analysis is already coded into the program. Need to plot a graph? Just click this button right here. It is that easy.

A cautionary tale for the budding bioinformatician

Now for a cautionary tale: ease of use may give you false results! Using proprietary tools does not absolve you of the need to actually read and research the type of analysis that you are doing. This is particularly true in the case of genome assembly and annotation. Finally, proprietary programs are quite often not entirely suitable for cutting-edge research such as structural variant detection in genomes. So consider this before sacking your current bioinformatics team.

Which to choose?

In most cases it comes down to your comfort level with programming. Can you program? Open source solutions may allow you to do more cutting edge analysis than the proprietary tools. Are you in a clinical lab and cannot afford to spend the time to train yourself (and/or others) to learn to program? Proprietary tools may be more appropriate. The choice is yours my friend! Would you like us to help you make the choice? Perhaps you have already gone down one of these routes- please do share your experience with fellow Bitesizers.

Derek is a US postdoctoral fellow working with next-generation sequencing data derived from many livestock and domestic animal species.

More 'Genomics and Epigenetics' articles