Interested in a career in bioinformatics? We’ve got the lowdown on the training you’ll need to pursue this career path, and a handy list of resources to get you started on your learning.
What is Bioinformatics?
Bioinformatics is an interdisciplinary field that combines mathematics, computer science, physics, and biology to help answer key questions in modern biological sciences research. Bioinformaticians generally work in multidisciplinary groups comprising people from different research backgrounds.
What Does a Bioinformatician Do?
A bioinformatician uses tools to understand or solve biological problems and also helps to develop tools for research. There are two general categories of bioinformaticians.1
- The first category includes developers who implement algorithms and develop tools for bioinformatics.
- The second category includes curators who are responsible for all the work relating to data resources and data integration.
Most bioinformaticians work within different medical science and health fields, including biology, genetics, proteomics, and pharmaceuticals. Some professionals come from a biomedical research background while others specialize in computational tools.
Skills Required for a Career in Bioinformatics
You’ll need at least a Master’s degree, as well as the ability to program, and you’ll need to be able to learn, and use, complex technology. A number of universities offer bioinformatics degrees.
Here we’ve outlined some of the skills you’re likely to have to master if you decide to pursue a career in bioinformatics.
1. Bioinformatics Skills
You need to learn how to use:2,3
- sequence alignment tools such as Blast or Bowtie;
- the Genome Analysis Toolkit (GATK);
- software for next-generation sequencing, microarray, qPCR, and data analysis (Partek);
- tools for handling high-throughput sequencing data (e.g. samtools);
- tools such as Ensemble to gather gene data sets;
- tools for database search systems (e.g. Entrez).
2. Statistical Skills
You need to learn:
- how to use statistical software systems such as SPSS and SAS;
- how to perform statistical analyses with Python or R.
3. Programming Skills
You should be familiar with:
- one or more of the following programming languages: R, Perl, Python, Java, and Matlab;
- machine-learning tools and libraries such as Mllib and Scikit-Learn in Python.
4. General Biology Knowledge
This requirement will vary according to your area of study or the particular job you are applying for. You will most likely need knowledge of molecular biology, genetics, and cancer biology.
5. Knowledge of Genomics and Genetics
This knowledge is the core of bioinformatics. Some of the most important skills are high-throughput sequencing, next-generation sequencing, and computational genomics.
6. Database Management
This requirement includes traditional relational databases, which are the basis of SQL (e.g. SQL Server and Oracle). You should also have an awareness of NoSQL databases, which are non-relational, distributed, open-source, and horizontally scalable (e.g. MongoDB).
You also want to make sure you have some knowledge of big data databases (e.g. TCGA) and big data analytics databases (e.g. Vertica).
7. Data Mining and Machine Learning
Knowledge of techniques such as hierarchical clustering and decision trees would be useful in any bioinformatics role.
8. General Skills
In addition to the technical skills mentioned above, you’ll need a range of transferable skills, including the ability to multitask and to work independently, good communication skills, curiosity, analytical reasoning skills, and managerial skills.
Free Learning Resources For a Career in Bioinformatics
Below are some free resources to start learning the skills you will need to pursue a career in bioinformatics.
SPSS
SPSS-Tutorials have a range of tutorials on data analysis and various statistical tests.
SAS
SASCrunch provides a list of free resources to help you learn SAS.
Python
If you’re just starting out with Python, Bioinformatics Programming Using Python: Practical Programming for Biological Data by Mitchell L. Model is a good starting point. After that, you should get familiar with NumPy for vectorized array computation. scipy is also very useful for some special functions or linear algebra.
If you want to process large data, you will need to understand some of the Python-C binding (e.g. SWIG, ctypes, Cython) for high-performance data processing in C and manipulation in Python.
R
David Romney provides a list of online resources for learning R.
Perl
Using the Perl online library, you can access either Beginning Perl or other advanced documents depending on your level of expertise.
Java
There is a very good free online textbook for Java, and the LearnJavaOnline.org Interactive Java Tutorial is also good.
Matlab
Coursera runs a great course that teaches the basics of Matlab.
Molecular Biology
UCLA has an interactive online tutorial in molecular biology.
Cancer Biology
Several cancer biology animations and videos are available on CancerQuest.
Genomics and Genetics
EMBL-EBI provides a free practical course on the analysis of high-throughput sequencing data as well as another course on functional genomics.
Becoming a bioinformatician takes a lot of hard work, but it’s definitely worth the effort. Check out our article on some of the ways in which bioinformatics can be used. Are there any bioinformaticians out there who can share their experiences? We’d love to hear how you got into your career in bioinformatics in the comments.
References
- Vincent AT, Charette SJ. (2015). Who qualifies to be a bioinformatician? Frontiers in Genetics 6:164.
- Wu H, Palani A. (2015). ‘Bioinformatics Curriculum Development and Skill Sets for Bioinformaticians’ in 2015 IEEE Frontiers in Education Conference (FIE). IEEE: El Paso, TX.
- Welch L, Lewitter F, Schwartz R, Brooksbank C, Radivojac P, Gaeta B, Schneider MV. (2014). Bioinformatics Curriculum Guidelines?: Toward a Definition of Core Competencies, PLOS Computational Biology 10(3): e1003496.
Originally published January 23, 2018. Reviewed and republished February 2021.