Resources for Becoming a Programming Biologist

Written by: Dr Nick Oswald

last updated: October 7, 2024

Have you ever entertained the idea of learning to program? Have you tried but felt discouraged by the overwhelming amount of information out there? If you answered yes to both of those questions, I encourage you to try again with the following resources. Computer science is one of the best subjects to self learn.   All you need is motivation and a computer connected to the internet!

Check out these resources and soon you’ll become a  programming biologist.

Learning the Basics

You may already have a programming language in mind, however I strongly suggest taking a step back to learn the fundamentals of computer science first. Before you learn the syntax of your language of choice, you need to understand basic concepts such as data types, variables, conditionals, loops, arrays, functions, etc. . One of the best ways to do this is to take Harvard’s free online Introduction to Computer Science course. You’ll be introduced to a number of languages including C, PHP, JavaScript, SQL, CSS, and HTML.  You will also learn how to write/run programs in a simple web-based IDE. This course is a challenging first step, but if you’re serious about learning becoming a programming biologist, it is worth your time and effort.

Choosing a Language

After taking the course, you will have a strong grasp of computer science fundamentals. Now comes the fun part: choosing a primary language. Picking a language will depend on what you plan to use your new skill for. To do scientific data analysis or run simple scripts to speed up repetitive computing tasks, I strongly recommend Python, Ruby, Perl, Julia, or R. If you’re unsure which program is right for you, answer a few questions at Best Programming Language for Me to find a language. Spend some time reading up on the different pros/cons as well as how steep the learning curve is. Also, check out the Lord of the Rings Analogy to Programming Languages.

My recommendation is Python. It’s easiest to learn, includes extensive capabilities, and has a well developed library of tools. For example, Biopython, Galaxy, and Pygr, to name a few. If you choose Python for example, you’ll have a significantly smoother experience implementing powerful programs in a relatively short amount of time, compared to C++ or Java.

Note: You can always learn multiple languages.  However, I recommend mastering the intricacies of your first language before setting out to learn a second.

Practice, Practice, Practice

Once you’ve chosen a language, what comes next is relentless practice. If you choose a language you’ve never used before, start with a free interactive tutorial such as Code Academy to learn the ropes. They have tutorials on Python, Ruby, Rails, Java, SQL, Git, and many more. If you choose python and are also interested in bioinformatics, check out the course Biology Meets Programming: Bioinformatics for Beginners.

The only way to become a better programming biologist is to constantly write programs. In doing so, you’ll pick up small techniques, learn to think programmatically, and eventually feel confident enough to apply your knowledge to real world problems. The good news is there is a vast amount of resources dedicated to helping you practice your skills.

Here are some of my favorites:

  • Rosalind – Bioinformatics problems and algorithms
  • Project Euler– Math related problems
  • Hacker Rank – Language specific problems as well as general algorithms

Specialization in Bioinformatics

Now that you’ve learned the basics, chosen a language, and worked hard on practice problems, try the free, online, seven course specialization in bioinformatics created by Drs. Pavel Pevzner and Phillip Compeau. This is the best resource I recommend to anyone interested in bioinformatics and computational biology. You’ll learn hundreds of bioinformatics algorithms including Sequence Alignment, Motif Searching, Genome Assembly, Evolutionary Tree Reconstruction, Hidden Markov Models, Peptide Sequencing, and many more. You’ll also be introduced to next generation sequencing tools and other methods in computational biology. While the specialization takes around 3-6 months to complete, you’ll be left feeling quite accomplished and ready to tackle your own projects.

Start Your Journey as a Programming Biologist

Even though all of this advice may or may not work for you individually, it’s a good direction to head towards regardless. Along the way, you’ll adjust accordingly depending on your learning style. With persistence you’ll eventually reach a level of comfort writing programs and become a programming biologist. It won’t be easy, but it’ll be well worth the effort. Happy coding!

Nick has a PhD from the University Dundee and is the Founder and Director of Bitesize Bio, Science Squared Ltd and The Life Science Marketing Society.

More 'Software and Online Tools' articles