The future of personalized medicine depends on affordable DNA sequencing. In the race for the $1,000 genome, several sequencer manufacturers are working on making equipment that can sequence DNA and RNA faster and more accurately. But so far, only one company – San Diego, California-based Illumina – has US FDA regulatory approval to use its sequencer in the clinic.
The Sequencers Available
The Accurate Illumina Sequencer
The MySeq sequencer works similarly to other Illumina sequencers. Its technology allows the accurate sequencing of genomes, including homopolymer regions and areas with repetitive sequencing regions – areas that are prone to errors! According to some reviews, Illumina’s error rate does increase with increasing nucleotide addition steps, but its error rate is a maximum of 0.5 percent (one error for every 200 bases).
The $1K Illumina Sequencer
At the other end of the Illumina spectrum, another sequencer, the HiSeqX10, was released last year and is purported to be able to sequence 45 human genomes in one day, for “about” $1,000 for each genome, the company says. The $1,000 threshold has been held as a benchmark for sequencing to enter the clinic for routine genetic testing.
Ion Torrent & Pacific Bioscience Sequencers
But Illumina has plenty of competition in the contest for accurate, fast and cheap sequencing. Ion Torrent, a division of ThermoFisher, has several sequencers, including the PGM and Proton, which are based on detecting electrical signals from DNA samples on a semiconductor chip. The Ion Torrent Proton 1 has reads comparably long as the Illumina MySeq.
Meanwhile, Pacific Biosciences has several machines, including the RSII that claims the longest average read length (14,000 base pairs) of any existing machine. The Roche 454, the original NGS machine, is still around, and useful for small genome sequencing. However, in this article, I am going to concentrate on Illumina sequencing because it is the dominate method.
How Illumina Sequencing Works
Illumina sequencing technology has become the most accurate form of NGS sequencing available, but it started with some pretty basic science inquiries into how polymerases worked. Two Cambridge University scientists, Shankar Balasubramanian and David Klenerman, were using fluorescent labeling in the 1990s to see how polymerases worked on surface-bound DNA during DNA synthesis. It occurred to them that this technique could be useful for sequencing methods, and, a few meetings in the lab and local pubs later, they presented the idea to a venture capitalist, which provided them seed money to develop what became Solexa, which further developed the technology. Solexa, in turn, was acquired by Illumina in 2007.
The first steps toward Illumina sequencing are very similar to traditional Sanger sequencing: DNA or cDNA samples are randomly fragmented, usually into segments of 200 to 600 base pairs. These fragments are then ligated to adaptors and made single-stranded.
For Illumina systems, the single-stranded fragments are loaded onto the company’s proprietary flow cell, where they bind to the inside surface of the flow cell channel. An Illumina flow cell has eight lanes for simultaneous analysis. Each channel is lined with oligos that are complementary to the library adaptors. DNA that doesn’t attach is washed away.
Each fragment is amplified on the flow cell, and unlabeled nucleotides and polymerization enzymes are added. These additions, called “Bridge amplification,” connect and lengthen the fragments of DNA on the flow cell. Enzymes then incorporate nucleotides, building double-stranded bridges. Denaturing the double-stranded molecules leaves single-stranded templates anchored to the flow cell channel, and several million clusters of double-stranded DNA are generated by amplification, and are then ready for sequencing.
Illumina’s “sequencing by synthesis” involves a proprietary method whereby four labeled reversible dNTP terminators, primers and DNA polymerase are added to the templates on the flow cell. When excited by a laser, fluorescence from each cluster can be detected, which identifies the first base. Then, four labeled reversible terminators, primers and DNA polymerase are added again, and laser excitation reveals the second base. Sequencing cycles are repeated for all the bases in the fragments, one base at a time (but very, very quickly). Since all four reversible dNTPs are present, incorporation bias is reduced. Base calls are made from signal intensity measurements during each cycle, reducing error rates further.
Sequencing depends on massive sequence reads in parallel. Deep sampling uses weighted majority voting and statistical analysis to identify homo- and heterozygotes and determine errors. Data collection software then allows for alignment of sequences to references, and variations (where things get interesting) are identified.
Advances in sequencing, such as what Illumina has accomplished, opens the door to increasingly ambitious basic research studies and clinical applications. But before we can expect widespread clinic use of genome sequences, much basic science research needs to be done. We need to invest time and resources into annotating the genome. Because what good is a genome sequence if you do not know what it means?
History of Illumina Sequencing (2015). Illumina Corporation. Website.
Mardis, E.R. (2013). Next-generation sequencing platforms. Annual Review of Analytical Chemistry, 6; 287-303. DOI: 10.1146/annurev-anchem-062012-092628
Sequencing technology (2015). Illumina Corporation. Website.
Your Genome.org (2015). What is the Illumina method of DNA sequencing?
GenoHub (2015). Choosing the Right NGS Sequencing Instrument for Your Study.