Of all the sample prep steps necessary for next generation sequencing, DNA size selection may have the greatest impact on quality of results. After all, ineffective sizing can waste sequencing capacity on low molecular weight material such as adapter-dimers or primer-dimers, while imprecise sizing can prevent bioinformaticians from producing accurate assemblies. High-quality size selection can boost sequencing efficiency, save money, improve assemblies, and even allow sequencing of low-input samples.
Short-read sequencers, such as Illumina and Ion Torrent, operate best when fed DNA libraries that contain fragments of similar sizes, often in a very specific range recommended by the manufacturer. When libraries have not been properly size-selected, these sequencers become significantly less efficient: it might take two lanes of sequencing, for example, to accomplish what could have been done in a single lane with a well-sized library.
Long-read sequencers, like those from Pacific Biosciences or Oxford Nanopore Technologies, see a different benefit from DNA sizing. These sequencers tend to produce ultra-long reads by fully sequencing a DNA fragment. When fragments are shorter, the read is necessarily limited. For these systems, scientists have demonstrated excellent improvements in read lengths by using size selection to remove smaller fragments, which allows the sequencers to focus on the DNA fragments most amenable to producing the longest reads.
Size selection can also have a major impact on certain sequencing applications. For paired-end libraries, highly precise sizing gives bioinformaticians the most accurate information about distance from one read to the next — essential data for properly mapping those reads after sequencing. In mate-pair sequencing, scientists have found that accurate size selection can improve alignment, reduce chimeras, and boost library complexity. Another technique known as ddRAD-seq, used for massively parallel genotyping, relies on near-perfect size selection to produce useful results.
There are several approaches to DNA size selection. For low-throughput pipelines, scientists often use manual gels, cutting out bands of the desired size by hand. Higher-throughput pipelines are more likely to use automated methods, such as bead-based systems or platforms with disposable gel cassettes. Several studies, such as this one from the Sanger Institute and this one from the University of Arizona, have demonstrated that automated DNA sizing produces more accurate and reproducible results than manual gels. Those studies also found that bead-based systems are less precise than gel-based systems, making them a better fit for cleanup than size selection. Automated gel-based platforms can also eliminate the risk of contamination between samples, drastically reduce time needed for the size-selection step, and improve yield by recovering more DNA than manual gels do.
To learn more, check out these publications and profiles of scientists using automated DNA size selection in their workflows.