Sage Science develops sample prep technologies for life science research. We focus on electrophoretic approaches that improve and automate high-value steps in Next Gen sequencing workflows.
Sage sells the Pippin™ line of DNA size selection instruments, which are widely used for DNA, RNA, and ChIP-seq library construction for short-read sequencing. Our systems are also used for preparing high molecular weight DNA for 3rd generation, long-range genomics platforms.
Our products are manufactured at our headquarters in Beverly, Massachusetts, USA.
Size Analysis of High-Molecular-Weight DNA for Long-Read Sequencing
Discover how to check DNA quality for long-read sequencing using electrophoresis and why pipetting carefully is so important.
Genomes are too large to be sequenced in one piece; they must first be chopped up into overlapping fragments, which are then reassembled based on their overlapping sequences. Sequencing a fewer number of larger fragments rather than a greater number of smaller fragments makes genome assembly easier and more reliable since each piece contains more distinctive sequences. [1] Long reads can also help find large, complicated genetic variants and can be invaluable in epidemiological studies that rely on microbial DNA fingerprints.
The two primary producers of long-read DNA sequencing technologies, Pacific Biosciences (PacBio®) and Oxford Nanopore Technologies, can routinely generate single-molecule reads hundreds of kilobases in length. [1] However, these advances have created new challenges in DNA handling and preparation. The high-molecular-weight (HMW) DNA used in long-read sequencing is more fragile than DNA used in short-read sequencing and requires unique methods for extraction and purification that minimize shearing. It is essential to ensure the integrity of the starting sample by assessing DNA quality, shearing profiles, and library size. One way to do this is to analyze HMW DNA sequencing libraries via electrophoresis. This article reviews several electrophoresis-based options to resolve and determine the size of HMW DNA.
The Importance of Quality Control in Long-Read DNA Sequencing
The small DNA molecules used in short-read sequencing (75-300 bp) are quite robust and can easily withstand extraction procedures, bead purifications, and shearing protocols. Assessing DNA quality via size analysis is generally quick and painless using a Bioanalyzer chip or TapeStation (Agilent) or merely running a midi slab gel.
In contrast, HMW DNA—fragments over 10 kb in length—can break at any one of several library construction steps. Every pipetting step (slowly, with wide-bore tips is recommended!) can break the DNA into small fragments that can affect your sequencing results. During DNA sequencing, the presence of smaller molecules in a library reduces the average read length, eliminating the primary benefit of long-read sequencing: the ability to more easily and correctly reassemble the fragments. [3,4] Shorter length DNA can be removed from libraries with methods like Sage Science’s BluePippin High-Pass DNA size selection. However, this can impact DNA yield depending on the fragment size cut-off and fragment size distribution of pre-selection library. [5]
With or without size selection, it is important to evaluate the starting DNA quality when working with HMW DNA by assessing the post-shear fragment distribution and the final library size. The following methods allow you to do just that so that you can be sure that your long-read sequencing libraries do, in fact, begin with long pieces of DNA [5,6].
Electrophoresis-Based Methods for Analyzing HMW DNA Size
As any molecular biologist can tell you, gel electrophoresis works by using an electric field to move DNA through a molecular sieve. Small molecules travel through the sieve more easily and quickly, while larger ones get tangled and move more slowly. You can determine the size of your DNA sample by comparing its position in the gel to that of a known standard, usually in the form of a size ladder. Resolving HMW fragments that exceed the size of the pores is problematic, though, as larger molecules (15-20 kb) may not move through the gel at all. Several electrophoretic methods have been developed specifically to resolve these larger DNA molecules.
Pulsed-Field Gel Electrophoresis
In the early 1980s, groundbreaking work by Shwartz and Cantor [2] showed that using an alternating, pulsed electrical field as opposed to a direct current can resolve HMW DNA up to 2000 kb. [2] In pulsed-field electrophoresis, the voltage is switched periodically among three directions instead of constantly running in one. [3] DNA molecules respond to the voltage changes by realigning their charge at different rates based on their size, with smaller pieces adjusting more quickly. Over time, even long DNA strands are propelled forward.
This “two steps forward, one step back” approach is an effective way of separating large pieces of DNA. However, it can be complicated, time-consuming, and requires specialized equipment. Field reversals are usually short (milliseconds to seconds) and are often modulated incrementally to achieve the desired results. For instance, users may require high separation within a fragment size range while retaining high compression within a and another range of fragments. Figure 1 provides an example of a few seconds of a pulsed-field pattern.
Femtopulse (Pulsed-Field Capillary)
Agilent’s Femtopulse is a capillary-based system that can resolve DNA from 1300 bp to 165 kb. This size range is well within the distribution of PacBio’s single-molecule sequencing libraries, which are typically between 15 kb and 100 kb. However, the 165 kb maximum limit may not provide an accurate depiction of larger fragment distributions, such as Oxford Nanopore systems.
Femptopulse is a great way to measure library size distributions and quality checking starting DNA (degraded DNA contains fragments that smear below 50 kb). Its major advantage over slab gels is its user-friendly analysis software, which quickly provides accurate sizing and quantification. It is also preferable when sample sizes are limited. The Femtopulse is expensive, so it is probably best suited to dedicated high-throughput labs where its cost can be justified (e.g., those labs using PacBio Sequel II or Oxford Nanopore PromethION).
CHEF Mapper (Pulsed-Field Gel)
The BioRad CHEF Mapper XA System is based on work by Schwartz and Cantor, [2] but allows more flexibility over the angle of the pulsed-field. It can resolve ultra-high-molecular-weight DNA fragments, from 100 bp up to 10 Mb in length. Nonlinear switch-time ramps change the switch time increments during the run and separate fragments from 50 to 700 kb. Secondary pulses can facilitate separation and resolution of larger molecules by releasing large DNA molecules stuck in the gel matrix. The CHEF Mapper is more affordable than the Femtopulse but requires third-party gel imaging software. Among the drawbacks is that it is a large apparatus compared to typical agarose gel setups, and can require water recirculation.
Pippin Pulse (Field Inversion Gel)
Sage Science Pippin Pulse is a simple option at a fraction of the cost of the Femtopulse or CHEF Mapper. It uses field inversion (back-and-forth field reversal) on a typical midi-gel size box with reinforced platinum electrodes to withstand the pulse regimens. The Pippin Pulse can resolve DNA up to 400 kb, can handle most of the chores in a PacBio or Oxford workflow, and takes up very little bench space. Gels are cast like any midi-gel, and it uses a standard power supply. When running the gel, simple software applies either preset or user-defined pulsed-field or direct-current protocols. Like CHEF Mapper, third-party imaging software is required for sizing and quantification. Figure 2 shows an example of a Pippin Pulse protocol.
Summary of Electrophoresis-Based Methods for Analyzing HMW DNA Size
The table below provides a summary of the different electrophoresis-based methods for analyzing HMW DNA size, based on cost, size, resolution, and any need for third-party software.
[table id=141 /]
You can only reap the benefits of long-read sequencing if you start with intact fragments of long DNA. These tools can help confirm that your HMW has not been sheared during isolation and purification, enabling you to get the cleanest reads possible.
So, you’ve spent time planning your high-throughput sequencing experiment. You’ve chosen how many replicates to use, deliberated about sequencing depth, and kept everything RNase-free. Now you have many gigabytes of data available. What’s next? While the first step of RNA-Seq analysis is aligning your sequencing reads to a reference genome, first you need to get…
A commonly used technique in epigenetics is Chromatin Immunoprecipitation, or ChIP for short. This technique can show you whether a certain protein (e.g. transcription factor or histone modification) binds to DNA, when in its native conformation, namely chromatin. Insightful, but difficult This information can be very insightful, but difficult to obtain. Most protocols and suggestions…
Are you tired of staring at all of your sequence data? Want to know the easiest way to look at it? For complex genomics data, an appropriate visualization tool is a must have. The right genomics software will make it easy-peasy to get some results as well as test all those ideas you have. Since…
A paradigm shift by the Big Three As we learned last week, the Human Genome Project was accomplished using the improved Sanger method and technology from Applied Biosystems (ABI). Despite the significant technical improvements to this ‘first-generation’ technology, sequencing multiple human genomes was never going to be easy without a paradigm shift. Over the last…
Construction of high-quality sequencing libraries is pivotal to successful NGS, and DNA quality is one of the most critical aspects of library preparation. As this Nature Methods paper illustrates, DNA shearing involves appropriate and consistent fragment sizes for sensitive and accurate sequencing, and the fragments must be accurately analyzed prior to sequencing to measure molarity…
The efficiency of whole genome sequencing (WGS) workflows has skyrocketed since its inception. Major leaps and minor tweaks in the WGS workflow have compounded over time resulting in radical reductions in processing time and the cost of sequencing whole genomes over the past decades. The complete sequencing of the first human genome, named the Human…
10 Things Every Molecular Biologist Should Know
The eBook with top tips from our Researcher community.