The Next Big Thing: Alternative Polyadenylation

What Is Alternative Polyadenylation?

Processing of mRNA and its regulation plays a fundamental role in gene expression. As science progresses, alternative polyadenylation takes center stage in the undercurrents of gene expression. ^1,2

Polyadenylation is part of the pre-mRNA maturation process and involves polyadenylation of the 3’ end of the emerging RNA. This process happens to all eukaryotic mRNAs, except replication-dependent histone transcripts, and is a two-step reaction that includes cleavage of the pre-mRNA and the synthesis of a polyadenylate tail (this tail is about 250 nucleotides in mammalian cells and about 50 nucleotides in yeast). Much like alternative splicing, alternative polyadenylation seems to play a big role in gene expression. Indeed, 50% or more of human genes appear to give rise to different transcripts due to different patterns of polyadenylation. ^2,3

There are different sites for polyadenylation: the polyadenylation site can be within introns/exons (coding-region polyadenylation, or CR-APA), producing isoforms of the same protein and therefore, affecting the gene expression qualitatively. On the other hand, the polyadenylation site can be located in the 3’ untranslated region (3’UTR). This UTR-APA won’t change the protein itself; however, it will give rise to transcripts with different lengths – which, in turn, can affect the protein expression quantitively. Since this region is expected to have regulatory sequences for gene expression, the longer the transcript the more negatively regulated it is. Therefore, generally, shorter transcripts equal more protein, while longer transcripts equal less protein. Besides mRNA stability, this process impacts many features of mRNA metabolism, such as localization of mRNA, translation efficacy, and nuclear export.^1,2

So, it isn’t surprising that polyadenylation has been linked to physiological and pathological processes. In fact, proliferating cells seem to have short3’UTRs because they need to produce large quantities of protein very fast; while quiescent cells have longer 3’UTRs compared to proliferating cells. Mutations in the polyadenylation process have been linked to various diseases, such as ? and ? thalassemia, immunodysregulation polyendocrinopathy enteropathy X-linked (IPEX) syndrome, cyclin D1 related cancers, among others. ^1,2

However, alternative polyadenylation is still a mystery. Hence, we need to understand how it affects gene expression of a normal cell, how deviations can lead to disease, and finally, how it can be used in therapeutic or preemptive medicine.

So, the question stands:

How Can We Study Alternative Polyadenylation?

Alternative polyadenylation research is not an easy task, because it is tricky to study a repetitive sequence, especially if it is at the end of the transcript, like in the case of the 3’UTR-APA. For example, RNA-seq is a great technique for many different purposes, but it has a major downfall when studying polyadenylation sites: insufficient coverage of the regions of interest. Consequently, it is not able to identify polyadenylation sites precisely. ¹

However, there are some great NGS-based techniques developed specially for polyadenylation site identification, using diverse approaches. The goal of these techniques is to enrich the fragments containing poly(A), and use an NGS-platform for sequencing; these approaches are known as 3’-enriched RNA-seq. 3’-enriched RNA-seq approaches have advantages over RNA-seq in studying polyadenylation sites and quantifying different isoforms created by alternative polyadenylation, and can be categorized in two groups, based on the approach to enrich the 3?-termini of transcripts: oligo(dT) priming-based methods and RNA manipulation-based methods . ^1,2

Oligo(dT) priming-based methods

Oligo(dT) priming-based approach uses the poly(A) sequence of the mRNA to capture it, by its 3’-termini. The mRNA is reverse-transcribed using an oligo(dT) primer, producing complementary DNA (cDNA). Since oligo(dT) primers are a single-stranded sequence of deoxythymine (dT), they will anneal to the poly(A) due to complementarity of bases.

There are different techniques that fall in this category, and may differ in some features, like fragmentation, 5’ adapting and second-strand synthesis. Here we will briefly discuss some of these techniques.¹

Poly(A) site sequencing (PAS-seq)

This is a commonly used technique to sequence polyadenylation sites. The first step is to fragment RNAs + poly(A) to 60-200 nucleotides. Then reverse transcribe all RNA to cDNA. The reverse transcriptase adds a few untemplated deoxycytidines to the 3’ end of the cDNA; then, an adaptor present in the reaction will anneal to these deoxycytidines. At this point, the reverse transcriptase will switch template, and create cDNA until the 5’ end of the adaptor. Therefore, the reverse transcriptase reaction will produce cDNA molecules with adaptors on both ends, eliminating the need for extra steps, like end-repair, A-tailing and adaptor ligation steps. After synthesis of the second strand, cDNAs are selected by size and amplified by PCR and the resulting library is ready for sequencing. This technique has several advantages: it is fairly easy to implement, and it cuts out bothersome steps by introducing the sequencing primers into the first-strand of cDNA. ¹

This type of oligo(dT) priming-based method has some disadvantages that are worth mentioning. The oligo(dT) primers, due to complementarity of bases may anneal to internal A-rich sequences, a process known as internal priming, that may be falsely recognized as polyadenylation peaks. Even though most of these false results can be removed computationally, it may lead to the loss of real polyadenylation sites, especially those flanking the A-rich sequences. Besides, the T-stretch of the oligo(dT) primers is maintained throughout the experimental procedure, leading to preservation of strand data, but it can be problematic during sequencing. In fact, homopolymers, simple repetitive sequences, may decrease the sequencing base-calling quality. The decrease in base-calling quality may be because of sequencing desynchronization: molecules in the same cluster have different sequencing starting points, leading to abnormal sequencing peaks. One way to minimize this problem is to use oligo(dT) primers with fewer Ts – however this may lead to a decrease in specificity, and therefore, more internal priming, which may lead to false positives. Another way to minimize this problem would be to sequence the strand starting at the 5’ end of the mRNA – this would increase base-calling quality as the homopolymer would only be read at the end of the sequencing process. However, this would lead to a decrease in sequencing depth at the 3’ end, which in turn could mask real polyadenylation sites.¹

Polyadenylation sequencing (PA-seq)

In this method, oligo(dT) primers are altered to try to minimize the above problem: a dTTP (deoxythymidine triphosphate) is replaced by a dUTP (deoxyuridine triphosphate), which can then be cleaved by a uracil-specific excision reagent (USER), successfully removing most Ts, and therefore ending the homopolymer-sequencing problem.¹

Other Oligo(dT) Priming-Based Methods

The A-seq method also tackles this problem, using a different solution: it uses a split primer. This split primer is an oligo(dT) primer with a hairpin structure (the strand folds onto itself and binds through base complementarity to itself), that contains the 3’adaptor sequence. The PCR primers bind to the 3’adaptor within the oligo(dT) primer, removing the T-stretch from the final library.¹

Another technique, 3’T-fill, adds unlabeled dTTPs do the poly(A), ‘muting’ this repetitive sequence during the sequencing step, circumventing the homopolymer-sequencing problem.¹

Another method known as whole transcriptome termini site sequencing (WTTS-seq) is very promising as it reduces the risk for internal priming. It consists of four steps: fragmentation, poly(A)+ RNA enrichment, reverse transcription to produce the first strand of cDNA, and the synthesis of the second cDNA strand by PCR. After synthesis of the first cDNA strand, single-stranded RNAs and RNA-DNA hybrids are removed by RNAses. The second cDNA strand is then synthesized by PCR using a poly(A)-anchored primer (PAAP). ^1,4

Collibri RNA Library Prep

Invitrogen (Thermo Fisher Scientific) has developed a kit for the preparation of cDNA libraries for strand-specific RNA sequencing called Invitrogen™ Collibri™ Stranded RNA Library Prep Kit for Illumina™ with Collibri™ Human/Mouse/Rat rRNA Depletion Kit.

After fragmentation of RNA by RNase III, the sample is hybridized with helper adaptors, leading to their ligation at 5´and 3´ends. After hybridization, the RNA is converted into cDNA. The cDNA is then amplified by PCR, and full-length Illumina™-compatible sequencing adaptors are introduced. The sample is now ready to be sequenced!

This kit permits a fast library construction (within 6 hours) for whole transcriptome sequencing, and allows multiplexing of libraries by using up to 96 single-indexed primers. This kit can be safely used on several quality RNA samples, including those extracted from Formalin-Fixed Paraffin-Embedded (FFPE) tissues.

This technique successfully removes ribosomal RNA (rRNA), has a uniform coverage, and a high transcript sensitivity. Besides, it is effective in the detection of non-coding RNA and preserves the 3’ end sequence information! Libraries prepared from polyadenylated mRNA produce approximately 10% of polyA sequence-containing reads. Sequences adjacent to polyA stretches will indicate alternative polyadenylation positions with high confidence.

In addition, you can track the library preparation progress visually: key components are color-coded, and therefore the color of your sample changes based on the reagents you have added, allowing you to know exactly which step you are at!

You can learn more by visiting the Invitrogen Collibri.

RNA manipulation-based methods

These methods were established as a response to the internal priming problem of the oligo(dT) priming-based methods. Generally, in RNA manipulation-based methods, RNA fragments with polyadenylation sites are enriched, and adaptors are added to the 3’end of these fragments; the adaptors will be the annealing site for primers for the reverse transcription.

Poly(A)-position profiling (3P-seq)

This method uses a splint-ligation that will anneal a biotinylated primer-binding site to the poly(A). Splint-ligation is very convenient because it allows the introduction of modified molecules into the RNA strand, or even the assemblage of smaller RNAs into one bigger RNA molecule. After being splint-ligated, the RNA-primer complex is partly digested by an RNAse. The splint ligation added a biotinylated site that permits the polyadenylated fragments to be captured through a streptavidin-wash. The fragments then undergo reverse transcription, with dTTP as the only deoxynucleoside triphosphate. Another digestion step releases the fragments. At this point, 3’ and 5’ adaptors are added, and the library is ready to be sequenced. 3P-seq successfully avoids internal priming. However, it is technically challenging and involves complex steps, making it time-consuming and demanding. ^1,5,6

3′ region extraction and deep sequencing (3′ READS)

This is another RNA manipulation-based method. After fragmentation, RNA fragments rich in poly(A) are captured by magnetic beads coated with CU₅T₄₅ (chimeric oligos with 45 thymidines and 5 uridines). Then, a digestion-step using RNAse liberates the fragments from the magnetic beads and removes most of the As from the poly(A) site, thus resolving the internal priming problem. However, much like 3P-seq, it is a technically difficult procedure.^1,7

Which Technique to Choose?

As we mentioned throughout this article, all approaches have pros and cons that should be considered before any decision. In the table below, you will be able to visualize details for each technique briefly discussed here:

Method	Fragmentation	Adapting	Sequencing desynchronization	Internal priming
3SEQ	Heat shearing	DNA ligation	No	Yes
PAS-seq	Heat shearing	Reverse transcription with template switching	Yes	Yes
PA-seq	Heat shearing	DNA ligation	No	Yes
A seq	RNase I	RNA ligation	No	Yes
3′T fill	Heat shearing	DNA ligation	No	Yes
WTTS-seq	Heat shearing	Reverse transcription and second strand synthesis	Yes	Rare
3P-seq	–	RNA ligation	No	No
3′ READS	Heat shearing	RNA ligation	Yes	No

What technique have you used? For more information, download our free infographic.

The Next Big Thing: Alternative Polyadenylation

References:

Chen, Wei, et al. “Alternative polyadenylation: methods, findings, and impacts.” Genomics, proteomics & bioinformatics (2017).
Tian, Bin, and James L. Manley. “Alternative polyadenylation of mRNA precursors.”Nature Reviews Molecular Cell Biology 18.1 (2017): 18.
Di Giammartino, Dafne Campigli, Kensei Nishida, and James L. Manley. “Mechanisms and consequences of alternative polyadenylation.” Molecular cell 43.6 (2011): 853-866.
Zhou, Xiang, et al. “Accurate profiling of gene expression and alternative polyadenylation with Whole Transcriptome Termini Site Sequencing (WTTS-Seq).” Genetics (2016): genetics-116.
Jan, Calvin H., et al. “Formation, regulation and evolution of Caenorhabditis elegans 3? UTRs.” Nature 469.7328 (2011): 97.
Kershaw, Christopher J., and Raymond T. O’Keefe. “Splint ligation of RNA with T4 DNA ligase.” Recombinant and In Vitro RNA Synthesis. Humana Press, Totowa, NJ, 2013. 257-269.
Hoque, Mainul, et al. “Analysis of alternative cleavage and polyadenylation by 3? region extraction and deep sequencing.” Nature methods 10.2 (2012): 133.

Thermo Fisher

About Us

Marketing

Bitesize Bio Search

The Next Big Thing: Alternative Polyadenylation

What Is Alternative Polyadenylation?