Viruses are a vastly diverse group of infectious particles with many different structures, mechanisms of function and ingenious strategies of invading host organisms for their own proliferation. One of the key features that ties viruses together as an inclusive group, is the reliance on living cells for replication and propagation. On their own, viruses lack the cellular machinery necessary for many life-sustaining functions including protein translation and metabolism. Regardless of the organization of a viral genome or the type of nucleic acid, infection of a host cell and viral propagation is dependent on the transcription of viral mRNA and, in turn, the translation of viral proteins as well as genome replication. Because viruses are dependent on host cell machinery for most of these processes, they have driven an outstanding virus-host co-evolution. Viruses that rely on the replication machinery of the host cell become cell-cycle dependent in their own replication. Furthermore, just as viruses have evolved ways to hijack necessary cellular proteins, cells have evolved complex mechanisms for fighting infection by detection and degradation of foreign mRNA. In order for viral mRNA to utilize host cell machinery, begin translation and remain both stable and undetected in the cytoplasm, it must contain the post-translational modifications of a host cell mRNA including, but not limited to, a 5’ cap structure. By disguising viral mRNA with the same structural elements found in host mRNA, the cellular defense mechanism can be evaded and protein translation may occur. The significance of the cap structure can be seen through the diversity of cap-synthesis pathways across vastly different viral families that all lead to the formation of a ubiquitous RNA 5’-cap. The 5’→ 3’ direction of nucleotide triphosphate (NTP) polymerization during RNA synthesis creates a nascent mRNA molecule with a 5’-triphosphate moiety resulting from the initial NTP on the 5’-end. Through the processes involved in cap synthesis, the pppRNA structure is transformed into a basic, cap-0 RNA structure (m7GpppN). Further 2'-O-methylations of the first and second nucleotides of the RNA may occur.
In this chapter, a number of processes used by viruses to synthesize, acquire or mimic a 5’ cap are explored to highlight the similarities and differences in the enzymatic mechanisms that lead to the maturation of a 5’cap on viral RNA and its importance in viral genome replication within a host cell.
2. Description of the RNA cap structure
To understand the importance of an RNA cap structure for viruses, it is crucial to first understand why this structure is essential to their eukaryotic hosts. Prokaryotic RNA transcription and protein translation are coupled due to the spatial proximity between DNA and ribosomes. In eukaryotic cells however, newly synthesized RNA transcripts undergo several nuclear post-transcriptional modifications, known as RNA processing, before they are exported and translated in the cytoplasm. These eukaryotic pre-mRNA modifications include the addition of a cap structure at the 5’-end, the splicing out of introns, the editing of nucleobases and the addition of a poly(A) tail at the 3’-end. RNA capping is a co-transcriptional process that occurs when an RNA molecule is 20-30 nucleotides in length. The cap structure consists of a guanosine residue, harboring a methylation in the N-7 position, which is bound to the terminal 5’-end nucleotide with a peculiar 5’-5’ triphosphate bridge (Fig. 1). This inverted link between the two nucleotides prevents RNA degradation by 5’-3’ exonucleases. The second important feature of the cap structure is the presence of the methyl group on the guanosine, which confers a positive charge that plays an important role in its specific recognition by specialized proteins. The cap structure fulfills many roles which ultimately lead to mRNA translation. In the nucleus for instance, the cap structure of pre-mRNAs is recognized by the cap binding proteins (CBP20 and CBP80). This cap binding complex (CBC) protects mRNA from degradation and assists RNA transport from the nucleus to the cytoplasm. Once in the cytoplasm, ribosomes and translation factors must be recruited for translation of mRNAs into proteins. The eukaryotic translation initiation factor 4E (eIF4E) specifically binds to the RNA cap structure . This association is mediated through stacking interactions between two aromatic residues of the eIF4E protein; the mRNA binding is further stabilized by specific hydrogen bonds between the positive charge of the 7-methylguanosine and an acidic residue . Upon cap binding, eIF4E assembles with eIF4G (a scaffold protein) and eIF4A (an RNA helicase) into the eIF4F complex . The scaffolding protein eIF4G recruits the small 40S ribosomal subunit through the eIF3 complex . The translation initiation complex then scans the mRNA for the start codon before recruiting the larger subunit of the ribosome, and translation of the open reading frame (ORF) takes place . Taken together, the roles fulfilled by the RNA cap structure are crucial for RNA stability and translation. Because of this, many eukaryotic viruses require strategies, such as RNA cap synthesis, in order to protect, replicate and translate their genomes in eukaryotic hosts.
3. Conventional and unconventional 5’ RNA cap synthesis mechanism
3.1. Canonical cap synthesis by different viruses
The importance of the cap structure in eukaryote metabolism has resulted in an evolutionary pressure for viruses to adopt a similar cap structure. A series of enzymatic reactions is required to synthesize a cap structure at the 5’-end of RNA. The most pervasive enzymatic pathway, also termed “conventional capping”, consists of three sequential enzymatic activities that are required to generate a functional 7-methylguanosine 5’-5’-triphosphate bridged cap structure. As a result of the directional 5’ to 3’ polymerization of nucleotide triphosphates (NTP) during RNA synthesis, nascent RNA bear at their 5’-end a triphosphate moiety (originating from the initial NTP). This 5’-triphosphate end of the RNA is first converted into a 5’-diphosphate end by hydrolysis of the terminal phosphate, or γ-phosphate, by an RNA triphosphatase (RTPase). This is followed by a two-step reaction catalyzed by an RNA guanylyltransferase (GTase). The enzyme first specifically binds and hydrolyzes a GTP molecule to form a covalent enzyme-GMP intermediate, which then catalyzes the transfer of the GMP moiety onto the 5’-end of a diphosphorylated acceptor RNA (ppRNA) in the second step of GTase reaction. Lastly, an RNA (guanine-N-7)-methyltransferase (N7MTase) uses S-Adenosyl methionine (SAM) as a methyl group donor in order to methylate the guanosine residue of the cap structure at the N7 position. This sequence of enzymatic modifications yields the minimal RNA cap-0 structure (m7GpppN). Subsequent methylation of the 2’-hydroxyl group of the first few nucleotides of the RNA can be catalyzed by a (nucleoside-2’-O)-methyltransferase (2’OMTase) again using a SAM molecule as a methyl-donor (Fig. 2). Further methylations on the caps proximal nucleotides convert a cap-0 structure into a cap-1 (m7GpppNm) or cap-2 (m7GpppNmNm) structure.
The conventional RNA 5’ cap synthesis mechanism is used by a majority of viruses in order to acquire a cap structure. Most DNA viruses together with the RNA viruses from the Bornaviridae and Retroviridae families use the host RNA polymerase II (RNA Pol II) to transcribe their mRNAs. As a result, the majority of DNA virus transcripts are co-transcriptionally capped using the cellular capping apparatus. Alternatively, many RNA viruses with a cytoplasmic replication cycle, do not have access to the host RNA Pol II and therefore have evolved their own capping machinery. Over time, a wide diversity of enzyme structures and mechanisms of action have evolved to generate the same highly conserved RNA cap structure (Fig. 3). The following paragraphs describe the enzymes supporting the RTPase, GTase, N7MTase and 2’OMTase activity.
3.2. RNA triphosphatases
The RTPase activity is the first of the three enzymatic reactions required to synthesize a cap structure. The RTPase hydrolyzes the γ-β-phosphoanhydride bond at the 5’-end of an RNA to yield an RNA 5’-diphosphate and inorganic phosphate (Pi). Viruses have evolved a wide variety of enzyme structures and mechanisms of action to fulfill the RTPase activity, a greater diversity than is seen with any other enzymatic capping activity. RTPases are classified as either belonging to the metal-dependent family or the metal independent family based on their cofactor requirements. As indicated by its name, the first family requires a divalent cation cofactor for its activity. This metal requirement is usually satisfied by Mg2+, although Mn2+ is also able to support the RTPase activity . This family of enzymes also shares the ability to hydrolyze free NTPs, again in the presence of a metal cofactor [5, 6]. The lack of substrate specificity is speculated to be a result of the chemical similarity between an NTP and the RNA 5’-triphosphate end. The metal dependent RTPase family is further subdivided into three distinct structural groups, namely the triphosphate tunnel metalloenzyme (TTM), histidine triad-like (HIT-like) and helicase-like RTPase (Fig. 4).
The TTM enzymes are found in chlorella virus, poxviruses, baculoviruses, mimiviruses and lower eukaryotes. All TTM RTPases fold in a specific, characteristic structure. An assembly of eight antiparallel β-strands to form a tunnel scaffold surrounding the active site (Fig. 4). The interior of the tunnel is dominated by hydrophilic amino acid side chains oriented toward the center of the tunnel creating a network of interactions for the triphosphate moiety of the substrate . Glutamate residues, within this amino acid network are also responsible for the coordination of the crucial cation cofactor . The recognition of the RNA substrate, primarily through its triphosphate moiety, could explain the activity of the TTM RTPase against NTP substrates. Interestingly, this NTP hydrolysis is not supported by Mg2+, but is rather dependent on Mn2+ or Co2+ . The coordinated metal ion, in conjunction with basic lysine and arginine, activates the γ-phosphate and stabilizes the pentacoordinate phosphorane transition state. A glutamate serves as a general base catalyst to activate the nucleophilic water for the attack on the γ-phosphorus according to a one-step in-line mechanism . TTM RTPases have been acquired by large DNA viruses from their hosts . Interestingly, modern Poxviridae infect higher eukaryotes that lack TTM RTPase, underlying their evolution from viral ancestors that replicated in unicellular eukarya, from which they likely acquired a TTM RTPase.
The HIT-like RTPase is so far only represented by the NSP2 enzyme of rotaviruses (dsRNA virus). The name of this family is based on the structural resemblance between the NSP2 C-terminal domain (CTD) and the ubiquitous cellular histidine triad nucleotidyl hydrolases (HIT). The NSP2 protein associates into an octamer to form a doughnut-shaped quaternary structure (Fig. 4) [9, 10]. RNA binding grooves are found at the surface of the doughnut-shape while the active site is buried deep in an electro-positive cleft on each monomer. Despite structural similarity with HIT, NSP2 appears to be catalytically distinct. The catalytic histidine triad requires a Mg2+ cofactor to hydrolyze the γ-β-phosphoanhydride and form a covalent phosphate-histidine intermediate . The enzyme harbours similar catalytic rates toward both NTP and pppRNA substrates. Increased affinity for RNA, conferred by the RNA binding grooves, is speculated to stimulate RTPase activity over NTPase activity in vivo . Despite the structural similarity with HIT, currently no evidence indicates that HIT-like RTPase could have evolved from their cellular counterpart, and rather a convergent evolution is more probable .
The helicase-like RTPases are found in a variety of ss(+) RNA viruses of the flavivirus, coronavirus, potexvirus and alphavirus genera and the dsRNA viruses of the Reoviridae family. These enzymes are active NTPase-helicases and belong to the large helicase superfamilies SF1 and SF2. The NTPase activity fuels the energy-consuming strand displacement of the helicase activity. The common NTPase-RTPase catalytic site is located in a cleft formed from the junction of two RecA-like subdomains (Fig. 4). As with many nucleotide-binding proteins, the active site of helicase-like RTPases harbour both a Walker A and Walker B motif [12, 13]. The Walker A motif (GxxxxGK(T/S)), or phosphate-binding loop (P-loop), is responsible for contacting the γ-phosphate through its highly conserved arginine. The aspartate of the Walker B motif (DExD) coordinates the crucial Mg2+, which stabilizes the γ and β-phosphates, while the glutamate activates the water molecule for the hydrolysis reaction . The addition of the RTPase activity to an NTPase-helicase ancestor appears to result form only a minor evolutionary progression as the ancestor enzyme already displayed the key RTPase features, namely, a nucleic acid binding domain, a triphosphate binding active site and a terminal phosphate hydrolysis activity.
The second family of RTPases is the metal-independent group. Higher eukaryotic viruses that rely on capping apparatus of the cell use the host metal-independent RTPase. Moreover, baculovirus also expresses such a metal-independent RTPase. Two striking differences between this enzyme family and the metal-dependent family, are its cation-independent mechanism of action and its inability to hydrolyze free NTP . Metal-independent RTPases are members of the cysteine phosphatase superfamily, sharing their signature HCxxxxxR(S/T) P-loop motif located in a deep positively charged pocket. The catalytic cysteine is located at the bottom triphosphate binding cleft formed by the characteristic α/β-fold ternary structure (Fig. 4) [15, 16]. The catalytic cycle fits a two-step phosphoryl-transfer reaction. First, the pppRNA γ-phosphate is attacked by the catalytic cysteine to form a covalent protein-cysteinyl-S-phosphate intermediate which results in the release of the ppRNA product. Next, a water molecule attacks the phosphocysteine to expel the inorganic phosphate and regenerate the enzyme . The metal-independent RTPase presumably evolved from the cysteine phosphatase ubiquitously found in higher eukaryotes and was later acquired by baculovirus from their hosts. Interestingly, baculovirus also encodes a second TTM RTPase fulfilling the same role. This unconventional carrying of two distinct enzymes having the same activity is speculated to be an evolutionary snapshot of an RTPase transition from the lower eukaryote TTM RTPase to the higher eukaryote metal-independent RTPase.
3.3. RNA guanylyltransferase
The second step of the capping sequence is the GTase activity. GTase catalyzes the rate-limiting transfer of a GMP moiety from a GTP substrate to an acceptor ppRNA to yield an unmethylated cap structure (GpppN). GTases are members of the covalent nucleotidyltransferases superfamily which also includes the ATP- and NAD+-dependent DNA ligases and the ATP-dependent RNA ligases . This superfamily’s ternary structure is composed of the N-terminal of the nucleotidyltransferase (NT) domain fused to an oligobinding fold (OB-fold) domain in the C-terminal. These flexible proteins are able to undergo large conformational changes during their catalytic cycle. GTases share highly conserved structures and motifs, of which the hallmark KxDG(I/L) motif is present in nearly all GTases . The catalytic cycle of the GTase is a complex two-step ping-pong reaction involving multiple conformational changes. First, a GTase in a conformation where the OB-fold domain is distant from the NT domain (open conformation) specifically binds a GTP molecule. This is followed by the closure of the OB-fold domain toward the NT domain (closed conformation) which is stabilized by interactions between the bound nucleotide and residues from both NT and OB fold domains. This conformational change also creates a Mg2+ cofactor binding site, thus the closed conformation represents the catalytically active form of the enzyme [19, 20]. Upon Mg2+ binding, the α-phosphate of the GTP is sandwiched between the catalytic lysine (form the KxDG) and the metal cofactor. Deprotonation of the lysine leads to the attack on the α-phosphate of the GTP to form a enzyme-(lysyl-N)-GMP intermediate (EpG), concomitant with the hydrolysis of a pyrophosphate molecule . Following the catalysis, interactions between the bound guanylate and the OB fold domain are disrupted, leading to the reopening of the enzyme and the release of pyrophosphate. The reopening of the guanylylated enzyme allows for accommodation of the ppRNA, which is likely followed by the closure of the OB-fold domain. Closing of the OB-fold domain returns the enzyme to its catalytically active form, which promotes the transfer of the GMP to the acceptor RNA. A final reopening allows for unmethylated capped RNA to be released and the apo-protein to be regenerated (Fig. 5) . The active sites of the GTase are highly conserved, potentially due to their fairly complex catalytic cycle. Most viruses encode GTases that are, with respect to the active site, nearly identical to their eukaryotic host GTase, favouring the hypothesis of ancestral viral acquisition of the host GTase.
While nearly all GTases are highly conserved, a few recently discovered viral GTases are different. Little is currently known about those atypical GTases lacking the catalytic KxDG motif. Some segmented dsRNA viruses of the Reoviridae family encode for a large multiprotein capsid harbouring nucleic acid maturation functions, including GTase activity. The Reoviridae GTase is structurally different from the conventional GTase. While they lack the conserved KxDG motif, they still maintain the capacity to form an enzyme-(lysyl-N)-GMP intermediate. The flavirivus GTases are also atypical. Their activities are found on the N-terminal portion of the RDRP-MTase peptide. They are structurally distinct from both the conventional and the Reoviridae GTase but they still mediate RNA guanylation through a two-step mechanism involving an EpG intermediate [21, 22]. The precise amino acid involved in the guanylate-enzyme complex formation is also speculated to be a lysine, but a histidine or an arginine residue may also play this role. Progress in the field of atypical viral capping enzymes will eventually shed light on those imprecisions.
3.4. RNA methyltransferase
The third step of the RNA 5’-end cap synthesis is the methylation of the cap guanosine by a N7MTase. An N7MTase adds a methyl group to the guanine at the N7 position in order to convert the GpppN into a functional m7GpppN cap-0 structure. The conversion of S-Adenosyl methionine (SAM) into S-Adenosyl homocysteine (SAH) provides the methyl group. N7MTases are members of the large SAM-dependent MTase family, which shares a low sequence identity but a structurally conserved SAM binding core. This SAM binding pocket, composed of a seven-stranded β-sheet flanked by six α-helices, ensures specific and proper positioning of the SAM molecule, while other structural determinants provide specificity for a range of methyl acceptors [23, 24]. For the N7MTase, those structural determinants are a positively charged RNA-accommodating groove and a GpppN binding pocket that forms extensive electrostatic interactions with the cap guanine, thereby ensuring specificity . Despite a broad network of interactions with both substrates (GpppN and SAM), no direct contact is made between the N7MTase and their substrate reacting group: the guanine N7 nitrogen (methyl acceptor) and the SAM CH3 (methyl donor). The methyl transfer is instead mediated by a direct in-line nucleophilic attack of the SAM methyl moiety by the guanine N7 nitrogen. N7MTases are not directly implicated in the transition state stabilization, but are rather optimizing the proximity and the spatial orientation between both ligands reacting groups. In addition, a favourable electrostatic environment further stimulates the catalysis . The degree of conservation among N7MTases is very high and most viral and eukaryotic N7MTases only differ in their accessory domain. A rare exception is the poxvirus N7MTase, which appears to bind SAM in a slightly different conformation. Moreover, some poxviruses, such as vaccinia virus, have evolved a heterodimer N7MTase. The vaccinia virus N7MTase D1 for example relies on its association with the accessory protein D12 to be fully active . The degree of conservation among N7MTases points toward a common eukaryotic ancestor acquired by viruses.
Lastly, some viruses infecting higher eukaryotes, such as flavirirux, reovirus and poxvirus, can further modify their RNA 5’-end through 2’-O-methylation in order to more accurately mimic their host mRNA modifications. This last modification is not required for viruses infecting lower eukaryotes as their host harbours cap-0 mRNA. The 2’OMTase methylates the first nucleotide 2’-hydroxyl group(s) of the RNA, allowing for the conversion of a m7GpppN (cap-0) into a m7GpppNm (cap-1). The 2’OMTases are also members of the large SAM-dependent MTase family. When compared to the N7MTase, 2’OMTase harbours an additional highly conserved catalytic lysine-asparagine-lysine-glutamine tetrad . These amino acids are not consecutive in the primary sequence, but they cluster together once the protein adopts its three-dimensional structure. The exact catalytic pathway is still controversial, but relies on the conserved asparagine and arginine to lower the pKa of the catalytic lysine, which is responsible for the 2’-hydroxyl group activation. Two mechanisms are proposed for this substrate activation. The first involves the lysine deprotonating the 2’-OH to form a nucleophilic 2’-oxanion. The second implicates the lysine in the formation of a non-deprotonating hydrogen bound with a 2’-hydroxyl proton, which freezes the 2’-OH rotation in an angle where the 2’-oxygen electron lone pair is steered toward the SAM methyl group. In both cases, the nucleophilic 2’-oxygen attacks the electrophilic SAM methyl group according to an in-line Sn2 mechanism [28-31]. The pentavalent methyl intermediate of the transition state is stabilized by the asparagine. Despite the structural homology with the N7MTase, the 2’OMTase harbours a distinct mechanism of methyl transfer. Interestingly, some viruses, such as the flavivirus, have evolved both N7MTase and 2’OMTase activities within the same enzyme [22, 32]. These dual MTases share the same SAM binding site and accessory domain but not the same mechanism of methyl transfer. The classical N7MTase and 2’OMTase mechanisms are instead present but independent. It is, for example, possible to abolish the 2’OMTase activity through disruption of the lysine-asparagine-lysine-glutamine tetrad while maintaining the N7MTase activity [22, 32]. It is important to note that the flavivirus dual MTase accomplishes a sequential methylation, starting with the N7 guanine methylation and followed by a repositioning of the cap structure and finally, the 2’-hydroxyl methylation [22, 32]. This sequence is virus specific and can be inverted, as exemplified by the vesicular stomatitis virus (VSV), a member of the Rhabdoviridae family. The VSV also encodes a dual MTase, but the 2’OMTase takes place first and is followed by the N7MTase . These dual MTases have likely evolved their second MTase activity out of their initial MTase fold.
3.5. Gene organization of viral capping enzymes
In order to support viral replication and fitness, both the catalytic activity of viral enzymes involved in RNA capping as well as their localization within the cell, are crucial. Viral capping enzymes required for RNA capping have to be recruited at the site of RNA synthesis. Recruitment of the capping enzyme can be mediated by protein-protein interactions with either the RNA polymerase or a scaffold protein. While recruitment of the three distinct enzymatic activities is required in order to synthesise a cap-0 structure, the available surface for protein interactions at the RNA synthesis site is limited. Viruses have evolved multiple solutions to overcome this problem including the fusion of multiple enzymatic activities to the same polypeptide as well as protein-protein interactions between two capping enzymes to form a hetero-multimer (Fig. 6). A good example of protein-protien interaction is seen in Paramecium bursaria Chlorella virus, which encodes the RTPase, GTase and N7MTase activities on three different peptides [19, 34, 35]. The RTPase enzyme is likely to interact with the GTase, in a manner that is reminiscent of the lower eukaryotic capping machinery in which the Pol II co-transcriptionally recruits the RTPase-GTase heterodimer and the N7MTase separetely [36, 37]. Alternatively, viruses such as Baculovirus and Infectious Spleen and Kidney Necrosis virus benefit from the fusion of the RTPase and the GTase activities in a single polypeptide, thus facilitating the recruitment of the capping apparatus to the viral RNA polymerase transcription site [38, 39]. In this instance, the organization of the viral capping enzymes is most analogous to that of higher eukaryotes in which the RTPase and GTase enzymes are fused together. In this case, interaction with the GTase domain is solely responsible for RTPase-GTase recruitment to the RNA Pol II while the N7MTase is recruited separately . The fusion of sequential enzymatic activities to the same multi-domain protein appears to be more robust than the heterodimer formation. Because of this, selective pressures have driven the fusion of the capping gene in a wide variety of viruses. Alphavirus, for example, encodes a single protein that is able to add a N7 methylated guanosine to a ppRNA, while the RTPase activity is located on a different peptide . The flavivirus represent an even more striking example of gene organization optimization. The RTPase in this group shares a catalytic site with the NTPase/helicase (also implicated in RNA synthesis) on one protein while the GTase and the dual (N7 and 2'OMTase are fused to the RNA dependent RNA polymerase (RDRP) on a second protein. In this example, flavivirus managed to pack, within two polypeptides, six different enzymatic activities, all of which are involved in RNA synthesis and maturation [21, 22, 32].
Some viruses have even evolved a highly efficient capping enzyme, fusing together all three or four enzymatic functions required for cap synthesis into what can be described as an RNA-capping assembly line. Mimivirus and African swine fever virus encode a large, single protein inclusively harbouring the RTPase, GTase and N7MTase activities. This allows these viruses to efficiently modify their RNA to generate a cap-0 structure [7, 42]. The conventional cap synthesis pathway is a directional succession of enzymatic activities such that RTPase→GTase→N7MTase. Interestingly, the order of the catalytic domains within the primary sequence of these triple-activity capping enzymes follows the required capping activity sequence (NH2-RTPase-GTase-N7MTase-COOH). As a result, they not only co-localize all capping activity to the RNA 5’-end, but also optimize the progression of the RNA through the capping activity sequence. Poxvirus, typified by the vaccinia virus (VV), also display a nice example of a multi-capping enzyme. The VV multi-capping enzyme, D1, possesses all three RTPase, GTase and N7MTase activities. The first two are constitutive while the N7MTase requires association with the D12 stimulatory subunit. Together this complex is able to modify an RNA 5’-end up to a cap-0 structure. It is also interesting to note that the structure of the D12 stimulatory subunit indicates that it used to be a 2’OMTase but that function is now inactive. Instead, the 2’OMTase activity is now taken over by the dedicated VP39 2’OMTase . This raises the possibility of an ancestor poxvirus RNA-capping assembly line composed of a D1-D12-like complex that could process a 5’-triphosphate RNA into a cap-1 RNA. Such an enzymatic conveyor can currently be found in mammalian reovirus and bluetongue virus. These two viruses are members of the segmented dsRNA Reoviridae family and transcribe their plus-strand messenger RNA within an internal capsid particle containing the RDRP and the capping apparatus. A single protein packs together all four enzymatic activities required to synthesize a cap-1 structure (RTPase, GTase, N7MTase and 2’OMTase), although the putative RTPase activity is yet to be confirmed [43-45]. Once again, these activities are presented into a directional layout that channels the mRNA through successive enzymatic modifications with the goal of converting its 5’-triphosphate end into a cap-1 end. Moreover, this RNA capping assembly line is in direct contact with the polymerase, ensuring optimal recruitment of the nascent mRNA to the capping apparatus . The λ2 and VP4 capping proteins from reovirus and bluetongue virus are slightly different in regard to their quaternary structure. Reovirus λ2, which is overall linearly shaped, associates into a pentamer to form a hollow cylinder with each active site facing the interior of the cavity, or the turret. This barrel is perpendicular to the spherical internal capsid particle and creates a channel for the nascent mRNA to exit the internal capsid particle while undergoing complete type-1 mRNA capping . It is interesting that a diversity of viruses, ranging from dsDNA virus such as Mimivirus, African swine fever virus and poxvirus, to segmented dsRNA viruses including members of the Reoviridae family, have evolved such a complex but highly effective RNA-capping assembly line. The convergent evolution of these systems highlights the critical importance of proper RNA capping for viral genome replication and overall viral fitness.
3.6. Unconventional 5’ RNA cap synthesis mechanism evolved by different viruses
The capacity to properly cap RNA confers a distinct advantage to many eukaryotic viruses. Consequently, the selective pressure to maintain this structure is high, which is reflected by the degree of conservation among the viral capping proteins. Interestingly, this selective pressure is not directed toward the capping proteins themselves (RTPase, GTase and N7MTase), but rather toward their final product, the cap structure. Because of this, many viruses have evolved diverse biosynthetic strategies, divergent from the canonical RTPase→GTase→N7MTase pathway, allowing them to synthesize or acquire the final cap structure. This cap structure is in every aspect identical to the canonically synthesized one; only the enzymatic pathway varies. Many viruses families include members that use an unconventional 5’ RNA cap synthesis pathway. As of today, three unconventional 5’ RNA cap synthesis mechanism have been described.
3.7. The m7GTP RNA capping pathway
The m7GTP RNA capping pathway, also termed the alphavirus-like pathway, is found in a number of (+)ssRNA viruses of the alphavirus (Semliki Forest virus and Sindbis virus), potexvirus (Bamboo mosaic virus), tobamovirus (Tobacco mosaic virus), Togaviridae (Rubella virus and Chikungunya virus) and Hepeviridae (Hepatitis E virus) families [5, 47]. These viruses encode unique capping machinery capable of synthesizing a cap-0 structure in three sequential enzymatic reactions. The initial step is quite similar to the conventional capping mechanism in which an RTPase (nsP2 protein of Semliki Forest virus for example) hydrolyzes the γ-β-phosphoanhydride bond at the 5’-end of the RNA yielding a ppRNA . Next a GTP molecule in methylated in position N7 by an atypical N7MTase (nsP1 protein of Semliki Forest virus for example). This m7GTP is then recognized as a substrate by an atypical GTase (also nsP1 of protein of Semliki Forest virus for example). The reaction results in the formation of a characteristic m7GMP-enzyme covalent complex upon the hydrolysis of a pyrophosphate group. This m7GMP group is finally transferred onto the 5’-end of the acceptor ppRNA, to yield a typical m7GpppN cap-0 structure [41, 49-52]. The overall capping reaction is then RTPase→atypical N7MTase→atypical GTase (Fig. 7). It is worth mentioning, however, that not only the order of chemical modifications differs, but also the protein mechanisms of action. The atypical N7MTase has fundamental similarities to the standard N7MTase, including the presence of a SAM binding domain, but its substrate recognition is vastly different. Atypical N7MTase proteins are unable to methylate GpppN as the canonical N7MTase does, and instead they specifically methylate GTP (and GDP to some extent) . The atypical GTases are mechanistically different from their GTase counterpart in that they lack the KxDG conserved motif and mediate their m7GMP-enzyme intermediate through a conserved histidine instead of a lysine . These proteins have no activity with GTP, but specifically require m7GTP to form a covalently bound enzyme complex. Therefore, the conversion from GTP to m7GTP is necessary prior to the N7-methyl-gunanylyltransferase activity .
Of all known eukaryotes and viruses, the m7GTP RNA capping pathway is only used by members of the (+)ssRNA viruses, which points toward a eukaryote-independent emergence of this unconventional cap synthesis mechanism. In addition, the conservation of this capping pathway throughout distantly related viruses harbouring a broad spectrum of hosts, ranging from plants to animals, suggests an evolution from a common (+)ssRNA virus ancestor.
3.8. The GDP RNA capping pathway
The GDP RNA capping pathway, also termed the Rhabdoviridae-like pathway, is found in representatives of many (-)ssRNA viruses of the Rhabdoviridae (vesicular stomatitis virus (VSV) and Rabies virus), paramyxoviridae (Human respiratory syncytial virus and Measles virus), Bornaviridae (bornavirus), and Filoviridae (Ebola virus and Marburg virus) families [5, 47]. These viruses encode unconventional capping machinery that catalyzes the formation of a cap-1 structure. These viruses, exemplified by VSV, encode a large L protein harbouring the RNA dependent RNA polymerase RDRP activity as well as the RNA capping activity. The latter requires a sequence of four enzymatic activities that differ from the conventional pathway, in order to generate a cap-1 structure. First, the NTPase activity is responsible for the hydrolysis of a GTP molecule into a GDP molecule. Then, an RNA GDP polyribonucleotidyl transferase (PRNTase) catalyzes a two-step reaction. The L protein hydrolyzes the (alpha-beta) phosphoanhydride bond of the pppRNA triphosphate moiety releasing a molecule of pyrophosphate and creating a covalent enzyme-pRNA intermediate. The pRNA moiety is then transferred onto the GDP to form a GpppN block RNA. In this case, only the α-phosphate originates from the RNA whereas both the β and γ-phosphates are contributed by the GDP. Finally, synthesis of the cap-1 structure is completed by two successive methylations; the first being methylation of the first nucleotide of the 2’OH and the second being methylation of the guanine N7 nitrogen [33, 53-57]. When compared to the canonical capping reaction, this unconventional capping pathway reverses the phosphate contribution from the GTP and the RNA. The covalent enzyme-monophosphate-nucleotide intermediate is formed with the RNA instead of the GTP in an enzyme-pRNA complex instead of an enzyme-GMP complex. Similarly to the conventional capping pathway, the diphosphate cosubstrate is pre-emptively hydrolysed from its triphosphate precursor, but this time it is GDP instead of ppRNA that is generated. The PRNTase mechanism of action is also distinct from the GTase one in that the KxDG motif is replace by an HR motif and the histidine, not the lysine, is responsible for the enzyme-pRNA phosphoamide bond [55, 56]. Both the N7 and 2’OMTase activities are also present on the L protein and share the same SAM binding site. The typical lysine-asparagine-lysine-glutamine tetrad is also predicted to be at the MTase active site. The 2’O position of the GpppN is methylated prior to the guanine N7 position, which is the opposite order when compared to most canonical cap-1 methylation events [33, 53]. The overall GDP RNA capping sequence can be summarized as NTPase→PRNTase→2’OMTase→N7MTase (Fig. 7). It is very likely that an ancestral (+)ssRNA virus polymerase has evolved a PRNTase activity independently from its eukaryotic host. Both N7 and 2’OMTase, however, have likely been acquired from a eukaryotic host.
3.9. The RNA cap snatching
Some viruses, unable to synthesize their own cap structures, have evolved a clever way to acquire this important entity: steeling it from their host. This method of cap acquisition, termed RNA cap snatching, is used by representatives of the Orthomyxoviridae (e.g. Influenza virus, Thogoto virus), the Arenaviridae (e.g. Lassa virus, Machupo virus) and the Bunyaviridae (Hantaan virus, La Crosse virus, Tomato Spotted Wilt virus) families [5, 58]. These (-)ssRNA viruses acquire their cap structure from their hosts capped mRNA. They bind the cap structure, cleave the RNA a few nucleotides downstream and finally use this short capped RNA to prime their RDRP . The Arenaviridae and Bunyaviridae express a large monomeric polymerase where the Orthomyxoviridae expresses an heterotrimeric polymerase (e.g. PB1, PB2 and PA protein of influenza virus) harbouring all the activities required for cap snatching. The PB2 protein of the Influenza virus, the most studied cap snatching virus, specifically binds the host mRNA cap structure. The specificity of the binding is crucial and is mediated by the aromatic stacking of the methylated gunanine coupled to a base-specific interaction with a conserved acidic residue . While the mode of cap binding is similar between PB2 and other cap-binding proteins (e.g. eIF4E, nuclear cap binding complex, Vaccinia VP39) its overall fold is completely different . Once the host mRNA is bound by the cap-binding PB2, the viral PA subunit cleaves the mRNA a few nucleotides downstream from the cap structure. The length of the primer RNA generated is virus-dependent, and typically ranges from 10-13 nucleotides for Influenza virus, but can be as short as 1-2 nucleotides as is seen in the Thogoto virus [59, 61, 62]. The PA endonuclease domain shares a high homology with the type II restriction enzyme, including the active site conserved (P)Dxn(D/E)xK signature motif . The PA active site coordinates two Mn2+ cations and is believed to catalyze endonucleolytic cleavage through a common two-metal dependent mechanism [61, 64]. The short capped oligomers are next used by the PB1 RDRP as primer to initiate the transcription of the viral mRNAs . PB1 also specifically binds the viral RNA (vRNA) 3’ and 5’-end through a ribonucleoprotein 1-like motif ((R/K)G(F/Y)(G/A)(F/Y)Vx(F/Y)) . The vRNA serves as a template for the 3’ elongation of the cellular 10-13 nucleotide-capped primer. The overall cap snatching process results in the transcription of a chimeric full-length vRNA with a 5’-extension of 10-13 cellular nucleotides and a cap-2 structure (Fig. 8). Cap snatching enables viruses to acquire their hosts cap structure, which not only promotes viral replication but also impairs cellular mRNA translation, as translation of decapped cellular mRNA is impeded and the mRNA is targeted for degradation. Another consequence of cap snatching is the dependency on a pool of host mRNA molecules in order to support viral replication. (-)ssRNA viruses that utilize cap snatching have evolved ways to maintain the precious pool of eukaryotic mRNA. First, the cap binding and endonuclease activity of the trimeric polymerase are only activated upon vRNA binding, limiting the waste (induced by the cleavage and downstream degradation) of mRNA when the vRNA are not loaded on the RDRP . Secondly, some nucleocapsid proteins, first demonstrated by Hantavirus, are able to bind and protect capped mRNA from degradation in the processing bodies (P-bodies) . Thus, converting the P-bodies function from mRNA decapping and decay into cellular cap storage foci. The cap snatching is only observed in segmented (-)ssRNA viruses; such a unique molecular mechanism supports the hypothesis of a common (-)ssRNA virus ancestor of today’s virus, despite their tropism now ranging from plants to animals.
The incredible diversity of RNA capping pathways, protein folding and enzymatic mechanisms of action that have been evolved by viruses all lead to the synthesis of the same ubiquitous structure is a testimony to the importance of the cap structure for viral genome replication and global viral fitness.
4. Viral alternatives to cap structures
Most viruses harbour a cap structure at the 5’-end of their RNA. Mutations preventing the proper capping of their RNA result in infection or replication deficient viruses. This is a strong proof of the crucial importance of the cap structure for viral RNA stability and translation. Yet not all viruses harbour capped RNA, which raises the question about the mechanism they evolved to overcome this cap dependency? To answer this query it’s important to ask whether it is the cap structure itself or its function that is essential. In fact, the cap structure is important for a number of different cellular processes related to mRNA metabolism. For instance, the cap structure protects the RNA from 5’→3’ exonucleases, preventing their degradation. The RNA cap structure also represents a definite molecular structure that is specifically recognized by the eukaryotic initiation factor 4E (eIF4E), which, together with the scaffold protein eIF4G, the RNA helicase eIF4A and the ribosome binding protein eIF3, promote RNA translation initiation. While most viruses use a cap structure to fulfill these important roles, some viruses have evolved cap-independent strategies to ensure the stability and translation of their RNA.
4.1. Viral proteins as substitutes for the cap structure
Viruses of the Picornaviridae (e.g. Poliovirus, Hepatitis A virus), Potyviridae and Caliciviridae (e.g. Norwalk virus, Feline calicivirus) families bear a special type of RNA 5’-end modification. The RNA 5’end of these (+)ssRNA viruses is covalently linked to a viral protein . This viral genome-linked protein (VPg) is not added to the viral genome upon replication, like a regular cap structure, but is instead directly used by the RDRP as a primer to initiate RNA polymerisation. VPg is a representative of the class II nucleic acid-protein complex and does not catalyze its own covalent complex formation (like GTase or PRNTase could do) . The VPg-RNA formation is instead catalyzed by a second protein, the viral RDRP, which synthesizes the primer in a template-dependent matter, resulting in a virus specific initiating primer, VPg-pUpU for Picornaviridae and VPg-pGpU for Calicivirus . VPg is covalently linked to the first RNA nucleotide via a phosphodiester bond between the RNA α-phosphate and the tyrosine hydroxyl group situated in the conserved motif (E/D)EYDE(Y/W/F). The VPg protein protects the vRNA 5’-end from the cellular 5’→3’ exounucleases, thus limiting the vRNA degradation. Furthermore, the VPg is used to initiate the RNA polymerisation instead of being added once the RNA is synthesised. This prevents the formation of 5’-triphosphate vRNA and limits the cellular anti-viral response, which will be described later . In addition to their protective role against RNA degradation, some VPg can fulfill a second important role of the cap structure, promoting the vRNA translation initiation. This is the case of the Caliciviridae and Potyviridae 15 kDA VPg that is essential for vRNA translation initiation. This VPg directly interacts with eIF4E (the cap-binding protein) and the eIF3 complex (the 40S binding complex), which promotes the assembly of the translation initiation complex to the 5’-end of the vRNA (Fig. 9) [68, 72-75]. This allows VPg-vRNA to bypass the requirement for a direct eIF4E-cap interaction in order to initiate translation. This property is not conserved among all VPg, the Picornaviridae VPg is much smaller (2.5 kDA) and is not involved in the vRNA translation initiation . These viruses instead rely on a highly structured RNA sequence called an internal ribosome entry site (IRES) to ensure their translation (this will also be described in more detail later on). All the (+)ssRNA viruses encoding a VPg benefit from its protective effect on the viral genome, but the Caliciviridae and Potyviridae VPg have evolved an additional function, promoting vRNA translation initiation. This VPs is a striking example of a cap substitute as it fulfills two critical functions of the cap structure, namely ensuring vRNA stability and promoting translation initiation.
4.2. Highly structured 5’ RNA structure as an alternative to the cap structure
The ribonucleic acid (RNA) is a macromolecule which, according to the central dogma of molecular biology, is a transient messenger carrying the genetic information required to pilot the protein synthesis. In addition to this canonical role, RNA, given its high chemical complexity, can fulfill additional roles including genome support, ordered three-dimensional structure and even catalytic activity . Many viruses have exploited this capacity of RNA to form complex structure in order to promote viral replication. Some viruses, lacking enzymatic activity to synthesize or acquire a cap structure at the 5’-end of their vRNA, have instead selected a high-order structural RNA element upstream of their coding region. This peculiar RNA sequence can fold precisely and repeatedly into a definite three-dimensional structure. This ordered structure has numerous functions including binding to other macromolecule partners. Those viruses use this cis-acting structure to bind directly or indirectly to ribosomal components in order to assemble the translation initiation complex at the beginning of their open reading frame (ORF). This promotes the cap-independent translation of viral genes. Such RNA structures bypassing the cap-dependency for translation initiation are called internal ribosome entry site (IRES). Many RNA virus families (e.g. Dicistroviridae, Picornaviridae and some Flaviviridae) use this structure to promote viral protein production. The diversity of viruses that have evolved distinct IRES structures can be divided into four categories that differ in their structure, length, mechanism of ribosome recruitment and robustness (Fig.9). The first group of IRES, which is the smallest and simplest, is encoded into the Dicistroviridae (e.g. Cricket paralysis virus) genome. This IRES consists of a 180 nt structure that is able to directly bind and recruit the 40S ribosomal subunit to the translation initiation site, and does not require any initiation factors nor methionyl-tRNA to initiate translation (Fig. 9) [77, 78]. The second group of IRES is similar to the first, but slightly larger with 330 nt. These include Flaviviridae of the Hepacivirus (e.g. Hepatitis C virus) and Pestivirus (e.g. Classical swine fever virus) genus. The second group of IRES is also able to directly bind the 40S ribosomal subunit, but requires the contribution of a limited number of initiation factors (eIF2 and eIF3) together with the methionyl-tRNA in order to initiate the vRNA translation [77-79]. Of notice, the RNA helicase eIF4A is not required for initiation of the group 1 or 2 IRES, an advantage that comes at the expense of a limited RNA unwinding capacity. Therefore the initial coding sequence of the ORF must be encoded by a non-structured RNA sequence, as an RNA structure will block translation initiation in the absence of helicase activity . The Picornaviridae family viruses harbour IRES from the third and fourth groups and are similar in many regards. They are the largest IRES (450 nt) and the most complex. They do not directly bind the 40S ribosomal subunit and require canonical eIFs (eIF2, eIF3, eIF4A, eIF4B, eIF4G) together with additional proteins called IRES trans-activating factors (ITAFs) in order to recruit the ribosome and initiate translation . The difference between these two groups lies in the positioning of the ribosome relative to the ORF. Group 3, found in the Aphthovirus (e.g. Foot-and-mouth disease virus) and Cardiovirus (e.g. Encephalomyocarditis virus) genera, recruits the ribosome at the initiating AUG codon. Group 4, found in the Enterovirus (e.g. poliovirus) and Hepatovirus (e.g. Hepatitis A virus) genera, recruits the ribosome upstream from the ORF and requires a scanning or shunting process to move along the RNA in order to reach the AUG codon and initiate translation [77, 78]. Of notice, those viral IRES (with the exception the of Hepatitis A virus IRES) are able to bypass the requirement for eIF4E, one of the limiting components of the cap-dependent translation initiation complex, to initiate their downstream ORF translation . Encoding an IRES into the viral genome is an efficient mechanism evolved by viruses to fulfill a critical role of the cap structure, namely the translation initiation. The importance of this structure is exemplified by its remarkable degree of conservation. The case of the Flaviviridae family presents an interesting example: the members of the Hepacivirus and Pestivirus genera share a much closer homology between their IRES region than between their coding region, while members of the Flavivirus genus, do not have any IRES at all and synthesize a cap structure through a conventional viral RNA capping mechanism . The emergence of viral alternatives to overcome the lack of a cap structure is a testimony to the crucial functions of this small structure for viral genome stability, replication and translation.
5. Recognition of the 5'-ends by the innate immune system
In humans, the RNA cap structure harbors additional methylations at the 2'-O site of the first and second transcribed nucleotides of the mRNAs . The addition of these supplementary ribose methylations occurs via enzymatic activities located in the nucleus and cytoplasm, respectively [83, 84]. Similarly, many different viruses possess RNA 2'-O-methyltransferases in order to modify their mRNAs. The role of these methylations has however remained elusive until recently when it was demonstrated that 2'-O methylation of viral mRNAs enhances virulence through evasion of intrinsic cellular defense mechanisms [85, 86].
5.1. Innate immune response
Viral infection normally results in the generation of immunological non-self RNA species. Pattern recognition receptors are a crucial component of innate immunity that are responsible for the detection of non-self RNAs . Toll-like receptors (TLRs), retinoic acid inducible gene-I (RIG-I)-like receptors (RLRs) and nucleotide oligomerization domain (Nod)-like receptors (NLRs) are important pattern recognition receptors that recognize non-self nucleic acids of pathogens [88-90]. For instance, many TLRs can detect viral nucleic acids that are found in endosomes following the release of nucleic acids from infected cells [91-95]. This eventually leads to the activation of subsequent immune reactions. In contrast, RLRs detect viral nucleic acids in the cytoplasm of the infected cells during the early phase of viral replication [96, 97]. This detection leads to the induction of interferons and inflammatory cytokines which ultimately block viral replication and promote the activation of antigen-presenting cells in order to eliminate infected cells .
RIG-I, MDA5, and LGP2 are important RLRs that can detect cytoplasmic viral RNAs and induce the expression of cytokines in order to establish a host antiviral state through the expression of numerous interferon-stimulated genes (ISGs) . These include the protein kinase PKR and stress-inducible proteins such as IFIT1 and IFIT2 that can inhibit the protein synthesis machinery of the host cell [99-101]. What is the exact molecular signature found on viral RNAs that is detected by RLRs? Previous experiments demonstrated that RIG-I specifically recognizes 5'-triphosphate groups that can be found on some viral RNAs [102-104]. Viruses must therefore hide or modify their RNA 5'-ends in order to evade the innate immune recognition through the addition of an RNA cap structure or through the addition of alternative 5' elements, such as viral proteins linked to the 5'end in order to hide their uncapped ends. This last strategy is used for instance by poliovirus which encodes a protein, VPg, which is covalently linked to the 5' end of the plus-strand genomic RNA . Viruses that are unable to maturate their RNA 5’-end have instead evolved immune-evasion strategies to prevent ISGs induction. For instance, the Hepacivirus protease inhibits the signal transduction resulting from RIG-I activation [106, 107].
5.2. Importance of the RNA cap 2'-O-methylation
Recent studies suggest that 2'-O-methylation of viral RNAs can enhance the replication of viruses through evasion of the innate immune response [85, 86]. For instance, coronaviruses that lack a functional 2'-O-methyltransferase activity induce a higher expression level of type I interferon . Moreover, these mutant viruses can replicate efficiently in the absence of some RLRs such as MDA5 . Similarly, poxvirus and coronavirus mutants that lack 2'-O-methyltransferase activities show an enhanced sensitivity to IFIT proteins. Therefore, it appears that 2'-O-methylation of cellular mRNAs has evolved as a molecular signature in order to distinguish between self and non-self RNA during viral infection, and that ribose 2′-O-methylation in the cap structure of viral RNAs plays an important role in viral escape from innate immune recognition. Not surprisingly, it has been suggested that the development of pharmacological strategies that could inhibit viral 2'-O-methyltransferases could represent a novel therapy against viruses that replicate in the cytoplasm of infected cells . In fact, it was previously shown that mutations of the 2'-O-methyltransferase catalytic residues can block or attenuate replication [22, 32] and that viral inhibitors such as sinefungin can inhibit methylation and suppress the replication of certain viruses, such as West Nile virus, in cell culture .
This chapter explored the viral diversity of enzymatic activities and mechanistic pathways converging to the maturation of the 5’ cap on viral RNA. The cap structure provides tremendous advantages to eukaryotic viruses in terms of vRNA stability, gene translation and immune evasion. Some viruses have evolved enzymatic mechanisms of action unknown to the eukaryotic domain in order to synthesize this critical structure. Other viruses have developed novel cap synthesis mechanisms that generate a 5’ cap structure chemically identical to their hosts, yet formed by an entirely new process. Finally, particular viruses have also evolved unique mechanisms to steal or mimic the host cap structure. In conclusion, the incredible diversity and conservation of the mechanisms evolved by viruses to synthesize, acquire or mimic the 5’ cap structure is a testimony to the importance of viral RNA capping for viral replication, fitness and infectivity.
M.B. is a 'Chercheur Boursier' Senior from the Fonds de Recherche en Santé du Québec and a member of the Centre de Recherche Clinique Étienne-Lebel.