Viral Modulation of Host Translation and Implications for Vaccine Development Viral Modulation of Host Translation and Implications for Vaccine Development

Translation of mRNAs into protein is an essential mechanism of regulating gene expres- sion—and a step exploited by viruses for their own propagation. In this article, we review mechanisms that govern translation and provide an overview of the translation machin- ery, discuss some of the components involved in this process, and discuss how viruses modulate host translational controls and implications in vaccine design.


Introduction
The central dogma of molecular biology is that data are organized by DNA, mRNA, and protein and that this information is translated during transcription leading to the execution of cellular programs via proteins, which are fundamental to the functioning of a cell. A vast body of literature has added to our understanding of the molecular interplay during translation; however, it is far from comprehensive as (1) biological systems are complex where there is little correlation between the sizes of an organism, its genome size, and the number of protein coding/noncoding genes; (2) biological systems respond acutely to changes in the environment or upon infection with a pathogen; (3) all biological systems are in a state of continuous evolution as they learn from new stimuli and adapt accordingly; (4) posttranslational modifications are normally required for assembly into molecular complexes/proteins to elicit a function; and (5) many proteins are multifunctional across different pathways. We begin this article with a succinct overview of the main components involved in protein translation and the translation process itself and then consider the multiple roles transfer RNAs (tRNAs) have during translation in virus-infected cells and how viruses modify tRNA expression and function. We conclude with a discussion of how understanding the mechanisms by which viruses modulate host translation pathway can aid in an effective vaccine design.
Protein synthesis is a multistep process involving various error-checking mechanisms. For example, genes are transcribed in the nucleus, and mature messenger RNAs (mRNAs) are exported into the cytoplasm as ribonucleoprotein particles, and immediately they are associated with ribosomes (either free in the cytoplasm or bound to endoplasmic reticulum) for initiation of translation. In eukaryotes, ribosomes consist of two subunits, a small 40S (Svedberg) and a large 60S, which together form 80S macromolecular ribonucleoprotein complexes of ribosomal RNA and ribosomal proteins [1]. The 40S subunit scans the mRNA until it recognizes the first codon (triplet AUG) at which point the first amino acid (a.a) methionine (Met) which is bound to its cognate transfer RNA (tRNA) with the UAC anticodon enters and binds to the AUG codon via sequence complementarity. The 60S subunit binds to this complex forming two distinct pockets, the peptidyl (P) site containing the Met-tRNA and an amino-acyl (A) site where the next aa-tRNA comes in. The chain initiator Met from the P site is transferred to the a.a. at the A site with the formation of a peptide bond, and the empty tRNA at A site is released. The 80S ribosome scans the next codon and the dipeptide-tRNA complex moves to the P site, the next aa-tRNA is brought in and peptide chain elongation continues until the ribosome reads the special codon (stop codon) that signals chain ending. When stop codons are read, the peptide chain from the tRNA and the ribosome is released [2]. Typically, each mRNA is processed by multiple ribosomes simultaneously as polysome complexes [3]. Native peptides so formed may need substantial posttranslational modifications before they are transported to their cellular niche and become functional. Mistranslated peptides are degraded by a variety of proteolytic mechanisms and components are recycled. Some mRNAs are long-lived in the host cytoplasm, while others are rapidly degraded following protein synthesis [4].

Ribosomes
Both prokaryotic and eukaryotic ribosomes are macromolecular complexes consisting of ribosomal RNAs (rRNAs) and ribosomal proteins. Ribosomes are separated for structural and related studies using isopycnic ultracentrifugation [5] where eukaryotic ribosomes typically pellet at 80 Svedberg units of sedimentation and are referred to as 80S ribosomes though they consist of the smaller 40S and the larger 60S subunit [6][7][8][9][10][11]. The complete ribosome is 4.3 MDa where the larger 60S subunit contains 28S rRNA, 5S rRNA, 5.8S rRNA, and 47 distinct ribosomal proteins, while the smaller 40S contains a single 18S rRNA and 33 distinct ribosomal proteins [12]. Mammalian ribosomes contain all the sites necessary for interaction with the components of the translation machinery such as eukaryotic initiation factor 1 [13]. Structural studies have identified conserved cores in mammalian ribosomes as well as proteins that are unique to the human ribosome [14]. The main features of the ribosome involved in translation include the amino-acyl (A) site where aa-tRNAs bind, the P site where peptide bond formation occurs, and the E site where uncharged tRNAs exit the ribosome (Figure 1). Ribosomal RNAs are also posttranscriptionally modified at multiple positions and these modifications are essential for proper folding and function [15,16]. Typical rRNA modifications are catalyzed by small nucleolar RNAs (snoRNAs) and include 2'-O ribose methylation and pseudouridylation, which is a very abundant posttranscriptionally modified nucleotide in various stable RNAs of all organisms. These specific bases in the rRNA stabilize rRNA structure and function. Ribose modifications are guided by C/D box snoRNAs, while pseudouridylation modifications are regulated by H/ACA box snoRNAs [17][18][19][20][21][22][23][24].

Messenger RNA (mRNA)
The human genome is 3.4 billion base pair long and encodes ~32,000 protein-coding genes with a median gene size of ~1 kb containing 7 exons [25]. Protein-coding genes are transcribed by RNA polymerase II, and primary transcripts are spliced to remove introns to generate mature mRNAs, which are polyadenylated by a poly A polymerase at the 3′ end while the 5′ end carries a specific 7-methyl guanosine (m7G) modification that stimulates canonical translation initiation [26]. Mature mRNAs associate with several RNA-binding proteins and exit the nucleus as ribonucleoprotein complexes, which then associate with ribosomes to initiate translation. Multiple factors such as number of transcripts, half-life of the mRNA, etc. determine the level to which a particular mRNA is translated. Housekeeping mRNAs have long half-lives, while transcription factors and inducible genes constitute the bulk of mRNAs with short half-lives in concordance with their transient roles.

Transfer RNAs
The human genome encodes 610 tRNA genes [25] that are interspersed throughout the nuclear genome and can be classified into 51 anticodon families targeting the 64 codons. Significant intraspecies [26] and interspecies [25] copy number variation has been previously demonstrated and may extend to the tissue or cellular level. Approximately 50% of the nuclear tRNA genes are transcribed. The standard 20 a.a are decoded by 597 different tRNAs, and 3 tRNAs encode selenocysteine, where incorporation of selenocysteine into the growing peptide chain occurs by a unique suppressor tRNA and a stop codon. Moreover, 2 tRNAs have potential suppressor function, and 6 tRNAs have unknown a.a. that they carry. Additionally, the mitochondrial genome encodes 22 mitochondrial tRNAs (mtRNAs) [27]. Nuclear tRNAs are encoded by intronic or intergenic tRNA genes that are transcribed by RNA polymerase III in conjunction with transcription factors TBP, BDP1, BRF1, TFIIIB, and TFIIIC in a 3D spatially distinct region in the nucleus termed the nucleolus.
The prototypical tRNA genes consist of a 5'-UTR and signature A and B box motif [28,29], followed downstream by a stretch of U residues that signal transcript termination. tRNA genes can be located within introns of protein coding genes where they are cotranscribed with their encoding genes. For all intergenic tRNAs, transcription is a concerted process initiating with binding of transcription factor TFIIIC to the A and B box region, recruiting TFIIIB upstream, and culminating in recruitment of RNA Pol III. Primary transcript is next processed by RNAse P-and RNAse Z-mediated removal of the 5′ leader and the 3′ trailer sequence, where tRNA nucleotidyl transferase mediates addition of the 3'-CCA trinucleotides [30][31][32]. Several posttranscriptional modifications on the tRNA are followed by coupling of the tRNA with the cognate a.a., a process mediated by aminoacyl tRNA synthetases. The process of tRNA charging involves recognition of several modifications on the tRNA body especially N73 near the CCA motif at the 3′ end [33]. Aberrant primary tRNA transcripts are recycled through a nonsense-mediated decay pathway involving degradation of their 3′ ends. Additionally, mature tRNAs lacking modifications are degraded via a 5′ exonucleolytic cleavage. Eukaryotic cells encode for 20 distinct tRNA synthetases for each of the 20 standard a.a. It remains unclear if amino acylation is restricted to the nucleus or also occurs in the cytoplasm. Mitochondrial tRNAs (mtRNAs) that are encoded on the circular mitochondrial genome between the rRNA and mRNA genes [27] are transcribed by the mitochondrial RNA polymerase in conjunction with transcription factors Tfam and mtTFB from the bidirectional promoters on the circular mitochondrial genome.
Both cytosolic and mtRNAs are posttranscriptionally modified [34], though nuclear tRNAs [35] can have additional modifications presumably due to the mechanisms of action for nuclear tRNAs and the bacterial origin of mitochondrial tRNAs [27,36,37]. These modifications have at least three important functions: (1) modifications affecting the anticodon loop, which alter translation efficiency; (2) modifications to the tRNA body affecting tRNA secondary structure; and (3) modifications at other positions that determine aminoacyl transferase recognition and amino acid loading on the CCA motif [38]. More than 100 diverse modifications have been reported for nuclear tRNAs, while mtRNAs exhibit about 16 conserved posttranscriptional modified nucleosides [39]. The nature and role of tRNA modifications are beyond the scope of this review, but they have an essential role in tRNA function both canonical and noncanonical functions.  [41,42], and (3) class III mtRNAs (e.g., mtRNA Ser(AGY) ) lack the D-loop and do not exhibit the classical cloverleaf structure [43,44].

Wobble-hypothesis and associated implications on translation
The specificity of the codon: anticodon interaction is crucial for incorporation of the correct amino acid into the growing peptide chain and determines the composition of the proteome [45][46][47], rate of a.a misincorporation [48][49][50][51][52], and ultimately protein folding [53,54]. However, the standard genetic code is degenerate (i.e., more than one codon can specify the same amino acid). For example, six different codons can specify the a.a. lysine (K); tRNA Lys is thus able to bind to six different codons for K in any given mRNA. This is because the ribosome can determine if the interactions between the first two bases of the anticodon on the tRNA and the corresponding complements on the mRNA are of Watson-Crick-type, but cannot distinguish if the third base interaction is perfectly complementary. Nuclear magnetic resonance (NMR) studies with anticodon stem loops of the smaller 40S unit of E. coli tRNA Lys have clearly shown three modifications in this region, a N6-threonylcarbamoyladenosine (t 6 A) modification at position 37, a 5-methylaminomethyl-2-thiouridine (S,mnm 5 s 2 U) modification at position 34, and a pseudouridine at position 39, which force the dynamic loop structure to assume an open U-turn structure that perfectly fits the ribosomal decoding center [55]. Ribosomal profiling studies have shown that wobble positions slow the rates of protein translation [56]. Controlling the rate of translation via wobble base pairing has important implications: (1) utilizing infrequent tRNAs that are expressed only under particular stimuli, (2) allowing for stable and correct folding of the protein, and (3) allowing information for regulation of translation rate to be hard-coded in the mRNA [57,58].
Recent studies have shown that in cellular organelles that do not encode all the tRNAs necessary to read the genetic code, a single tRNA species containing a U in the wobble position in the anticodon can read fourfold degenerate codon, a phenomenon described as superwobbling [58]. The superwobbling allows codons to be decoded not only by tRNAs containing a perfectly complementary or wobble 3rd base but also by tRNAs that employ superwobbling allowing for smaller genomes [58,59].

Alternative functions of tRNAs
In addition to their normal function in protein synthesis, tRNAs acutely respond to cellular and environmental stresses. Cells with different proteomic profiles also exhibit diversity of tRNA iso-acceptor types, i.e., tRNAs with different anticodons but same a.a. tRNA expression, posttranscriptional modifications, and abundance (both copy numbers and expression) typically reflect the cellular state of tRNAs that code for the most abundant codons and are found in high copy numbers. tRNA expression levels in a particular cell type reflect the codon bias of that cell and indicate the proliferation status of a cell type, a feature that supports the proposition that tRNA gene expression is modulated in response to the host cell needs. The ribosomal tempo is thus regulated by abundance and diversity of the tRNA pool available during translation.
tRNAs are cleaved during cellular stress [60] and in immune response to infection generating specific tRNA fragments (tRFs) that contain the 5′ (5'tRFs) or the 3′ (3'tRFs) ends of the parent tRNA molecule (Figure 2). The most known tRFs are nuclear in origin though a few tRFs have been shown to originate from plastid genomes [61] or mitochondria [62]. tRFs have also been reported to originate from the pre-tRNA moiety instead of the mature tRNA molecule, and these are labeled as 3'-U tRFs since they match the 3′-trailer region of the precursor tRNA [63][64][65]. Many tRFs that result from cellular stress conditions consist of two 30-40 nt long fragments split across the anticodon loop and are referred to as tRNA-derived stress-induced RNAs (tiRNAs) [66][67][68]. tiRNAs reflect universal hallmarks of cellular stress across all kingdoms of life [69][70][71][72][73][74][75].
Mechanisms of how tRFs are produced are most likely stimulus and species specific. Similarly, the functional roles of tRFs are yet to be elucidated (reviewed previously [92]). In yeast, tRFs are associated with starvation-induced vacuoles where they are degraded to provide phosphate and nitrogen [93]. tRFs also accumulate in plants during conditions of phosphate paucity [70].
Cleavage of the 3′ end CCA by angiogenin has been shown to reduce rate of protein translation [94], as well as initiation by competing with the eukaryotic initiation factor eIF4F.

How do tRNAs affect vaccine production and potentially efficacy?
Among the variety of stimuli host cells respond to, intracellular pathogens are a special case as many pathogens regulate host cell translation themselves. Viruses in particular regulate multiple facets of the host translation process since inhibition of host protein synthesis (1) makes Immunization of a host with viral vaccine antigen can prevent viral modulation of the host translation machinery. Most viral antigens are considered "foreign" by the host cell-a feature tied to their codon usage that differs from the host.
The standard vertebrate genetic code contains 64 codons (61 coding for an amino acid and 3 stop codons); however, most eukaryotic proteins contain 20 standard amino acids, and thus, more than one codon can encode the same amino acid. Codons that specify the same amino acid are referred to as synonymous codons. Those that do not specify the same amino acid are termed nonsynonymous codons. However, most biological systems have evolved to preferentially utilize one or few codons for each amino acid during translation, a feature referred to as codon usage bias [95][96][97][98]. Thus, in an infected cell, viral and host proteins may be translated by very different collections of codons. Accumulating data show that many viruses evolve to adapt their codon usage to the host [99], and this can be specific for each virus or viral gene to regulate the tempo and pattern of expression. This raises a challenge in commercial vaccine production because rare codon usage can lead to low yield of the immunogen and increase production costs [100]. Secondly, while most host protein synthesis begins with an initiator codon (AUG) coding for methionine, viral genomes utilize multiple mechanisms of noncanonical translation such as internal ribosomal entry sites (IRES), ribosome shunting, leaky scanning of the viral open reading frame, non-AUG initiation, and reinitiation from AUG with frame shifts; read through translation and alternative stop; and carry on translation [101]. A detailed description of this is out of the scope of this examination, and it is important to understand how these mechanisms can be used to improve vaccine yield and/or efficacy.
A commonly employed strategy to improve vaccine yield is to optimize the codon usage pattern to overcome bias for the antigen in question [57,102]. Codon usage bias is calculated by counting the number of time a particular codon is observed in a gene or set of genes. This can be extended to calculate the relative synonymous codon usage, which reflects the abundance of a particular codon relative to all other codons in the absence of a codon usage bias. By tabulating the most frequently used codons in the host genome and comparing to those used in the viral genome, it is possible to discern codon usage bias (CUB) for the virus. Immunogens in vaccines can then be expressed either in cells that overexpress the rare tRNA used by the viral protein to increase protein yield or engineered through molecular tools (site-directed mutagenesis, cloning, etc.) to utilize the most common host codons. This codon optimization strategy has been employed for developing a variety of vaccines [57,. Codon optimization has been reported to reduce vaccine efficacy by increasing antigenicity and changing conformation of the native immunogen [141][142][143][144][145]. Codon optimization as a way to increase immunogen (vaccine) production suffers from the assumptions that: (1) rare codons limit rate of translation, (2) synonymous codons have redundant function, (3) replacing rare codons with high-frequency codons improves protein yield, and (4) sites of posttranslational modifications are preserved upon codon optimization. However, multiple studies have shown that these are not necessarily true and multiple other factors such as mRNA secondary structure [146] and posttranscriptional modifications on mRNAs [147] can alter rates of translation.
Conversely, incorporation of rare (nonpreferred) codons in viral genes used for antigen production can lead to decreased production of viral antigens and lead to attenuation. This codon deoptimization strategy has also been employed for a variety of viral vaccine candidates [148][149][150][151][152][153][154][155][156][157][158][159][160][161][162][163]. These studies have clearly shown attenuation of viral replication and improved immune responses. Further, it was recently shown that deoptimized live attenuated viral vaccines in case of respiratory syncytial virus (RSV) remain genetically stable if these changes in the genome are distributed throughout and not restricted to one viral gene or antigen [149]. Codon deoptimization strategies are still being explored for viral vaccine design; however, like codon optimization strategies, the rules for design of a safe and effective candidate are only partly recognized. Both optimization and deoptimization require extensive computational analysis, which needs to be followed up with measures of attenuation, antigenicity, and structural analysis of the antigen coupled with analysis of alternative peptides and proteins. An overview of codon optimization strategies currently used for viral antigens is shown in Figure 3.

Future directions
tRNAs and other molecules involved in host translation are an important target for disease intervention especially for intracellular viral pathogens, which are completely reliant on the host translation machinery for their successful replication and propagation in the host. However, mechanisms by which viruses and their hosts regulate translation are still being elucidated and this information is critical for development of novel interventions for both infectious and noninfectious diseases. Several vaccine production platforms use codon optimization strategies so that vaccine candidates mimic host codon usage and can be produced more efficiently with lower production costs. This results in selective usage of certain tRNAs to carry particular amino acids and to be recognized by the host cells. It is important that viral proteins can be synthesized preferentially over host proteins stimulating an immune response using these viral antigens and can be used to educate the host immunity to reduce or block damage due to subsequent infections. Inherently, every vaccine is foreign in nature for its host, which triggers an immune response. Prevalent vaccines used against infectious disease broadly fall into three categories: (1) those involving attenuated/killed pathogen, (2) subunit vaccines that contain one or more pathogen antigens (pathogen-derived or recombinant), and (3) recombinant plasmids that express one or more antigens as above. Additionally, vaccines are formulated considering delivery routes, speed of antigen release, need for adjuvants, and desired immune response. Irrespective of these criteria, the primary criterion that defines a vaccine is its antigenicity and it is important to understand mechanisms that regulate antigenicity of vaccine candidates to retain efficacy in vivo.