Gene Duplication and the Origin of Translation Factors

Charles Darwin is famous for his contribution to the development of evolutionary theory. Less commonly known is that Darwin was a good botanist. He wrote several books devoted to flowering plants. Being an honest scientist, he did not conceal the inability of his theory of evolution to explain the sudden appearance and rapid spread of angiosperms, calling this phenomenon an “abominable mystery”. One possible solution to the puzzle that agitated Darwin may be the several successive duplications of the ancient ancestral genome at the beginning of the divergence of angiosperms that gave them the ability to rapidly accumulate changes (Cui et al., 2006). Speculation about the possible role of gene duplication in evolution began in the middle of the last century (Sturtevant, 1925; Haldane, 1932; Muller, 1936; Lewis, 1951), but only the later rapid development of molecular biology allowed the identification of numerous repeated sequences that revealed a high frequency of gene duplication in evolution. Based on this information, S. Ohno (Ohno, 1970) suggested that gene duplication was the only way new genes could emerge.


Introduction
Charles Darwin is famous for his contribution to the development of evolutionary theory.Less commonly known is that Darwin was a good botanist.He wrote several books devoted to flowering plants.Being an honest scientist, he did not conceal the inability of his theory of evolution to explain the sudden appearance and rapid spread of angiosperms, calling this phenomenon an "abominable mystery".One possible solution to the puzzle that agitated Darwin may be the several successive duplications of the ancient ancestral genome at the beginning of the divergence of angiosperms that gave them the ability to rapidly accumulate changes (Cui et al., 2006).Speculation about the possible role of gene duplication in evolution began in the middle of the last century (Sturtevant, 1925;Haldane, 1932;Muller, 1936;Lewis, 1951), but only the later rapid development of molecular biology allowed the identification of numerous repeated sequences that revealed a high frequency of gene duplication in evolution.Based on this information, S. Ohno (Ohno, 1970) suggested that gene duplication was the only way new genes could emerge.

Types of duplications
Duplication of DNA can occur in many ways: (1) partial duplication of a gene (or an internal duplication), (2) duplication of a single gene, (3) partial duplication of a chromosome, (4) duplication of an entire chromosome, and (5) genome duplication, or polyploidy.The first four types of duplication are sometimes combined under the term SSD (smaller scale duplication) (Davis & Petrov 2005).Other authors prefer the terms "paralogon" (derived from "paralog"), for extended duplicated regions containing paralogs, and SGD (single gene duplication), for duplications of individual genes (Durand & Hoberman, 2006).Duplication of the entire genome is designated as WGD (whole genome duplication) (Davis & Petrov, 2005).According to Ohno, duplication of the genome rather than its individual parts is more important for evolution, because the partial duplication of regulatory genes or other restricted elements of the genome may lead to regulatory imbalances (Ohno 1970).

Whole genome duplications
Ancient polyploidizations of the genome have been identified in all four eukaryotic kingdoms: plants, animals, fungi and protists.In all cases, the proportion of genes in the form of duplicated copies ranges from 10 to 50% and often correlates with the time elapsed since duplication (Scannell et al., 2006).WGD is widespread in plants (Vision et al., 2000;Adams & Wendel, 2005).Estimates of the incidence of polyploidy in angiosperms vary from 30 to 80%, and about 3% of speciation events are explained by genome duplications (Otto & Whitton, 2000).Many, if not all, species of plants may thus have at least one polyploid ancestor.Most eudicots are assumed to have an ancient hexaploid ancestor, with subsequent tetraploidization in some taxa (Jaillon et al., 2007).Duplication of the entire genome in the yeast Saccharomyces cerevisiae led to an initial increase in the number of genes from 5000 to 10 000, but the subsequent loss of paralogs has led to the preservation in modern Saccharomyces of about 5500 protein-coding genes, of which 1102 form 551 paralogous pairs (Byrne & Wolfe, 2005).A special term, ohnologs, dedicated to S. Ohno, was proposed for paralogs resulting from WGD (Wolfe, 2000).Detection of natural polyploidy is a difficult task, especially for ancient events.Recent duplications can be detected by comparing closely related species, one of which underwent diploidization and therefore contains twice as many chromosomes as species that did not undergo WGD.For example, a comparison of the genomes of Ashbya gossypii and S. cerevisiae revealed that both species evolved from a single ancestor that had seven or eight chromosomes (Dietrich et al., 2004).Changes in chromosome number due to mutations (in particular translocations) led to the ancestors of A. gossypii and S. cerevisiae.WGD in S. cerevisiae has provided this species with new opportunities for functional divergence absent in A. gossypii.A similar comparative analysis was also carried out for S. cerevisiae and its closest non-WGD relative, Kluyveromyces waltii (Kellis et al., 2004).The older the duplication, the harder the analysis, because a period of diploidization often follows polyploidization, which "transforms" the polyploid genome to the diploid state.Diploidization is achieved by an intensive loss of genes, rearrangements of the genome and the divergence of duplicated genes.Recent analyses have also shown that the duplication of individual genes in evolution has occurred much more frequently than was previously thought (Lynch & Conery, 2000;Lynch et al., 2001).Diploidization has been studied in many genomes including those of plants (Chapman et al., 2006;Jaillon et al., 2007;Tuskan et al., 2006), bony fishes (Brunet et al., 2006), yeasts (Piskur, 2001;Kellis et al., 2004;Scannell et al., 2006;Scannell et al., 2007), Paramecium (Aury et al., 2006) and vertebrata (Blomme et al., 2006).Plants have repeatedly undergone polyploidization during evolution, presumably aided by their ability to propagate vegetatively and by the existence of specific regulatory mechanisms in plant cells.In particular, model polyploids have been characterized by a rapid loss of some genes and the specific inactivation of others by methylation (Kashkush et al., 2002;Comai et al., 2000;Lee & Chen, 2001).Epigenetic silencing may protect the duplicated copies from pseudogenization, thus facilitating the acquisition of new functions (Rodin & Riggs, 2003).Vertebrate genomes contain many families of genes that are not found in invertebrates, and many gene duplications apparently occurred early in the evolution of the chordates (Taylor & Raes, 2004).Ohno suggested that the complex genome of vertebrates arose as a result of two rounds (2R) of WGD (Ohno, 1970).This view was once supported by the belief that the human genome contained about 100 000 genes, which was four times more than the estimated number of genes in the genomes of invertebrates.Sequencing of the human genome has since reduced the estimate of the number of genes to 20 000-25 000 but has not yet answered the question of the number of duplications of the ancestral genome.Some authors continue to support the 2R hypothesis (Larhammar et al., 2002;Spring, 1997;Meyer & Schartl, 1999;Wang & Gu, 2000;Dehal & Boore, 2005), others find evidence of only one round of WGD (X.Gu et al., 2002;Guigo et al., 1996;McLysaght et al., 2002), while others disclaim the possibility of WGD entirely and discuss only duplications of a limited number of segments (Friedman & Hughes, 2001;2003).Ohno (1970) argued that duplication of the genome rather than its individual parts is more important for evolution, because partial duplications can lead to regulatory imbalances.Nevertheless, partial and complete duplications of genes also play very important roles in evolution.WGDs have occurred several times during the evolutionary history of organisms, while SSDs arise continuously through multiple mechanisms.Several mechanisms have been suggested for the improvement in function of existing proteins and for the creation of new functions.One such mechanism is the internal (partial) duplication of genes, which is important for increasing the functional complexity of genes in evolution (Li, 1997).Such duplications are believed to have played a key role in the emergence of complex genes.Many proteins of modern organisms contain internal repeats of amino acids, and these repeats often correspond to functional or structural domains of proteins.These data suggest that the genes encoding these proteins were formed by internal duplications (Lavorgna et al., 2001).Internal duplication provides the possibility of improving protein function by increasing the number of active sites.Internal duplications can also lead to the acquisition of new functions by the modification of duplicated regions or the reorganization of modules.Numerous data on the role of intragenic duplications in the early stages of evolution of proteins were obtained by comparative analyses of sequenced genomes (Marcotte et al., 1999;Lavorgna et al., 2001;Conant & Wagner 2005;Chen et al. 2007).Duplicated regions can accumulate mutations that contribute to the divergence of the repeated fragments, which can then become fixed.Often, only traces of duplications in the form of imperfect repeats can be detected in contemporary amino acid sequences (Li, 1997).Eukaryotic proteins have more repeats than do prokaryotic proteins (Marcotte et al., 1999;Chen et al., 2007).

The fate of duplicated genes
Tens of millions of years after WGD in Arabidopsis thaliana and S. cerevisiae, only about 30% and 10%, respectively, of the genes are preserved in the form of duplicated copies (Seoighe & Wolfe, 1999;Wong et al., 2002;Blanc et al., 2003).Preservation of duplicated copies in evolution can be achieved by one of three processes: (1) conservation, in which the copies are stored in an unaltered state (Hahn 2009); (2) subfunctionalization, in which both paralogs are necessary for performing the functions previously provided by the ancestral gene (both terms were offered by Force (Force et al., 1999)); and (3) neofunctionalization, in which one of the paralogs acquires a new function and the other preserves the old function.Characteristically, in (2) and (3), the regulatory and/or structural parts of the gene may be changed (Figure 1).

Conservation of duplicated copies
Duplicated genes are retained unchanged in cases where the normal development of the organism needs many copies of genes with similar function, which allows the synthesis of a larger amount of specific RNA or protein (Ohno, 1970).An increase in the number of copies of these genes correlates with the increasing complexity of the organism (Chen et al., 2007).Amplification of genes in microorganisms leads to resistance to antibiotics and heavy metals, increased virulence and other adaptive properties (Romero & Palacios, 1997;Reams & Neidle, 2004;Andersson & Hughes, 2009).In plants, amplification of genes provides resistance to herbicides (Harms et al., 1992;Shyr et al., 1992).The best known examples of conservation of duplicated copies in various organisms are genes for rRNA, tRNA and histones, many of which are organized in tandem repeats, which allows the maintenance of homogeneity by unequal crossing over or gene conversion (Hurles, 2004).One of the most interesting questions related to the preservation of duplicated copies of genes is whether the loss of genes is an occasional event or is subjected to natural selection.Which duplicates are lost, and which persist after polyploidization?About 10% of yeast genes are preserved in the form of duplicated copies, and most are not needed for viability (Z.Gu et al., 2002).The most frequently duplicated genes encode cyclins, components of the signal transduction pathway, and cytoplasmic (but not mitochondrial) ribosomal proteins.Most are characterized by high levels of expression.Perhaps selection for increasing the level of expression was the major factor for the preservation of duplicated genes (Seoighe & Wolfe, 1999).Analysis of the most recent WGD in Arabidopsis showed a preferential retention of genes involved in transcription and signal transduction, whereas genes involved in DNA repair or encoding proteins of organelles were characterized by more frequent loss (Blanc & Wolfe, 2004).Interestingly, genes preserved as paralogs after duplication have a high probability of remaining duplicated after the next round of duplication (Seoighe & Gehring, 2004).Loss of duplicates is thus not a random process.

Subfunctionalization
This hypothesis is to some extent the opposite to Ohno's hypothesis of evolution, because it assumes the existence of both functions before the duplication (Figure 1).The first evidence for it was the discovery of the phenomenon of "gene sharing" (Piatigorsky et al., 1988;Piatigorsky, 2003).This model explains the emergence of new genes by the duplication of multifunctional genes.Such genes encode proteins that already perform different functions.Gene sharing was discovered in crystallins, proteins found in the lens of the eye.Crystallins make up 70% of the contents of cells, but remain in soluble form without forming aggregates (formation of aggregates leads to cataracts).Under such conditions, a majority of proteins would form insoluble aggregates within seconds.Another feature of these proteins is a record longevity (equal to the lifetime of an individual, for example 80 years); most proteins last for only minutes or hours.The eyes of all vertebrates have a standard set of crystallins (α, β, γ), and additional species-specific crystallins are encoded by genes that in other tissues encode enzymes.In most cases, this double life is ensured not by duplications but by a "division of functions": enzyme and crystallin are encoded by the same gene, but the protein can perform additional functions without changing its amino acid sequence.This phenomenon was thus called gene sharing (Piatigorsky et al., 1988;Piatigorsky, 2003).In gene sharing, a gene acquires a second function, without duplication and without loss of its primary function.A change in tissue specificity or regulation during development, however, may occur.Acquisition of a new function without duplication was first detected in crystallin ε in birds and crocodiles (up to 23% of the total protein of the lens).The amino acid sequence of crystallin ε was identical to lactate dehydrogenase B (LDH), and the protein had an activity similar to LDH.Subsequent work showed that both proteins were encoded by the same gene.Similarly, crystallin τ in lampreys, bony fishes, reptiles and birds is identical to and encoded by the same gene as α enolase.Zeta-crystallin is identical to quinone reductase.Crystallins δ, ε and τ thus illustrate examples of "division of functions", when a gene has acquired additional functions, without duplication.Multifunctional genes are characterized by significant limitations in the capabilities of any adaptive changes, since mutation that improves one function may disturb another.Duplication could provide a possible resolution of this "adaptive conflict".The molecular mechanisms leading to subfunctionalization have not been studied in detail until recently.Such analyses only became possible with the comparative analysis of genes in closely related species, for example in genes involved in galactose utilization in S. cerevisiae and K. lactis (Hittinger & Carroll, 2007).Divergence in the expression of duplicated genes over long periods of time attracted the interest of scientists as an important stage in the emergence of a new gene by duplication (Ohno, 1970;Ferris & Whitt, 1979).Thus in some cases, duplicates may have identical coding sequences but different regulatory sequences (Figure 1).Some pairs of duplicated genes can diverge in concert, forming two groups that are expressed in different tissues or under different conditions (Blanc & Wolfe, 2004).This process, which explains the divergence of metabolic pathways, is called "concerted divergence".

Neofunctionalization
The stable maintenance of duplicated copies in the genome requires functional divergence.From Ohno's (1970) position, functional divergence is achieved by ensuring that one copy of the gene retains the old function, while other copies acquire new functions.An inevitable intermediate stage in this process would be the emergence of a pseudogene, as most mutations will disrupt or inactivate a gene rather than giving rise to new functions.Because this event is considered extremely unlikely, an extended hypothesis of neofunctionalization (NF) has been proposed, which includes the following possibilities: (1) a new gene acquires a new function but keeps the old function (NF-I), ( 2) a new gene completely looses the old function (NF-II), or (3) a new gene retains part of the old function (NF-III) (He & Zhang, 2005).Many examples of neofunctionalization have been described in recent years (see (Hahn, 2009)), although distinguishing neofunctionalization from subfunctionalization is sometimes difficult and has led to the creation of a "subneofunctionalization" model (He & Zhang, 2005).

Exon shuffling as a mechanism of neofunctionalization
One of the options for neofunctionalization is the formation of "chimeric" or fusion genes (Long, 2000).This phenomenon is possible due to the duplication of a gene or part of a gene, because only then can the original gene remain functional.After gene duplication, one of the copies can capture an exon(s) from an unrelated adjacent gene.Another possibility is the addition of flanking non-coding DNA as an additional open reading frame.The model, known as "exon shuffling" (Gilbert, 1978), suggests that recombination in introns can provide a mechanism for exchanging exon sequences between genes.However, the event will be evolutionarily significant only if it involves a structural or functional domain.Moreover, the shuffling of domains can occur without the involvement of introns (Doolittle, 1995).We are thus more correct to discuss the shuffling of domains rather than exons.Introns do not occur in prokaryotic genes, but many cases of domain shuffling have been described.The presence of introns, though, greatly facilitates the shuffling of domains, especially in vertebrates.In the 30 years since the discovery of introns, many examples of exon shuffling in a variety of organisms (vertebrates, invertebrates, plants) have been found.Only relatively recently have retrotransposition and illegal recombination been shown to be responsible for these phenomena (Long et al., 2003;van Rijk & Bloemendal, 2003).

The main stages of translation
In the process of protein synthesis, or translation, four distinct phases are usually distinguished: initiation, elongation, termination and recycling (Figure 2).During initiation, the ribosome is assembled at the initiation codon of the mRNA, and the initiating methionyl-tRNA is attached to the peptidyl (P) center of the ribosome.The main objectives of the initiation of translation are identical in bacteria and eukaryotes, but initiation is much more complex in eukaryotes than in bacteria (Kapp & Lorsch, 2004).Three initiation factors occur in bacteria, but eukaryotes have at least 12, which contain about 23 different proteins (Sonenberg & Dever, 2003).Interestingly, the initiation of translation in archaea is intermediate in complexity between bacterial and eukaryotic translation.During elongation, the aminoacyl-tRNA binds to the aminoacyl center (or A-site) of the ribosome, where the information recorded on the mRNA is translated into the language of proteins.This process involves elongation factor eEF1A (EF-Tu in bacteria) in complex with GTP.The ribosomes catalyze the formation of peptide bonds when the anticodons of tRNAs correspond to the codons of the mRNA.After translocation of the mRNA in the P-center, with the help of eEF2 (EF-G in bacteria), a next codon arrives in the A-center, and the process repeats.In contrast to initiation, the main components involved in elongation are highly conserved in all three domains.For example, the human elongation factor eEF1A and EF-Tu of Escherichia coli are 33% identical along their entire length, exhibiting a higher degree of similarity in the GTP-binding domains (Cavallius et al. 1993).The proteins a/eEF1A and a/eEF2 reveal significant structural similarities, both in the free state and in complex with the ribosome (Andersen et al., 2001;Stark et al., 2002;Valle et al., 2002;Jorgensen et al., 2003).The similarity of elongation factors in bacteria, archaea and eukaryotes suggests that the mechanisms of elongation in eukaryotes in many respects correspond to those in bacteria and archaea (Ramakrishnan, 2002).Termination of translation begins when the stop codon (UAA, UAG or UGA) enters the Asite of the ribosome.As the result of this process, the newly synthesized polypeptide chain is released.The stop codon is recognized by a release factor (RF1/RF2 in prokaryotes and eRF1 in eukaryotes) that triggers release of the nascent peptide from the ribosome.The efficiency of termination is enhanced by the GTPase release factor, RF3 in prokaryotes and eRF3 in eukaryotes (Kisselev et al., 2003).At least some stages of the termination of translation, such as recognition of the stop codon and hydrolysis of peptidyl-tRNAs, are assumed to be similar in archaea and eukaryotes.This hypothesis is based on data of the homology of aRF1 and eRF1 and the finding that aRF1 of Methanococcus jannaschii is able to function in an in vitro system containing mammalian ribosomes (Dontsova et al., 2000).Archaea, however, do not have homologs of RF3 and eRF3, which does not necessarily mean the absence of proteins with similar functions.Alternatively, these proteins may be absent due to a reduction of the apparatus of translation during the evolution of archaea (Lecompte et al., 2002).During the final stage of translation, recycling, the dissociation of the ribosome occurs together with the release of the mRNA and deacylated tRNAs.An essential feature of this stage is the preparation of a new round of initiation.The details of this process are known only for bacteria.

Termination factors have arisen by the duplication of genes encoding elongation factors
Comparison of amino acid sequences in the family of elongation factors raised speculation that the progenitors of EF-G and EF-Tu arose as a result of duplication and subsequent divergence of a gene encoding an ancient GTPase, and further duplications led to the emergence of modern elongation and termination factors (Nakamura & Ito, 1998;Inagaki & Doolittle, 2000) (Figure 3).RF1, RF2 and RF3, as well as eRF1 and elongation factor eEF-2, are assumed to have been derived from the bacterial elongation factor EF-G (Nakamura & Ito 1998), while eRF3 arose from the duplication of the gene encoding eukaryotic elongation factor eEF1-A (Inagaki & Doolittle 2000).The amino acid sequences of RF1 and RF2 are 36% identical, suggesting that the genes prfA and prfB arose from a common precursor by duplication (Craigen et al., 1990).Homologs of eRF1 are found in different species, and the eRF1 protein from different species is able to replace eRF1 of S. cerevisiae, indicating a high degree of functional conservation (Urbero et al., 1997).An almost complete lack of similarity in the sequences of bacterial and eukaryotic termination factors probably indicates their independent origin (Kisselev et al., 2003).On the other hand, the first class factors (RF1, RF2, aRF1 and eRF1) could be so divergent that they have lost any resemblance, with the exception of the GGQ motif (Frolova et al., 1999;Lecompte et al., 2002;Seit-Nebi et al., 2001).The lack of homology between the amino acid 158   (Hoshino et al., 1998;Jakobsen et al., 2001); ** -duplication described only from Saccharomyces (Atkinson et al., 2008); *** -duplication specific to several species of ciliates (Liang et al., 2001;Atkinson et al., 2008) and A. thaliana (Chapman & Brown, 2004).Branch lengths are not to scale.The progenitors of prokaryotic EF-G and EF-Tu were proposed to have first diverged from a common ancestral GTPase, and then each gave rise to two protein families corresponding to the elongation and termination factors (Nakamura & Ito, 1998;Inagaki & Doolittle, 2000;Atkinson et al., 2008).EF -elongation factor, RFrelease factor, e -eukaryotic, a -archaeal.sequences of bacterial and eukaryotic termination factors does not mean that these proteins lack similarity at other levels of the organization of protein molecules.Indeed, the spatial structure of many translation factors are characterized by a number of common features that fit the hypothesis of "molecular mimicry" (Nissen et al., 2000;Nakamura & Ito, 2003).In contrast to eRF1, eRF3 is a much less conserved protein, especially in its N-terminal domain, which can either be completely absent, as in the case of Giardia lamblia (Inagaki & Doolittle, 2000), or demonstrate species-specific differences in length (maximum length is 321 amino acids in Leishmania major (Atkinson et al., 2008)) and amino acid sequence.This lack of conservation may underlie species-specific regulation of the activity of this protein (Kodama et al., 2007).In some species of yeast, the N-terminus is enriched in QN residues and provides prionogenic properties to the protein (Kushnirov & Ter Avanesyan, 1998).The same amino acid composition is also detected in the N-terminal domains of eRF3 in the kinetoplastid protists L. major and Trypanosoma cruzi, but this similarity is unlikely to be homologous (Atkinson et al., 2008).For termination of translation and maintenance of viability, only the C-terminal domain of eRF3 (homologous to elongation factor eEF1A) is necessary.eRF3 may have arisen in the early stages of eukaryotic evolution, since neither bacterial nor archaeal genomes contain homologues of eRF3 (Inagaki & Doolittle, 2000).Recent studies have shown that the functions of eRF3 can be performed in archaea by aEF1A (Saito et al., 2010).The termination factor eRF3, preserving the functions typical of elongation factors (GTP-ase activity and interaction with the A-site of the ribosome), lost the capacity to bind tRNA but acquired the capacity to interact with eRF1 (Table 1).From this standpoint, elongation factor EF1A of archaea is functionally intermediate between elongation and termination factors: it acquired the ability to stimulate aRF1 while maintaining all the properties of an elongation factor (Saito et al., 2010).Termination factor eRF1 is a striking example of neofunctionalization, because it has acquired a variety of functions absent in elongation factors, including the ability to decode stop signals and to catalyze the release of nascent peptides from eukaryotic ribosomes in response to stop codons.eRF3 (for eRF1), eEF1B (for eEF1A) Table 1.Functional homology between elongation and termination factors in Archaea, Bacteria and Eukaryota

Additional paralogs of termination factors in several species
Additional duplication of genes encoding termination factors have been found in several species (Figure 3).For example, an additional copy of eRF1 is present in some lineages of ciliates (Liang et al., 2001;Atkinson et al., 2008).These organisms differ from most eukaryotes by their reassignment of one or two stop codons to encode amino acids (Lozupone et al., 2001).UGA, for instance, encodes cysteine in Euplotes (Meyer et al., 1991).The presence of two copies of eRF1 in Euplotes octocarinatus may be associated with a different codon specificity of eRF1 proteins for UAA and UAG codons (Liang et al., 2001).Later studies showed that both eRF1a and eRF1b recognized UAA and UAG as stop codons (Wang et al., 2010).The precise functions of each protein thus remain to be discovered.The plant A. thaliana has three paralogs of eRF1, all of which are able to rescue the sup45-2(ts) mutation in SUP45 (encoding eRF1) in S. cerevisiae (Chapman & Brown, 2004).Another example of duplication, found only in some taxonomic groups, is the presence of two paralogous genes encoding eRF3 in mammals.In mammals, proteins homologous to eRF3 can be divided into two subfamilies based on the sequence of their N-termini.The first subfamily includes human hGSPT1 (or eRF3a) and mouse mGSPT1 (Hoshino et al., 1989;Hoshino et al., 1998;Jean-Jean et al., 1996), while the second subfamily includes human hGSPT2 (eRF3b) and mouse mGSPT2 (Hoshino et al., 1998;Jakobsen et al., 2001).Complementation experiments have shown that only mGSPT2 is able to complement the SUP35 gene (encoding eRF3) mutation (Le Goff et al., 2002).GSPT2 is a paralog of GSPT1 that has perhaps arisen as a result of retrotransposition of the GSPT1 transcript into the genome of the common ancestor of mouse and human.GSPT2 may thus be a functional retrogene (Zhouravleva et al., 2006).Both eRF3a and eRF3b are able to serve as termination factors in mammalian cells and interact with eRF1 (Chauvin et al., 2005).However, eRF3a is considered the main factor (Chauvin et al., 2005) that is expressed in all tissues, while eRF3b is detected only in the brain (Hoshino et al., 1998;Chauvin et al., 2005).This duplication event may not have led to the emergence of a new gene function but may have contributed to the complexity of regulatory processes by tissue-specific expression of these genes.

Subneofunctionalization in a family of termination factors gave rise to proteins participating in mRNA quality control
A necessary condition of protein synthesis is to obtain functionally active proteins, so the control of accuracy of protein synthesis occurs at each stage of translation (Valente & Kinzy, 2003).The accuracy of initiation is achieved by proper identification of the start codon by a multifactorial initiation complex (Asano et al., 2001).Elongation requires the control of various events, including maintenance of the correct reading frame.Shifts in the reading frame occur at a frequency near 3 x 10 -5 (Atkins et al., 1991) and may lead to the synthesis of non-functional products because shifts in the reading frame will often create a premature termination codon (PTC).Eukaryotic cells possess a mechanism known as nonsense-mediated mRNA decay (NMD) that recognizes and degrades mRNA molecules containing premature termination codons (Amrani et al., 2006) (Figure 4).NMD is mediated by the trans-acting factors Upf1, Upf2 and Upf3, all of which directly interact with eRF3; only Upf1 interacts with eRF1 (Czaplinski et al., 1998;Wang et al., 2001).In addition to NMD, eukaryotic cells contain two additional mechanisms of mRNA quality control.No-go decay (NGD) releases ribosomes that are stalled on the mRNA (Doma & Parker, 2006).In yeast, NGD involves the proteins Hbs1 and Dom34 (Pelota in mammals).Another mechanism, non-stop decay (NSD), leads to the release of ribosomes that have read through the stop codon instead of terminating (Vasudevan et al., 2002).NSD has only been found in S. cerevisiae and involves the Ski7 protein (van Hoof et al., 2002).A common feature of these processes is that all involve the termination factors eRF1 and eRF3 (NMD) or their paralogs (Dom34/eRF1 and Hbs1/eRF3 in NGD; Ski7/eRF3 in NSD).Hbs1 is a paralog of eEF1A and eRF3 (Wallrapp et al., 1998;Inagaki & Doolittle, 2000), while Dom34 is a paralog of eRF1 (Koonin et al., 1994;Davis & Engebrecht, 1998) (Figure 3).The C-terminus of Hbs1, homologous to that of eRF3, is sufficient to interact with Dom34, which assumes the same structure of the complex of two pairs of proteins (Hbs1-Dom34 and eRF3-eRF1) (Carr-Schmid et al., 2002).Indeed, Hbs1 forms a complex with Dom34 and GTP (Dom34-Hbs1-GTP), similar to that of eRF1-eRF3-GTP (Hauryliuk et al., 2006;Graille et al., 2008;Chen et al., 2010;Shoemaker et al., 2010;van den Elzen et al., 2010).The central event of NGD is mRNA cleavage, and Dom34 has the necessary RNase activity (Lee et al., 2007;Graille et al., 2008), although the proposed endonuclease activity of Dom34 is not required for mRNA cleavage in NGD (Passos et al., 2009).Dom34 of S. cerevisiae consists of three domains, two of which are homologous to the corresponding domains in eRF1, while the Nterminal domain of Dom34 is different from that of eRF1 and is probably necessary for the recognition of the mRNA stem (Graille et al., 2008).Lack of the Hbs1 protein in archaea is apparently compensated by its homolog aEF1A (Kobayashi et al,. 2010), which also performs the functions of eRF3 in archaeal termination of translation (Saito et al., 2010).In one more pathway of mRNA degradation, non-stop decay (NSD), participates In one more pathway of mRNA degradation, non-stop decay (NSD), participates Ski7 protein that is paralog of Hbs1 and eRF3 (Benard et al., 1999).This mechanism is necessary to destroy mRNAs lacking all termination codons (Frischmeyer et al., 2002;van Hoof et al., 2002).Ski7 protein that is paralog of Hbs1 and eRF3 (Benard et al., 1999).This mechanism is necessary to destroy mRNAs lacking all termination codons (Frischmeyer et al., 2002;van Hoof et al., 2002) Ski7, involved in NSD, arose from duplication of Hbs1 by WGD (Kellis et al., 2004) or by an independent duplication of Hbs1 before WGD and the subsequent loss in several species (Atkinson et al., 2008) (Figure 3).An interesting hypothesis links the appearance of Ski7 with the existence of the prion [PSI + ] ( Atkinson et al., 2008).[PSI + ] is the aggregated (prion) form of the yeast protein Sup35 (eRF3) (Kushnirov & Ter Avanesyan, 1998).Formation of [PSI + ] decreases the amount of functional Sup35, leading to the efficient read-through of nonsense mutations in ORFs (and possibly at the normal terminator codons) (Serio & Lindquist, 1999).The emergence of Ski7 in such organisms would thus create an additional system of mRNA quality control.However, [PSI + ] formation has not been detected in the natural, industrial and clinical isolates of Saccharomyces.In addition, the prionic properties of Sup35 are conserved in various species of Saccharomyces as well as in Candida albicans and Pichia methanolica (Inge-Vechtomov et al., 2003), species in which Ski7 has not been found (Atkinson et al., 2008).

Conclusion
Successive duplications of genes encoding elongation factors for translation led to the emergence of several protein complexes with different properties.The eRF1-eRF3 complex terminates translation, and the Dom34-Hbs1 complex is involved in the quality control of mRNA.Both eRF1 and eRF3 interact not only with each other but also with additional proteins.Some of these interactions are possibly mutually exclusive, and some of the proteins interacting with eRF1/eRF3 can be components of the complex terminating translation.Possible candidates for involvement in termination are poly(A) binding protein (PABP) and Upf proteins (Upf1, Upf2 and Upf3).Interaction of eRF3 with PABP links termination of translation with initiation (Hoshino et al., 1999), while interaction with Upf involves eRF proteins in nonsense-mediated decay (Amrani et al., 2006).The genetic data, derived mostly from S. cerevisiae, strongly suggest that the functions of eRF1 and eRF3 are not restricted to termination of translation (Inge-Vechtomov et al., 2003).Further studies are needed to characterize other non-translational functions of both proteins, as was shown for eEF1A (Mateyak & Kinzy, 2010).

Acknowledgments
This work was supported by the Russian Foundation for Basic Research (10-04-00237) and the Program of the Presidium of the Russian Academy of Sciences, The Origin and the Evolution of the Biosphere.

Fig. 1 .
Fig. 1.Possible consequences of gene duplication (modified from (Hahn, 2009)).A and Cregulatory sequence changes; B and D -coding sequence changes.Since variant 3 (conservation) does not change the duplicated copies, it is not represented in the diagram.OF (grey) -old function, NF (black) -new function, LF (white) -lost function (attributed to both regulatory and structural sequences).

Fig. 2 .
Fig. 2. Evolutionarily related proteins perform similar functions and interact with the same sites of the ribosome during translation.The most significant participants are shown.The arrows indicate the sequence of events.IF -initiation factor; EF -elongation factor; RFrelease, or termination, factor; e -eukaryotic.

Fig. 3 .
Fig.3.The origin of the proteins involved in elongation, termination and mRNA quality control.The genes duplicated only in certain taxa are marked with asterisks: * -duplication unique to mammals(Hoshino et al., 1998;Jakobsen et al., 2001); ** -duplication described only from Saccharomyces(Atkinson et al., 2008); *** -duplication specific to several species of ciliates(Liang et al., 2001;Atkinson et al., 2008) and A. thaliana(Chapman & Brown, 2004).Branch lengths are not to scale.The progenitors of prokaryotic EF-G and EF-Tu were proposed to have first diverged from a common ancestral GTPase, and then each gave rise to two protein families corresponding to the elongation and termination factors(Nakamura & Ito, 1998; Inagaki & Doolittle, 2000;Atkinson et al., 2008).EF -elongation factor, RFrelease factor, e -eukaryotic, a -archaeal.

Fig. 4 .
Fig. 4. Neofunctionalization of termination factors in mRNAs quality control systems.Three systems described for S. cerevisiae are shown.NSD (Non-stop decay) is responsible for the degradation of transcripts lacking stop codons.NGD (No-go decay) removes mRNA secondary structures that prevent translation.NMD (Nonsense-mediated decay) destroys transcripts containing nonsense mutations.See text for details.