Growing Uses of 2A in Plant Biotechnology

The combination, ‘pyramiding’ or ‘stacking’ of multiple genes in plants is a fundamental aspect of modern plant research and biotechnology. The most widely adopted stacked traits (herbi‐ cide tolerance and insect protection) provide growers with benefits of increased crop yield, simplified management of weed control and reduced insecticide use. The global acreage of stacked traits or more precisely genetically modified organisms bearing stacked traits is expected to increase rapidly in the near future, with the introduction of nutritional and/or industrial traits to satisfy the needs of consumers and producers [1]. Several approaches have been used to stack multiple genes into plant genomes and then to coordinate expression [2-4]. Stacking approaches include sexual crossing between plants carrying distinct transgenes [5,6], sequential re-transformation [7], and single-plasmid [8] or multiple-plasmid co-transforma‐ tion [9]. These strategies, however, suffer from the inherent weakness that co-expression of the heterologous proteins is unreliable.


Introduction
The combination, 'pyramiding' or 'stacking' of multiple genes in plants is a fundamental aspect of modern plant research and biotechnology. The most widely adopted stacked traits (herbicide tolerance and insect protection) provide growers with benefits of increased crop yield, simplified management of weed control and reduced insecticide use. The global acreage of stacked traits or more precisely genetically modified organisms bearing stacked traits is expected to increase rapidly in the near future, with the introduction of nutritional and/or industrial traits to satisfy the needs of consumers and producers [1]. Several approaches have been used to stack multiple genes into plant genomes and then to coordinate expression [2][3][4]. Stacking approaches include sexual crossing between plants carrying distinct transgenes [5,6], sequential re-transformation [7], and single-plasmid [8] or multiple-plasmid co-transformation [9]. These strategies, however, suffer from the inherent weakness that co-expression of the heterologous proteins is unreliable.
Due to limited genomic coding space, many viruses encode more than one protein from a single mRNA transcript. Internal ribosome entry site (IRES) sequences serve as a launching pad for internal initiation of translation, allowing expression of two or more genes from a single transcript [reviewed in 10]. A number of IRES motifs from plant [11] and animal [12] viruses have been used to direct the expression of multiple recombinant proteins in plants and plant cells [13,14]. However, widespread use of IRES motifs in plant biotechnology is limited: they are not small (~600 base pairs), adding to the size of the transgene; translation efficiency of a gene placed after the IRES is much lower than that of a gene located before the IRES [14]. One promising gene/protein strategy adopted by some viruses to ensure a balance of proteins in vivo is to express polyprotein precursors with cleavable linkers between the proteins of interest [15]. Several groups demonstrated the potential of this approach by co-expressing two proteins separated by the tobacco etch virus (TEV) NIa protease recognition sequence (heptapeptide cleavage recognition sequence ENLYFQS) together with the NIa proteinase [16][17][18]. The utility of the NIa protease is limited due to the presence of a nuclear-localizing signal (NLS) within the protease and the amount of energy necessary to express the 49 kDa protease. It is also possible to use linker sequences that are putative substrates of known endogenous plant proteases [19].
To bypass the need for an endogenous or recombinant accessory protease acting on the translated polypeptide product a different approach involves the use of self-processing viral 2A peptide bridges [reviewed in 20,21]. The designation "2A" derives from the systematic nomenclature of protein domains within the polyproteins of picornaviruses. In foot-andmouth disease virus (FMDV) and some other picornaviruses the oligopeptide 2A region of the polyprotein manipulates the ribosome to "skip" the synthesis of the glycyl-prolyl peptide bond at its own carboxyl terminus leading to the release of the nascent protein and translation of the downstream sequence [22]. Under the monikers of "Skipping", "Stop-Carry On" and "StopGo" translation, it allows the stoichiometric production of multiple, discrete, protein products from a single transgene [23,24]. Several recent review articles have amply covered the role of 2A biotechnology in animal systems [20,25]. This summary-review will provide an up-to-date overview of 2A and cover the wider application of 2A-polyproteins to the expression of multiple proteins in plants.

The co-translational model of 2A-mediated "cleavage"
FMDV, like other members of the family Picornaviridae, is a non-enveloped RNA virus which contains a single-stranded, positive-sense RNA molecule of approximately 8500 nt that functions as an mRNA [26]. This (+) RNA encodes a high molecular mass polyprotein that undergoes co-translational processing to yield the structural proteins (1A, 1B, 1C and 1D, commonly known as VP4, VP2, VP3 and VP1 respectively) which comprise the viral capsid, and the non-structural proteins (2A, 2B, 2C, 3A, 3B, 3C pro , and 3D pol ) that control the viral life cycle within host cells [27]. The 2A oligopeptide is only 18 amino acids (aa) long (-LLNFDLLKLAGDVESNPG-) defined by the co-translational "cleavage" at its C-terminus and a post-translational cleavage at its N-terminus, mediated by the virus-encoded proteinase 3C pro [28]. Analysis of recombinant FMDV polyproteins [29] and artificial polyprotein systems in which 2A was inserted between two reporter proteins [22] showed that just 2A, plus the Nterminal proline of the downstream protein 2B was sufficient for highly efficient co-translational "cleavage" (Figure 1, Panel A). Quantification of products using in vitro cell-free translation systems showed the product upstream of 2A accumulated in a molar excess over that downstream -at variance with a proteolytic model of 2A which predicts a 1:1 stoichiometry of the cleavage products [23,30,31]. We and others have shown that 2A is not a proteinase, nor a substrate for a host-cell proteinase, but an autonomous element mediating a co-translational "recoding" event [27,29]. From these observations we proposed a model of the 2A reaction based on hydrolysis of the nascent chain from ribosome-associated tRNA at the peptidyl-transferase centre [23,24,30]. For in-depth reviews of the model see [24,32,33].  Two individual polypeptides can be generated from one transcript using F2A to link the individual genes. Panel B: F2A cleavage efficiency. All constructs shared a common core consisting of CFP-F2A-RABD2a. Panel C: Sequences encoding HA-tagged CAH1 were fused in-frame to wild type and mutant versions of genes encoding the GTPases RABD2a, SAR1, and ARF1 linked by F2A-the pre-protein has the endomembrane targeting sequence. These synthetic polyproteins were efficiently cleaved when transiently expressed in protoplasts and in planta. CFP, enhanced cyan fluorescent protein; GUS, β-glucuronidase; SP, ER signal peptide of CAH1 protein; GS, Golgi targeting signal of N-acetylglucosaminyl transferase 1; RABD2a, Arabidopsis RABD2a GTPase; SAR1, Nicotiana tabacum SAR1p; ARF1, Arabidopsis ADP-ribosylation factor 1; HA, hemagglutinin epitope tag; CAH1, Arabidopsis α-CAH1 (adapted from [60]). 2A comprises two parts, an N-terminal region (without sequence conservation) predicted to form an alpha helix, and a C-terminal motif,-DxExNPG, followed by a proline required for the reaction. Recently it was shown that the synonymous codon usage of this conserved motif is biased [34]. The amino acids E,S,N,P,G,P tend to use GAG, TCC, AAC, CCT, GGG and CCC respectively. The results also indicate that the synonymous codon usage of the 2A peptide has no effect on 2A activity. In summary, our results indicate the conserved -DxExNPG motif within the peptidyl transferase centre (PTC) of the ribosome is restricted and it forms a tight turn, shifting the ester bond between the C-terminal glycine and tRNA Gly (in the P site of the ribosome) into a conformation which rules out nucleophilic attack by prolyl-tRNA Pro (in the A site)-no peptide bond is formed. Although no stop codon is involved, eukaryotic translation release (termination) factors 1 and 3 (eRF1/eRF3) release the nascent protein from the ribosome [35][36][37]. Due to its mode of action, the 2A peptide has been described as a "cis-acting hydrolase element" (CHYSEL) [32]. Our model of this translational recoding event predicts two outcomes, either ribosomes terminate translation, or, translation of the downstream sequences resumes. Skipping induced by 2A sequences gives approximately equal expression of the proteins upstream and downstream of the 2A site as measured by: i) CAT and GUS enzyme activity [38]; ii) cell free translation in vitro and Western blot [22,23,31,39,40]; iii) GFP/FACS with antibiotic resistance [41]; iv) co-fluorescence reporting [42,43]; v) fluorescence resonance energy transfer (FRET) analysis [44] and vi) protein segregation in transgenic animals [45,46]. Since these sequences act co-translationally, artificial polyprotein systems may include signal sequences to localize different protein translation products to discrete sub-cellular sites.
Chimeric polyproteins incorporating 2A have been widely tested in eukaryotic systems, including mammalian [22], plant [38], insect [51], yeast [39] and fungal cells [52].The 2A system does not work in prokaryotic cells-the reported proteolysis activity of 1D-2A in Escherichia coli cells [53] was not detected in equivalent constructions in our laboratory showing "cleavage" specificity for eukaryotic systems alone [54]. The unique activity of 2A peptides has led to their use as tools for co-expression of two (or more) proteins in biomedicine and biotechnology [reviewed in 20,21,55]. The most widely used 2A sequence is derived from the FMDV (hereafter referred to as "F2A") [42]. Other 2A peptides used successfully include "T2A" from Thosea asigna virus (TaV), "E2A" from equine rhinitis virus (ERAV) and "P2A" from porcine teschovirus-1 (PTV-1) [ Table 2]. Comparing the in vitro activity of different 2As inserted between GFP and GUS, we have shown that T2A 20 has the highest cleavage efficiency followed by E2A 20 , P2A 20 , and F2A 20 [31]. In 2A peptide-linked TCR:CD3 constructs, Szymczak and colleagues demonstrated that F2A 22 and T2A 18 have higher efficiency than E2A 20 [44]. In human cell lines, zebrafish and mice, cleavage and targeting of NLS-EGFP and mCherry-CAAX to the nucleus and plasma membrane, respectively, was the most efficient in P2A 19 -linked constructs followed by T2A 18 , E2A 20 and F2A 22 [56]. To allay public fears and opposition to plants carrying a transgenic viral sequence, efficient 2A-like cellular sequences could be used (   The -DxExNPGP-motif conserved among 2A/2A-like sequence is shown in red.

Intracellular protein targeting Of 2A constructs
For effective technologies, some synthesized proteins must be transported across membranes and directed towards other sites in order to function. Protein targeting occurs either cotranslationally (targeting to endoplasmic reticulum [ER], Golgi, vacuole, plasma membrane) or post-translationally (targeting to nucleus, mitochondria, chloroplast, etc) and is orchestrated by distinct signal sequences encoded within the polypeptide [42]. In plants, the original FMDV-2A sequence was tested in various artificial polyproteins using reporter genes chloramphenicol acetyltransferase (CAT), β-glucuronidase (GUS) and green fluorescent protein (GFP) expressed in transgenic tobacco plants. This preliminary series of studies suggested that 2A cleaves proteins properly in plant cells [38,57] and directs protein targeting to different cellular compartments via either co-or post-translational mechanisms [58]. Subsequently, Samalova and co-workers questioned its use in plant systems, suggesting that the 2A sequence was dispensable for efficient cleavage of polyproteins carrying a single internal signal peptide -it appears signal peptide cleavage by signal peptidase was responsible for processing the polyprotein. The use of a self-cleaving 2A was required when both halves of the fusion were translocated across the ER membrane, however, the upstream product was mis-sorted to the vacuole. Furthermore, it was shown that the FMDV 2A peptide resulted in low rates of polypeptide separation in plant cells when placed downstream of common fluorescent proteins (GFP and RFP derivatives) [43].
The Arabidopsis carbonic anhydrase (CAH1) is one of the few plant proteins known to be targeted to the chloroplast via the secretory pathway -the pre-protein has the endomembrane targeting sequence. The need for post-translational modifications, such as N-glycosylation, for proper folding, and to enhance stability and/or function of these proteins probably explains the use of this alternative trafficking pathway [59]. Recently, the FMDV 2A co-expression system was re-assessed to study the effects of three Ras-like small GTPase proteins, RAB2a, ARF1, and SAR1 on CAH1 protein trafficking in plant cells [60]. Members of this superfamily share several common structural features and act as molecular switches that regulate many aspects of plant vesicular transport [61,62]. Rabs regulate virtually all steps of membrane traffic from the specification of membrane identity to the accuracy of vesicle targeting [63]. ARF1 has been shown to play a critical role in COPI-mediated retrograde trafficking, while SAR1 is involved in COPII-mediated ER-to-Golgi protein transport [reviewed in 64]. In this study, Exchange factors for ARF GTPases (ARF-GEFs) regulate vesicle trafficking in a variety of organisms. In animals and fungi, there are eight ARF-GEF families, but only the apparently ancestral GBF and BIG families are present in plants, suggesting that plant ARF-GEFs have acquired multiple roles in different trafficking pathways [65,66]. In Arabidopsis the ARF-GEFs GNOM-like 1 (GNL1) and its close homologue GNOM jointly regulate the retrograde COPImediated traffic from the Golgi to the ER, which is the ancient eukaryotic function of the GBF1 class [67]. Another line of research by Teh and Moore (2007) revealed secretory traffic is resistant to the trafficking inhibitor brefeldin A (BFA), whereas endosomal recycling involves GNOM -GNL1 is a BFA-resistant GBF protein that functions with the BFA-sensitive ARF GEF GNOM [68]. The 20aa 2A peptide from FMDV was used in this study to construct polyproteins that expressed trafficked fluorescent protein markers in fixed stoichiometry in different cellular compartments: N-ST-RFP-2A-GFP-HDEL produces a Golgi-localized RFP (red) and an ER-localized GFP (green), N-secRFP-2A-GFP-HDEL produces an ER-localized GFP and an RFP that is targeted to the vacuole via the ER and Golgi.

The use of 2A multigene expression strategies in plant science -Caveats and proposals
The take-home message from F2A mutagenesis experiments is that the sequence is largely intolerant to amino acid substitution over its entire length [31,37]. While mutations of conserved amino acids have, in general, more pronounced effects than changes to nonconserved ones [31], variations at most positions within the peptide reduce activity -2A peptides are optimized to function as a whole [37]. Sequences immediately upstream of 2A are known to be either critical or very important for activity [57,[69][70][71][72]. Longer versions of F2A with extra sequences derived from the capsid protein ("1D") -upstream of 2A in the FMDV polyprotein -produce higher levels of cleavage [23,29,47]. Specifically, N-terminal extension of 2A by 5aa of 1D improved "cleavage", but extension by 14aa of 1D or longer (21 and 39aa) produced complete "cleavage" and an equal stoichiometry of the up-and downstream translation products [23]. After "fine-tuning" of the F2A sequence we suggest that researchers opt for F2A 30 (+11aa 1D). This 2A proved to be the most favourable in terms of both length and cleavage efficiency and was unaffected by the sequence of the upstream gene [73,74]. In the case of shorter 2As, cleavage efficiency has been improved by insertion of various spacer sequences such as Gly-Ser-Gly or Ser-Gly-Ser-Gly [41,44,45,[75][76][77], the V5 epitope tag (-GKPUPNPLLGLDST-) [78], or a 3xFlag epitope tag [79] ahead of the 2A sequence. If opting for a shorter sequence, users should be aware activity can be affected by the short amino acid tract linking the protein upstream with 2A introduced by the cloning strategy. For example, the F2A 20 encoded by pGFP-F2A 20 -GUS was highly active [29,31], whereas the pGFP-F2A 20 -CherryFP was noticeably lower [73]. The only difference was the short "linker" between GFP and F2A created by the cloning strategy:-SGSRGAC-(pGFPF-2A 20 -GUS; linker derived from Xba1 and Sph1 restriction sites) and-RAKRSLE-(pGFPF-2A 20 -CherryFP; linker derived from furin and Xho1 restriction site) [73]. Taken together, these observations are consistent with our translational model in which 2A activity is a product of its interaction with the exit tunnel of the ribosome which is thought to accommodate a nascent peptide of 30-40 amino acids [80].
When using the 2A system, it should be noted that the 2A oligopeptide remains as a C-terminal extension of the upstream fusion partner and the downstream protein must have an N-terminal proline residue. Although an N-terminal proline confers a long half-life upon a protein [81], it does prevent many N-terminal post-translational modifications that may be essential for activity. If this is the case, proteins that require authentic termini can be introduced as the first polyprotein domain. The need to target proteins to different subcellular locations within plant cells by C-terminal localization signals may be compromised if they contain a 2A-extension.
In the case of proteins translocated into the ER, a strategy was adopted to include a furin proteinase cleavage site between the upstream protein and 2A [82,83]. Furin is a subtilisin-like serine endoprotease that cleaves precursors on the C-terminal side of the consensus sequence -Arg-X-Lys/Arg-Arg ↓ (-RX(K/R)R-) in the trans-Golgi network (TGN) [84,85]. The furin cleavage sequences ↑ -RKRR-,-↑ RRRR-, and -↑ RRKR-consisting of only basic amino acids, which can be efficiently cut by carboxypeptidases ( ↑ ), was used to remove 2A peptide-derived amino acids from the upstream antibody heavy chain during protein secretion [83]. Proteins expressed in plants could have their 2A extensions removed by endogenous proteinases acting on similar hybrid linker peptides. In 2004, François and colleagues connected the first nine amino acids (SN ↑ AADEVAT) of the LP4 peptide to the 20aa F2A to generate a hybrid linker peptide, LP4-2A [86]. LP4 is the fourth linker peptide of the naturally occurring polyprotein precursor originating from seed of Impatiens balsamina [87]. Cleavage of the polyprotein with plant defensin DmAMP1 from Dahlia merckii at its amino-terminus and plant defensin RsAFP2 from Raphanus sativus at its carboxy-terminus resulted in the release and targeting of (DmAMP1-SN) and RsAFP2 to different cellular compartments [86]. Recently, 2A and LP4-2A were used to connect the Bacillus thuringiensis (Bt) cry1Ah gene, which encodes a protein exhibiting strong insecticidal activity, and the mG 2 -epsps gene, which encodes a protein tolerant to glyphosate, the world's most important and widely used herbicide [72]. The expression level of the two genes linked by LP4-2A was higher than those linked by 2A, regardless of the order of the genes within the vector. Furthermore, tobacco plants transformed with the LP4-2A fusion vectors showed better pest resistance and glyphosate tolerance compared to plants transformed with the 2A fusion constructs.

A strategy to improve transgene expression from the Chlamydomonas nuclear genome
Micro-algae have the potential to be low-cost bioreactors for recombinant protein (RP) production due to their relatively rapid growth rates, favourable transformation time, ease of containment and rapid scalability [88][89][90]. The availability of a complete genome sequence [91] coupled with the ability to manipulate all three genomes (chloroplast, nuclear and mitochondrial) makes Chlamydomonas reinhardtii an attractive species for biotechnologists [90,92]. While transgene expression from the alga's nuclear genome offers several advantages over chloroplast expression, such as post-translational modifications and protein targeting and/or secretion [93], yields of target RPs are often inadequate for industrial purposes. In an attempt to overcome this limitation, Rasala and colleagues constructed a C.reinhardtii nuclear expression vector using F2A to fuse GFP or xylanase 1 (xyn1) from Trichoderma reesi to the bleomycin/ zeocin antibiotic resistance gene sh-ble [94]. High-value xylanases have numerous applications in the textile, paper, pulp, food and feed industries [reviewed in 95]. Efficient cleavage of GFP from Ble-2A was observed in the algal cytoplasm, leading to high level GFP expression and built-in resistance to zeocin. Co-expression of Xyn1 with Ble-2A led to the selection of transformants with higher xylanase activity (~100-fold) compared to unfused Xyn1. Subsequently, the ble-2A expression system was used to secrete enzymatically active Xyn1 by insertion of a secretion signal peptide (SP) between F2A and Xyn1 (ble2A-SP-xyn1). In a follow-on study, the same strategy was used to express and compare six recombinant fluorescent proteins (FPs: blue mTagBFP, cyan mCerulean, green CrGFP, yellow Venus, orange tdTomato and red mCherry) in C.reinhardtii [96]. All FPs were easily detectable in live cells, and the ble-2A-FP polyproteins were efficiently processed to yield unfused FPs. CrGFP was shown to be the least fluorescent due to its low signal-to-noise ratio, while the FPs with longer emission wavelengths (Venus, TdTomato and mCherry) had the highest signal-to-noise ratios. In this study, the ble-2A vector was used to tag an endogenous gene (α-tubulin) with a fluorescent protein tag (mCerulean) that was readily detectable in Chlamydomonas using standard live-cell imaging techniques. Taken together, these results suggest this new expression system will become an important tool for both algal biotechnology and basic research.

Engineering plant metabolomes
Manipulating plant metabolomes ranges from modifications, building extensions or branches onto existing biochemical pathways to very extensive changes-such as the rice C 4 project. Here, the aspiration is to covert rice from a C 3 plant (3-carbon molecule present in the first product of carbon fixation) to a more efficient C 4 plant -eliminating photorespiration. As a major global crop species, rice has been the subject of intensive research. The 'Golden Rice' project was undertaken to address the problem of vitamin A deficiency (VAD), and is discussed in some detail here since it provides an interesting 'vignette' of the progress in transgenesis and plant metabolome engineering. It should be noted, however, that another 'biofortified' crop also designed to reduce VAD is the "Super-banana", originally developed for the Ugandan population, which recently gained approval to begin human trials in the United States. In this case, however, knowledge gained from a cultivar identified in Papua New Guinea, enabled development of super-bananas by genome editing, rather than transgenesis, of a commercial cultivar.
A shortage of dietary vitamin A leads to VAD resulting in impairment in sight, increases in the severity of a range of infectious diseases and is estimated to lead to the premature deaths of 650-700,000 children under the age of 5 each year: a particular problem in parts of S.E. Asia and Africa. Whilst no rice cultivars produce provitamin A within the endosperm, the precursor geranylgeranyl diphosphate (GGDP) is produced. This precursor could be converted into βcarotene, which functions as provitamin A in humans. To convert GGDP to β-carotene was originally thought to require the activities of phytoene synthase, phytoene desaturase, ζcarotene desaturase and lycopene β-cyclase, although recent analyses have shown endogenous rice enzymes can substitute for lycopene β-cyclase in the conversion of lycopene to β-carotene. The agrobacterial vector construct pB19hpc encodes the daffodil phytoene desaturase gene (psy) driven by the endosperm-specific glutelin (Gt1) promoter, and the phytoene desaturase (crtI) gene from the bacterium Erwinia uredovora, driven by the cauliflower mosaic virus (CaMV) 35S promoter, arranged in tandem (Figure 2, Panel A) [97]. Whilst the psy gene already encoded an N-terminal transit peptide sequence, the crtI sequence was modified to encode the N-terminal transit peptide sequence from the pea Rubisco small subunit -ensuring both proteins were imported into plastids, the site of GGDP biosynthesis. Selectable marker genes (aphIV and nptII) were also included. To complete the β-carotene biosynthetic pathway, plants were co-transformed with vectors pZPsC and pZLcyH. Vector pZPsC carries psy and crtI (as pB19hpc), but lacked the aphIV marker expression cassette. Vector pZLcyH encodes lycopene β-cyclase (lcy) from Narcissus driven by the rice glutelin promoter and the marker aphIV gene controlled by the CaMV 35S promoter, arranged in tandem. Again, lycopene β-cyclase carried a functional transit peptide to direct plastid import. Whilst this proved to be a successful strategy -it is rare one's experimental results are given a Papal blessing, the use of multiple promoters in such a strategy of transgenesis often leads to 'interference': known as 'transcriptional interference' or 'promoter suppression'-probably caused by competitive binding of transcription factors and/or modification of DNA structure at one site that affects the other site. Such problems may be overcome by the inclusion of 'insulators' which serve to segregate an enhancer and an adjacent promoter into independent domains-'enhancer blocking'.
The first-generation Golden rice encoded the daffodil phytoene synthase gene, but this daffodil enzyme proved to be a rate-limiting step in β-carotene biosynthesis, since substitution with psy from maize produced a 23-fold increase in total carotenoids (with a preferential accumulation of β-carotene) compared to the original Golden Rice (Figure 2, Panel B) [98]. Again, tandem (rice glutelin) promoters were used. To avoid the problems of promoter interference, a third generation of Golden Rice was developed using two systems both of which produce multiple proteins from a single transcript mRNA. In this strategy, the crtI and psy genes were linked by either a 2A sequence (creating a single ORF; pPAC construct), or, the two genes were separated by an IRES (pPIC construct; Figure 2, Panel C) [99]. In this strategy phytoene synthase from Capsicum was co-expressed with phytoene desaturase from Pantoea from a single transcription unit in both cases. The rice globulin promoter drove the expression of [Psy-2A-CrtI], or, [Psy-IRES-CrtI]. The endosperm of transgenic PAC rice had a much more intense golden colour than did the PIC rice transformants, demonstrating that 2A was more efficient than an IRES in co-expression of PST and CRTI and hence the synthesis of β-carotene. Indeed, immunoblot analyses of CRTI (the downstream protein in both cases) showed that 2A was nine-fold more effective than an IRES. It is well-known that in the case of a bicistronic mRNAs using an IRES, the gene downstream of the IRES is only translated ~1/10 th as efficiently  The transcription of the daffodil phytoene desaturase gene (psy), Erwinia phytoene desaturase (crtI) and the Narcissus lycopene β-cyclase (lcy) genes are driven either by the endosperm-specific glutelin (Gt1) or CaMV 35S promoters, as indicated. In all cases shown in this figure, sequences encoding the N-terminus of CRT1 were preceded by those encoding functional transit peptides-to ensure import into plastids. The aphIV and nptII selectable markers used are shown, together with transcription terminators / poly(A) addition signals (light grey boxes) and Agrobacterium vector left and right borders (LB, RB). Panel B: In the second generation Golden Rice, Erwinia phytoene desaturase (crtI) gene was co-expressed with the maize phytoene desaturase gene (psy) gene under the control of tandem rice glutelin promoters (Glu1). The Agrobacterial vector also encoded a selectable marker cassette comprising the maize polyubiquitin (Ubi1) promoter with intron, hygromycin resistance (hptII) and nos terminator are shown as grey boxes. Panel C: In the third generation of Golden Rice, Capsicum phytoene synthase (psy) and Pantoea carotene desaturase (crtI) were co-expressed by linkage into a single ORF via a synthetic 2A sequence that was optimized for rice codons (pPAC construct), or, the genes were linked by the insertion of an IRES into the intergenic region (pPIC construct). In both cases, transcription was driven by the rice globulin promoter (GIb). The Agrobacterial constructs also comprised a selectable marker (bar) driven by the 35S promoter and were flanked by a 5'-matrix attachment region (Mar) from the chicken lysozyme gene are not shown. Panel D: Sequences encoding the maize proteinase inhibitor (mpi) and the potato carboxypeptidase inhibitor (pci) were fused into a single ORF. The two proteinase inhibitors were linked using either (i) the processing site of the Bacillus thuringiensis Cry1B precursor protein (C) or (ii) the 2A sequence from foot-and-mouth disease virus (F2A). In both cases the woundinducible mpi promoter (arrow) was used to drive the expression of the mpi-pci fusion genes. Panel E: Genes encoding the 3,3'-β-hydroxylase (crtZ) and 4,4'-β-oxygenase (crtW) from marine bacteria (Paracoccus spp.) were linked via F2A to form a single ORF. Again, both gene sequences were preceded by those encoding functional transit peptides to ensure import into plastids. Selectable markers (e.g. lacZ and hptII) are shown as grey boxes. as the gene upstream of the IRES. In comparison to IRESes, the use of 2A and 2A-like sequences has the advantages of producing stoichiometric levels of the translational products and can be used to express multiple (>2) proteins at similar levels.
In light of the developments in the synthesis of β-carotene outlined above, it is interesting to note that whilst higher plants synthesize carotenoids, they do not possess the ability to form ketocarotenoids -potent antioxidants with numerous reported health benefits. Organisms capable of synthesizing ketocarotenoids are rare, although an early report showed that coexpression of the 3,3'-β-hydroxylase (crtZ) and 4,4'-β-oxygenase (crtW) from marine bacteria (Paracoccus spp.), linked via 2A (Figure 2, Panel E), lead to the formation of ketolated carotenoids (astaxanthin, canthaxanthin and 4-ketozeaxanthin) from β-carotene and its hydroxylated intermediates by the construction of an astaxanthin pathway [100].

Improving resistance to abiotic/biotic stresses
Abiotic stresses such as drought, excessive salinity, high and low temperature are critical factors limiting the productivity of agricultural crops. The development of genetically engineered plants with enhanced tolerance presents an important challenge in plant gene technology. A common response of plants to these environmental stresses is the accumulation of sugars and other compatible solutes. Trehalose is a nonreducing disaccharide that functions as a stress protection metabolite in many organisms [reviewed in 101]. In yeast, trehalose-6phosphate synthase (TPS1) and trehalose-6-phosphate phosphatase (TPS2) enzymes catalyse the conversion of glucose-6-phosphate and uridine diphosphate (UDP)-glucose to trehalose in a two-step pathway [102]. Several efforts have been undertaken to engineer the droughtand salt-tolerance of economically important plants using TPS1 and/or TPS2 genes from yeast and bacteria [103][104][105]. Both TPS1 and TPS2 genes of Zygosaccharomyces rouxii were cointroduced into potato plants as a ZrTPS2-F2A-ZrTPS1 polyprotein in an attempt to develop stress-tolerant transgenic plants [106]. The resulting plants showed increased tolerance to drought and no visible phenotypic alterations. Glycinebetaine (N,N,N-trimethyl glycine; betaine) is regarded as an extremely effective compatible solute, which is able to restore and maintain the osmotic balance of living cells, in response to high salinity, cold and drought [107]. In plants, betaine is produced by the two-step oxidation of choline via the two enzymes choline monooxygenase (CMO) and betaine aldehyde dehydrogenase (BADH). Transferring betaine synthesis genes CMO and BADH from the halophyte Suaeda salsa to Pichia pastoris produced "CMO-F2A-BADH" recombinant yeasts with higher tolerance to salt, methanol, and high temperature stress [108]. The result indicates that this strategy could be used to improve the tolerance to stress of commercially important crops such as potato, rice, tomato, and tobacco, which do not accumulate betaine.
Plants are substrates for a wide range of pests and pathogens, including fungi, bacteria, viruses, nematodes, insects, and parasitic plants [109]. To defend themselves against pathogen attack, plants produce a battery of antimicrobial peptides (AMPs), secondary metabolites and reactive oxygen species [reviewed in [110][111][112]. AMPs (such as defensins) are attractive candidates for transgenic applications for several reasons: their diverse antimicrobial activity, low toxicity for non-target cells and low cost in terms of energy and biomass involved in their expression [113]. To achieve resistance against a broader range of pathogens in plants, co-expression of transgenes encoding AMPs with different biochemical targets is an attractive approach. In the case of plant defensins DmAMP1 and RsAFP2 (see above), biological activity of the hybrid protein was higher compared to the individual parental proteins [86]. The potato AMPs snakin-1(SN1) and defensin-1 (PTH1) were fused to improve plant protection against phytopathogens [114]. SN1 is active against both bacterial and fungal species, whereas PTH1 shows primarily antifungal activity [115,116]. Antimicrobial activity of SN1 and PTH1 (linked by the F2A sequence) as a single-fusion protein in E.coli systems was better against the majority of tested microorganisms compared with the activity of individual proteins. In a sense, this (finding) is a surprise since F2A does not display any cleavage activity in E.coli [54]. Nevertheless, increased antibacterial and antifungal activity was reported in tobacco and potato plants expressing the snakin-defensin hybrid.
The expression of plant proteinase inhibitors is one strategy for increasing resistance against insects. The maize serine proteinase inhibitor (MPI) and the potato carboxypeptidease inhibitor (PCI) were co-expressed in rice using two strategies ( Figure 2, Panel D) [117]. The first was to link the two gene sequences into a single ORF via a sequence encoding the proteinase processing site of the B. thuringiensis Cry1B precursor protein. The rationale here is that the translation (fusion protein) product would be post-translationally cleaved into MPI and PCI by an endogenous rice proteinase. The second strategy was to link the mpi and pci genes via 2A: here, the translation products would be translated as discrete products ([MPI-2A] +PCI). Both co-expression strategies were successful and both types of rice transgene showed increased resistance to the striped stem borer (Chilo suppressalis). Whilst both strategies benefit from co-expression using a single transgene, the additional merit of using 2A for similar approaches is that post-translational processing by an endogenous proteinase limits the subcellular targeting of proteins downstream of the Cry1B linker to the cytoplasm or other subcellular sites using post-translational import -not the exocytic pathway which requires the cotranslational recognition of signal sequences by signal recognition particle. Since 2A works within the ribosome, proteins downstream of 2A can be modified to comprise N-terminal signal sequences and direct these translation product(s) to the exocytic pathway.
Glucosinolates (GLSs) present in cruciferous plants (e.g. cabbage, broccoli, and oilseed rape) play a defensive role against generalist insects [118] and pathogens [119]. However, cruciferspecialist insect herbivores like the economically important pest diamondback moth (DBM; Plutella xylostella) frequently use GLSs to stimulate oviposition [120]. An increase in the global area of brassica crops between 1993 and 2009 coupled with the high fecundity of DBM especially in tropical regions has resulted in development of resistance to many broadspectrum insecticides used in the field [121]. Recently, genetic engineering has been used to produce non-host GLS-containing plants as a first step towards the creation of "dead-end trap crops". The transfer of the six-step benzylglucosinolate (BGLS) pathway into tobacco plants using only two ORFs, consisting of the first three genes (GGP1-F2A-CYP83B1-F2A-CYP79A2) and last three genes (SOT16-F2A-UGT74B1-F2A-SUR1), give rise to BGLS-producing plants [122,123]. Importantly, these non-host plants were more attractive for DBM oviposition than wild-type tobacco plants. As larvae are unable to survive, the strategy of engineering ovipo-sition cues into non-host plants offers an alternative trap crop approach to crop protection. The combination of abiotic and biotic stresses presents an added degree of complexity, as responses at a molecular level are largely controlled by different signalling pathways that can act antagonistically [124]. The pyramiding of several defence genes may therefore provide further opportunities for creating broad-spectrum stress tolerance in agronomically important crop plants [125].

Cost-effective production of cellulose degrading enzymes for biomass-to-fuel conversion
Dwindling fossil resources and increasing energy demands are driving the development of alternative feedstocks for producing fuels and chemicals. Cellulosic feedstocks such as crop residues, wood products and dedicated crops (e.g. switchgrass, salix) are among leading alternatives because they are sufficiently abundant, low cost and do not compete with food sources. The bioconversion of lignocellulose biomass into fuels involves three major transformations: the production of saccharolytic enzymes (cellulases and hemicellulases), the hydrolysis of carbohydrate components present in pretreated biomass to sugars, and fermentation of sugars to produce fuels such as ethanol and butanol [reviewed in 126,127]. Unfortunately, the high cost of enzymes is a major barrier in the biomass-to-fuel industry [128]. Observed results indicate in planta enzyme expression offers a potential method for low cost large-scale enzyme production [129,130]. Current indications are subcellular targeting [131,132] and simultaneous expression of recombinant cellulolytic enzymes [133] are key factors in optimizing their accumulation in transgenic plants. Transgenic expression of 2A-linked cellulase enzymes (β-glucosidase, BglB; xylanase, Xyl11; exoglucanase, E3; endoglucanase, Cel5A) in chloroplast-targeted tobacco plants induced synergistic effects that led to more efficient hydrolysis of lignocellulose materials for bioethanol production [134]. Chloroplast transit peptides, small subunit of Rubisco complex (Rs) and Rubisco activase (Ra) [135], were fused to the N-terminal of the enzyme genes. This study found a synergistic effect between BglB and Cel5A in the (RsBglB-F2A-RaCel5A) lines and between E3 and Cel5A expressed in the (RsE3-F2A-RaCel5A) lines. The enzymes had higher activities which led to enhanced carboxymethyl cellulose (CMC) hdydrolysis into glucose and cellobiose. A similar observation was made by Jung et al., [2010] with chloroplast-derived BglB and Cel5A [136]. Furthermore, supplementing the protein extracts of transgenic (RsBglB-F2A-RaCel5A) with CBH11 exoglucanase increased hydrolysis activity. While the cost or yield may not be the same for all chloroplast-derived enzymes, these are important steps in cellulose bioconversion. One major use for plant virus-based over-expression vectors is the production of immunogenic epitopes in plants. Using plants as hosts has the benefits of a eukaryotic expression system with fast growth rates that can be produced on large scales and shares no pathogens with humans or animals. Viral vectors naturally achieve extremely high over-expression levels and can systemically infect whole plants from small, inexpensive inocula and avoiding the requirement to produce transgenic plants. To achieve this, the modified virus has to retain its infectivity and ability to move through the plant. A common approach to overexpress foreign proteins from a plant virus genome is to fuse them to the capsid protein (CP). CPs are often the most highly expressed viral proteins and this ensures efficient overexpression without the need to re-engineer any regulatory sequences in the viral genome. If the epitopes are displayed on the virus particle surface, they are also easy to purify. However, CP fusions are not always tolerated as they can interfere with viral encapsidation and spread. 2A peptides can be used to rescue encapsidation and infectivity of CP-fusion viruses by providing a pool of unfused CP.

Recombinant plant viruses as (Co-)expression systems
The first such use in a modified plant virus was the Potato virus X (PVX) 'overcoat' virus, in which a green fluorescent protein is fused to the N-terminus of the CP via an FMDV 2A linker [137]. The 2A-mediated 'cleavage' is incomplete so that both free CP and the GFP-CP fusion are produced, enabling formation of fully infectious virus particles which also incorporate the GFP-CP fusion. GFP is exposed at the virion surface, permitting tracking of infection and imaging of virions [137]. A similar 'overcoat' principle was also used to image infections of the related Plantago asiatica mosaic virus (PlAMV) [138]. Obviously this overcoat approach also enables the fusion of other peptides or proteins to the PVX CP. This was first demonstrated by expression of a single chain antibody against the herbicide diuron [139]. Approximately 100-250 μg protein/g leaf fresh weight of the antibody was produced, and 'overcoat' virions were easily purified by diuron-based immune-capture.
Subsequently, a number of antigenic epitopes have been expressed using PVX 'overcoat' vectors, including Rotavirus inner capsid protein [140], Classical swine fever virus glycoprotein [141], tuberculosis antigen ESAT-6 [142], and a consensus epitope from a Hepatitis C virus envelope protein [143]. Between 1-125 μg protein or virus/g leaf fresh weight, or up to 0.5-1% of total soluble protein, were obtained in these studies, and 'overcoat' virions could be purified by centrifugation [141,143]. In one case, greater amounts of protein were achieved with an alternative expression strategy using a duplicated subgenomic promoter [140], but such constructs tend to be genetically very unstable compared to 2A-based 'overcoat' vectors [144]. PVX virions carrying the Classical swine fever virus glycoprotein epitope produced an immunoprotective response in rabbits [141], and antibodies from a mouse immunized with PVX particles displaying the hepatitis C virus R9 epitope reacted with sera from infected patients [143]. For smaller epitopes, direct fusion to the PVX CP was sometimes possible, but for all larger foreign peptides, the 2A-mediated partial separation was required to enable encapsidation of systemically infective virus [143]. Filamentous viruses like PVX are particularly suited for surface presentation of antigens as they have a large number of CP units (~1270/PVX virion), and 'overcoat' overexpression vectors using 2A have also been developed for the related viruses Pepino mosaic virus (PepMV) [144] and PlAMV [145]. So far, plant virusexpressed epitopes have been purified mainly as virion-attached surface peptides, but with 2A linkers, it is also possible to produce them mainly as free proteins. For efficient surface display, an optimal ratio of free and fused CP has to be found that maximizes virus-displayed epitopes whilst still enabling efficient encapsidation. The range of 2A-like sequences with different 'cleavage' activities [31,37] will be useful in the development of further 'overcoat' vectors.
Expression vectors using 2A peptides have also been developed based on Cowpea mosaic virus (CPMV) [146], Bean pod mottle virus (BPMV) [147], and Wheat streak mosaic virus (WSMV) [148]. In the case of the CPMV and WSMV vectors, 2A linkers were used to overcome the problem that these viruses encode large polyproteins that are processed into functional subunits by viral proteases. Foreign proteins inserted into the polyprotein open reading frame need to be released and 2A sequences provide an alternative to viral protease cleavage sites. In CPMV, using 2A instead of viral cleavage motifs reduced the number of additional amino acids attached to the over-expressed protein, and as in PVX 'overcoat' vectors, some of the foreign protein is displayed on virus particles, which are very stable and easy to purify. In WSMV, use of FMDV 2A or (and also FMDV 1D/2A) sequences resulted in more efficient release of the foreign protein (GFP) than with viral proteinase sites [148]. WSMV vectors enable protein expression in cereal hosts. In the soybean-infecting BPMV vectors, 2A linkers were used both to enable insertion of foreign genes into the viral polyprotein open reading frame, and to facilitate simultaneous co-expression of two different foreign proteins.

Food for thought
The first demonstration the 2A was active in plant cells used an artificial polyprotein which comprised two reporter proteins flanking 2A [38]. This co-expression system was soon adopted by plant virologists for use in both rod-shaped and icosahedral virus particles either as highlevel expression systems, or, to produce particles 'decorated' with fluorescent proteins, immunogens, single-chain antibodies etc. [137][138][139][140][141][142][143][144][145][146][147]. Here, plants are used simply as 'bioreactors' for production of recombinant proteins / virus particles -the plants are not transgenic.
In the case of transgenic plants the first reports of the use of 2A to co-express multiple proteins were as a 'proof-of-principle' or research tools [38,57], but within a few years plants were being genetically engineered to demonstrate how nutritional properties could be improved [105,149]. Whilst the use of 2A rapidly expanded in the arenas of animal biotechnology and biomedicine (e.g. monoclonal antibody production, cancer gene therapies, production of pluripotent stem cells: reviewed in [25]), progress in transgenic plants was slower-due to a number of reasons, including the 'trickle-down' effects on plant biotechnology from the EU policies concerning genetically-modified plants. Over the past few years, however, the 2A coexpression system has been used in the development of methods to engineer plant genomes [149], the expression of high-value proteins, the improvement of plant tolerance to biotic and abiotic stresses, the improvement of nutritional properties through metabolome engineering [vide supra] and the expression of plant storage proteins with amino acid content more suited to human nutrition [150]. The drive to improve agricultural productivity through the development of 'dual-use' crops necessitates complex strategies of plant engineering and it seems clear that the use of 2A will continue to expand.