Chloroplast is responsible for the major metabolic process photosynthesis. These organelles have their own genome and in the last three decades, the chloroplast genome has been broadly studied and manipulated through genetic engineering tools. The transfer of genes into chloroplast provides advantage over the insertion of transgene into nuclear genome, including overexpression of foreign protein, no positional effects, absence of epigenetic effects and uniparental inheritance of the transgenes, the ability to express multiple transgenes in operons and the possibility of eliminate the marker gene after the transformation and integration of the foreign gene. Now more than 100 transgenes have been reported stably integrated into the chloroplast genome including genes encoding enzymes with industrial value, biomaterials, biopharmaceuticals, vaccines and genes with agronomic trails. The chloroplast genetic tools have been implemented in several important crops. So, the chloroplast engineering technology has been positioned as the most important for the production of proteins and metabolites with biotechnological applications.
- chloroplast engineering
- foreign protein
- biotechnological applications
1.1. Chloroplasts: from cyanobacteria to bioreactors
The chloroplasts are organelles that allowed obtaining electrons from water, as part of the basic photosynthetic machinery that have evolved from cyanobacteria to flowering plants. In most groups of plants, they are indispensable due to its role in metabolic process as photosynthesis, biosynthesis of fatty acids, amino acids, pigments and vitamins and any interruption of its normal metabolism can cause lethality in plants [1, 2].
All functions that occurs in the chloroplast are regulated by a genome that comprises around 120 genes grouped in a double stranded of ∼150 kb; these genes have implications on photosynthesis, replication, transcription and translation processes; the organization of the chloroplast genome allow the site-directed gene integration without altering the integrity of endogenous genes; this characteristic has allowed the expression of genes codifying enzymes with industrial value, antibodies, antibiotics, vaccines, antigens and also genes of environmental importance. Due to the genetic organization in operons, chloroplast genes can be expressed together and with the advantage of minor suppression effects, where chloroplast expression yields until 70% of total soluble protein (TSP), which has improved the vegetal biotechnology in agricultural, food, medicine and environmental areas [3, 4].
Currently, different industries require metabolites for improving production processes; in this sense, several technologies based on microorganisms has been used as source of such metabolites; nevertheless, they could not satisfy demands at the rate they are needed; on the other hand, overexpression of several proteins in chloroplast has proved its usefulness for specific protein production, although this technology force to know the metabolic route itself by the presence of multiple isozymes in the metabolic route what would considered a disadvantage currently. Chloroplast engineering has been efficiently used in the expression of omega-3 genes, desaturases of omega-6, tocopherols and tocotrienols, flavodoxin, carotenoid or vitamins in crops as tomato, cauliflower, cabbage, rape, poplar, beet, potato and eggplant [5, 6].
2. Applications for the agronomic traits
The increases of agriculture, industry and globalization have given the opportunity to transgenic crops production and trade; but chloroplast engineering gives the opportunity of generating transgenic crops with the possibility of containing the transgene flow to minimize the outcrossing transgenes related to weeds or crops; this characteristic has allowed the development of crops expressing genes with useful agronomic traits via the chloroplast genome . In this sense, the requirements in the agriculture have focused in crop protection based on insect, herbicide resistance, drought and salt tolerance and phytoremediation. For example, the expression of ‘cry2Aa2’ operon bio-insecticidal protein from Bacillus thuringiensis known as Bt, which confers resistance to pests, was accumulated to up 45% of total soluble protein. Additionally, this technology has also been used to protect crops against bacteria and fungi when antimicrobial peptides are expressed [8, 9]. Also, the chloroplast engineering has impacted in the agriculture and environmental issues, by conferring crop tolerance to salinity through the expression of betaine aldehyde dehydrogenase gene or maximizing heavy metals removal through increasing efficiencies of absorption by the roots, shoots translocation and volatilization [10, 11].
The chloroplast engineering still has some obstacles because the transgene expression depends of factors such as the specie, the regulatory regions used (promoters, 5′-, and 3′-UTR) and efficient methods of tissue culture and regeneration protocols [12, 13]; also, recent studies indicate that protein overexpression can interact with intermediaries of metabolic pathways causing mutant phenotypes ; this coupled to somatic embryogenesis which cannot be achieving a homoplasmic state in monocotyledons is one of the biggest obstacles for plastids engineering in agronomic crops . Despite not having a viable method for transgene integration to all crop species, currently transformations has been successfully done in tomato, cauliflower, cabbage, rape, poplar, beet, alfalfa, potato, carrot, cotton, oilseed rape, petunia, rice, soybean, sugar beet and eggplant [5, 6, 16, 17]. The efficient use of this technology in crops has shown that possibly in the future these could interact with the biopharmaceutical sector .
3. Enzymes to cellulose degradation
Plant lignocellulosic biomass is mainly compounded of polymeric sugars as cellulose, hemicellulose, pectin and polyphenolics (lignin); this complexity avoids its use because the residues are joined by forming crystalline microfibrils that are highly resistant to enzymatic hydrolysis [14, 19].
Nevertheless, despite this, cell wall constituents can be degraded with consortiums of enzymes such as pectin lyase capable of degrading the pectin through α-(1 → 4) bond hydrolysis of polygalacturonic acid [20, 21]. To degrade lignin residues imbibed in plant biomass an enzyme cocktail is required and usually includes laccase ‘Lac’, lignin peroxidase ‘LiP’ and manganese peroxidase ‘MnP’; interestingly, all ligninases exhibit high homology in their primary sequence [22, 23]. The ligninases are implicated in the removal of an electron of phenol moiety of lignin which is stabilized by organic acid chelators facilitating the degradation of phenolic compounds in the presence of H2O2; as lignin is the most important source of aromatic polymers in nature, its decomposition is also necessary for carbon recycling [24, 25].
Pectin and lignin constituents are not the only ones present in the cell walls, also the cellulose residues represent 40% of plant biomass, which placed it as the highest carbohydrate synthesized by plants, and it is formed of 100–14,000 residues of glucose with β-1-4 linkages forming a crystalline cellulose which needs synergistic action of several highly specialized enzymes to release glucose units, these are known as endo-β-glucanases (1,4-β-D-glucan-4-glucohydrolase ‘EC 22.214.171.124’) and hydrolyzed random internal glycosidic linkages, resulting in a decrease in polymer length and a gradual increase in the reducing sugar concentration [26, 27]; subsequently, a exoglucanases (1,4-β-D-glucanglucohydrolase ‘EC 126.96.36.199’ and 1,4-β-D-glucancellobiohydrolase ‘EC 188.8.131.52’) hydrolyse cellulose chains by removing cellobiose from the reducing or non-reducing ends and finally β-glucosidases (β-D-glucoside glycohydrolase ‘EC 184.108.40.206’) hydrolyse cellobiose to release D-glucose [28, 29].
However, the degradation mechanism of celluloses is different in several organisms; for example, the amorphous or soluble cellulose is degraded by the action of endocellulases alone, while crystalline cellulose first requires an exocellulase and then a cellobiase to release two glucose moieties and sometimes glucohydrolase may act as a component of the exoglucanase releasing glucose and not cellobiose from the non-reducing end of poly- and oligosaccharides ; also the cellobiose can be oxidatively degraded for a cellobiose-quinone oxidoreductase to cellobionic acid [31, 32, 33]. In symbionts of termites and fungi, endo-1,4 glucanases cleave inner-1,4-glycosidic bonds and generate oligosaccharides, then the cellobiohydrolases split off cellobiose from the non-reducing end of the oligosaccharide chain; whereas in aerobic bacteria, cellulose hydrolysis has been attributed to action of two types of enzymes that act like fungal endocellulases and cellobiases [32, 33, 34]. Exocellulases also have been found in few bacteria and the cleavage of crystalline cellulose is done by an intra-molecular synergism of bacterial endocellulases [35, 36]. However, despite the mechanism which the cell wall components are degraded, currently there are no enzymes with high capacity of degradation and the need for them increases [26, 37].
To supply the necessity of efficient enzymes in the industry, the microorganisms are the first potential source of enzymes to be used in genetic engineering [26, 38] and currently, from them it has been possible to obtain endoglucanases, xylanases, cellobiohydrolases, exoglucanases, glycosyl hydrolases, glucuronoyl esterases, ferulic acid esterase and acetylesterases . However, the enzymatic activity obtained from microorganism’s expression is limited by the enzymatic capacity inherent to own system and the protein overexpression is only possible with the change of expression system; to supply the high quantities of proteins required by industry, currently, the chloroplast genetic engineering has been used to express several hydrolytic enzymes genes such as cellulases (bgl1C, cel6B, cel9A, xeg74, celA, celB, bgl1, Cel6, Cel7, EndoV, CelKI, Cel3, TF6A, Pga2, Vlp2 peroxidase genes), pectinases (PelA genes), ligninases (MnP-2 genes) and xylanases (xyn, xynA genes) [14, 39, 40, 41, 42, 43].
With the expression of hydrolytic enzyme in chloroplast genome, positive results have been observed in the expression of β-glucosidase (bgl1 gene)  when plants with longer internodes (150% of height) and increasing leaf area (resulting in 190% more biomass) and mature transplastomic plants with earlier flowering were obtained. The expression of β-glucosidase also increased twofold the levels of gibberellic acid precursor (GA53, GA44, GA19, and GA20), (GA1) hormone, and catabolite (GA8) and in the case of indolacetic acid and zeatin hormones had more than 200% of increases. Surprisingly the glandular trichomas density resulted up to 10-fold with more production of natural sugar esters that shown to be effective biopesticides against a range of insect species  in the β-glucosidase expression plants, they showed 18-fold less aphid and whitefly population by a toxicity increase in exudates of plants; but, there are report when the transplastomic plants that expressed lacasses showed slightly retarded vegetative growth with a light green leaf color may be attributable to copper deficiency induced by ligand chelation related to produced laccase . In plants with xylanases overexpressed, a 60% increase of xylanase (xynA gene) was observed and the accumulation of the fungal enzymes was more than 10-fold higher levels but the transplastomic plants displayed pale-green-to-white leaf color and severely retarded growth . In other researches using bgl1C, cel6B, cel9A and xeg74 as a cocktail genes, all transplastomic lines displayed strong mutant phenotypes showing severe pigment deficiency, slow growth in soil or even they did not survive due to their insufficient photosynthetic performance . With manganese expression, the transplastomic plants showed mild phenotypic effects in green house with some leaves turning pale as they matured . On the other hand, there are reports where transgenic plants are morphologically and agronomically indistinguishable from the untransformed plants. These open new avenues for large scale production of several other industrially useful cellulolytic enzymes through chloroplast expression [39, 48, 49].
Despite different results obtained in several researches; currently, enzymes with high capacity to degrade the plant biomass have been notoriety in industry. And due to extensive processes, that are involved as cotton processing, paper recycling, detergent enzymes, juice extraction and as animal feed additives, they are positioned as the third most common enzymes used [26, 37]. Today, enzyme treatments of plant biomass are considered more cost-effective compared with mechanical processes, resulting in 20–40% energy savings ; although, the costs associated with the production of microbial enzymes are expected to be high . For this, the chloroplast technology can provide an inexpensive source of active cellulases, which is critical to efficient and cost-effective conversion of lignocellulosic biomass into the different industries [26, 38].
4. Factors involved in protein plastid expression
To achieve the gene expression successfully, several factors are involved in the gene regulation since their integration in the chloroplast genome until the concomitant modification of sequence to improve the translation.
4.1. Homologous recombination sequence
The recombination is a fundamental conserved process in all spices and is essential not only in the conservation of genome stability, facilitate the genetic diversity, but also is involved in the reparation, replication, maintenance and segregation of DNA [52, 53]; the principal role of recombination restores the DNA damage caused by photooxidation and other environmental stresses. It has proposed that the recA system regulate the recombination and that the mechanism is limited by enzymes availability more than substrates. Although recombination in chloroplast is well documented, its molecular mechanism is not well defined. However, the recombination is an advantage in chloroplast engineering because genes of interest and selectable marker genes flanked by homologous native chloroplast DNA sequences can be inserted in the chloroplast genome via homologous recombination .
4.2. Marker genes
The transplastomic lines and the gene of interest integration in the chloroplast genome must be confirmed with selectable marker genes that confer resistance to biotic or abiotic stress. The first selection marker gene was mutant from rRNA 16S that conferred the spectinomycin resistance ; however, the stable integration of a marker gene was reported by Goldschmidt-Clermont  with the aadA gene expression that confers spectinomycin and streptomycin resistance and this resulted 100-fold more that the previously obtained by Svab and Maliga . In 1993, the neo gene (kanamycin-resistance) was used as an alternative of selection, but in 2002, the aphA6 gene was reported with the same resistance with high efficiency of transformation ; however, Kumar et al.,  achieve results eightfold that aphA6 with a ‘double barrel’ from aphA6 and nptII genes, also, it has been activated other selection markers as aacC1 gene that confer gentamycin resistance or bar gene to phosphinothricin; multiple mechanisms have been used as selection method as codA gene used as negative marker ; recently, it has been used the cat gene to chloramphenicol resistance with low resistance compared with aadA gene, but without spontaneous resistant’s mutant . Furthermore, Nuccio et al.  used the badh gene that confer salinity resistance and this selection marker gene was tested in chloroplast transformation of carrot by Kumar et al.  showing as a selection alternative against disadvantages of marker genes with antibiotics resistance.
The selection markers are chosen according to cellular autonomy and portability; thus some selection markers are dominant like the aadA gene, while other genes are recessives like the punctual mutation in the RNAr genes (rrnS and rrnL); the dominant markers are important for transformations of the highly polyploid plastomes because they increase the transformation frequency due to its effect at early stages of selection despite that may only integrate in the minority of the plastomes; on the another hand, recessive markers have lower efficiency in the transformation and only are efficient in random segregations if the plastids have sufficient copies of transformed genome [63, 64]. However, the selection markers and genes of interest can be added under one promoter because the polycistronic RNAs are efficiently translated in plastids ; also, it is possible to obtain a transductional fusion between a resistance gene and reporter gene like gfp gene avoid phenotypic marking in transformed cells with more results in the selection process from transformed tissue .
To obtain high levels of protein expression in plastids, first, the levels of stable mRNA must be increased and then use a promoter with high efficiency . To realize the protein expression, the plastomes of algae and higher plants account with a polymerase RNA cyanobacteria type usually named ‘PEP’ (plastid-encoded RNA polymerase) constituted by enzymatic cores encoded by rpoA, rpoB, rpoC1 and rpoC2 genes . The PEP recognize promoters type σ70 and account with conserved region −10 (TATAAT) and −35 (TTGACA) which is responsible of start the transcription of 5–7 nucleotides downstream of −10 element of promoter; also, the PEP is associated at 8 auxiliary protein that regulate the transcription as kinase of plastid transcription (PTK) is regulated by a phosphorylation factor [69, 70].
The eubacteria enzymatic machinery has acquired a second polymerase type phage (NEP, nuclear-encoded plastid RNA polymerase), which is active in plastids; however, a nuclear gene encoded the catalytic core of NEP , which recognizes three different promoters: NEP type I with recognition in the region ∼15 nucleotides upstream (−14 to +1) from start codon (+1), a subtype promoter NEP Ib with a conserved motif to 18–20 nucleotides upstream of YTRa motif designed as ‘box II’ or ‘GAA box’ [72, 73] and promoter NEO type II that recognizes downstream region of start codon (−5 to +25) . Several genes and photosynthetic operons have a PEP type promoters, the genes non-photosynthetic are transcribed as PEP as NEP, whereas the only a few genes are transcribed exclusively by NEP promoters .
The promoters recognized by PEP/NEP are regulated differentially which is tested with the expression level and their efficiency can be related with upstream sequences of start codon . Recently, it has been reported with gfp expression that rrn promoter is 90-fold strong that psbA or of trc promoter, indicating that the transcription is more efficient in Prrn ; also, it has been recorded that other promoters such as PclpP-53 with high efficiency related with NEP non-dependent of enzymatic core of PEP ; several reports have ideated the analysis of the capacity of promoter from operon RNA (Prrn), which can be fused with control sequences to improve the protein expression . In this sense, the expression levels of a protein are determined by their sequence or their promoter and different promoter can be used to improve the gene expression in chloroplast genome .
4.4. 5′ and 3′ UTR sequences
The transgene expression in the plastid genome required sequences that can be recognized by transcriptional machinery from the own plastid; however, the role and mechanism of the regulatory elements and flanked sequences is not well defined . The protein expression in plastid depends of mRNA stabilization; also, the translation and accumulation efficiency of protein expressed by chimeric transgenes are determined by 5′3′UTR regions and the interaction of mRNA-rRNA . The loss of regulatory elements 5′ and 3′UTR leads to fast degradation or low transcripts accumulation . The start of translational activation is currently unknown, but it is possible that the ribosomal protein S1 media the mechanism of affinity from 5′UTR .
The 5′3′UTR elements are recognized by proteins that protect the mRNA against exonucleases 5′3′ and processional endonucleases 5′3′; the eukaryotic cells used this mechanism in the cytoplasm to remove defective mRNA capable of originate truncated proteins, and this shows how the untranslated region has influence in the stability and degradation of plastid mRNA . The 5′UTR region is a rich A + T region and contains cis elements that determine the mRNA stability by interaction with the α subunit of RNA polymerase . It has been reported that mutations in 5′UTR drastically decreased the protein expression . On the contrary, the 3′UTR is considered as a stabilizing region of mRNA being able to be similar or different to Shine-Dalgarno sequences; however, the upstream region of ATG are highly dependent from individual mRNA; in this way, these sequences are critic to start translation. The 3′UTR region from plastid mRNA is an inverted repeat sequence that can be folded as stem and loop in involved structures in the termination of prokaryotes transcription . It has been suggested that the 3′UTR is of less importance to mRNA stability ; however, the mRNA levels are mostly determined by 3′UTR more than 5′UTR . Recently, it has been reported that clpP can be used as potential source to obtaining regulatory sequences based on results on potato .
4.5. Shine-Dalgarno sequences
The plastid gene expressions are controlled in post-transcriptional stage, especially, during the translation . The prokaryotic mRNA normally contains with a 5′ untranslated region (5′UTR) and Shine-Dalgarno (SD) sequences as ribosome binding site . The translational regulation is to be determined by untranslated region upstream at start codon that sometimes is specific to each mRNA [91, 92]. The localization of SD upstream start codon is 4–9 nucleotides as the optimum form ; however, several sequences has a SD at different distances; in these cases, specific proteins added at 5′UTR and redirectioned the 30S subunit to start codon or next AUG downstream .
The plastids and E. coli have an identical anti-SD sequence in 3′end from 16S rRNA (5′-TGGATCACCTCCTT-3′), because of this the SD sequence ‘GGAGG’ of plastids can be recognized in E. coli and vice versa; it has been reported that bacterial SD can be used to improve the expression of genes in plastid ; although, still it is unknown as the SD are recognized and how the used determine the efficiency translational ; in this sense, the cistron can be processed in both system, but the translation efficiency is major in plastids, this may be due at the presence of plastid ribosomal proteins ‘PSRPs’, and although this proteins are not well defined, they can be explained why this proteins were acquired in the evolution [94, 95].
It has been reported that is possible used multiple SD sequence upstream at the start codon to improve the translation; however, adjacent SD sequence are unrecognized by multiple ribosome at the same time. The organization of four SD sequence with sufficient spacer length (39-nt) that allow the incorporation of ribosomal units 30S can improve the expression until 71-fold hence attracting more ribosome; however, if there is a start codon, stop codon or mini ORF between two SD, the translation is low . It is not well known the correlation between 5′UTR, SD and start codon number to the mRNA stability but the genes with polycistronic expression can contain SD canonic sequences upstream of each cistron indicating that initiation of internal translation can occur; sometimes, the polycistron does not make intercistronic cleavage and is efficiently translated and it is unknown as the SD downstream are recognized, but it is speculated that regulatory proteins are involved in the process [91, 92, 94].
5. Transformation methods
The plastid transformation is widely used in basic research and into biotechnological applications; initially developed in Chlamydomonas and tobacco, currently is feasible in several plant species . This technology require the transfer of genes into chloroplast genome; now, the strategies more common to transgenes insertion in chloroplast genome are biolistic and polyethylene glycol (PEG); the biolistic has been tested with gold and tungsten particles; this method is efficient because there are no limits due to interaction of host-pathogen such as viruses and bacteria; the biolistic has no restriction by cellular species, foreign DNA length, sequence or conformation . The reports suggest that particles of 0.4 μm improve fourfold the transformation efficiency more than 0.6 μm; also, exhibit less cellular damage ; it has been shown that charged particles can promote transformation events in isolated genomes as plastid or nuclei at the same time . On the another hand, the PEG method is efficient and with low cost of transformation, eliminates the dependent of particle shots and decreases the risk of cellular death explosion and the dependence of species can be more efficient that particles bombardment; both methods are efficient but the use is crop dependent .
6. Elimination of marker gene
One the biggest impediments in the development of transgenic plants is that the manipulation and expression of multiple genes is difficult of achieve, still when new methods have been developed; nevertheless, when an integration and genetic expression has been achieved, the marker selection genes that allowed the detection of transformed tissue that need to be eliminated from the sequences at troubles associated to high-level protein expression and the possible effect of horizontal transference at microorganism . Multiple strategies have been reported to eliminate the markers genes as co-transformation, transposons, homologous recombination, specific site recombination and even P-DNA .
One of the most studied site-specific system is the Cre/loxP which included two components: (a) two sites loxP, each one with 34 pb (inverted repeat) flanking the sequence to eliminate and (b) the gene Cre (recombinase of 38 kDa) that joins to loxP sites and cleavage the sequence between both sites; in this sense, placing Cre/loxP system under control of OsMADS45 promoter to improve an autocleavage from marker selection gene improve the efficacy . Although the Cre/loxP system has been analyzed in multiple plants including Arabidopsis, Nicotiana tabacum Zea mays and Oryza sativa, only Brassica juncea has been reported complete efficiency until F2 progeny . Other system to elimination specific site of marker selection genes was reported using phage phiC31 (INT) that media the recombination attB/attP, Flp/frt from Saccharomyces cerevisiae, R/RS from Zygosaccharomyces rouxii  and Gin/gix from Mu bacteriophage .
Chloroplast genetic engineering is still under development but is emerging as a promising tool in the improvement of agricultural crops, expression of biopharmaceuticals elements, and manufacturing of biomaterials. Although the chloroplast transformation has been achieved in several important crops is not well defined in crops such as cereals due to factors like regenerate through somatic embryogenesis by inefficient methods of tissue culture and regeneration protocols, although there have been some significant progress in this regard . Also currently, the transformation methods for grasses are restricted to nuclear genome by Agrobacterium, electroporation, and biolistic [97, 106]. Furthermore, plastid genetic transformation has allowed phylogeny study, which is similar to transformation with emphasis in genetic improvement, requires knowledge of the genome that has not been studied in all species, this also makes a disadvantage. In this sense, although it would have to prepare a vector for each crop, there are reports of transformations using plastid vectors from tobacco in potato and tomato transformations , indicating that differences in mRNA processing are determined as a vector that can be used. On the other hand, although the transgenes could not escape through pollen, rare hybrids can be generated if modified crops are pollinated by wild relatives ; considering that of the 13 most important crops in the world, 12 of them hybridize with wild relatives ; coupled with this, it has been reported the transmission parental of chloroplasts in tobacco even when considering the strictly maternal inheritance of organelles . Therefore, it should be considered new controls in transgene transmission contention, which makes it the biggest challenge to pursue in the development of genetically modified crops .
Conflict of interest
The authors declare that they have no conflicts of interest.