Toward Future Engineering of the N-Glycosylation Pathways in Microalgae for Optimizing the Production of Biopharmaceuticals

Microalgae are eukaryotic and photosynthetic organisms which are commonly used in biotechnology to produce high added value molecules. Recently, biopharmaceuticals such as monoclonal antibodies have been successfully produced in microalgae such as Chlamydomonas reinhardtii and Phaeodactylum tricornutum . Most of these recombinant proteins are indeed glycosylated proteins, and it is well established that their glycan structures are essential for the bioactivity of the biopharmaceuticals. Therefore, prior to any commercial usage of such algae-made biopharmaceuticals, it is necessary to characterize their glycan structures and erase glycosylation differences that may occur in comparison with their human counterpart. In this context, the chapter summarizes successful attempts to produce biopharmaceuticals in microalgae and underlines current information regarding glycosylation pathways in microalgae. Finally, genome editing strategies that would be essential in the future to optimize the microalgae glycosylation pathways are highlighted.


Introduction
Microalgae are currently used for a broad spectrum of industrial applications including food and livestock feed industries, bioenergy, cosmetics, healthcare and environment [1][2][3][4]. Recently, due to their numerous advantages (high growth rate, easy cultivation, low production cost, etc.), microalgae have emerged as a solar-fueled green alternative cell factories for the production of recombinant proteins [4][5][6][7]. Among different attempts to produce vaccines and biopharmaceuticals in microalgae, the production of monoclonal antibodies (mAbs) represents the most extensive work [7,8]. Indeed, the first significant effort to produce recombinant mAb fragments was made in the green microalga Chlamydomonas reinhardtii with the synthesis and accumulation in its chloroplast of a human single chain antibody directed against the herpes simplex virus glycoprotein D (HSV8-lsc) [8]. Later, a full-length human IgG1 directed against anthrax was produced successfully in the chloroplast of C. reinhardtii [9]. The Chlamydomonasmade mAb was able to bind the anthrax protective antigen 83 (PA83) [9]. In another study, a series of complex chimeric proteins was expressed in the chloroplast of C. reinhardtii. Such chimeric proteins were composed of a single chain antibody fragment (scFv) targeting the B-cell surface antigen CD22, genetically fused either to the eukaryotic ribosome inactivating protein, gelonin, from Gelonium multiflorm [10] or to Pseudomonas aeruginosa exotoxin A domains 2 and 3 [11]. These molecules, termed immunotoxins, were encoded by a single gene that produces an antibody-toxin chimeric protein. Such algae-made immunotoxins are able to bind target B cells and efficiently kill them in vitro [11]. Full-length mAbs have also been expressed in the diatom Phaeodactylum tricornutum through nuclear transformation [12][13][14]. Those mAbs correspond respectively to a recombinant mAb directed against the nucleoprotein of Marburg virus, a close relative of Ebola virus [14] and to a human IgG1 directed against the Hepatitis B virus Antigen (HBsAg) [12,13]. The latter has been biochemically characterized in order to check the quality of the diatom-made mAb as well as its N-glycosylation profile [15]. Moreover, it has been demonstrated that this glycosylated antibody is able to bind human Fcy receptors [16], thus suggesting that it could be efficient in human therapy.
When the production of biopharmaceuticals is considered, their N-glycosylation has to be investigated. Indeed, among the biopharmaceuticals that were approved in 2016 and 2017, 96% were glycosylated [17]. The glycosylation of the approved biopharmaceutical represents a critical quality attribute (CQA) that may affect its safety and biological activities [18][19][20]. In addition, introduction by the expression system of nonhuman epitopes on the recombinant protein may induce immune response after injection to patients [21]. Thus, the N-glycosylation of biopharmaceuticals is a real challenge for the commercial production of biopharmaceuticals. The glycosylation state of therapeutic proteins has to be accurately identified and characterized as per the World Health Organization and International Conference on Harmonization Q6B guidelines [17]. Therefore, in the context of developing the microalgae as alternative platforms for the production of biopharmaceuticals, the capability of these unicellular eukaryotic cells to introduce N-glycans on their endogenous proteins and on recombinant proteins, as well as their regulation, have to be considered and understood.

General aspects
N-glycosylation is a major post-translational modification of proteins in eukaryotes. Protein N-glycosylation first starts by the synthesis of a lipid-linked oligosaccharide formed by transfer of monosaccharides on a dolichol pyrophosphate (PP-Dol) anchored in the membrane of the endoplasmic reticulum (ER) via the action of a set of enzymes named asparagine-linked glycosylation (ALG) [22,23]. The final Glc 3 Man 9 GlcNAc 2 precursor is transferred en bloc by the oligosaccharyltransferase (OST) complex onto the asparagine residues of the consensus Asn-X-Ser/Thr sequences of a protein [22] (Figure 1). Alternative consensus sequences, such as Asn-X-Cys and Asn-X-Val, have also been found to be glycosylated in some proteins [24][25][26]. Comparison of protein N-glycosylation pathways in eukaryotes. Biosynthesis steps occurring in the ER are gathered in the box. Mature N-glycan structures observed in mammals, plants, insects, yeasts and filamentous fungi are drawn according to [33]. , N-acetylglucosamine; , xylose; , mannose; , fucose; , galactose; , sialic acid; Asn, asparagine; PP-Dol, pyroPhosphate dolichol; FuT, fucosyltransferase; GalT, galactosyltransferase; SiaT, sialyltransferase; XylT, xylosyltransferase; ALG, asparagine-linked glycosylation; OST, oligosaccharyltransferase.
In the ER, neo-synthesized glycoproteins are then submitted to a quality control process through the deglucosylation by glucosidases and reglucosylation by an UDP-glucose: glycoprotein glucosyltransferase (UGGT) of the N-glycans. This allows the synthesis of monoglucosylated glycan intermediates that interact with ER-resident chaperones, thus ensuring proper folding of the glycoproteins [27]. When the glycoprotein is correctly folded, α-glucosidase II would finally remove the last glucose residue, and ER-mannosidase will eventually remove one mannose residue that leads to the formation of an oligomannoside Man 9/8 GlcNAc 2 . The quality control events are conserved in eukaryotes because they are crucial for the secretion of well-folded proteins [28]. As a consequence, whatever the expression system used, a recombinant therapeutic protein leaving the ER compartment exhibits a N-glycosylation similar to one of the reference proteins with unique oligomannoside Man 9/8 GlcNAc 2 attached to the Asn residue of the N-glycosylation consensus site.
After transfer to the Golgi apparatus, oligomannosides resulting from the ER processing are modified by the action of specific mannosidases and glycosyltransferases [29]. These Golgi cell-specific repertoires give rise to various organism-specific oligosaccharides. In most eukaryotes, a N-acetylglucosaminyltransferase I (GnT I)-dependent N-glycan processing occurs (Figure 1). In this pathway, the α-mannosidase I converts Man 9/8 GlcNAc 2 into the branched isomer of Man 5 GlcNAc 2. Then, actions of GnT I, α-mannosidase II and GnT II, respectively, give rise to the core GlcNAc 2 Man 3 GlcNAc 2 that is common to most eukaryotes [27][28][29][30][31] (Figure 1). This core is then decorated by the action of specific glycosyltransferases that differ from one organism to another. This allows the protein to be decorated by organism-specific N-glycans that confer to the mature protein in vivo bioactivities [32]. It is worth noting that GnT I-independent N-glycan processing also occurs in some eukaryotes such as filamentous fungi and yeasts in which N-glycosylation in the Golgi apparatus results in the synthesis of high mannose and hypermannose N-glycans, respectively (Figure 1). As a consequence, in the context of the production of biopharmaceuticals by genetic engineering, such a diversity of mature N-linked glycans is a limitation because the expression system used may introduce inappropriate epitopes and heterogeneous glycosylation on the therapeutics and may also fail in introducing glycan sequences that are required for in vivo bioactivity of the biopharmaceuticals.

Protein N-glycosylation in microalgae
Overall, protein N-glycosylation in microalgae received little attention. Few studies, published in the 1990s have demonstrated that proteins secreted by green microalgae carry mainly oligomannosides or xylose-containing N-glycans based on affinodetection or enzymatic sequencing [34][35][36]. More recently, analysis by mass spectrometry of glycans N-linked to microalgae endogenous proteins has been reported. First, the 66 kDa cell wall glycoprotein from the red microalga Porphyridium sp. has been found to carry Man 8 GlcNAc 2 and Man 9 GlcNAc 2 oligomannosides containing 6-O-methyl mannose residues and substituted by one or two xylose residues [37,38] (Figure 2). Investigation of C. reinhardtii has demonstrated that proteins in this green microalga carry oligomannosides ranging from Man 2 GlcNAc 2 to Man 5 GlcNAc 2 as well as Man 4 GlcNAc 2 and Man 5 GlcNAc 2 N-glycans containing 6-O-methyl mannoses and substituted by one or two xylose residues (Figure 2) [39]. Initially reported as branched oligomannosides, the structure of Man 5 GlcNAc 2 was re-evaluated in 2017 as being linear sequences based on ESI-MS n analyses [40]. Although mature N-glycans from Porphyridium sp. and C. reinhardtii share common structural features, the location of the xylose residues on the N-glycan differs between these two microalgae ( Figure 2). As mature N-glycans do not exhibit any terminal GlcNAc residues, they were proposed to result from Golgi xylosylation and O-methylation of oligomannosides deriving from the precursor synthesized in the ER in a GnT I-independent processing, even if this needs to be completely elucidated and that methylation occurring in the ER cannot be ruled out yet [38].
N-glycan profile from P. tricornutum has been described to contain Man 3 GlcNAc 2 to Man 9 GlcNAc 2 oligomannosides and also minute amount of paucimannosidic fucosylated N-glycans (Figure 2) [41]. In contrast to Porphyridium sp. and C. reinhardtii, these N-glycans result from a GnT I-dependent pathway (Figure 2) [41]. As evidence, GnT I gene predicted in the P. tricornutum genome encodes an enzyme able to restore the maturation of complex-type N-glycans in the CHO Lec1 mutant that lacks endogenous GnT I activity [41]. N-glycans arising from a GnT I-dependent pathway have also been recently reported in the green microalga Botryococcus braunii through a glycoproteomic approach [42]. In contrast to P. tricornutum, these N-glycans harbor a GlcNAc residue at the nonreducing end as well as mono-and di-O-methylations of the core mannose residue. Moreover, this N-glycan bearing a terminal GlcNAc resulting from the GnT I activity could be further elongated with an additional hexose or methyl-hexose residue. In addition, proteins from this green microalga also exhibit methylated N-linked oligomannosides carrying core fucose and core xylose residues (Figure 2) [42].
In support to these biochemical data, protein N-glycosylation in microalgae can be drawn on the basis of public genomic databases. Microalgae genomes from different phyla are available Major mature N-linked glycans from the green microalga Chlamydomonas reinhardtii and Botryococcus braunii, the red microalga Porphyridium sp. and the diatom Phaeodactylum tricornutum. N-glycan structures are drawn according to [33]. , N-acetylglucosamine; , xylose; , mannose; , fucose; , galactose; Asn, asparagine; Me, methyl.  [4,43]. Since protein N-glycosylation occurs in the ER and the Golgi apparatus, bioinformatics analyses of microalgae genomes must be investigated independently for the two compartments: search for gene encoding proteins involved in the precursor biosynthesis and the ER protein quality control on the one hand, and search for Golgi glycosidases and glycosyltransferases involved in the synthesis of mature N-glycans on the other hand.
Genes encoding subunits of OST, glucosidases, as well as ER-resident UGGT and chaperones are predicted in microalgae genomes suggesting that the process of ER quality control in these unicellular organisms is similar to the one described in other eukaryotic cells [41,44,45]. Among these putative ER candidates, only the activity of the α(1,3)-glucosidase, also called glucosidase II, from the red microalga Porphyridium sp. has been biochemically confirmed [44]. Most ALG genes are also predicted in microalgae genomes [39,41,44] suggesting that the synthesis of the oligosaccharide precursor is overall conserved. However, some of these ALG, that is ALG3, ALG9 and ALG12, are not predicted in C. reinhardtii [39,45]. These ER enzymes are involved in the completion of the biosynthesis of the precursor Man 9 GlcNAc 2 -PP-Dol, prior to its glucosylation, by addition of mannose residues on the α(1,6)-mannose arm of the core (Figure 1). Reinvestigation in C. reinhardtii of the structure of oligomannosides and analysis of the ER N-glycan precursor [40] confirmed the absence of ALG3, ALG9 and ALG12 activities and the synthesis in this green microalga of linear oligomannoside sequences instead of branched isomer initially proposed in [39]. It is worth noting that in this truncated ER pathway, the presence of the triglucosyl extension is likely sufficient to ensure interaction of the N-glycan precursor with chaperones of the ER quality control process. In addition to the lack of the ALG3, ALG9 and ALG12 in C. reinhardtii, other microalgae genomes lack genes encoding ALG10 and GCS1, an α(1,2)-glucosidase [44]. Because ALG10 is the α(1, 2)-glucosyltransferase responsible for the addition of the α(1, 2)-glucose residue on the precursor N-glycan and GCS1 is responsible for trimming this residue, we hypothesize that the ER quality control in these microalgae involved only diglucosylated N-glycan intermediates.
With regard to Golgi N-glycosylation events, the presence of GnT I is predicted in some microalgae including haptophytes and cryptophytes, but not in C. reinhardtii, Volvox and Ostreococcus [41,42]. As mentioned previously, P. tricornutum GnT I activity was confirmed by the complementation of CHO Lec 1 mutant cell line [41]. A recent study of B. braunii [42] confirmed the involvement of this transferase in this green microalga N-glycosylation pathway. Concerning other Golgi enzymes, α-mannosidases (CAZy GH 47) and α(1,3)fucosyltransferases (CAZy GT10) are also predicted in microalgae genomes studied so far [41,44,45]. These enzymes are respectively involved in the trimming of mannose residue of oligomannosides and the transfer of fucose on the proximal GlcNAc. These sequences exhibit peptide motifs that were demonstrated to be required for activities of such Golgi enzymes, but, in contrast to GnT I, no biochemical data of their activity and specificity are available yet.
As depicted, protein N-glycosylation occurring in microalgae is specific and largely differs from the one described in mammals (Figures 1 and 2). Therefore, production in microalgae of biopharmaceuticals exhibiting N-glycans compatible with their use in human therapy would be challenging and requiring metabolic engineering of the N-glycosylation pathway in microalgae. This will include the inactivation of enzymes that introduce nonhuman glycoepitopes onto N-linked glycans and complementation of microalgae with appropriate glycosyltransferases to introduce missing glycan sequences. These strategies have already been successfully carried out for the engineering of the N-glycan pathways in plants and yeasts [46,47]. In addition, the success of the complementation with human glycosyltransferases requires the availability in the Golgi apparatus of appropriate nucleotide-activated sugars [48]. For instance, sialic acids that terminate bi-antennary N-glycans in mammals have not been reported in microalgae such as P. tricornutum and Porphyridium sp. [38,41]. As well, there is no evidence for the import of GlcNAc in the Golgi apparatus in microalgae exhibiting a GnT I-independent N-glycan pathway, even if putative candidates for UDP-GlcNAc transporter have been identified in microalgae such as C. reinhardtii [49]. Indeed, the two GlcNAc of the chitobiose unit of N-linked glycans are transferred onto the PP-Dol lipid in the cytosolic face of the ER membrane. Currently, metabolic engineering strategies are now feasible due to the recent development of transgene expression and gene inactivation in microalgae as summarized in Section 3.

Different tools to generate genome-modified organisms
Classical strategies of genetic engineering involve the modulation of gene expression including overexpression and inactivation by RNA interference [50][51][52]. The most used engineering methods are based on random insertional mutagenesis obtained by various processes such as conjugation, agitation with glass beads, electroporation, biolistic microparticle bombardment, agrobacterium-mediated transformation or multipulse electroporation. The transformation step is followed by phenotypic selection using antibiotics to generate genome-modified organisms [53]. Those processes present the advantage to be simple and reach a high level of transformed cells. For example, P. tricornutum transformation reached 1 per 10 6 cells with biolistic bombardment system [54]. However, cell-wall-less strains are required for almost all the classical methods quoted above [50,55]. Furthermore, genetic stability of the mutagenesis obtained after transformation by random insertion depends on microalgae species [53]. For example, a high mutagenesis stability has been shown in C. reinhardtii [55]. Unlike, mutagenesis was unstable in Thalassosiara weissflogii [56]. More recently, new tools have been developed in order to knock in, knock out, modify, replace, or insert genes. These new genetic engineering tools consist of the action of nucleases effecting their molecular scissor activities in specific loci [52]. A break in the DNA causes activation of DNA repair mechanisms, which can be either the homologous-recombination (HR) or the non-homologous end-joining (NHEJ) [52].
The HR results in sequence modification in the target locus [57]. In the NHEJ process, the two ends of the broken chromosome are stuck together causing small deletions or small insertions [57]. These events confer several modifications of the target gene such as gene inactivations or insertions. Very little is known about these mechanisms in microalgae due to their complexity as reported by Daboussi in 2017 [53].
Several researches have recently contributed to demonstrate that particular nucleases could be used for targeting stable modifications by acting like molecular scissors. Among these nucleases, we can quote meganucleases (MNs), zinc finger nucleases (ZFNs), transcriptor activator-like effector nucleases (TALENs) and finally, the famous clustered regularly interspaced short palindromic repeats (CRISPR)/nuclease Cas9 system. These four cited nucleases are described in the following paragraphs.
Meganuclease is an engineered endonuclease able to recognize and cleave a long specific DNA sequence from 18 to 30 base pairs. The meganuclease strategy requires to design a homing endonuclease from the LAGLIDADG family especially the I-CreI enzyme from C. reinhardtii implied in the targeting of interesting gene sequences that need to be modified [58]. This was tested for the first time in 2014 using P. tricornutum as a model [59]. In this study, two engineered meganucleases targeting genes involved in the lipid metabolism are allowed to obtain 29% of targeted mutagenesis [59]. Even successful, this strategy is time-consuming as compared to the other alternatives [52].
Zinc finger nucleases (ZFNs) are hybrid proteins composed of a restriction enzyme FokI with a designed zinc-finger DNA-binding domains [60]. These FokI enzymes are inactive in a homodimer conformation [61]. Therefore, cleavage of a typical DNA-target sequence requires to design two different ZFNs for binding to adjacent half-sites of a specific locus. Each designed ZFN is able to recognize a sequence of 9-12 nucleotides in the genome [52]. A set of zinc finger nucleases has been recently used to modify by insertion of template DNA, the Cop3 gene locus encoding a light-activated channel in C. reinhardtii [62]. Moreover, in 2017, the genome editing was reliably performed using the ZFN strategy in order to inhibit and modify nuclear photoreceptor genes in this same microalga [63]. Despite these promising results, the ZFN system is barely used because of its low specificity. Indeed, cleavage of DNA requires both ZFN monomers to recognize a homologous target in the genome in the proper spatial orientation to assemble a functional ZFN [64]. Also, ZFN system is time-consuming implementation [64]. Nowadays, other designed nucleases like TALENs or CRISPR/cas9 are emerging in the scientific community to perform genome editing in microalgae.
Transcriptional activator-like effector nuclease (TALEN) system is similar to ZFN because it uses nucleases composed of a restriction enzyme domain fused to a DNA-binding domain (here the TAL effector domain) and a nonspecific DNA cleavage domain FokI [65]. TALEN proteins are characterized by a repeated 34-amino acid sequence that recognizes specific DNA sequences [66]. P. tricornutum lipid metabolism was recently modified using TALEN [59]. In this study, seven genes involved in this metabolism were modified. Each genome modification had a high frequency reaching up to 56% of colonies with targeted mutagenesis [59]. This genetic engineering allowed creation of a high lipid-producing strain by inactivating a key gene for carbohydrate energy storage [59]. Another team has inactivated successfully the urease gene in P. tricornutum with 24% of transformed colonies [67]. In addition, TALEN system has also been used in order to inactivate red/far-red light-sensing phytochrome gene of this diatom [68].
The clustered regularly interspaced short palindromic repeats (CRISPR)/cas9 system is the most famous engineered nuclease system of this decade because it is a powerful and precise tool applied in numerous eukaryotic organisms [69]. This system is based on the RNA-guided DNA cleavage defense system from archaea and many bacteria. Indeed, these organisms are able to store bacteriophage DNA fragments along a previous bacteriophage infection in the CRISPR locus, which is formed of DNA repeat sequences spaced by a unique DNA sequence. This system establishes the basis of a bacterial defense as a response to bacteriophage attacks [70]. This defense mechanism has been highlighted for the first time by Pr Emmanuelle Charpentier and her team in 2011 [70,71]. The CRISPR/Cas9 system has been developed into a simple toolkit based on a custom single guide RNA (sgRNA) that contains a targeting sequence (crRNA sequence) and a cas9 nuclease-recruiting sequence (tracrRNA) [52]. In microalgae, CRISPR/cas9 has been used in C. reinhardtii [72]. However, the Cas9 nuclease production seemed to be toxic for the microalga limiting efficiency to obtain genome-modified strains [72]. Two years later, a new assay has been performed in this same microalga using another strategy avoiding toxicity [73]. Indeed, the authors succeeded to generate CRISPR/ cas9-induced NHEJ-mediated knock-in mutant strains in three loci [73]. In the same year, CRISPR/cas9 gene knockout technology has been used in P. tricornutum to induce mutant for the CpSRP43 gene, a member of the chloroplast signal recognition particle pathway. Using this strategy, the authors obtained 31% of mutation efficiency [74]. This team targeted two other genes of the diatom using this technology and obtained from 25 to 63% of mutation level [74]. Adaptability of the CRISPR/Cas9 system has been demonstrated in other diatoms like Thalassiosira pseudonana [75] as well as in the heterokont, Nannochloropsis oceanica in order to knock out the nitrate reductase activity [76]. In conclusion, CRISPR/cas9 system is a promising technology to generate genome-modified organisms in microalgae. Table 1 compares this system with the other nuclease systems cited above in terms of their technical characteristics and highlights their advantages and disadvantages.

Mutant libraries
The study of mutants impaired in a glycosidase or a glycosyltransferase implied in the N-glycan pathway is of great interest. Indeed, the synthesis of oligosaccharides is a sequential process. Inactivation of an enzyme usually results in the accumulation of its N-glycan substrate which enables the step-by-step dissection of the entire pathway. Moreover, mutant phenotyping of the glycosylation pathway allows to investigate to which extent the protein N-glycan processing is required for normal growth and development. An indexed and mapped mutant library has been created in C. reinhardtii by single random insertional mutagenesis of gene cassettes in 2016 [79]. This library already envisioned to study the function of genes encoding putative glycosyltransferases, glycosidases or even putative translocators in microalgae and to confirm their physiological role from reverse genetic studies.

Conclusion
The production of biopharmaceuticals in microalgae currently requires a better understanding of the N-glycosylation pathway mechanism and regulation. Such information can be gained by the use of mutant libraries like the one recently developed for C. reinhardtii. Indeed, characterization of each individual mutants will allow an understanding of a specific step of the N-glycan processing, and mutant cells could represent interesting cell lines for the production of biopharmaceuticals bearing a chosen N-glycan profiling.
Once these pathways would be completely deciphered in the microalgae model intended to be used for the production of biopharmaceuticals, the humanization of the N-glycosylation pathway could be initiated using designed engineered nucleases strategies recently developed in microalgae. We can now consider that transformed microalgae by these innovative new genomic tools will constitute in a near future one of the most suitable green cell factories for the production of humanized biopharmaceuticals.
thankful to the University of Rouen Normandy, the region Normandie and the I.U.F. for their financial support.

Conflict of interest
The authors have declared no conflict of interest.