Metabolic Engineering of Saccharomyces cerevisiae for Industrial Biotechnology

Saccharomyces cerevisiae is an important and popular host for production of value-added molecules such as pharmaceutical ingredients, therapeutic proteins, chemicals, biofuels and enzymes. S. cerevisiae, the baker’s yeast, is the most used yeast model as there is an abundance of knowledge on its genetics, physiology and biochemistry, and also it has numerous applications in genetic engineering and fermentation technologies. There has been an increasing interest in developing and improving yeast strains for industrial biotechnology. Metabolic engineering is a tool to develop industrial strains by manipulating yeast metabolism to enhance the production of value-added molecules. This chapter reviews the metabolic engineering strategies for developing industrial yeast strains for biotechnological applications and highlights recent advances in this field such as the use of CRISPR/Cas9.


Introduction
The term "metabolic engineering" was introduced by [1] into the science. Metabolic engineering is defined as "the improvement of cellular activities by manipulation of enzymatic, transport and regulatory function of the cell with the use of recombinant DNA technology" [1].
Metabolic engineering aims to manipulate genetic information of the strain and "improve the cellular activities" of the strain [1]. Stress tolerance of yeast is important and needs to be improved in industrial processes. Heterologous pathways and metabolic engineering cause stress on the yeast strain. The compound of interest can be toxic to the host strain and heterologous pathways are more sensitive than endogenous pathways. During the industrial processes, high salt, high temperature, high ethanol, acids and inhibitors can cause stress and affect industrial processes [2].
Metabolic engineering makes stress tolerance possible by improvement and modification of cellular functions of yeast [3]. Metabolic engineering can be subdivided as rational metabolic engineering, inverse metabolic engineering and evolutionary engineering.

Rational metabolic engineering
Rational metabolic engineering is the fundamental type of metabolic engineering. It focuses on the engineering of proteins and enzymes based on the knowledge of pathways and their regulation [4]. Protein activities are optimized to design a desired strain based on the protein and host information [5].
Application of systems biology helps to obtain protein and host information and also to model the system. In rational metabolic engineering, a mathematical model is needed to predict the strategies that can improve the strain [6].

Evolutionary engineering
Evolutionary engineering is the method for strain improvement by mutagenesis or gene recombination and shuffling, after which a cell with the desired phenotype can be obtained. In other words, multiple cycles of random genetic perturbation are performed and the strains are selected. These two events are sequentially performed [7,8]. Evolutionary engineering is a method that improves the strain by mimicking the evolutionary process [9,10]. Aim of the evolutionary engineering is to obtain desired phenotypes by mimicking the natural evolutionary process. The evolutionary process is achieved by appropriate selective pressure. Their molecular mechanism is then studied. Industrially important traits such as stress tolerance, product formation and substrate utilization are improved by evolutionary engineering. Through evolutionary engineering, Saccharomyces cerevisiae can be made to become resistant to multiple types of stress. Evolutionary engineering is used to improve the stress resistance of S.cerevisiae, such as ethanol resistance [11], salt resistance [12], freeze resistance [13].
In this method, the best screening method must be found for the stress of interest. Genetic modifications are random. Strains evolved through evolutionary engineering can provide genetic information for the improved strain and this information can be used for inverse metabolic engineering [7,14] or random methods for deletion or overexpression of genes [15] or to introduce random mutations. It accumulates random mutations in the genome of the host. It is not easy to determine which genetic modification causes the strain improvement [16].
While developing a microbial strain, toxicity and tolerance of bioproduct, cell growth during fermentation and also downstream processes are considered. Optimization of these factors and cell performance is a difficult task due to the lack of knowledge on the relationship between genotype and phenotype.
Evolutionary engineering can be classified into two categories: adaptive laboratory evolution (ALE) and directed evolution [17].

Directed evolution
In directed evolution, the desired selection pressure is applied to develop enzymes with new or improved properties. Protein engineering uses directed evolution to enhance activity of the enzymes. Directed evolution focuses on a gene encoding a protein or enzymes. ALE focuses on the entire organism and exhibits spontaneous mutations. Genetic diversity can be generated at a single gene, at pathways or for the whole genome [17,19].
Oligo mediated targeted mutation generation such as multiplex automated genome engineering (MAGE), a modified method of MAGE called yeast oligomediated genome engineering (YOGE) and RNAi-assisted genome evolution (RAGE) can be used for directed evolution [20,21].

Inverse metabolic engineering
Inverse metabolic engineering has three important steps to be applied to the strains. In the first step, the desired phenotype is identified, constructed and calculated. In the second step, the phenotype of interest is characterized according to genetic or environmental factors. In the third step, the phenotype of another strain or organism is translated to our strain by directed genetic manipulations or environmental manipulations [4]. Inverse metabolic engineering benefits from phenotypic differences. The host strain is exposed to different environmental conditions. Then, the trait of the organism which makes it resistant is investigated; after which, genetic basis of this trait is identified. Transcriptomics, proteomics and metabolomics is used to identify the basis of the trait [14,22]. Xylose assimilation is improved in recombinant Saccharomyces cerevisiae by inverse metabolic engineering. A genomic fragment library from Pichia stipitis was introduced into S.cerevisiae expressing XYL1 and XYL2. Then, the transformants with high xylose growth rate were chosen. After sequencing, XYL3 is the responsible gene for high xylose growth rate [23]. The concept of reverse metabolic engineering has advantages. There is no information about the proteins and their regulation in the pathway. Their regulation, industrial strains and actual production conditions can be directly utilized to identify key genetic players. Homologous genes are responsible for the final strain development. New genetic targets can be discovered. When heterologous genes are needed to be added, it is better to use rational metabolic engineering. Despite the advantages of inverse metabolic engineering, strain development will be more successful when all metabolic engineering methods are used [9].
In theory, microorganisms can produce all the metabolites that they produce in their cells, however they produce low levels of these products. Metabolic engineering has made it possible to induce the production of chemicals and proteins in higher volume. Titer, yield and productivity of the products is improved by metabolic engineering. The improvement of results for the product of interest contributed to the development of the strategies in metabolic engineering. To develop and improve industrial strains, the strategies with the help of metabolic engineering, systems biology and synthetic biology are used [24][25][26][27].
There are metabolic engineering tools that help us to make industrial strains. Using metabolic engineering, novel metabolic pathways can be constructed, new metabolic engineering targets can be identified, gene expression can be controlled and tolerance to stress can be increased. These tools and strategies can be used not only for strain development but it will be helpful for fermentation strategies, as well. Systems biology, synthetic biology and metabolic engineering help the industry develop resistant and efficient strains [27].

Systems biology and metabolic engineering
Systems biology focuses on interpreting cellular networks via computational simulations and omics data analysis [28].
Metabolic engineering research whereby an engineered strain produces chemicals, proteins, biofuel or a material that has economical value. It is important to scale up the strains for industry. Metabolic engineering is successful in developing a strain that overproduces the product of interest at the lab-scale. However, developing an industrial strain can produce bioproducts is challenging and takes time, effort and money. A combination of metabolic engineering and systems biology is systems metabolic engineering [24,29].
Traditional metabolic engineering approaches are integrated into systems biology and synthetic biology. Systems biology focuses on genome-scale computational simulation and omics analysis and synthetic biology focuses on the tools at the molecular level and pathways. Genome engineering and evolutionary engineering focuses on stress tolerance [28].
The combination of metabolic engineering and systems biology is called system metabolic engineering. It focuses on cell growth and target chemical production to accomplish the desired phenotype. Omics technology can be used for engineered strains. Sequencing all genomes is helpful to understand the differences between strains, however the difference is large. It will be difficult to find the real reason for the phenotype and the cost will also be high. The analysis of transcriptome, proteome, metabolome and fluxome offer information about the differences between strains because they are closely related to the phenotype [9]. The last step of inverse metabolic engineering is "omics" technologies. Here, "omics" technologies can find the differences at the gene level. The control of the target genes can be done by deleting or overexpressing the target genes in the strain. The result of this experiment will determine if this genetic modification will give the desired phenotype. The aim of the omics technologies are the gene detection (genomics), mRNA detection (transcriptomics), proteins detection (proteomics) and metabolites detection (metabolomics) in a strain. Omics technologies are genomics (study of an organism's genome, detection of genes), transcriptomics (mRNA detection, gene expression microarrays), proteomics (detection of protein to understand pathways and networks) and metabolomics (detection of global metabolite profiles in a system) and fluxomics (detection of metabolic fluxes) [30]. Genomics is the study of the whole genome of an organism. The study of chemical process, metabolites and the products of metabolism in a cell is called metabolomics. Analyses of omics can provide information on cellular and metabolic characteristics at the industrial strains. Omics data can give information on which genes or pathways are enhanced for the production of the bioproduct of interest [27]. Multi-omics data is applied to investigate the characteristics of the strain. Multi-omics analyses are done to select genetically engineered strains and the best strain for metabolic engineering.
Genome-scale metabolic models (GEMs) simulate the gene-protein-reaction (GPR) relationships for all the genes in an organism and help the researchers to predict metabolic response and fluxes for various systems-level metabolic studies. Genome-scale metabolic model (GEM) helps to develop strains to produce chemicals and drugs. GEM helps to predict enzyme functions, interactions among cells or organisms and understand the human diseases [31].
After the whole genome sequencing, genome-scale metabolic models have helped to predict cellular metabolisms and function and showed a way to identify the targets to increase the compound of interest. The first GEM for Haemophilus influenzae was established in 1999. MOMA is one of the widely used algorithms to identify the targets which helped to increase the production of the compound of interest. OptKnock is another program used for metabolic engineering [27].
The synthetic pathways for the compound of interest can be predicted by pathway prediction algorithms. Biochemical network integrated computational explorer (BNICE), RetroPath, GEM-Path, OptStrain and DESHARKY are the examples of pathway prediction algorithms [27].
Stable or enhanced enzymes are needed for the processes of the production of the chemicals. Computational protein design tools design new or improved enzymes by identifying core parts of protein structure and target sites for engineering [27].

Synthetic biology and metabolic engineering
Synthetic biology is the combination of engineering and biology. It can redesign or engineer biological systems [32]. In synthetic biology, biomolecular components, networks and pathways are designed and then, used to reprogram organisms [33]. Synthetic biology focuses on DNA synthesis, design and construction of novel metabolic pathways [27]. After construction of the synthetic pathways, the metabolic pathways should be optimized and maximized to improve the yield of the product of interest. The target site of metabolic pathways was chosen by the help of genome-scale metabolic models (GEMs) and multi-omics analyses. The high performing, resistant and efficient strains will be constructed according to synthetic biology tools [17,27]. Optimization of the pathway needs to done after the synthetic pathway is constructed. Pathway optimization can be controlled by gene expression. Gene expression can be modulated by gene expression components or regulatory RNAs. Any modulation on the 3′ or 5′ untranslated region (UTR), transcription factor, promoter, ribosomal binding site or terminator can control gene expression. RNAi will be effective in controlling genes [34]. RNAi is a gene knockdown system in eukaryotes and an important method for metabolic engineering. RNA-induced silencing complex (RISC) protein reduces mRNA levels via a small interfering RNA (siRNA). Double stranded RNA is degraded to siRNA by a protein called dicer. Argonaute recognizes small guide RNAs which will recognize and degrade mRNAs of the target gene. RNAi has been widely used for metabolic engineering in eukaryotic organisms. In prior study [34], hairpin RNA expression cassettes are constructed to improve itaconic acid production.
The disadvantage of metabolic engineering is whole-genome sequencing to identify the trait. Clustered regularly interspaced short palindromic repeats (CRISPR-Cas9) has led to a new era for genome engineering [35]. CRISPR-Cas9 made a huge impact on the advancement of engineering of microbial cell factories. Endogenous homology-directed repair (HDR) or non-homologous end joining (NHEJ) are DNA repair pathways and used to insert or delete genes. However, their efficiency is low for metabolic engineering of microbial organism [36,37,38].
Endonucleases are used to double strand break induction and increase the recombination efficiencies [39]. The disadvantage of endonucleases is that they are not a successful strategy for large genomes. Transcription-activator nucleases TALENs and Zinc-finger nucleases (ZFNs) are advantageous for sequence specificity [38]. TALENs are focused on a smaller size of the genome. The CRISPR-Cas9 system is a defense mechanism that bacteria against viruses. In a class II CRISPR/ Cas system, Cas9 is an endonuclease and introduces a double strand break. Transactivating CRISPR RNA (tracrRNA) and CRISPR RNA (crRNA) collectively guide Cas9 to the target region. Cas9 cleaves the DNA-strand which is the complementary strand of the crRNA-guide sequence. In genome engineering, crRNA and tracrRNA are fused and used together.
The CRISPR-Cas system is used for metabolic engineering in S. cerevisiae [40]. CRISPR interference (CRISPRi) is a system that is used for gene downregulation. Deactivated Cas9 binds to the target DNA and blocks the transcriptional initiation [41].

Biopharmaceuticals and metabolic engineering
Bio-based chemicals' market size is estimated to reach USD 97.2 billion by 2023. According to biopharmaceutical market forecast report, the market size is 239.8 billion in 2019 and it is estimated to grow at a CAGR of 13.28% during period 2020-2025. Biopharmaceuticals represent 25% of commercial drugs and about 40% of total pharmaceutical sales [42]. Total biopharmaceutical sales are over $275 billion in 2019 and doubled from $125 billion in 2012. Total biopharmaceutical sales are growing at 12 percent annually. $200 billion of $275 billion are the sales of recombinant proteins. The rest of the sales are non-recombinant vaccines as well as blood and plasma products [43]. Sales of most biopharmaceuticals have grown significantly versus sales in 2011. The top three categories of sales are recombinant proteins, monoclonal antibodies and insulin [44]. The USA FDA and European Medicines Agency have stimulated the biopharmaceutical industry to produce more biopharmaceuticals [43]. Improvement of human health and life longevity are the benefits of the medicine. Biopharmaceuticals play an important role in the treatment of many disease [45]. The first biopharmaceutical was the recombinant human insulin produced in Escherichia coli. Eli Lilly launched the recombinant human insulin to the market in 1982 [46].
Approximately 70% of potential drugs in development are developed for diseases such as diabetes, cancer, neurological and immunological diseases. The effectiveness of biopharmaceuticals in the treatment of cancer and HIV/AIDS has been observed in the last decade. Deaths due to these diseases have decreased with the use of biopharmaceuticals. As a result, it has led to an increase in the use of biopharmaceuticals in the global market [47].
Therapeutic enzymes, therapeutic proteins, recombinant growth factors, cell and gene therapies, recombinant hormone, synthetic immunomodulators, hormones, monoclonal antibodies and vaccines are biopharmaceuticals, and they have been extensively used as therapeutic agents [45]. The term "biopharmaceuticals" was named in the 1980s. It means pharmaceuticals that are produced using genetic engineering [48]. Biopharmaceuticals have many advantages such as targeting only specific molecules, having fewer side effects, having high specificity and high activity [49]. Biopharmaceuticals are 100-1000 times larger than conventional drugs. It is necessary to use microbial and mammalian cells for the production of all therapeutic proteins. As a result, biopharmaceuticals are produced in E.coli, yeast and mammalian cells [50].
Yeast has advantages among the cell factories that are used for biopharmaceutical production. Yeast needs inexpensive medium compared to mammalian cell culture. Inexpensive medium for yeast reduces the cost of the biopharmaceutical production. Fermentation technologies that are used for yeasts are well-known and well-established [51]. However, yeast also has disadvantages in the production of therapeutic proteins. The high glycosylation capability of yeast is a disadvantage. If the high glycosylation capability is blocked, yeast can produce therapeutic proteins with the humanized glycosylation [52].
Cell factories are advantageous due to product quality, scale-up and downstream processes. The advantages of producing proteins in mammalian cells are having properly-folded protein, good pharmacokinetics and human-like N-glycosylation. However, mammalian cells are sensitive to bioprocessing. and the growth medium of mammalian cells are expensive [53]. The first human insulin is produced in E.coli. Protein production in E.coli is advantageous due the use of and inexpensive medium, fast growth time, high cell density culture, easy transformation and fast protein production rate. The disadvantage of protein production in E.coli is incorrect folding, low solubility and secretion of the protein. S.cerevisiae has good properties such as proper folding, easy culture growth and correct post-translational modifications. Secretion of the product to an extracellular medium is possible and this makes the protein purification easier. S. cerevisiae is also free of pathogens. S.cerevisiae is the most commonly used yeast strain for recombinant protein production. S.cerevisiae can be defined as predominantly unicellular however, many yeasts have both unicellular and multicellular lifestyles [54]. S.cerevisiae is used as a model for diseases. S.cerevisiae was the first eukaryotic organism to be fully genome sequenced [55]. S.cerevisiae has tolerance to chemical and physical stresses and that makes it a good model organism protein production in the industry [54].
Bioproduct process contains four stages: (1) first upstream process, (2) second upstream process, (3) midstream process and (4) downstream process. Fermentable carbohydrates are converted in the first upstream processes. A high-performance strain is developed in the second upstream process. In the midstream process, the strain is grown, and it produces the product of interest. The desired product is purified in the downstream process. Microorganisms should be optimized to produce the chemical or material of interest efficiently. The optimization and modifications can be applied on the microorganisms by metabolic engineering [17].
Metabolic engineering uses synthetic biology, systems biology, and evolutionary engineering to develop microorganisms. Metabolic engineering develops highly efficient microorganism strains to produce important products such as bioproducts, bulk chemicals, materials, natural products, fine chemicals, and polymers. Target products should be selected first. Biofuels, bulk chemicals, polymers, fine chemicals (including drugs), materials and natural products are bioproducts. Bulk chemicals are chemicals with low price and produced in large quantities. Fine chemicals are more expensive than bulk chemicals and are produced in small quantities.
Traditional petroleum-based plastics are produced from fossil fuels and are unsustainable; thus, bioplastics have become popular [56]. Bio-based polymers are polymers that can replace petroleum-based plastics. Polyhydroxy-acids (PHA) are biodegradable natural polyesters that are produced in microorganisms and have biodegradability and biocompatibility properties. To produce more PHAs and poly (lactic-co-glycolic acid), microorganisms can be engineered by metabolic engineering. Biofuels are produced by biological and chemical processes. Fatty acid biosynthetic pathways, ethanol pathways, CoA-dependent reverse b-oxidation pathways, keto acid pathways and isoprenoid pathways are key pathways for biofuels. Bioethanol is one such biofuel that is produced by microorganisms and developed by metabolic engineering. E. coli and Saccharomyces cerevisiae have been used for the production of biofuels. The classification of natural products is based on their structures. This classification contains alkaloids, terpenoids, phenylpropanoids, and polyketides. The source of natural products are natural sources. The cost of extracting natural products is high. The extraction of natural products leads to low yield. Natural products can be produced by chemical synthetic routes. However, production by chemical synthetic routes can generate multistep reactions and stereoisomers. Natural product production can be performed by metabolic engineering strategies. Selection of a host strain is based on the product. There are three ways to choose the host strain. First, the desired product can be overproduced by the host strain. Second, the desired product can be produced with low efficiency. Third, the target product is not produced by the host strain. The host strain will be generated using different metabolic engineering strategies. E. coli and S. cerevisiae are popular host strains to produce biodiesel hoewever, the efficiency is low. 'Generally recognized as safe' (GRAS) microorganisms should be used to produce food and pharmaceutical products for the safety issues. Bacillus subtilis and S. cerevisiae are well-known GRAS strains [57]. Artemisinic acid is known as an anti-malarial drug precursor and is produced in S. cerevisiae by introduction of heterologous pathways. Opioids were produced in S. cerevisiae [58]. Systems metabolic engineering can enhance the production of recombinant proteins such as artemisinic acid. Some of the target pathways are not available in microorganisms. Enzymes and metabolic pathways for the desired product should be designed by metabolic engineering. For example, lactams cannot be produced by natural pathways, and de novo pathways should be designed for lactams. Penicillin is a beta-lactam non-ribosomal peptide. Baker's yeast Saccharomyces cerevisiae can produce and secrete penicillin by metabolic engineering. Five genes in the benzylpenicillin pathway in P. chrysogenum were integrated into S. cerevisiae. Bioactive benzylpenicillin is then produced and secreted by S. cerevisiae [59].
If natural pathways of the production of the desired product is unknown, then GEM-Path, DESHARKY, RetroPath and RetroRules are used as prediction algorithm tools for metabolic pathway design. Mutations should be identified after these methods. Colorimetric assays, spectrophotometer fluorescence-activated cell sorting (FACS), or microfluidic sorting devices can be used to identify the beneficial mutation in the organism. The pathway should be optimized after the metabolic pathway is constructed in the host strain. Genome-scale metabolic simulation, plasmids, regulatory RNAs, and genome engineering are used to optimize the pathways of the host strain. Recombination-mediated genetic engineering is used to optimize pathways and produce the desired product efficiently. RecABCD system-based homologous recombination, the l Red recombination, site-specific recombination systems including Cre-lox and flippase-flippase recombinase target (Flp-FRT), zinc finger nuclease (ZFN) and CRISPR along with CRISPR/Cas are genome engineering tools [57]. The b-amyrin is a pentacyclic triterpenoid compound and was produced by S. cerevisiae strain engineered by CRISPRi [60].
Scale-up fermentation is an important step for biopharmaceuticals. The strain's growth performance and optimal fermentation conditions have been validated for lab-scale fermentation (0.5-30 L). After lab-scale fermentation is approved, pilot-scale fermentations (30-3000 L) and large scale production (3000-20,000 L) will be performed to see the conditions of the strain and the product of interest. Full-scale (20,000-2,000,000 L) production fermentation will be performed for the production of biopharmaceuticals. In scale-up fermentation, gradients of feed, oxygen concentrations, and maintaining the genomic stability of high-performing strains will be a challenge [27].