Approximate number of identified natural metabolites.
Abstract
Since 1940s, microbial secondary metabolites (SMs) have attracted the attention of the scientific community. As a result, intensive researches have been conducted in order to discover and identify novel microbial secondary metabolites. Since, the discovery of novel secondary metabolites has been decreasing significantly due to many factors such as 1) unculturable microbes 2) traditional detection techniques 3) not all SMs expressed in the lab. As a result, searching for new techniques which can overcome the previous challenges was one of the most priority objectives. Therefore, the development of omics-based techniques such as genomics and metabolomic have revealed the potential of discovering novel SMs which were coded in the microorganisms’ DNA but not expressed in the lab or might be produced in undetectable amount by detecting the biosynthesis gene clusters (BGCs) that are associated with the biosynthesis of secondary metabolites. Nowadays, the integration of metabolomics and gene editing techniques such as CRISPR-Cas9 provide a successful platform for the detection and identification of known and unknown secondary metabolites also to increase secondary metabolites production.
Keywords
- metabolomics
- genetic engineering
- secondary metabolites identification
- genomic
- CRISPR-Cas9
- production of secondary metabolites
- microorganisms
- gene editing
1. Introduction
Since the discovery of penicillin in the 1940s, microbial secondary metabolites (SMs) have attracted the attention of scientists all over the world. In fact, penicillin discovery has been shown to be a promising solution for many kinds of infections. As a result, the scientific world starts to search for other products that are produced by microbes that can be utilized for treating a different disease or can be useful for any aspect of our life. Therefore, the period between the 1940s – 1960s The “golden period of SM discovery” [1, 2] is referred to as “the golden era of SM discovery.” During the golden era, several SMs were discovered, characterized, and reported, and they are still used today. Unfortunately, after the golden era, the development of authorized novel chemical scaffolds of secondary metabolites has declined dramatically [1] the decrease in the microbial secondary metabolites detection and identification could be due to 1) almost 99% of the microbial community unculturable [2], due to the difficulty to identify their optimal medium compositions, which means that the majority of SMs are definitely unidentified, 2) the scientists have been focused on specific groups of microorganisms such as
All biochemical reactions carried out by organisms is called metabolism and all products resulting from metabolism is called metabolites. In fact, there are two kinds of metabolites resulting from the biochemical reactions that are called primary and secondary metabolites. The difference between primary and secondary is that primary metabolites are found in all living cells able to divide while secondary metabolites are present only incidentally and are not affect the organism’s life immediately. Microbial SMs is low molecular mass products with an unusual chemical structure that are produced by microorganisms usually during the late growth phase and are not essential for the growth and development of the microbe but are associated with some other functions such as competition, interactions, defense, and others [3, 4]. In fact, SMs have shown a variety of biological activities that can be utilized in different aspects such as antitumor agents, immunosuppressive agents, antimicrobial agents, antiparasitic agents, anthelmintic, and food industry etc. An example for the importance of SMs in our life is the discovery of immunosuppression such as cyclosporine A, which plays a significant role in establishing the organ transplant field.
Nowadays, Over 2 million SMs have been found based on their vast diversity in structure, function, and biosynthesis (Table 1). Plants (about 80%) and microbes (approximately 20%) are the primary sources of secondary metabolites discovered [3]. Actinobacteria and fungi have been found to create the bulk of SMs discovered to date [5]. Nowadays, omics-based techniques such as genomics, metabolomics, proteomics, and transcriptomics have overcome the problem of identification of unculturable microbes and have revealed that microorganisms have the potential to produce more secondary metabolites than were originally expected [6, 7]. By conducting omics techniques scientists were able to detect SMs that are coded by clustered genes present on chromosomal DNA directly without doing microbial culturing.
Source | All known compounds | Bioactive |
---|---|---|
Plant kingdom | 600,000–700,000 | 150,000–200,000 |
Microbes | Over 50,000 | 22,000–23,000 |
Higher plants | 500,000–600,000 | ~100,000 |
Animal kingdom | 300,000–400,000 | 50,000–100,000 |
Protozoa | Several hundreds | 100–200 |
Vertebrates | 200,000–250,000 | 50,000–70,000 |
Marine animals | 20,000–25,000 | 7000–8000 |
Invertebrates | ~100,000 | NA |
Algae, lichens | 3000–5000 | 1500–2000 |
Insects, worms | 8000–10,000 | 800–1000 |
Due to the development of the genomic and bioinformatic field, scientists are now able to access extensive genetic information and enable genome mining of relevant Biosynthesis gene cluster (BGCs) with the potential for valuable SM production [8]. therefore, genetic engineering has now become widely used and moving beyond traditional tools which open a new era in the detection of novel secondary metabolites [9]. In fact, by using bioinformatic analysis that analyzes the putative secondary metabolites genes cluster in the sequenced genome, scientists were able to predict new SMs that were not identified by using traditional techniques because all new revealed SMs are not produced naturally under the lab conditions or even though produced but in very low amount that the traditional techniques were unable to identify them [10, 11]. Metabolomics aims to characterize and identify SMs in natural and engineered biosystems.
Metabolomics based techniques such as mass spectrometry (MS) and nuclear magnetic resonance (NMR) is accurate that can measure as low molecular weight compounds as possible. In fact, mass spectrometry (MS) and nuclear magnetic resonance (NMR) have been reported as significant analytical techniques to detect secondary metabolites under specific conditions [12]. This chapter provides an overview of metabolomics and genetic engineering techniques especially the CRISPR-Cas9 technique for the discovery and production enhancement of microbial secondary metabolites.
2. Genetic engineering for SMs detection
The genes associated with the biosynthesis of secondary metabolites is named biosynthesis gene cluster (BGCs). In fact, BGCs include all genetic information required for secondary metabolites regulation, assembly, modification, and biosynthesis [13]. As mentioned previously, not all microorganisms can be cultured in the laboratory resulting in not all SMs can be expressed by using traditional techniques (culturing and detection) also a lot of microbes contains silent or cryptic genes in their genome that are responsible for the production of secondary metabolites. In fact, these silent BGCs have potentially significant in the discovery of novel secondary metabolites [13, 14, 15, 16].
Nowadays, instead of traditional detection techniques, genetic engineering tools are utilized for the identification of novel biosynthesis gene cluster BGCs [9]. However, genetic engineering can be used in both heterologous and homologous hosts. While gene manipulation in a homologous host allow the retention of factors necessary for the production of SMs, also gene manipulation in a heterologous host enable activation of BGCs obtained from unculturable microorganism [17].
In fact, a variety of genetic engineering techniques have been developed in order to induce the expression of all genes of interest. Therefore, in metabolomic production field, several genome techniques have been utilized in order to detect and enhance secondary metabolites production such as clustered regulatory interspaced short palindromic repeat (CRISPR-Cas9), zinc finger nucleases (ZFNs), and transcriptional activator-like effector nucleases (TALENs) [18, 19]. While each technique has its advantages and disadvantages (Table 2), CRISPR-Cas9 has been reported to be the most promising and significant technique that can be used in the discovery and enhancement of SMs production [9, 17, 20, 21].
CRISPR/Cas9 | Zinc finger nucleases (ZFNs) | Transcription factors like effector nucleases (TALENs) | |
---|---|---|---|
It does not necessitate any protein engineering steps and is very easy to test several times. Grna | It requires complex to test gRNA | TALENs need protein engineering steps to test gRNA | |
It operates by inserting double-strand breaks or single-strand DNA nicks into the target DNA (Case9 nickase) | It can induce double-strand breaks in target DNA | Induces DSBs in target DNA | |
Not Required | Required | Required | |
CRISP R is made up of a single monomeric protein as well as chimeric RNA | ZFNs are dimeric proteins that only require one protein component to function | TALENs are also dimeric and require a protein component to function | |
It has been discovered that there is a low rate of mutation | High mutation rate observed in plants | When compared to CRISPR, the mutation rate is high | |
crRNA, Cas9 proteins | Zn-finger domains Non- specific FOKI nuclease domain | Zn-finger domains Non-specific folk nuclease domain | |
20–22 | 18–24 | 24–59 | |
High | High | High | |
Easy and very fast procedure | Complicated procedure that necessitates protein engineering expertise | Relatively easy procedure | |
In human cells, it can cleave methylated DNA. This is an area of particular concern for plants, as it has received little attention | Unable to do so | There are many unanswered questions about TALENs’ ability to cleave methylated DNA | |
CRISPR’s main advantage is that multiple genes can be edited at the same time. Only Cas9 was required | This is extremely difficult to achieve using ZFNs | Using TALENs to obtain multiplexed genes is extremely difficult. Because it necessitates distinct dimeric proteins for each target |
2.1 Gene insertion/deleting
Gene insertion or deletion is useful not only in biosynthesis gene clusters activation but also for novel SMs discovery [22]. In fact, several silent biosynthesis gene clusters have been refactored by replacing the biosynthesis gene clusters promoter to yield natural products such as secondary metabolites [23, 24, 25, 26].
Nowadays, the promising technique has been developed in the genetic engineering field that is multiplexed CRISPR-Cas9 and transformation-associated recombination (TAR)-mediated promoter engineering method (mCRISTAR) [21, 27, 28, 29, 30]. mCRISTAR actually combined the advantages of TAR technique and CRISPR-Cas9 technique. Basically, mCRISTAR mode of action is that CRISPR-Cas9 breaks the double-stranded in the promoter region of the biosynthesis gene cluster (BGCs), then the fragments produced are reassembled by TAR with synthetic gene-cluster specific promoter cassettes [21].
2.2 Gene cloning
Basically, gene cloning consists of some steps include 1) determining the suitable heterologous host 2) cloning the target gene, 3) transferring the gene into the suitable host, 4) expression of the gene in the suitable host system, 5) optimization of production [31].
However, many new and useful cloning techniques have been introduced such as transformation assisted recombination (TRA), Cas9-assisted targeting of chromosome segments (CATCH), and TAR-CRISPR [20, 32, 33]. CATCH is a cloning tool that uses the CRISPR-Cas9 system for direct BGCs cloning into the host. However, compared to PCR and restriction enzyme cloning techniques, CATCH is appeared to be more useful for direct cloning of large genes clusters. Whether, TAR technique has been utilized for about a decade in the cloning of large BGCs, but the TAR technique is associated with low cloning efficiency [20, 33]. To address this challenge TAR and CRISPR-Cas9 have been coupled resulting in a new approach called TAR-CRISPR [33]. Therefore, TAR-CRISPR is different than mCRISTAR as discussed earlier. It is yeast-based method, while mCRISTAR uses CRISPR-Cas9 to breaks the double-stranded in the promoter region of the BGC, and the fragments produced are reassembled by TAR with synthetic gene-cluster specific promoter cassettes. As a result, by coupling CRISPR with TAR significant increase of clone efficiency has been reported [33]. In fact, TAR-CRISTAR cloning will allow for the development of BGC cloning and SM production in the future.
While gene-editing techniques play a significant role in the detection and production of microbial secondary metabolites, metabolomics is also important in the identification and characterization of secondary metabolites produced by native or genetically modified microorganisms.
3. Identification and characterization of secondary metabolites
The identification and characterization of secondary metabolites are important. Metabolomic often requires abroad array of instrumentation such as ELSD for detecting lipids, coulometric array detectors for detecting redox compounds, and fluorescent spectrometer for detecting aromatic compounds, whereas other omics techniques such as genomics, transcriptomics, or proteomics are often conducted by a single instrument.
In microbial secondary metabolites investigation, the experiments are mainly conducted in two different approaches, targeted or untargeted metabolites identification [34]. As its name, targeted metabolites experiment aims to identify a specific group of SMs that are already known. Whereas, the untargeted secondary metabolites experiment aims to identify the large scale of SMs produced by microorganisms including novel and known metabolites [35].
Nowadays, two general technologies have been utilized as primary tools in metabolomic, mass spectrometry (MS), and nuclear magnetic resonance (NMR) [4, 36, 37].
These high-throughput tools provide broad coverage of many classes of secondary metabolites, including amino acids, lipids, sugars, organic acids, and others.
In fact, nuclear magnetic resonance (NMR) and mass spectrometry (MS) has been used to identify both targeted and untargeted secondary metabolites [38]. They are often complementary to each other. Mass spectrometry (MS) provides information of molecules whereas, nuclear magnetic resonance (NMR) is utilized to differentiate between structural isomers [39]. In fact, MS is more sensitive than NMR and able to detect the large scale of metabolites, while NMR is highly quantitative and reproducible and require larger sample amount for analysis than MS [40, 41].
4. Data analysis
In fact, the major challenges in metabolomic experiments are the huge amount of information obtained from either NMR spectroscopy or MS [7, 37]. The extraction of the significant information generated by NMR and MS is crucial by using computer software in order to organize the vast amount of data [40, 42].
Because studying individual metabolites is impractical for visualizing changes between groups of metabolites, univariate statistical approaches can be utilized to understand the results. Principal component analysis (PCA) is one of the most extensively used statistical approaches [39, 43, 44]. The data can be simplified using principle component analysis. CA without losing its core feature. In fact, principal component analysis PCA provides information on multivariate differences among secondary metabolites while, different univariant statistical tests such as non-parametric Wilcoxon signed-rank test, Kruskal–Wallis test, and the parametric.
Student’s t-test and ANOVA can be utilized to analyze isolated metabolites [45].
Nowadays, most metabolites can be identified, due to the development of many bioinformatics software. There are two types of metabolites identification that are applied including 1) definitive identification and 2) putative identification [7]. Many different metabolomics databases are available online some of them are used for NMR such as METLIN (http://metlin.scripps.edu), Biological Magnetic Resonance Databank (http://www.bmrb.wisc.edu/metabolomics/), and METLIN (http://metlin.scripps.edu) while the others are used for MS such as Mass Bank (http://www.massbank.jp), http://csbdb.mpimp-golm.mpg.de/csbdb/gmd/gmd.html“http://csbdb.mpimp-golm.mpg.de/csbdb/gmd/gmd.html), the Glom Metabolite Database (GMD, NIST (http://www.nist.gov/srd/nist1a.htm), METLI and MMCD (http://mmcd.nmrfam.wisc.edu) [46].
5. Conclusion
Microorganisms are one of the most significant sources of SMs that play important roles in many aspects of our life including pharmaceutical, biomedical and food applications. The integration between genetic engineering and metabolomic provides a powerful platform for the production, detection, and characterization of known and unknown secondary metabolites. However, the combination between CRISPR-Cas9 and metabolomics may improve the efficiency of microbial SMs discovery. Thus, the need of the hour is a comprehensive and sensitive technique that has the ability to provide comprehensive information of any secondary metabolites under all conditions.
References
- 1.
Li JWH, Vederas JC. Drug discovery and natural products: End of an era or an endless frontier? Biomeditsinskaya Khimiya. 2011; 57 (2):148-160 - 2.
Pelaez F. The historical delivery of antibiotics from microbial natural products - Can history repeat? Biochemical Pharmacology. 2006; 71 (7):981-990 - 3.
McMurry JE. Organic chemistry with biological applications. In: Secondary Metabolites: An Introduction to Natural Products Chemistry. Stamford, USA: Cengage Learning Ltd; 2015. pp. 1016-1046 - 4.
Berg M, Vanaerschot M, Jankevics A, Cuypers B, Breitling R, Dujardin J-C. LC-MS metabolomics from study design to data-analysis – Using a versatile pathogen as a test case. Computational and Structural Biotechnology Journal. 2013; 4 :e201301002 - 5.
Bérdy J. Bioactive microbial metabolites. The Journal of Antibiotics. 2005; 58 (1):1-26 - 6.
Putri SP, Nakayama Y, Matsuda F, Uchikata T, Kobayashi S, Matsubara A, et al. Current metabolomics: Practical applications. Journal of Bioscience and Bioengineering. 2013; 115 :579-589 - 7.
Go EP. Database resources in metabolomics: An overview. Journal of Neuroimmune Pharmacology. 2010; 5 (1):18-30 - 8.
Blin K, Kim HU, Medema MH, Weber T. Recent development of antiSMASH and other computational approaches to mine secondary metabolite biosynthetic gene clusters. Briefings in Bioinformatics. 2019; 20 :1103-1113 - 9.
Tong Y, Weber T, Lee SY. CRISPR/Cas-based genome engineering in natural product discovery. Natural Product Reports. 2019; 36 :1262-1280 - 10.
Lim FY, Sanchez JF, Wang CCC, Keller NP. Toward awakening cryptic secondary metabolite gene clusters in filamentous fungi. Methods in Enzymology. 2012; 517 :303-324 - 11.
Shuikan AM, Hozzein WN, Alzharani MM, Sandouka MN, Al Yousef SA, Alharbi SA, et al. Enhancement and identification of microbial secondary metabolites. In: Extremophilic Microbes and Metabolites - Diversity, Bioprespecting and Biotechnological Applications. London: IntechOpen; 2020 - 12.
Lenders J, Frédérich M, De Tullio P. Nuclear magnetic resonance: A key metabolomics platform in the drug discovery process. Drug Discovery Today: Technologies. 2015; 13 :39-46 - 13.
Bino RJ, Hall RD, Fiehn O, Kopka J, Saito K, Draper J, et al. Potential of metabolomics as a functional genomics tool. Trends in Plant Science. 2004; 9 (9):418-425 - 14.
Tran PN, Yen MR, Chiang CY, Lin HC, Chen PY. Detecting and prioritizing biosynthetic gene clusters for bioactive compounds in bacteria and fungi. Applied Microbiology and Biotechnology. 2019; 103 :3277-3287 - 15.
Valayil JM. Activation of microbial silent gene clusters: Genomics driven drug discovery approaches. Biochem Anal Biochem. 2016; 5 :276 - 16.
Rutledge PJ, Challis GL. Discovery of microbial natural products by activation of silent biosynthetic gene clusters. Nature Reviews. Microbiology. 2015; 13 :509-523 - 17.
Zhang MM, Wang Y, Ang EL, Zhao H. Engineering microbial hosts for production of bacterial natural products. Natural Product Reports. 2016; 33 :963-987 - 18.
Miller JC, Holmes MC, Wang J, Guschin DY, Lee YL, Rupniewski I, et al. An improved zinc-finger nuclease architecture for highly specific genome editing. Nature Biotechnology. 2007; 25 :778-785 - 19.
Jankele R, Svoboda P. TAL effectors: Tools for DNA targeting. Briefings in Functional Genomics. 2014; 13 :409-419 - 20.
Yamanaka K, Reynolds KA, Kersten RD, Ryan KS, Gonzalez DJ, Nizet V, et al. Direct cloning and refactoring of a silent lipopeptide biosynthetic gene cluster yields the antibiotic taromycin A. Proceedings of the National Academy of Sciences. 2014; 111 :1957-1962 - 21.
Kang HS, Charlop-Powers Z, Brady SF. Multiplexed CRISPR/Cas9-and TAR-mediated promoter engineering of natural product biosynthetic gene clusters in yeast. ACS Synthetic Biology. 2016; 5 :1002-1010 - 22.
Voytas DF. Plant genome engineering with sequence-specific nucleases. Annual Review of Plant Biology. 2013; 64 :327-350 - 23.
Lee NC, Larionov V, Kouprina N. Highly efficient CRISPR/Cas9-mediated TAR cloning of genes and chromosomal loci from complex genomes in yeast. Nucleic Acids Research. 2015; 43 :e55-e55 - 24.
Horbal L, Marques F, Nadmid S, Mendes MV, Luzhetskyy A. Secondary metabolites overproduction through transcriptional gene cluster refactoring. Metabolic Engineering. 2018; 49 :299-315 - 25.
Shao Z, Rao G, Li C, Abil Z, Luo Y, Zhao H. Refactoring the silent spectinabilin gene cluster using a plug-and-play scaffold. ACS Synthetic Biology. 2013; 2 :662-669 - 26.
Bauman KD, Li J, Murata K, Mantovani SM, Dahesh S, Nizet V, et al. Refactoring the cryptic streptophenazine biosynthetic gene cluster unites phenazine, polyketide, and nonribosomal peptide biochemistry. Cell Chemical Biology. 2019; 26 :724-736 - 27.
Pohl C, Kiel JAKW, Driessen AJM, Bovenberg RAL, Nygard Y. CRISPR/Cas9 based genome editing of Penicillium chrysogenum. ACS Synthetic Biology. 2016; 5 :754-764 - 28.
Sander JD, Joung JK. CRISPR-Cas systems for editing, regulating and targeting genomes. Nature Biotechnology. 2014; 32 :347-355 - 29.
Zhang MM, Wong FT, Wang Y, Luo S, Lim YH, Heng E, et al. CRISPR–Cas9 strategy for activation of silent Streptomyces biosynthetic gene clusters. Nature Chemical Biology. 2017; 13 :607 - 30.
Li L, Zheng G, Chen J, Ge M, Jiang W, Lu Y. Multiplexed sitespecific genome engineering for overproducing bioactive secondary metabolites in actinomycetes. Metabolic Engineering. 2017; 40 :80-92 - 31.
Greunke C, Duell ER, D’Agostino PM, Glöckle A, Lamm K, Gulder TAM. Direct pathway cloning (DiPaC) to unlock natural product biosynthetic potential. Metabolic Engineering. 2018; 47 :334-345 - 32.
Jiang W, Zhao X, Gabrieli T, Lou C, Ebenstein Y, Zhu TF. Cas9-assisted targeting of chromosome segments CATCH enables one step targeted cloning of large gene clusters. Nature Communications. 2015; 6 :810 - 33.
Bonet B, Teufel R, Crüsemann M, Ziemert N, Moore BS. Direct capture and heterologous expression of Salinispora natural product genes for the biosynthesis of enterocin. Journal of Natural Products. 2015; 78 :539-542 - 34.
Breitling R, Ceniceros A, Jankevics A, Takano E. Metabolomics for secondary metabolite research. Metabolites. 2013; 3 :1076-1083 - 35.
Wu C, Kim HK, van Wezel GP, Choi YH. Metabolomics in the natural products field—A gateway to novel antibiotics. Drug Discovery Today: Technologies. 2015; 13 :11-17 - 36.
Dunn WB, Broadhurst DI, Atherton HJ, Goodacre R, Griffin JL. Systems level studies of mammalian metabolomes: The roles of mass spectrometry and nuclear magnetic resonance spectroscopy. Chemical Society Reviews. 2011; 40 (1):387-426 - 37.
Midelfart A, Dybdahl A, Gribbestad IS. Metabolic analysis of the rabbit cornea by proton nuclear magnetic resonance spectroscopy. Ophthalmic Research. 1996; 28 (5):319-329 - 38.
Dettmer K, Aronov PA, Hammock BD. Mass spectrometry based metabolomics. Mass Spectrometry Reviews. 2007; 26 (1):51-78 - 39.
Alia A, Ganapathy S, de Groot HJ. Magic angle spinning (MAS) NMR: A new tool to study the spatial and electronic structure of photosynthetic complexes. Photosynthesis Research. 2009; 102 (2-3):415-425 - 40.
Dunn WB, Broadhurst D, Begley P, Zelena E, Francis-McIntyre S, Anderson N, et al. Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry. Nature Protocols. 2011; 6 (7):1060-1083 - 41.
Johnson CH, Gonzalez FJ. Challenges and opportunities of metabolomics. Journal of Cellular Physiology. 2012; 227 (8):2975-2981 - 42.
Lu W, Bennett BD, Rabinowitz JD. Analytical strategies for LC-MS-based targeted metabolomics. Journal of Chromatography. B, Analytical Technologies in the Biomedical and Life Sciences. 2008; 871 (2):236-242 - 43.
H. H. Analysis of a complex of statistical variables into principal components. Journal of Education & Psychology. 1933; 24 :417-441 - 44.
Young SP, Wallace GR. Metabolomic analysis of human disease and its application to the eye. Journal of Ocular Biology, Diseases, and Informatics. 2009; 2 (4):235-242 - 45.
Broadhurst DI, Kell DB. Statistical strategies for avoiding false discoveries in metabolomics and related experiments. Metabolomics. 2006; 2 (4):171-196 - 46.
Brown M, Dunn WB, Dobson P, Patel Y, Winder CL, Francis-McIntyre S, et al. Mass spectrometry tools and metabolite-specific databases for molecular identification in metabolomics. Analyst. 2009; 134 (7):1322-1332