Examples of different types of biological samples used for metabolomics analysis.
The center of multiomics is being moved from genomics to phenomics (Figure 1) . Proteome and metabolome are two main components of phenome, and are equally important. The concept and development of proteoforms significantly enrich the content of a proteome. A book entitled “Proteoforms: Concept and Applications in Medical Sciences” has been published focusing on proteomics at the proteoform level . It is driving the editor to edit another book focusing on metabolomics to discuss (i) the methodology of metabolomics, including sample preparation, targeted metabolomics, and untargeted metabolomics based on nuclear magnetic resonance (NMR) or mass spectrometry (MS), and (ii) applications of metabolomics in the research and practice of life science and medical science.
Metabolomics is an important aspect of phenomics, which is the theory and methodology to study metabolome, including identification of biochemical and molecular characteristics of metabolome, characterization of interactions among different metabolites or between metabolites and genetic/environmental factors, and evaluation of biochemical mechanisms related to a given condition such as different pathophysiological processes . Metabolome contains all metabolites derived from nucleic acids, proteins, lipids, and sugars in a given cell, tissue, biological system, or body-fluid. The metabolites in a metabolome interact mutually in enzymatic reaction systems to form metabolic network systems. The metabolomic variation is associated with multiple factors, including genetic, environmental, internal, external, drug, or dietary factors . Currently the studies on metabolomic variations are much insufficient in the width and depth of metabolomics. It is necessary to develop high-sensitivity, high-throughput, and high-reproducibility methodology for maximizing the coverage of metabolomic variations. The studies on metabolomic variations directly result in the discovery of effective biomarkers to clarify molecular mechanisms of a disease, determine reliable therapeutic targets, and discover reliable biomarkers for precise prediction, diagnosis, and prognostic assessment in the context of predictive, preventive and personalized medicine (3P medicine, PPPM).
2. Importance of metabolomic variations in medical science
Metabolome contains all metabolites derived from nucleic acids, proteins, lipids, and sugars in a given cell, tissue, biological system, or body-fluid [4, 5, 6]. The metabolites in a metabolome interact mutually in enzymatic reaction systems to form metabolic network systems . The change of metabolites is associated with multiple factors, including internal, external, genetic, environmental, drug, or dietary factors. Metabolomics is the theory and methodology to study metabolome, including identifying biochemical and molecular characteristics of metabolome, characterizing interactions among different metabolites or between metabolites and genetic/environmental factors, and evaluating biochemical mechanisms related to a given condition such as different pathophysiological processes . Metabolomic variations can reflect the status of physiological and pathological processes, monitor the progression of a disease, and predict and assess the drug effects compared to the baseline of metabolic profiles, which benefits for disease stratification, and personalized/precise medicine in the context of PPPM .
3. Samples used to measure metabolomic variations
The biological samples are very intricate that are used to measure metabolomic variations, including extracts from different cells, tissues, and body-fluids (Table 1). Urine and serum/plasma [6, 17, 18] are the most commonly used body-fluids to analyze metabolome for different diseases because these samples are very easily available and are easy to be prepared, without any injury. In addition, tears  are the good samples for analyzing metabolome of an eye disease, exhaled air [20, 21] for pulmonary and airway diseases or other diseases, saliva  for oral diseases, synovial fluid  for arthritis, and cerebrospinal fluid (CSF)  for neurological systems disease. Generally speaking, there are many biological samples that are suitable for metabolomics analysis of a disease. The metabolomics studies based on these different samples can directly or indirectly reflect the status of a disease, which may use to understand the molecular mechanism of a disease, and discover therapeutic targets and reliable biomarkers to predict, diagnose, and prognostically evaluate a disease.
|Biological sample||Methods||Main results||References|
|HeLa cells||Gas cluster ion beam- secondary ion MS (GCIB-SIMS)||Purinosomes comprise nine enzymes that act synergistically, channeling the pathway intermediates to synthesize purine nucleotides, increasing the pathway flux, and influencing the adenosine monophosphate/guanosine monophosphate ratio.|||
|Carcinoma and adjacent normal tissues||UHPLC-Orbitrap MS||This method enables targeted profiling of over 400 biologically important metabolites covering 92 metabolic pathways|||
|Sweat||GC–MS and LC–MS/MS||As most of the identified metabolites are involved in key biochemical pathways, this study opens interesting possibilities to the use of dry sweat as a source of metabolite markers for specific disorders.|||
|Urine and plasma||HPLC-ESI-qTOF-MS||A total of 31 and 38 metabolites in plasma and urine, respectively, showed significant differences between healthy volunteers and Sjögren’s Syndrome patients and were proposed for their identification.|||
|Cerebrospinal fluid (CSF)||GC–MS and LC–MS/MS||A total of 274 CSF-derived metabolites were common to the discovery and replication cohorts in cancer-related fatigue.|||
|Saliva||UHPLC-qTOF-MS||The study identified and classified a total of 211 endogenous and exogenous salivary metabolites. The results reveal a distinct metabolite profile of dog and human saliva as 25 lipid compounds were identified only in canine saliva and eight dipeptides only in human saliva.|||
|Sputum||LC–MS/MS||The KEGG analysis revealed that the glycerophospholipid metabolism pathway was downregulated in severe COPD. Due to the critical role of glycerophospholipid metabolism in oxidative stress, significant negative correlations were discovered between glycerophospholipid metabolites and three oxidative stress products (SOD, MPO, and 8-iso-PGF2α). The diagnostic values of SOD, MPO, and 8-iso-PGF2α in induced sputum were found to exhibit high sensitivities and specificities in the prediction of COPD severity.|||
|Blood||1D 1H NMR spectroscopy methods||This has led to the absolute quantitation of nearly 70 metabolites in serum and plasma and nearly 80 in whole blood.|||
4. Methods used to measure metabolomic variations
The appropriate analytical methods for metabolomics are important to detect, identify, and quantify metabolomic variations in a given condition; for example, a disease status versus control, which are mainly classified into targeted metabolomics  and untargeted metabolomics . (i) The targeted metabolomics  is to mainly quantify hypothesis-driven known metabolite variations in a metabolome (such as metabolites derived from one or more unknown metabolism pathways) between or among research groups, and then use multivariate statistical analysis to establish mathematical models . This mathematical model then is used to discriminate Diseases from healthy controls, treatment from untreatment, or different stages of diseases. The often used methods for targeted metabolomics are the selected/multiple reaction monitoring (SRM/MRM) analysis with an optimized sample extraction and liquid chromatography-mass spectrometry (LC–MS) conditions using the triple quadrupole mass spectrometry (QqQ-MS) . (ii) The untargeted metabolomics  is an none hypothesis-driven approach to globally detect, identify, and quantify metabolite variations in a metabolome in a biological system without any bias, which will benefit the understanding molecular mechanism of a disease, discover new therapeutic targets/drugs and metabolite biomarkers for effective prediction, diagnosis, and prognosis. The often used methods for untargeted metabolomics are the mass spectrometry (MS)-based methods [6, 29], and nuclear magnetic resonance (NMR)-based methods [30, 31] (Figure 2). (a) MS-based methods have ion mobility coupled with MS (IM-MS) that can measure time, mass-to-charge (m/z) and intensity variables , capillary electrophoresis coupled with MS (CE-MS) that can measure time, m/z and intensity variables [29, 32, 33], gas chromatography coupled with MS (GC–MS) that can measure time, m/z and intensity variables [29, 34], liquid chromatography coupled with MS (LC–MS) that can measure retention time (RT), m/z and intensity variables [26, 29, 35], and direct injection coupled with MS (DI-MS) that can measure m/z and intensity variables . IM-MS is to use a buffer gas and a uniform or periodic electric field for separation of ions based on size and shape of the ions, followed by MS analysis. This is a very high throughput and high selectivity method, which can easily separate isomeric and isobaric compounds. CE-MS is to use electro kinetics for separation of polar molecules, followed by MS analysis. This is a very good method to analyze polar molecules in aqueous samples for measurement of inorganic and organic anions, with low running costs and relatively low throughput. GC–MS is to use gas chromatography for separation of molecules, followed by MS analysis. This method is suitable for a polar and volatiles compounds, whose advantages are availability of universal database for identification, high sensitivity, and high reproducibility; and whose disadvantages are only detection of a polar and volatile compounds, requirement of derivatization of polar compounds, low ionization discrimination, and requirement of higher amount of samples. LC–MS is to use liquid chromatography for separation of molecules, followed by MS analysis. This method is suitable for polar to hydrophobic compounds, whose advantages are requirement of minimal amount of samples, high sensitivity, high throughput, and flexibility in column chemistry widening the range of detectable compounds; and whose disadvantages are requirement of high ionization discrimination, lack of large metabolite databases, and requirement of specific chromatographic conditions for very polar molecules. DI-MS is to use the nanospray source directly coupled with MS, which does not require chromatography separation, whose advantages are low sample volume requirement, high sensitivity, high-throughput, and low cost; and whose disadvantages are requirement of high ionization discrimination, significant ion suppression phenomenon, and inability to separate isomers and isbaric species. (b) NMR-based methods have one-dimensional, two-dimensional, and three-dimensional NMR methods (1D-NMR, 2D-NMR, and 3D-NMR) , which is to use the interaction of spin active nuclei (13C, 1H, 31P, 19F) in the electromagnetic fields for obtaining structural, chemical, and molecular environment information [30, 31], whose advantages are non-destruction of sample, minimal sample preparation, high reproducibility, relative high throughput, availability of molecular dynamic and compartmental information with diffusional methods, and availability of databases; and whose disadvantages are low sensitivity, overlapping of metabolites, and high instrumentation cost . MS-based methods and NMR-based methods are complimentary for metabolomics analysis, and both will produce very complex data. The processing, analysis, and annotation of data are very important and crucial steps to discover the potential and important metabolic biomarkers [37, 38]. However, compared to the NMR-based metabolomics, MS-based metabolomics has a relatively low cost, high sensitivity and resolution, and very good analytical performance to measure the metabolomic variations for PPPM or PM practice .
5. Applications of metabolomics in life science and medical science
Metabolome is the important content of phenome. Metabolomics conducts qualitative and quantitative analysis of all small molecule metabolites in organisms, and searches for the relative relationship between metabolites and physiological and pathological changes. The subjects are mostly small molecules with molecular weights of less than 1,000. With the development of high throughput technology, the study of living organisms has developed from single small molecule to multi-omics; such as genomics, transcriptomics, proteomics, metabolomics. Multiomics reflects molecular changes in a disease or biological process, and molecules that can be identified can be used as valuable biomarkers. Metabolites are substances produced or consumed through the metabolic process. Metabolites are the final expression products subject to genetic control and environmental influence. Imprints with genomic, transcriptomic, epigenetic and environmental effects are called “associations between genotypes and phenotypes” . Metabolomics has been extensively applied in fields of medical science and life science (Table 2). It has important applications in medicine and life sciences, agriculture, food safety and so on. Metabolites, as the end products of gene expression, have been implicated in many diseases. For example, metabolomics has great potential for diabetes research, metabolic markers hold the potential to detect diabetes-related complications already under subclinical conditions in the general population . Metabolomics is used to identify key disease-related metabolic changes and disease-progression-related changes, and defining metabolic changes during AD disease trajectory and its relationship to clinical phenotypes provided a powerful roadmap for drug and biomarker discovery . Carmen Peña-Bautista’s work shows that the untargeted analysis carried out in human plasma samples from early Alzheimer’s disease patients and healthy individuals, and the use of sophisticated statistical tools, identified some metabolic pathways and plasma biomarkers . Nina P Paynter’s work shows metabolomics also has important applications in cancer. The processes of life accompany metabolism, such as glycolysis, protein synthesis and metabolism. These fundamental features of cellular metabolism are reprogrammed in cancer cells to support their pathological levels of growth and proliferation. Metabolic reprogramming in malignant cells is likely the result of the multifactorial effects of genomic alterations (i.e. mutations of oncogenes and tumor suppressors), the tumor microenvironment (which imposes metabolic stress caused by compromised nutrients and oxygen availability), and other influences . These changes may be the result of changes in the genome or environmental impacts and a variety of other factors. We need to understand the complete breadth of metabolic abnormalities in cancer because some metabolic changes provide opportunities to develop novel therapeutic targets and predictive biomarkers . As mentioned in Yousra Ahmed-Salim’s study, generally, combinations of more than one significant metabolite as a panel, in different studies, achieved a higher sensitivity and specificity for diagnosis than a single metabolite . Metabolomics has become the most powerful platform for studying tissue samples. A common application of metabolomics is the discovery of biomarkers for diagnosis or prediction of treatment sensitivity and prognosis. For example, Yousra Ahmed-Salim et al. conducted a systematic review of the application of metabolomics in the treatment of ovarian cancer. The most frequently described metabolite difference between the biological fluids and tissues of patients with ovarian cancer and those of healthy controls have been in phospholipids . Su et al. interrogated metabolomics and gene-expression from the NCI-60 cell lines to study relationships between metabolite and transcripts . They observed that the metabolome can distinguish cancer subtypes and that metabolite levels correlate well with gene expression under strong correlation models . In conclusion, metabolomics can more accurately determine pathophysiological changes of diseases and identify effective biomarkers through the high-throughput study of metabolites in organisms with abundant sources of samples, so as to further understand the molecular mechanism of diseases. Thus, it is beneficial to the prevention, diagnosis and treatment of diseases.
|Metabolomics methods||Biological samples||Main discoveries||References|
|1D-NMR||Plasma samples from SARS-CoV-2 rRT-PCR-positive patients (n = 15, with multiple sampling timepoints) and age-matched healthy controls (n = 34, confirmed rRT-PCR negative), together with patients with COVID-19/influenza-like clinical symptoms who tested SARS-CoV-2 negative (n = 35).||The study observed four plasma cytokine clusters that expressed complex differential statistical relationships with multiple lipoproteins and metabolites. These included the following: cluster 1, comprising MIP-1β, SDF-1α, IL-22, and IL-1α, which correlated with multiple increased LDL and VLDL subfractions; cluster 2, including IL-10 and IL-17A, which was only weakly linked to the lipoprotein profile; cluster 3, which included IL-8 and MCP-1 and were inversely correlated with multiple lipoproteins. IL-18, IL-6, and IFN-γ together with IP-10 and RANTES exhibited strong positive correlations with LDL1–4 subfractions and negative correlations with multiple HDL subfractions.|||
|2D-NMR||Different aging regimes (crust from dry-aged beef, inner edible flesh of dry-aged beef, and wet-aged beef striploin)||NMR-based multivariable analyses could be used to distinguish the method, degree, and doneness of beef aging.|||
|3D-NMR||For 19 of the 25 model metabolites, “Structure of unknown metabolomic mixture components by MS/NMR” yielded complete structures that matched those in the mixture independent of database information.|||
|DI-MS||A parasite–host cell system||The study applied a metabolic fingerprinting approach to evaluate metabolic changes induced by six different (candidate) drugs in a parasite–host cell system.|||
|LC–MS||Those clinical strains that differed in their virulence and biofilm phenotype also had pronounced divergence in their metabolomes, as underlined by 332 features that were significantly differentially abundant with fold changes greater than 1.5 in both directions.|||
|GC–MS||Embryonic zebrafish||A total of 87 important endogenous metabolites such as citric acid and hypoxanthine were identified by universal databases or standards among 270 extracted metabolites, which consisted of sugars, amines, amino acids, nucleotides, fatty acids, and sterols.|||
|CE-MS||Plasma samples of acute corneal seizure mouse model.||Both electrically induced seizures showed decreased values of methionine, lysine, glycine, phenylalanine, citrulline, 3-methyladenine and histidine in mice plasma. However, a second provoked seizure, 13 days later, showed a less pronounced decrease of the mean concentrations of these plasma metabolites, demonstrated by higher fold change ratios.|||
|IM-MS||Breast cancer plasma samples||Analysis of the resulting data showed that phosphatidylcholines, triglycerides and diglycerides exhibited lower expression and phosphatidylserine showed increased expression in the breast cancer samples compared to those of healthy subjects. The coefficients of variation, determined by reference to the QC data, for all of the features identified as potential markers of disease, were 6% or less.|||
Metabolomics as the important aspect of phenomics is emerging as the frontier field in life science and medical science. Many biological samples have been used to measure metabolomic variations, including extracts from different cells, tissues, organisms, and body-fluids (for example, urine, serum/plasma, tear, exhaled air, saliva, synovial fluid, CSF, and sputum). Metabolomics is classified into targeted metabolomics and untargeted metabolomics. Targeted metabolomics is used to analyze the known metabolite profiling with SRM/MRM methods. Untargeted metabolomics is used to globally analyze the unknown metabolite profiling with NMR-based methods (1D-NMR, 2D-NMR, and 3D-NMR) and MS-based methods (DI-MS, LC–MS, GC–MS, CE-MS, and IM-MS). Metabolomics has been extensively applied in the research and practice of life science and medical science. However, currently the studies on metabolomic variations are much insufficient in the width and depth. The development of high-sensitivity, high-throughput, and high-reproducibility methodology is needed to maximize the coverage of metabolomic variations for clarification of molecular mechanism of a disease, determination of effective therapeutic targets, and discovery of reliable biomarkers for prediction, diagnosis, and prognostic assessment in the context of PPPM practice.
The authors acknowledge the financial supports from the Shandong First Medical University Talent Introduction Funds (to X.Z.), the Hunan Provincial Hundred Talent Plan (to X.Z.), and the Hunan Provincial High-Level Health Talents “225” Plan – Medical Academic Leader Funds (to X.Z.).
Conflict of interest
We declare that we have no financial and personal relationships with other people or organizations.
X.Z. conceived the concept, designed the manuscript, wrote and critically revised the manuscript, coordinated and was responsible for the correspondence work and financial support. J.Y., S.Z., N.L., and N.L. participated in the literature analysis, and wrote partial manuscript.
Acronyms and abbreviations
Capillary electrophoresis coupled with mass spectrometry
Direct injection coupled with mass spectrometry
Gas chromatography coupled with mass spectrometry
Ion mobility coupled with mass spectrometry
liquid chromatography coupled with mass spectrometry
Nuclear magnetic resonance
Triple quadrupole mass spectrometry
Predictive, preventive and personalized medicine
Selected/multiple reaction monitoring
Predictive, preventive and personalized medicine