Open access peer-reviewed chapter

Diagnosis of Lung Cancer: What Metabolomics Can Contribute

By Elien Derveaux, Evelyne Louis, Karolien Vanhove, Liene Bervoets, Liesbet Mesotten, Michiel Thomeer and Peter Adriaensens

Submitted: February 14th 2018Reviewed: June 1st 2018Published: November 5th 2018

DOI: 10.5772/intechopen.79258

Downloaded: 399


The reprogrammed metabolism of cancer cells reflects itself in an alteration of metabolite concentrations, which in turn can be used to define a specific metabolic phenotype or fingerprint for cancer. In this contribution, a metabolism-based discrimination between lung cancer patients and healthy controls, derived from an analysis of human blood plasma by proton nuclear magnetic resonance (1H-NMR) spectroscopy, is described. This technique is becoming widely used in the field of metabolomics because of its ability to provide a highly informative spectrum, representing the relative metabolite concentrations. Cancer types are characterized by decreased or increased levels of specific plasma metabolites, such as glucose or lactate, compared to controls. Data analysis by multivariate statistics provides a classification model with high levels of sensitivity and specificity. Nuclear magnetic resonance (NMR) metabolomics might not only contribute to the diagnosis of lung cancer but also shows potential for treatment follow-up as well as for paving the way to a better understanding of disease-related diverting biochemical pathways.


  • metabolomics
  • human blood plasma
  • metabolic phenotype
  • 1H-NMR spectroscopy
  • metabolite spiking
  • multivariate OPLS-DA statistics
  • lung cancer
  • cancer cell metabolism
  • biomarker

1. Introduction

Metabolomics, or metabolite profiling, comprises the study of the entire spectrum of low-molecular weight metabolites and their cellular processes in a biological system [1, 2, 3, 4]. Next to a large number of studies exploring the use of metabolomics in the field of disease diagnosis and prognosis, its application is also extended to other research areas such as toxicology [5], nutrition [6], microbiology [7], and drug discovery [8]. Together with high prevalence diseases such as diabetes [9], obesity [10, 11, 12], and neurological and cardiovascular disorders [13, 14], different types of malignant diseases including breast [15, 16], colorectal [17, 18], and lung cancer [19, 20, 21, 22, 23, 24] are being extensively examined by using a metabolomics approach.

Nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS), which can possibly be connected to a gas- or liquid chromatography system (GC-MS/LC-MS), are the analytical techniques that are primarily used in the field of metabolomics [25, 26, 27]. While 13C nuclei can be very useful in contribution to metabolite identification by NMR, the proton (1H) nucleus is mostly studied in metabolomics NMR experiments [28]. The 1H nucleus is omnipresent in metabolites, shows the highest relative sensitivity, and has a natural abundancy of 99.98%. 1H-NMR spectroscopy is a noninvasive technique that needs no sample extractions and that enables the identification and quantification of metabolites in biofluids as well as in tissues and therefore is becoming widely used in the field of metabolomics [29]. Despite that 1H-NMR is less sensitive compared to MS, it has many advantages: nondestructive, easy quantification, low cost per sample, minimal sample preparation requirements resulting subsequently in an excellent reproducibility and rapid high-throughput data acquirement [30]. In a single run of a few minutes, the 1H-spectrum from one sample provides information regarding the relative concentrations of all present metabolites. The metabolic phenotype provides a representative snapshot of an individual’s metabolic state and therefore enables the determination of cellular processes altered by disease [2].

Metabolites from a number of different diagnostic biofluids are already examined in multiple studies, mostly involving human blood plasma, serum, or urine [1, 22, 31, 32]. In parallel with biofluids, intact tumor tissues are frequently evaluated since intra-tumor heterogeneity is currently one of the major causes of treatment failure [33, 34]. To that end, high-resolution magic angle spinning NMR (HR-MAS NMR) as an analytical approach is gaining great attention [35, 36, 37, 38].

This review intends to point out the results of 1H-NMR metabolic profiling of lung cancer patients acquired by our research group and further explores the benefits which this method might deliver to contribute to an optimal treatment for lung cancer patients.

2. Methods

2.1. Sample collection and preparation

Experimental design focused on the analysis of fasting venous blood samples from lung cancer patients. Importantly, exclusion criteria were (i) not fasted for at least 6 h; (ii) fasting blood glucose concentration ≥ 200 mg/dl; (iii) medication intake in the morning of blood sampling, and (iv) treatment or history of cancer in the past 5 years, as described in the study of Louis et al. [20]. The blood samples were collected in lithium-heparin tubes and stored at 4°C within 5 min. Plasma aliquots were obtained after centrifugation at 1600 g for 15 min within 8 h after collection. Plasma sample preparation included a centrifugation step at 13,000 g for 4 min at 4°C and dilution of 200 μl of the supernatant with 600 μl deuterium oxide (D2O) containing 0.3 μg/μl trimethylsilyl-2,2,3,3-tetradeuteropropionic acid (TSP) as a chemical shift reference of the spectra. After presaturation for water suppression, the Carr-Purcell-Meiboom-Gill (CPMG) pulse sequence was used to acquire slightly T2-weighted spectra on a 400 MHz (9.4 Tesla) NMR spectrometer [1].

2.2. Spectral processing

2.2.1. Binning

Before applying multivariate statistics, the data acquired by 1H-NMR analysis should be preprocessed. Preprocessing of data usually includes phasing, baseline correction, alignment, and normalization. In addition, the spectrum has to be divided into regions of which the integration value (i.e., area under the peak) can be used as a variable for the statistical analysis. Binning or bucketing is a commonly used technique to produce such a reduced set of variables by segregating the spectrum [39]. In point-wise binning, the spectrum is divided into so-called equally-sized bins. An important limitation of this method is the possible splitting of peaks, resulting in a loss of differentiating power and possibly data misinterpretation. To overcome this, another methodology based on spiking of the plasma with known metabolites is proposed. This approach describes how the 1H-NMR spectrum is divided into well-defined variable-sized integrations regions, being the variables for multivariate statistical analysis [40].

2.2.2. Spiking methodology

To obtain a correct signal assignment, 1H-NMR spectra of reference plasma samples to which a known metabolite was spiked, were acquired. Hereto, stock solutions were prepared by dissolving a relevant concentration of a known metabolite in a reference plasma sample. Reference plasma can be obtained by pooling the plasma of several blood samples from a healthy person. Next, a small amount of stock solution can be added to a reference plasma NMR sample (e.g., 10 μl stock solution to 200 μl reference plasma and 600 μl D2O containing the TSP reference). This procedure can be repeated for all metabolites of interest, using a fresh reference sample for each metabolite. The outcome of these spiking experiments allows an accurate identification of the chemical shifts and J-coupling patterns. On our 400 MHz (9.4 Tesla) NMR spectrometer, the described spiking method led to a segmentation of the spectra in 110 well-defined integration regions [40]. After integration and normalization (relative to the total integrated area, with exclusion of the contributions of TSP and water), these integration regions could be used as variables for multivariate statistical analyses.

2.3. Multivariate statistics

The statistics were carried out by using supervised orthogonal partial least squares discriminant analysis (OPLS-DA) to train and validate a classification model which enables optimal discrimination between lung cancer patients and a control population. The statistical classifier was constructed after detection and removal of outliers in the training data set via unsupervised principle component analysis (PCA). In addition, PCA was also conducted to visualize significant intrinsic clusters in the case–control data set upon which identification of possible confounders was based.

Model characteristics such as the total explained intra- (R2X(Cum)) and intergroup (R2Y(Cum)) variation were examined together with sensitivity and specificity values in order to evaluate strength performance of the OPLS-DA classifier. Predictive ability (Q2(Cum)) of the model was demonstrated by cross-validation of the training set as well as by application of the model to an independent validation cohort.

3. Results

3.1. Detection of lung cancer

The assigned and normalized integration regions of the 1H-NMR spectrum reflect the relative metabolite concentrations and thus represent the metabolic phenotype. Therefore, they can be used as variables for multivariate OPLS-DA statistics in order to discriminate between lung cancer patients and healthy controls. By applying this methodology on lung cancer plasma samples, a classification model that enables discrimination between those two groups was trained. Hereto, a large training cohort consisting out of 233 lung cancer patients and 226 controls was used. Characteristics of the subjects included in the training and validation cohort are summarized in Table 1. The trained OPLS-DA classifier resulted in a correct classification of 78% of the lung cancer patients and 92% of the control group (Figure 1A) [19]. To affirm that the discrimination was purely due to differences in plasma metabolite concentrations, PCA was conducted to exclude possible confounders. By means of PCA score plots, it was confirmed that gender, smoking status, disease, and chronic obstructive pulmonary disease (COPD) are no confounders [19].

Training cohortValidation cohort
Number of subjects, N2262338998
Gender, N (%)
Male119 (53)160 (69)44 (49)66 (67)
Female107 (47)73 (31)45 (51)32 (33)
Age, yrs.
67 ± 11
68 ± 10
69 ± 10
64 ± 9
BMI, kg/m2
28.3 ± 5.0
25.8 ± 4.5
28.4 ± 5.7
26.2 ± 4.7
COPD, N (%)39 (17)119 (51)9 (10)35 (36)
Taking lipid-lowering medication, N (%)124 (55)122 (52)56 (63)39 (40)
Diabetes, N (%)23 (10)40 (17)20 (22)12 (12)
Smoking habits
Smoker, N (%)47 (21)113 (49)15 (17)48 (49)
Ex-smoker, N (%)102 (45)110 (47)36 (40)46 (47)
Non-smoker, N (%)77 (34)10 (4)38 (43)4 (4)
Pack years
16 ± 24
33 ± 21
13 ± 18
38 ± 21
Left, N (%)103 (44)40 (41)
Right, N (%)119 (51)54 (55)
Bilateral, N (%)6 (3)4 (4)
Unknown, N (%)5 (2)0 (0)
Amount of tumors, N239102
Histological subtype
NSCLC-Adenocarcinoma, N (%)91 (38)46 (45)
NSCLC-Squamous carcinoma, N (%)66 (28)29 (28)
NSCLC-Adenosquamous carcinoma, N (%)5 (2)1 (1)
NSCLC-Carcinoid, N (%)5 (2)0 (0)
NSCLC-NOS, N (%)8 (3)6 (6)
SCLC, N (%)30 (13)15 (15)
Unknown, N (%)34 (14)5 (5)
Clinical stage according to 7th TNM edition
IA, N (%)55 (23)12 (12)
IB, N (%)21 (9)5 (5)
IIA, N (%)11 (5)7 (7)
IIB, N (%)15 (6)4 (4)
IIIA, N (%)48 (20)17 (16)
IIIB, N (%)26 (11)12 (12)
IV, N (%)63 (26)45 (44)

Table 1.

Summary of the characteristics of the subjects included in the training and validation cohort.

BMI: Body mass index; C: controls; COPD: chronic obstructive pulmonary disease; LC: lung cancer patients; NOS: not otherwise specified; NSCLC: non-small cell lung cancer; SCLC: small cell lung cancer; and TNM: tumor, node, metastasis.

Figure 1.

OPLS-DA score plots, resulting from the classification of the training cohort of 233 lung cancer patients and 226 controls (A) and the independent validation cohort of 98 lung cancer patients and 89 controls (C). The AUC of ROC curves confirms the predictive ability of the classification model by cross-validation of the training cohort and an independent validation model (B). AUC: Area under the curve; C: controls; CV: cross-validation; LC: lung cancer patients; PS: predicted scores; and ROC: receiver operating characteristic.

While these results definitely support the applicability of this methodology for the detection of lung cancer, no clear differentiation between tumor stages or histological subtypes could be detected yet, that is, none of the trained OPLS-DA models already showed significant clustering of different tumor stages or histological subtypes. This probably is due to the limited number of lung cancer patients in the subgroups and the diffuse character of the subgroups formed on the basis of histology and clinical tumor stage. However, the ability of a constructed OPLS-DA model to discriminate between 76 stage I lung cancer patients and 76 randomly selected controls with 74% sensitivity and 78% specificity strongly suggests that plasma metabolite phenotyping reveals the presence of lung cancer already during early stadia of tumor development (Figure 2) [19].

Figure 2.

OPLS-DA score plot, resulting from the classification of 76 stage-I lung cancer patients and 76 randomly selected controls of the training cohort. C: Controls.

3.2. Validation of the classification model

Importantly, after training of a promising classification model, confirmation of the validity of the model needs to be considered. When the metabolic fingerprint of a large cohort of patients and controls is available, this can be realized by applying the model on an independent validation cohort consisting out of an independent group of both lung cancer patients and controls. In this study, an independent cohort of 98 patients with lung cancer and 89 controls was used for validation of the trained model classifier. The trained model shows a high predictive accuracy with a sensitivity of 71% and a specificity of 81% (Figure 1B and C) [19].

3.3. Differentiation between cancer types

To further illustrate the potential of the methodology described above, the following paragraph demonstrates that different cancer types are characterized by a specific metabolite profile. Hereto, the same workflow was applied on a data set of 54 lung cancer patients and 80 breast cancer patients. Again, the segmentation of the spectrum was based on metabolite spiking and OPLS-DA statistics were used to train a classification model, this time in discriminating lung cancer from breast cancer. The resulting model allows a correct classification of both cancer types with a sensitivity of 93% (93% of the 54 lung cancer patients were correctly classified) and a specificity of 99% (99% of the 80 breast cancer patients were correctly classified) (Figure 3A). Validation of the model by applying it on an independent cohort of 81 lung cancer patients and 60 breast cancer patients confirmed these findings and shows a sensitivity of 89% and a specificity of 82% (Figure 3B and C) [20]. Another recent study explored these promising results by establishing an OPLS-DA classification model that allows discrimination between three different types of cancers, that is, lung, breast, and colorectal cancers. After 1H-NMR measurements of 37 plasma samples of each patient group, multivariate statistics revealed that each type of cancer was represented by a specific metabolic signature (Figure 4) [41]. Since the metabolic phenotype allows a clear differentiation between different cancer types, it can be assumed that the metabolic profile should not be considered as a general cancer marker but rather as a distinguishing characteristic of a specific cancer type.

Figure 3.

OPLS-DA score plots, resulting from the classification of the training cohort of 54 lung cancer patients and 80 breast cancer patients (A) and the independent validation cohort of 81 lung cancer patients and 60 breast cancer patients (C). The AUC of ROC curves confirms the predictive ability of the classification model by cross-validation of the training cohort and an independent validation model (B). AUC: Area under the curve; BC: breast cancer patients; LC: Lung cancer patients; PS: predicted scores; and ROC: receiver operating characteristic.

Figure 4.

OPLS-DA score plot, resulting from the classification of a population of lung-, breast- and colorectal cancer patients, each group consisting of 37 individuals. CRC: Colorectal cancer patients; BC: breast cancer patients; and LC: lung cancer patients.

4. Reorganization of metabolic pathways

The metabolites that contributed the most to the differentiation between lung cancer patients and healthy controls were identified and selected based on their variable importance for projection (VIP) value by means of an S-plot. The variables on the wings of the S-plot are the ones with the strongest contribution to the model and the highest statistical reliability [42]. Metabolic phenotyping of blood plasma shows that lung cancer patients are characterized by elevated glucose and decreased lactate levels, which implies an increased gluconeogenesis. This enhanced gluconeogenesis reflects the reaction of the human body to the Warburg effect or aerobic glycolysis in which, even in normoxic conditions, cancer cells rely on fermentation, that is, glycolysis leading to lactate production via fermentation of pyruvate. The Warburg effect, which takes place in cancer cells, can be observed in tumor tissue by means of 1H-NMR as shown by Rocha et al. They demonstrated that lung tumors of different histological subtypes are all characterized by lowered glucose whereas lactate levels are increased, which is supported by the significantly enhanced glycolytic activity of cancer cells compared to normal cells [23]. Moreover, lung cancer patients show decreased phospholipid plasma levels, pointing to an increased lipogenesis and enhanced membrane synthesis, which is correlated with increased proliferation of cancer cells [43, 44, 45, 46]. Other metabolites with an increased concentration in lung cancer patients compared to controls are N-acetylated glycoproteins, β-hydroxybutyrate, leucine, lysine, tyrosine, threonine, glutamine, valine, and aspartate. Contrarily, metabolites showing a decreased concentration in lung cancer patients are alanine, sphingomyelin, citrate, chlorinated phospholipids (e.g., phosphatidylcholine), and other phospholipids [19].

5. Metabolomics in daily clinical practice

5.1. Effect of the NMR magnetic field strength

Evaluation of the advantages versus limitations of NMR spectrometers with higher magnetic field strength was accomplished by comparing the results obtained for the same plasma samples on both a medium-field (9.4 Tesla; 400 MHz) and high-field (21.1 Tesla; 900 MHz) NMR spectrometer. For a 900 MHz spectrum, an improved resolution as well as a higher signal to noise (S/N) ratio is observed as compared to a 400 MHz spectrum (Figure 5) [47]. Because of these improved characteristics, measurements with a high-field spectrometer enable to define the integration regions more accurately using spiking experiments, resulting in less signal overlap and therefore in a larger number of integration regions that are representative for a single metabolite. Yet, discriminative power of both high- and medium-field spectra is rather comparable. These findings are in line with the study of Bertram et al., who demonstrated that the prediction performance and thus obtained information out of the spectra meant for diagnosis strongly increases when shifting the magnetic field strength from 250 to 500 MHz, whereas the effect of further increasing the magnetic field strength from 500 to 800 MHz appeared less strong when group discrimination is concerned [48]. However, analysis with a high-field spectrometer can be the preferred choice for the detection and identification of new, low-concentration metabolites and therefore can contribute to a better understanding of the underlying disturbed biochemical pathways of disease [47]. A drawback is the high cost of high-field spectrometers, which raises strongly with the magnetic field strength. By comparison, the cost of a 400 MHz spectrometer is in the order of €300,000 while a 900 MHz spectrometer can reach the cost of €2,750,000. The need of a supplementary cryoprobe can raise these estimated amounts even more with €200,000 [47]. In addition, such instruments demand for an isolated building for its housing, which is less practical in a clinical setting. Taken all into account, medium-field (400–600 MHz) spectrometers will probably become the preferred instruments for future application in clinical metabolomics.

Figure 5.

Comparison of the 1H-NMR spectra of human blood plasma acquired at a high-field (900 MHz) (top) and medium-field (400 MHz) (bottom) spectrometer. Both spectra are zoomed-in between 0.80 and 1.10 ppm. The top spectrum shows an increased resolution and improved S/N ratio. The paired labeled peaks each represent a methyl group of the amino acid valine. ppm: Parts per million.

5.2. Precision medicine

The contribution of metabolic phenotyping toward the clinical environment, often referred to as pharmacometabolomics, can encompass the entire patient journey, starting from an improved screening selection and earlier diagnosis to a follow-up for treatment response prediction and enhanced personalized choice of therapy [49]. Despite several challenges that accompany the implementation of such a unique innovative technique, for example, biomarker validation and cost-effectiveness [49, 50], the authors are highly convinced that metabolism-based biomarkers carry the potential to significantly contribute to future daily standard clinical practice.

For lung cancer, metabolic phenotyping by means of 1H-NMR can further be useful in preceding low-dose computed tomography (LDCT) scanning as a tool to deliver additional and complementary risk factors for a better selection of high-risk individuals. Currently, selection of those individuals is primarily based on age and smoking status/history [51]. As an outcome of the National Lung Screening Trial, it is stated that mortality is significantly reduced when screening with LDCT occurs [52]. Although sensitivity levels of LDCT screening are high and the number of diagnoses in early stadia increases, the positive predictive value of LDCT is currently still low [53]. Other drawbacks of LDCT screening are the high rate of false positive results, the high risk of overdiagnosis and consequently additional radiation exposure due to avoidable diagnostic tests [54]. In order to meet with the raising interest in improving the accuracy of risk prediction, promising clinically relevant diagnostic biomarkers which can add predictive value to existing models are indispensable [55, 56]. Therefore, a noninvasive blood-based screening test in complement with LDCT would be a valuable tool to reduce the number of individuals undergoing unnecessary and sometimes harmful follow-up treatments. Likewise, in a next phase, identification of prognostic biomarkers could assist in the tracing of early-stage lung cancer patients who would most likely benefit from current therapies, for example, surgery with curative intent or adjuvant chemotherapy [57].

Next to the discovery of diagnostic and prognostic biomarkers, metabolic profiling is being extensively examined for its use in prediction of individual therapy response [58, 59, 60, 61]. Personalized treatment will contribute to a reduction of adverse reactions by (i) prediction of the patient’s response and (ii) administration of the most efficient drug dose. Moreover, longitudinal monitoring of patients allows to track post-interventional outcome or deviations in response and therefore can assist in paving the way toward long-term personalized health [49].

6. Conclusion

Analysis of metabolic changes in blood plasma by 1H-NMR spectroscopy allows to significantly discriminate between lung cancer patients and healthy controls. Additionally, metabolic phenotyping supports detection of lung cancer in all stages and enables differentiation between different cancer types such as breast and lung cancers. This indicates that a metabolomics approach can actively contribute to lung cancer diagnosis, even in early stages of tumor development. For daily clinical practice, where the main goal is to correctly classify patients, a medium-field (400–600 MHz) NMR spectrometer can provide sufficient discriminative power to perform clinical metabolomics. For research purposes, on the other hand, where disease-related disturbed pathways deserve a more extensive analysis, high-field NMR (e.g., 900 MHz) spectra are preferred. The ability of high-field NMR to observe a larger number of metabolites that are represented by a nonoverlapping signal, permits a deeper look into the underlying affected metabolic pathways. We show that increased glucose levels are observed while lactate levels are decreased in blood plasma of lung cancer patients. These aberrant metabolite concentrations indicate an increased gluconeogenesis as counteraction of the body to the Warburg effect in the cancer cells. Moreover, the fact that cancer cells manage an enhanced membrane synthesis can be confirmed by the lowered plasma levels of phospholipids.

Encouraged by all these promising results, the authors strongly believe that 1H-NMR-based metabolic fingerprinting will become widely clinically implemented by serving as (i) an additional screening tool for lung cancer, (ii) a procedure to define complementary risk factors for current risk models toward an improved selection of lung cancer patients eligible for LDCT, and (iii) an innovative method to better characterize lung cancer patients in order to provide them with the best treatment strategies available.


This study is part of the Limburg Clinical Research Program (LCRP) UHasselt-ZOL-Jessa and supported by Kom op tegen Kanker (Stand up to Cancer), the Flemish Cancer Society. The authors like to thank Prof. Dr. Eric de Jonge and Prof. Dr. Philip Caenepeel for their support in sample recruitment.

Conflict of interest

The authors declare that they have no conflict of interest.

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Elien Derveaux, Evelyne Louis, Karolien Vanhove, Liene Bervoets, Liesbet Mesotten, Michiel Thomeer and Peter Adriaensens (November 5th 2018). Diagnosis of Lung Cancer: What Metabolomics Can Contribute, Lung Cancer - Strategies for Diagnosis and Treatment, Alba Fabiola Costa Torres, IntechOpen, DOI: 10.5772/intechopen.79258. Available from:

chapter statistics

399total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Immunotherapy in Advanced Lung Cancer Treatment

By Alexandru C. Grigorescu

Related Book

First chapter

Mineralogy and Malignant Mesothelioma: The South African Experience

By James I. Phillips, David Rees, Jill Murray and John C.A. Davies

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us