Diagnosis of Lung Cancer: What Metabolomics Can Contribute Diagnosis of Lung Cancer: What Metabolomics Can Contribute

The reprogrammed metabolism of cancer cells reflects itself in an alteration of metabolite concentrations, which in turn can be used to define a specific metabolic phenotype or fin gerprint for cancer. In this contribution, a metabolism-based discrimination between lung cancer patients and healthy controls, derived from an analysis of human blood plasma by proton nuclear magnetic resonance ( 1 H-NMR) spectroscopy, is described. This technique is becoming widely used in the field of metabolomics because of its ability to provide a highly informative spectrum, representing the relative metabolite concentrations. Cancer types are characterized by decreased or increased levels of specific plasma metabolites, such as glucose or lactate, compared to controls. Data analysis by multivariate statistics provides a classification model with high levels of sensitivity and specificity. Nuclear magnetic resonance (NMR) metabolomics might not only contribute to the diagnosis of lung cancer but also shows potential for treatment follow-up as well as for paving the way to a better understanding of disease-related diverting biochemical pathways.

Nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS), which can possibly be connected to a gas-or liquid chromatography system (GC-MS/LC-MS), are the analytical techniques that are primarily used in the field of metabolomics [25][26][27]. While 13 C nuclei can be very useful in contribution to metabolite identification by NMR, the proton ( 1 H) nucleus is mostly studied in metabolomics NMR experiments [28]. The 1 H nucleus is omnipresent in metabolites, shows the highest relative sensitivity, and has a natural abundancy of 99.98%. 1 H-NMR spectroscopy is a noninvasive technique that needs no sample extractions and that enables the identification and quantification of metabolites in biofluids as well as in tissues and therefore is becoming widely used in the field of metabolomics [29]. Despite that 1 H-NMR is less sensitive compared to MS, it has many advantages: nondestructive, easy quantification, low cost per sample, minimal sample preparation requirements resulting subsequently in an excellent reproducibility and rapid high-throughput data acquirement [30]. In a single run of a few minutes, the 1 H-spectrum from one sample provides information regarding the relative concentrations of all present metabolites. The metabolic phenotype provides a representative snapshot of an individual's metabolic state and therefore enables the determination of cellular processes altered by disease [2].
Metabolites from a number of different diagnostic biofluids are already examined in multiple studies, mostly involving human blood plasma, serum, or urine [1,22,31,32]. In parallel with biofluids, intact tumor tissues are frequently evaluated since intra-tumor heterogeneity is currently one of the major causes of treatment failure [33,34]. To that end, high-resolution magic angle spinning NMR (HR-MAS NMR) as an analytical approach is gaining great attention [35][36][37][38].
This review intends to point out the results of 1 H-NMR metabolic profiling of lung cancer patients acquired by our research group and further explores the benefits which this method might deliver to contribute to an optimal treatment for lung cancer patients.

Sample collection and preparation
Experimental design focused on the analysis of fasting venous blood samples from lung cancer patients. Importantly, exclusion criteria were (i) not fasted for at least 6 h; (ii) fasting blood glucose concentration ≥ 200 mg/dl; (iii) medication intake in the morning of blood sampling, and (iv) treatment or history of cancer in the past 5 years, as described in the study of Louis et al. [20]. The blood samples were collected in lithium-heparin tubes and stored at 4°C within 5 min. Plasma aliquots were obtained after centrifugation at 1600 g for 15 min within 8 h after collection. Plasma sample preparation included a centrifugation step at 13,000 g for 4 min at 4°C and dilution of 200 μl of the supernatant with 600 μl deuterium oxide (D 2 O) containing 0.3 μg/μl trimethylsilyl-2,2,3,3-tetradeuteropropionic acid (TSP) as a chemical shift reference of the spectra. After presaturation for water suppression, the Carr-Purcell-Meiboom-Gill (CPMG) pulse sequence was used to acquire slightly T2-weighted spectra on a 400 MHz (9.4 Tesla) NMR spectrometer [1].

Binning
Before applying multivariate statistics, the data acquired by 1 H-NMR analysis should be preprocessed. Preprocessing of data usually includes phasing, baseline correction, alignment, and normalization. In addition, the spectrum has to be divided into regions of which the integration value (i.e., area under the peak) can be used as a variable for the statistical analysis. Binning or bucketing is a commonly used technique to produce such a reduced set of variables by segregating the spectrum [39]. In point-wise binning, the spectrum is divided into so-called equally-sized bins. An important limitation of this method is the possible splitting of peaks, resulting in a loss of differentiating power and possibly data misinterpretation. To overcome this, another methodology based on spiking of the plasma with known metabolites is proposed. This approach describes how the 1 H-NMR spectrum is divided into well-defined variable-sized integrations regions, being the variables for multivariate statistical analysis [40].

Spiking methodology
To obtain a correct signal assignment, 1 H-NMR spectra of reference plasma samples to which a known metabolite was spiked, were acquired. Hereto, stock solutions were prepared by dissolving a relevant concentration of a known metabolite in a reference plasma sample. Reference plasma can be obtained by pooling the plasma of several blood samples from a healthy person. Next, a small amount of stock solution can be added to a reference plasma NMR sample (e.g., 10 μl stock solution to 200 μl reference plasma and 600 μl D 2 O containing the TSP reference). This procedure can be repeated for all metabolites of interest, using a fresh reference sample for each metabolite. The outcome of these spiking experiments allows an accurate identification of the chemical shifts and J-coupling patterns. On our 400 MHz (9.4 Tesla) NMR spectrometer, the described spiking method led to a segmentation of the spectra in 110 well-defined integration regions [40]. After integration and normalization (relative to the total integrated area, with exclusion of the contributions of TSP and water), these integration regions could be used as variables for multivariate statistical analyses.

Multivariate statistics
The statistics were carried out by using supervised orthogonal partial least squares discriminant analysis (OPLS-DA) to train and validate a classification model which enables optimal discrimination between lung cancer patients and a control population. The statistical classifier was constructed after detection and removal of outliers in the training data set via unsupervised principle component analysis (PCA). In addition, PCA was also conducted to visualize significant intrinsic clusters in the case-control data set upon which identification of possible confounders was based.
Model characteristics such as the total explained intra-(R 2 X(Cum)) and intergroup (R 2 Y(Cum)) variation were examined together with sensitivity and specificity values in order to evaluate strength performance of the OPLS-DA classifier. Predictive ability (Q 2 (Cum)) of the model was demonstrated by cross-validation of the training set as well as by application of the model to an independent validation cohort.

Detection of lung cancer
The assigned and normalized integration regions of the 1 H-NMR spectrum reflect the relative metabolite concentrations and thus represent the metabolic phenotype. Therefore, they can be used as variables for multivariate OPLS-DA statistics in order to discriminate between lung cancer patients and healthy controls. By applying this methodology on lung cancer plasma samples, a classification model that enables discrimination between those two groups was trained. Hereto, a large training cohort consisting out of 233 lung cancer patients and 226 controls was used. Characteristics of the subjects included in the training and validation cohort are summarized in Table 1. The trained OPLS-DA classifier resulted in a correct classification of 78% of the lung cancer patients and 92% of the control group ( Figure 1A) [19]. To affirm that the discrimination was purely due to differences in plasma metabolite concentrations, PCA was conducted to exclude possible confounders. By means of PCA score plots, it was confirmed that gender, smoking status, disease, and chronic obstructive pulmonary disease (COPD) are no confounders [19].
While these results definitely support the applicability of this methodology for the detection of lung cancer, no clear differentiation between tumor stages or histological subtypes could be detected yet, that is, none of the trained OPLS-DA models already showed significant clustering of different tumor stages or histological subtypes. This probably is due to the limited number of lung cancer patients in the subgroups and the diffuse character of the subgroups formed on the basis of histology and clinical tumor stage. However, the ability of a constructed OPLS-DA model to discriminate between 76 stage I lung cancer patients and 76 randomly selected controls with 74% sensitivity and 78% specificity strongly suggests that plasma metabolite phenotyping reveals the presence of lung cancer already during early stadia of tumor development (Figure 2) [19].

Validation of the classification model
Importantly, after training of a promising classification model, confirmation of the validity of the model needs to be considered. When the metabolic fingerprint of a large cohort of patients and controls is available, this can be realized by applying the model on an independent validation cohort consisting out of an independent group of both lung cancer patients and controls. In this study, an independent cohort of 98 patients with lung cancer and 89 controls was used for validation of the trained model classifier. The trained model shows a high predictive accuracy with a sensitivity of 71% and a specificity of 81% (Figure 1B and C) [19]. Taking lipid-lowering medication, N (%) 124 (55) 122 (52) 56 (63) 39 (40) Diabetes, N (%) 23 (10) 40 (17) 20 (22) 12 (12)

Differentiation between cancer types
To further illustrate the potential of the methodology described above, the following paragraph demonstrates that different cancer types are characterized by a specific metabolite profile. Hereto, the same workflow was applied on a data set of 54 lung cancer patients and   80 breast cancer patients. Again, the segmentation of the spectrum was based on metabolite spiking and OPLS-DA statistics were used to train a classification model, this time in discriminating lung cancer from breast cancer. The resulting model allows a correct classification of both cancer types with a sensitivity of 93% (93% of the 54 lung cancer patients were correctly classified) and a specificity of 99% (99% of the 80 breast cancer patients were correctly classified) (Figure 3A). Validation of the model by applying it on an independent cohort of 81 lung cancer patients and 60 breast cancer patients confirmed these findings and shows a sensitivity of 89% and a specificity of 82% (Figure 3B and C) [20]. Another recent study explored these promising results by establishing an OPLS-DA classification model that allows discrimination between three different types of cancers, that is, lung, breast, and colorectal cancers. After 1 H-NMR measurements of 37 plasma samples of each patient group, multivariate statistics revealed that each type of cancer was represented by a specific metabolic signature (Figure 4) [41]. Since the metabolic phenotype allows a clear differentiation between different cancer types, it can be assumed that the metabolic profile should not be considered as a general cancer marker but rather as a distinguishing characteristic of a specific cancer type.

Reorganization of metabolic pathways
The metabolites that contributed the most to the differentiation between lung cancer patients and healthy controls were identified and selected based on their variable importance for projection (VIP) value by means of an S-plot. The variables on the wings of the S-plot are the ones with the strongest contribution to the model and the highest statistical reliability [42].
Metabolic phenotyping of blood plasma shows that lung cancer patients are characterized by elevated glucose and decreased lactate levels, which implies an increased gluconeogenesis. This enhanced gluconeogenesis reflects the reaction of the human body to the Warburg effect or aerobic glycolysis in which, even in normoxic conditions, cancer cells rely on fermentation, that is, glycolysis leading to lactate production via fermentation of pyruvate. The Warburg effect, which takes place in cancer cells, can be observed in tumor tissue by means of 1 H-NMR as shown by Rocha et al. They demonstrated that lung tumors of different histological subtypes are all characterized by lowered glucose whereas lactate levels are increased, which is supported by the significantly enhanced glycolytic activity of cancer cells compared to normal cells [23]. Moreover, lung cancer patients show decreased phospholipid plasma levels, pointing to an increased lipogenesis and enhanced membrane synthesis, which is correlated with increased proliferation of cancer cells [43][44][45][46]. Other metabolites with an increased concentration in lung cancer patients compared to controls are N-acetylated glycoproteins, β-hydroxybutyrate, leucine, lysine, tyrosine, threonine, glutamine, valine, and aspartate. Contrarily, metabolites showing a decreased concentration in lung cancer patients are alanine, sphingomyelin, citrate, chlorinated phospholipids (e.g., phosphatidylcholine), and other phospholipids [19].

Effect of the NMR magnetic field strength
Evaluation of the advantages versus limitations of NMR spectrometers with higher magnetic field strength was accomplished by comparing the results obtained for the same plasma samples on both a medium-field (9.4 Tesla; 400 MHz) and high-field (21.1 Tesla; 900 MHz) NMR spectrometer. For a 900 MHz spectrum, an improved resolution as well as a higher signal to noise (S/N) ratio is observed as compared to a 400 MHz spectrum (Figure 5) [47]. Because of these improved characteristics, measurements with a high-field spectrometer enable to define the integration regions more accurately using spiking experiments, resulting in less signal overlap and therefore in a larger number of integration regions that are representative for a single metabolite. Yet, discriminative power of both high-and medium-field spectra is rather comparable. These findings are in line with the study of Bertram et al., who demonstrated that the prediction performance and thus obtained information out of the spectra meant for diagnosis strongly increases when shifting the magnetic field strength from 250 to 500 MHz, whereas the effect of further increasing the magnetic field strength from 500 to 800 MHz appeared less strong when group discrimination is concerned [48]. However, analysis with a high-field spectrometer can be the preferred choice for the detection and identification of new, low-concentration metabolites and therefore can contribute to a better understanding of the underlying disturbed biochemical pathways of disease [47]. A drawback is the high cost of high-field spectrometers, which raises strongly with the magnetic field strength. By comparison, the cost of a 400 MHz spectrometer is in the order of €300,000 while a 900 MHz spectrometer can reach the cost of €2,750,000. The need of a supplementary cryoprobe can raise these estimated amounts even more with €200,000 [47]. In addition, such instruments demand for an isolated building for its housing, which is less practical in a clinical setting. Taken all into account, medium-field (400-600 MHz) spectrometers will probably become the preferred instruments for future application in clinical metabolomics.

Precision medicine
The contribution of metabolic phenotyping toward the clinical environment, often referred to as pharmacometabolomics, can encompass the entire patient journey, starting from an improved screening selection and earlier diagnosis to a follow-up for treatment response prediction and enhanced personalized choice of therapy [49]. Despite several challenges that accompany the implementation of such a unique innovative technique, for example, biomarker validation and cost-effectiveness [49,50], the authors are highly convinced that metabolism-based biomarkers carry the potential to significantly contribute to future daily standard clinical practice.  For lung cancer, metabolic phenotyping by means of 1 H-NMR can further be useful in preceding low-dose computed tomography (LDCT) scanning as a tool to deliver additional and complementary risk factors for a better selection of high-risk individuals. Currently, selection of those individuals is primarily based on age and smoking status/history [51]. As an outcome of the National Lung Screening Trial, it is stated that mortality is significantly reduced when screening with LDCT occurs [52]. Although sensitivity levels of LDCT screening are high and the number of diagnoses in early stadia increases, the positive predictive value of LDCT is currently still low [53]. Other drawbacks of LDCT screening are the high rate of false positive results, the high risk of overdiagnosis and consequently additional radiation exposure due to avoidable diagnostic tests [54]. In order to meet with the raising interest in improving the accuracy of risk prediction, promising clinically relevant diagnostic biomarkers which can add predictive value to existing models are indispensable [55,56]. Therefore, a noninvasive blood-based screening test in complement with LDCT would be a valuable tool to reduce the number of individuals undergoing unnecessary and sometimes harmful follow-up treatments. Likewise, in a next phase, identification of prognostic biomarkers could assist in the tracing of early-stage lung cancer patients who would most likely benefit from current therapies, for example, surgery with curative intent or adjuvant chemotherapy [57].
Next to the discovery of diagnostic and prognostic biomarkers, metabolic profiling is being extensively examined for its use in prediction of individual therapy response [58][59][60][61]. Personalized treatment will contribute to a reduction of adverse reactions by (i) prediction of the patient's response and (ii) administration of the most efficient drug dose. Moreover, longitudinal monitoring of patients allows to track post-interventional outcome or deviations in response and therefore can assist in paving the way toward long-term personalized health [49].

Conclusion
Analysis of metabolic changes in blood plasma by 1 H-NMR spectroscopy allows to significantly discriminate between lung cancer patients and healthy controls. Additionally, metabolic phenotyping supports detection of lung cancer in all stages and enables differentiation between different cancer types such as breast and lung cancers. This indicates that a metabolomics approach can actively contribute to lung cancer diagnosis, even in early stages of tumor development. For daily clinical practice, where the main goal is to correctly classify patients, a medium-field (400-600 MHz) NMR spectrometer can provide sufficient discriminative power to perform clinical metabolomics. For research purposes, on the other hand, where diseaserelated disturbed pathways deserve a more extensive analysis, high-field NMR (e.g., 900 MHz) spectra are preferred. The ability of high-field NMR to observe a larger number of metabolites that are represented by a nonoverlapping signal, permits a deeper look into the underlying affected metabolic pathways. We show that increased glucose levels are observed while lactate levels are decreased in blood plasma of lung cancer patients. These aberrant metabolite concentrations indicate an increased gluconeogenesis as counteraction of the body to the Warburg effect in the cancer cells. Moreover, the fact that cancer cells manage an enhanced membrane synthesis can be confirmed by the lowered plasma levels of phospholipids.
Encouraged by all these promising results, the authors strongly believe that 1 H-NMR-based metabolic fingerprinting will become widely clinically implemented by serving as (i) an additional screening tool for lung cancer, (ii) a procedure to define complementary risk factors for current risk models toward an improved selection of lung cancer patients eligible for LDCT, and (iii) an innovative method to better characterize lung cancer patients in order to provide them with the best treatment strategies available.