Raman Spectroscopy for In Vivo Medical Diagnosis

Raman spectroscopy is a noninvasive optical technique that can be used as an aid in diagnosing certain diseases and as an alternative to more invasive diagnostic techniques such as the biopsy. Due to these characteristics, Raman spectroscopy is also known as an optical biopsy technique. The success of Raman spectroscopy in biomedical applications is based on the fact that the molecular composition of healthy tissue is different from diseased tissue; also, several disease biomarkers can be identified in Raman spectra, which can be used to diagnose or monitor the progress of certain medical conditions. This chapter outlines an overview of the use of Raman spectroscopy for in vivo medical diagnostics and demonstrates the potential of this technique to address biomedical issues related to human health.


Introduction
Raman spectroscopy is based on the inelastic scattering of photons, also known as Raman effect, discovered by C. V. Raman in 1928 [1]. When a sample is illuminated with a light source, the incoming photons are absorbed or scattered. If absorbed, the photon energy is transferred to the molecules, whereas if a photon is scattered and the energy is conserved, it is called elastic scattering. However, a small portion of scattered photons (1 in every 10 billion photons) can be scattered inelastically, which means a slight change in the photon energy. This small energy difference between the incident and the scattered photon is the Raman effect. Raman spectroscopy has several advantages for biomedical applications, including being nondestructive and relatively fast to acquire, and provides information at the molecular level. Additionally, water produces weak Raman scattering, which means the presence of water in the sample does not interfere with the spectrum that is being analyzed. The main disadvantages of Raman spectroscopy include the extremely weak Raman signal and the presence of undesirable noise sources such as the intense fluorescence background present in biological samples.

Instrumentation
A Raman spectrometer useful for in vivo measurements should be an integrated system that can provide real-time spectral acquisition and analysis [1]. A Raman system for in vivo measurements includes a light source, sample light delivery and collection, spectrograph with detector, and the computer interface. Lasers are the excitation source for Raman spectroscopy due to the fact they can provide sufficient power to the sample in order to detect Raman spectra in a reasonable integration time. However, it is necessary to consider important issues such as power, integration time, and wavelength of the laser to optimize the Raman system for in vivo biomedical applications. For example, to avoid tissue damage, the maximum permissible exposure (defined by ANSI) and temperature increase must be considered. Therefore, a correct laser power selection depends on achieving a good signal to noise and to minimize tissue damage. In biological tissue, the fluorophores can generate signals that mask or overwhelm the weak Raman signal, and to avoid fluorescence background, multiple approaches have been proposed including the excitation in the near infrared (NIR) [2]. It is known that most biological fluorophores have no peak emission in this region of the spectrum, which results in lower fluorescence background compared to visible or UV excitation. Due to these advantages, most of the Raman spectroscopy systems for skin diagnosis use a 785-nm diode laser as the excitation source, since it provides low-cost light source that generates low fluorescence and can penetrate deep into human tissue. In sample light delivery and collection, the most used method for clinical applications is optical fibers. The Raman fiber probe design varies depending on the clinical application. In the case of Raman spectroscopy of the skin, the probe consists of a single central delivery fiber surrounded by several collection fibers. The selection of a suitable detection system is an important issue for Raman spectroscopy. The typical Raman detection system used for biomedical applications consists of a spectrograph attached to a cooled charge coupled device (CCD). Most CCDs use a thermoelectric (TE) system to cool the detector down to À70 C in order to reduce thermal noise. The detection system also requires a spectrograph coupled to the Raman probe and to the CCD. It is recommended the spectrograph have a spectral resolution of 8-10 cm À1 in order to provide detailed information of biological Raman bands. The spectral resolution depends on spectrograph optical parameters, the diffraction grating, and the CCD pixel size. A schematic of the typical arrangement of these components is shown in Figure 1.

Data preprocessing
A big issue in biological Raman spectroscopy is the presence of undesirable background elements related to different sources such as intrinsic fluorescence, noise introduced by the equipment used, and the noise generated by external sources.

Smoothing and denoising
The main sources of noise present in Raman spectra from biological samples are the shot noise, fluorescence background, flicker noise, dark current, and thermal noise. One alternative to reduce the thermal noise and dark signal is the use of a Raman system with high quality, thermoelectric cooled spectrometers. In Raman spectra, most of the time, the shot noise is the predominant noise associated with the particle nature of light. The approximate shot noise associated with measurement of n counts is n 1/2 . Thus the signal to noise ratio (S/N) can be improved incrementing the number of counts n. In other words, S/N can be improved by increasing averaging time due to the fact the signal increases proportionally with time. There are several multitude noise removal techniques that can be applied to Raman spectra. Smoothing is often employed for the removal of high-frequency components from Raman spectra, based on the fact that noise appears as high-frequency fluctuations, whereas signals are assumed to be low frequency. One smoothing technique is Fourier filtering [3]. In this technique, the higher frequency fluctuations, which are considered only noise, can be removed and the lower frequency ones can be used to reconstruct Raman spectra without noise. One drawback of this method is that the removal of the higher frequency noise may often introduce artifacts and distortion in Raman spectra. A commonly used smoothing technique is Savitzky-Golay (SG) filtering. The SG filter is a moving window-based local polynomial fitting procedure [4]. As the moving window size increases, some of the Raman bands may disappear. Therefore, it is very important to choose the appropriate parameters such as the polynomial order and the moving window size to avoid loss of Raman data. Other smoothing methods are locally weighted scatter plot smoothing (LOWESS) [5] and wavelet filtering [6] whereby the spectrum is decomposed using the discrete wavelet transform in order to isolate the noise by localizing it in space and frequency. Once it is isolated, it can be set to zero and the inverse wavelet transform is used to reconstruct the data. In all the mentioned methods, parameters have to be chosen carefully to avoid the important Raman bands being eliminated during smoothing.

Background removal
As mentioned is the last section, one noise source in biological Raman spectra is the fluorescence background. This intrinsic fluorescence emission is several orders of magnitude greater than the Raman scattering intensity of biological tissues; therefore, fluorescence appears as a strong band that obscures Raman signals and must be removed in order to perform the analysis on the Raman spectra. Background elimination has been performed using two approaches: experimental and computational. The experimental methods are related to changes in the instrumentation and those include shifted excitation [7], photo bleaching [8], and time gating [9]. One drawback of these methods is the relatively complex instrumentation, the long acquisition times, and alterations in the sample that could make the analysis of biological samples difficult. On the other hand, background removing by using computational approaches has the advantages such as easy to implement, inexpensive, and fast. Such methods include polynomial fitting [10][11][12], Fourier transform [13], wavelet transform [13], first-and second-order differentiation [14], multiplicative signal correction [15], linear programming [16], geometric approach [17], asymmetric least squares [18], methods based on iterative reweighted quantile regression [19], iterative exponential smoothing [20], and morphology operators [21,22]. However, the most used method is polynomial fitting due to simplicity. In this method, a polynomial is fitted and subsequently subtracted from the Raman spectrum to eliminate background effects. The selection of polynomial order is extremely important, because a higher order polynomial fitting may consider Raman bands as background and may be affected by high frequency noise. To solve this issue, some modified polynomial fitting methods were proposed. Figure 2 shows the Raman spectra of in vivo mouse skin tissue with and without fluorescence removal using the polynomial fitting method.
For example, the algorithm proposed by Zhao et al. [11] also known as the Vancouver Raman algorithm (VRA) is widely used for baseline correction in biomedical applications due to effectiveness and simplicity. The main advantage of this method is that it accounts for noise effects and Raman signal contribution.

Normalization
Raman spectra from the same sample could have different intensity levels if they were acquired at different times or under different experimental parameters such as changes in laser power levels. Normalization process deals with these differences in intensity levels by making that the intensity of a specific Raman band of the same material is the same or similar possible in all the spectra recorded under the same experimental parameters. One approach is the normalization to area. In this method, the intensity at each frequency in the spectrum is divided by the square root of the sum of the squares of all intensities. This normalization is useful when the spectra do not share a common band and it is better to normalize the spectra so that the total area under the spectrum is 1.0. This method has the advantage that is not dependent on any single band but one disadvantage is that the background can contribute to the normalization [1]. Another approach is the peak normalization, which uses intensity corresponding to the central frequency of a particular Raman band as reference (internal or external). The 1660 cm À1 (amide I) and the 1450 cm À1 band (CdH vibrations) are commonly used as reference due to their intensities that are not significantly affected by other changes in the sample [23]. This method assumes the reference does not change from one spectrum to other and therefore is not suitable when the nature of the samples could lead to a shift in the band position.

Chemometrics
Chemometrics uses mathematical and statistical methods to provide chemical/physical information from chemical data or for the subject under consideration, spectroscopic data. In order to identify components in a sample, one possibility is to use individual bands, but this approach is not the best option because one band is not specific for a molecule, as many molecules have the a band in the same localization. A more precise identification is to use multiple bands or the complete spectrum. Such approach considers each point in a spectrum as a variable and spectroscopic data can be displayed as a matrix where columns represent the variables (Raman shift or wavenumber) and the rows represent observations (Raman spectra). To analyze data with more than one variable, multivariate data analysis is used. There are many multivariate data analysis techniques available and their correct use depends on the objective of the analysis. The objective can be data description or exploratory analysis, discrimination, classification, clustering, regression, and prediction. Also, the data analysis methods can be divided into unsupervised and supervised methods. The supervised methods are used when there is no a priori knowledge available and are very useful to find hidden structures in the unlabeled data and sometimes are used as a first step to supervised methods. Hierarchical cluster analysis (HCA) and principal component analysis (PCA) are examples of unsupervised methods. On the other hand, supervised methods need a priori information such as class labels and the analysis involves the use of a training data set to find the patterns in the data and later validate the model using a test set. One example of the supervised method is partial least squares (PLS).

Principal component analysis (PCA)
Principal component analysis (PCA) is an unsupervised method often used to reduce the number of variables [24] and exploratory analysis of data. PCA is based on the eigenvector decomposition of the covariance matrix of the spectra matrix into eigenvectors and eigenvalues. The eigenvectors (or principal components) are orthogonal along n-dimensional axes and are ordered by decreasing value of each associated eigenvalue. This means the principal components are independent of each other and uncorrelated, as opposed to the original ones, which may be correlated. Also, their decreasing order means that the first principal component explains the maximum amount of variance of the original data, and the second one explains more variance than the third, and so on. The original data can be considered as an MÂN matrix of M spectra sampled at N wavenumbers. Applying the PCA to this matrix, PCA yields three results: N principal components, an NÂN matrix containing the coefficients for the transformation between the original data and the principal components, and N eigenvalues describing the importance of the corresponding principal components. The original N experimental spectra are transformed into a new set of N 'synthetic' spectra called principal components. In summary, one advantage of PCA is that by evaluating the relative importance of the consecutive principal components, it is possible to reduce the dimension of the original dataset by finding a smaller collection of variables that explain the highest amount of variance. Additionally, because changes in Raman signal are uncorrelated with the noise in the spectra, the random noise and the significant spectral changes will be separated into different principal components. Therefore, many principal components can be discarded, removing noise without losing useful information from Raman signal.

Partial least squares (PLS)
PLS is one of the most widely used multivariate data analysis techniques along with vibrational spectroscopy to estimate and quantify components in a sample [25]. As a supervised method, the concentrations of all constituents in the calibration samples are known. As with PCA, the noise observed in the spectra is isolated into separate latent variables (LVs), which are left out of the calibration, improving prediction precision, and nonlinear relationships between the properties of interest and intensity can be accommodated in a PLS model by including multiple LVs.

Classification and clustering models
Several data analysis methods are focused on looking for differences between the spectra so that groups of spectra can be identified and classified. The most common methods used in biomedical Raman spectroscopy are k-nearest neighbors (KNN), hierarchical cluster analysis (HCA), artificial neural networks (ANN), discriminant analysis (DA), and support vector machines (SVM). The KNN method compares all spectra in the dataset through the use of the metrics of similarity between spectra like the Euclidean distance. This method has been used in combination with PCA and Raman spectroscopy for the diagnosis of colon cancer [26]. HCA uses a variety of multivariate distance calculations such as Euclidean and Mahalanobis metrics to identify similar spectra and is one of the used methods in Raman and IR imaging [27]. Similarly, artificial neural networks can be used to identify clusters or to find patterns in complex data. ANNs are computational models inspired by the functionality and structure of the central nervous system and the networks consist of interconnected group of nodes or neurons, which have different functions such data input, output, storage, or forwarding. The layout of ANN is composed of a number of layers and a number of neurons per layer. The use of ANN in the data analysis of blood serum Raman spectra allows for the differentiation between patients with Alzheimer's disease, other types of dementia, and healthy individuals [28]. DA is a supervised data analysis technique, which requires a priori knowledge of each sample group membership. DA computes a set of discriminant functions based on linear combinations of variables that maximize the variance between groups and minimize the variance within groups according to Fisher's criterion. Sometimes it is very useful to combine both PCA and LDA approaches (called PC-LDA model), which improves the efficiency of classification as it automatically finds the most diagnostically significant features [29][30][31]. SVMs are kernel-based algorithms that transform data into a high-dimensional space and construct a hyperplane that maximizes the distance to the nearest data point of any of the input classes. Raman spectroscopy and SVM have been used as methods for cancer screening [32].

Applications
The importance of the in vivo Raman spectroscopy is the number of potential biomedical applications. One application is the in vivo noninvasive diagnosis, and most research papers focus on cancer and skin diagnosis. In this section, a wide overview over applications in cancer and skin diagnosis is given, with a focus on developments over the past 5 years.

Cancer diagnosis
One of the most common clinical targets under investigation with Raman spectroscopy is cancer due to the possibility to measure biological samples minimally invasive, in vivo, and without labeling. One important step that enables the introduction of in vivo measurements of cancer in hollow organs is the development of fiber-optic Raman probes that can be implemented during endoscopy [33].

Lung cancer
Short et al. designed a Raman probe for in vivo detection of lung cancer during autofluorescence bronchoscopy [34], and they demonstrated the potential of Raman for in vivo diagnosis of lung cancer by reducing the false positives of autofluorescence bronchoscopy [35].

Gastrointestinal cancer
In 2014, Bergholt et al. [36] performed an in vivo diagnostic trial to classify dysplasia in Barrett's esophagus (BE). They reported a diagnostic sensitivity of 87.0% and a specificity of 84.7%, which demonstrate that real-time Raman spectroscopy can be performed prospectively in screening of the patients with suspicious BE in vivo. In a study conducted on mice with colon cancer, Taketani et al. [37] identified alterations in its molecular composition of lipids and collagen type I, along with its advancement. The tumor lesion was discriminated from normal tissues of the control mouse with an accuracy of 86.8%. Stomach cancer diagnosis has been another application in biomedical Raman spectroscopy [38]. Bergholt et al. have also reported a statistically robust study where 450 patients underwent Raman endoscopy for identifying gastric precancer based on PLS-DA [39]. The same group used in vivo Raman spectroscopy to characterize the properties of normal colorectal tissues and to assess distinctive biomolecular variations of different anatomical locations in the colorectum for cancer diagnosis. They conclude that interanatomical Raman spectral variability of normal colorectal tissue is subtle compared to cancer tissue. Their PLS-DA model provided a diagnostic accuracy of 88.8%, a sensitivity of 93.9% and a specificity of 88.3% for colorectal cancer detection [40].

Oral cancer
In a study conducted by Guze et al. [41], Raman spectra of oral diseases from 18 patients were classified into a benign or malignant category using PCA-LDA, and the method provided 100% specificity with 77% sensitivity. Murali Krishna et al. reported the potential for Raman spectroscopy to identify early changes in oral mucosa and the efficacy of this approach in oral cancer applications [42]. Comparing noncancer locations in a smoking and nonsmoking population demonstrated prediction accuracies from 75 to 98%. Another group reported the discrimination of normal oral tissue from different lesion categories with accuracies ranging from 82 to 89% [43]. Recently, Lin et al. [44] reported the utility of fiber-optic-based Raman spectroscopy for real-time in vivo diagnosis of nasopharyngeal carcinoma (NPC) at endoscopy. A total of 3731 in vivo Raman spectra were acquired in real time from 95 subjects. Raman spectra differ significantly between normal and cancerous nasopharyngeal tissues. Using PCA-LDA, their method provided a diagnostic accuracy of 93.1% (sensitivity of 93.6%; specificity of 92.6%) for nasopharyngeal cancer identification. The Raman spectra of the diseased tissue include oral squamous cell carcinoma (OSCC), oral submucosa fibrosis (OSMF), and oral leukoplakia (OLK). The study achieved good diagnostic accuracy for the three diseased groups and the normal group, which were 89, 85, 82, and 85%, respectively.

Skin cancer
A clinical study of 453 patients to investigate different types of skin cancer was published in 2012 by Lui et al. [45]. The instrument used by the authors allowed an acquisition time of approximately 1s and the software preprocessed the spectra immediately, which allowed to investigate skin lesions in real time. Benign and malignant skin lesions including melanomas, basal cell carcinomas, squamous cell carcinomas, actinic keratoses, atypical nevi, melanocytic nevi, blue nevi, and seborrheic keratosis were investigated and discriminated by multivariate analysis tools with sensitivities between 95 and 99%. Lim et al. determined the diagnostic capability of a multimodal spectral diagnosis for in vivo noninvasive disease diagnosis of melanoma and nonmelanoma skin cancers [46]. They acquired reflectance, fluorescence, and Raman spectra from 137 lesions in 76 patients using optical fiber-based systems. They obtained the best classification for nonmelanoma skin cancers when using multimodal approach. On the other hand, the best melanoma classification occurred when using Raman spectroscopy alone. A Raman probe to detect invasive brain cancer in situ in real time in patients was developed by Jermyn et al. [47]. They demonstrated that Raman spectroscopy can accurately detect grade 2-4 gliomas in vivo during human brain cancer surgery and it was possible to differentiate between cancer cell-invaded brain and normal brain, with sensitivity and specificity greater than 90%. Additionally, this approach can classify in real time, making it an invaluable tool for surgical procedure and decision making.

Atopic dermatitis
Several published works have used Raman spectroscopy to analyze the molecular composition of skin and correlate it with history of atopic dermatitis (AD) and filaggrin gene (FLG) mutations; Kezic et al. measured NMFs noninvasively on the skin of 137 Irish children with a history of moderate to severe AD [48]. González et al. detected the presence of the protein filaggrin in the skin of newborns using Raman spectroscopy and PCA as an early detection procedure for filaggrin-related AD [49]. In order to detect the presence of filaggrin in the Raman spectra, the coefficients of the principal components for each of the skin spectra from newborns were calculated. The first and second principal components accounted for 93.86% of all the explained variance of the original data. Figure 3 shows a graph of these two principal components, also known as scores plot. In the figure, the gray solid circles correspond to those infants who developed AD; the rest of the subjects are grouped together around the location of the filaggrin spectrum, represented as a black solid circle. The geometrical distance of each Raman spectra to the spectrum of filaggrin in the principal component plane indicates the amount of filaggrin in the subjects. Lower distances indicate higher amount of filaggrin and higher distances indicate lees amount of filaggrin or a filaggrin with a different molecular structure than the molecule that was taken as a reference spectrum.
This result indicates that this approach can be used to identify the persons who are more susceptible to develop AD, making it possible to use this technique as a method for early detection of AD. González et al. validated the use of Raman spectroscopy as a noninvasive tool to detect filaggrin gene mutations [50]. In this study, the amount of filaggrin was estimated by performing the correlation between the pure filaggrin Raman spectrum and the skin spectra obtained from Mexican patients with AD; the genetic analysis showed that 8 out of the 19 patients (42%) presented an FLG mutation. These 8 patients presented the 2282del4 FLG mutation, 2 of which (10.5%) were homozygous and 6 (31.5%) heterozygous, whereas 1 (5.2%) resulted in a compound heterozygote for the 2282del4 and the R501X mutations. These genetic results were compared to the filaggrin amount estimated; a lower correlation value of the spectra with the filaggrin spectrum indicates a lower filaggrin concentration. Figure 4 shows the results of the correlation for the patients with an FLG mutation (FLG -) and without an FLG mutation (FLG +). The patients with an FLG mutation presented an average correlation of 0.286, while the patients without an FLG mutation showed an average correlation of 0.4. Their results show that the correlation of the filaggrin Raman spectrum with the Raman Figure 4. Correlation between the filaggrin Raman spectrum and the skin spectrum of subjects with (FLG À) and without (FLG +) filaggrin gene mutations [50]. Figure 3. Plot of the two first principal components of the Raman spectra for each newborn (white circles) and the Raman spectrum of filaggrin (black circle). The infants identified with the numbers 1, 9, and 11 developed AD (gray circles) [49]. spectra of skin can be an indicator of filaggrin gene mutations. In another work, Baclig et al. used a genetic algorithm to demonstrate that strongly reduced Raman spectral information is sufficient for clinical diagnosis of atopic dermatitis [51].

Skin aging
Tfayli et al. reported slight variability in skin lipids upon aging [52]. The Raman spectral features of the skin lipids shifted in lateral packing with increasing age of the volunteers. González et al. differentiated between chronological aging and photoinduced skin damage by PCA of in vivo Raman spectra from sun-protected and sun-exposed skin [53].

Nickel allergy
Alda et al. [54] detected biochemical differences in the structure of the skin of subjects with nickel allergy when comparing with healthy subjects. The Raman spectral differences between groups were classified using PCA.

Melasma
Moncada et al. [55] used Raman spectroscopy in melasma patients treated with a triple combination cream (Tretinoin, Fluocinolona, and Hydroquinone) and found that the Raman skin spectra of the melasma patients showed differences in the peaks associated to melanin at 1352 and 1580 cm À1 ( Figure 5). The Raman skin spectrum of patients who did not respond to treatment ( Figure 1B) showed peaks that are not well defined, which are consistent with molecule degradation and protein breakdown. These results are consistent with the results reported previously by González et al. [56].

Other in vivo applications: UV/Vis Raman, Raman imaging, and SERS
In most of the in vivo Raman applications, near infrared (NIR) excitation sources are preferred. NIR wavelengths in the range of 780-1100 nm result in lower fluorescence background in the tissue and simplify the analysis of the Raman bands in comparison to visible or UV excitation. The visible excitation sources have been used in various biomedical Raman applications [57]. However, the use of visible wavelengths has several disadvantages for in vivo biomedical Raman applications such as the decrease of penetration depth, autofluorescence, and heat generation. The UV radiation is not used for in vivo measurements due to the mutagenicity. In Raman imaging [58], a laser spot scans the sample area and acquires Raman spectra at every set point. The intensity of a specific Raman band or bands is used to build an image from cells and tissues. Also the Raman spectra can be discriminated by chemometric analysis and the result is an image of the sample that contains chemical information, also known as Raman chemical image. Other methods of Raman imaging include coherent anti-Stokes Raman spectroscopy (CARS) and stimulated Raman scattering. These methods have been applied to study biochemical interactions in cells and tissues. However, the in vivo applications have been limited to animal models. [59][60][61]. The Raman imaging has the disadvantage that long integration times are needed, which limit its use for in vivo measurement in humans. Surface-enhanced Raman spectroscopy (SERS) was used for in vivo biomedical applications to detect biomarkers in animal models [62][63][64]. However, the toxicity issues related to nanoparticles used for SERS make this method infeasible for in vivo Raman measurements of human tissue.
Other alternatives for in vivo biomedical applications are to combine Raman spectroscopy with other optical methods. For example, Raman spectroscopy has been combined with optical coherence tomography [65,66], confocal reflectance microscopy [67,68], diffuse reflectance, and fluorescence spectroscopy [46,69]. The disadvantages of the multimodal approach are the higher cost and complexity of the system needed to perform the measurements. However, the multimodal approach has the advantage, when comparing with Raman spectroscopy alone that provides complementary and more detailed information about the disease and more accurate diagnosis in terms of both sensitivity and specificity.

Limitations
Among the disadvantages of Raman spectroscopy for biomedical applications is the weakness of the Raman effect, which most of the time is often accompanied by a stronger background Figure 5. Raman skin spectra of all melasma patients, grouped by patients who responded to the treatment (A) and patients who did not respond to treatment (B). For each group, the central solid line corresponds to the mean of the spectra of each group, and the gray shadow around this line represents the standard deviation [55]. signal particularly in biological samples. The background removing includes changes in instrumentation, which means high-complexity and high-cost systems. One alternative is the algorithm-based methods for fluorescence background removing. However, these methods cannot deal with all types of fluorescence without user intervention to adjust algorithm parameters. Additionally, the complexity of the fitting algorithms makes it difficult to use by nonexperts. Other limitation is that not all the molecules are Raman active, which means that some molecules do not give Raman signal. The potential of damaging the sample due to the laser exposure, which depends on the excitation wavelength, has to be taken for in vivo measurements. To solve this problem, lower energy excitation sources in the NIR range are preferred. Demonstrating the safety of these devices to regulatory agencies is a very important step for clinical implementation. For the in vivo diagnosis applications, larger studies are needed in order to test the reliability of the results. To date, a short number of studies involving a sufficient number of patients are reported. The lack of standardized and reliable methods for data analysis is an important limitation. Thus, standardization of measurement procedures, instrument calibration, processing, and evaluation of data is needed. Also the information provided by Raman spectra must be displayed in user-friendly, simple format, including clinically relevant information for diagnosis.

Conclusions and outlook
From the applications described in this chapter, it is clear that Raman spectroscopy has a great potential for in vivo measurements and identification of disease markers, which would make this technique a viable option for noninvasive medical diagnosis. Among the advantages of using Raman spectroscopy as a noninvasive tool for medical diagnosis is the fact that water is not Raman-active; therefore, it does not interfere with measurements. Also, the technique is noninvasive and fast and gives specific information about the structure and biochemical composition of samples, making it a viable option to identify molecules that are associated with disease.
Raman spectroscopy is likely to become a key player for in vivo and noninvasive medical diagnosis; however, in order to become a useful and reliable technique, it is important to use it along with signal processing methods and chemometrics in order to automatize and increase the reliability of the measurements and the identification of the molecules of interest.
An area of development that would accelerate the use of Raman spectroscopy in a clinical environment is the design of low-cost and portable Raman spectrometers, which would make their use more appealing for the medical community. Research in this area could also lead to an integrated optics Raman spectrometer, which would make the use of this technique useful in wearable health devices and monitoring of health parameters in a clinical environment.
It is the authors' belief that the combination of optimized instrumentation, standardized measurement procedures, preprocessing, and data analysis will allow Raman spectroscopy to become a powerful tool for disease diagnostics and a common clinical tool in a hospital environment.