Open access peer-reviewed chapter

Artificial Neural Networks (ANNs) for Spectral Interference Correction Using a Large-Size Spectrometer and ANN-Based Deep Learning for a Miniature One

By Z. Li, X. Zhang, G. A. Mohua and Vassili Karanassios

Submitted: May 2nd 2017Reviewed: September 18th 2017Published: December 20th 2017

DOI: 10.5772/intechopen.71039

Downloaded: 684

Abstract

Artificial neural networks (ANNs) are evaluated for spectral interference correction using simulated and experimentally obtained spectral scans. Using the same data set (where possible), the predictive ability of shallow depth ANNs was validated against partial least squares (PLS, a traditional chemometrics method). Spectral interference (in the form of overlaps between spectral lines) is a key problem in large-size, long focal length inductively coupled plasma-optical emission spectrometry (ICP-OES). Unless corrected, spectral interference can be sufficiently severe to the point of preventing precise and accurate analytical determinations. In miniaturized, microplasma-based optical emission spectrometry with a portable, short focal length spectrometer (having poorer resolution than its large-size counterpart), spectral interference becomes even more severe. To correct it, we are evaluating use of deep learning ANNs. Details are provided in this chapter.

Keywords

  • artificial neural networks (ANNs)
  • artificial intelligence
  • machine learning
  • deep learning
  • spectral interference
  • PLS
  • ICP
  • microplasma
  • portable optical emission spectrometry

1. Introduction

According to Webster’s dictionary, a neural network is defined as “a computer architecture in which a number of processors are interconnected in a manner suggestive of the connections between neurons in a human brain and which is able to learn by a process of trial and error”. The “processors” maybe individual computers, but they do not have to be. Typically, such “processors” reside in the same computer and most frequently are referred to as “neurons”. Also according to Webster’s dictionary, artificial is something “made or produced by human beings rather than occurring naturally, typically as a copy of something natural”. In this chapter, spectral interference correction in optical emission spectrometry (OES) will be discussed, using artificial neural networks (ANNs).

Spectral interference (a long standing problem in optical atomic spectrometry [1, 2, 3, 4]) arises when a spectral line emitted from an analyte (defined as the chemical species of interest) in an analytical sample (i.e., one to be used in chemical analysis) is overlapped with (orinterfered by) a spectral line emitted from another chemical constituent that is also present in the same sample or in the sample’s matrix (defined as whatever an analyte is in). For example, Zinc (Zn) as an analyte (A) in an iron-rich soil matrix. In a widely used, 6000–10,000 K hot, inductively coupled plasma (ICP), iron (Fe) can emit as many as 4000 spectral lines, whereas Zn has less than 10 sensitive (i.e., analytically useful) lines, thus making interference from an Fe line (e.g., from the soil matrix) on a Zn line likely. An additional example will be discussed in conjunction with the simulated spectral windows shown in Figure 1 .

Figure 1.

Spectral interferences in a simulated 100 pico-meter (pm)-wide spectral window (between 213.806 and 213.906 nm) centered on the analyte zinc (Zn) 213.856 nm spectral line.

Assuming that all species involved (per legend of Figure 1 ) are present in a sample and that they all are at equal concentrations, only Copper (Cu, at 213.853 nm) and Nickel (Ni, at 213.856 nm) are shown to overlap with the Zn spectral line (at 213.856 nm). Per legend, and with all participating species at equal concentrations, the spectral lines of V (Vanadium), W (Tungsten) and Re (Rhenium) are not sufficiently intense to be considered (or even to be graphed). However, if the concentrations of these elements is higher (e.g., by 10–100 times) rather than equal to that of the analyte Zn, they will have to be considered. Other elements from the periodic table do not have to be considered even if they are present in the sample or its matrix, because they do not emit spectral lines in the 100 pm-wide spectral window of interest. From this, the significance of spectral interference begins to emerge.

A more dramatic (and yet very realistic example) will be drawn from the practice of elemental analysis by optical emission spectrometry and will be discussed in conjunction with the data shown in Figure 2 . In this example, Arsenic (As) is the Analyte (or simply A) and Cadmium (Cd) is the Interferent (or simply I). It is worth noting that water contaminated by As is a key problem worldwide affecting the health and well-being of more than 100 million inhabitants worldwide because As occurs naturally in the soil surrounding water wells used as drinking water supply in many parts of the world.

Figure 2.

Simulation of the spectral response obtained from a spectral scan (solid line) when using an optical emission spectrometer and a mixture containing arsenic (As) as the analyte (A) and cadmium (Cd) as the Interferent (I). C = concentration (regardless of units).

In the example shown in Figure 2 , the intensities of the spectral lines of the As (Analyte, A) and of Cd (Interferent, I) are about equal (as indicated by the horizontal dashed line positioned on the intensity axis) crossing at 1 (the intensity axis was normalized to the maximum intensity of the As spectral line, the intensity of the Cd line was scaled accordingly). To produce a 1:1 intensity ratio of A:I, the concentration ratio of A:I was 1:0.3 (due to the different sensitivities of the spectral lines involved). Furthermore, in actual practice, an analyst would only observe the “added” or measured response shown by the solid line in Figure 2 . The individual responses obtained by the simulations (shown in using the dashed and the dotted lines in Figure 2 ) will not be observed and they have been added to facilitate discussion.

In this example, the intensity of the combined spectral response for As and Cd (i.e., one obtained by adding the individual responses for As and Cd from a sample containing both As and Cd, and shown by the solid line in Figure 2 or by experimentally measuring the combined response), is slightly more than 1.2 (indicated by the horizontal dotted line). Clearly, if the Cd interference on As is left uncorrected, the concentration of As will be reported to be higher than it actually is, in this example, by about 20%. Errors as large as the one discussed here are unacceptable in analytical determinations due to potential legal or regulations compliance reasons or health implications (e.g., as used in clinical analysis for medical diagnostics). In analytical practice (e.g., by commercial chemical analysis laboratories), spectral interference is addressed routinely using a number of methods, as will be briefly outlined below.

Several approaches have been used to correct the adverse effects of spectral interference. These traditional methods can be roughly divided into three categories: Chemistry-based approaches (i.e., via a chemical separation of an analyte from its matrix), Physics-based approaches (e.g., through use of high resolution spectrometry) and Mathematics-based (or statistics-based) approaches (e.g., via use of inter element correction factors or through use of Chemometrics). Briefly:

  • Chemistry-based approaches involve removal of either the analyte or the interferent using, for instance, some form of sample processing (e.g., via chemical separation). In general, approaches that require use of additional sample processing and manipulation steps are time-consuming and labor-intensive, thus adding to the overall cost per analysis [1, 2, 3, 4].

  • Physics-based approaches, such as use of high resolution spectrometry.

    • Use of high resolution spectrometry [1] to resolve spectral interferences is not feasible in routine analytical laboratories because spectrometers with high resolution (e.g., those with a long focal length) are expensive and they drift.

    • Use of a non-interfered spectral line. This is not always possible, especially if an analyte has few spectral lines useful for analytical determinations and the interferent has many (per Zn and Fe example discussed in the introduction).

  • Mathematics-based (or statistics-based) approaches, include:

    • Use of inter element correction (IEC) factors [5], in which the intensity of a spectral line of an interferent is measured at a different wavelength, and a correction factor is applied. Depending on the type of spectrometer used, this is not always possible.

    • Use of chemometrics methods [6, 7, 8, 9, 10, 11, 12, 13, 14, 15], defined as those involving the “application of mathematical or statistical methods for the treatment of chemical data”. Among others, examples of chemometrics approaches include adaptive filtering, factor analysis, orthogonal polynomials or curve fitting techniques.

Unlike the traditional approaches outlined above, ANNs have been “designed to find relationships in multi-variate data through learning” [16, 17, 18, 19] that is to “learn by example’. As such, they provide an attractive alternative to the traditional methods mentioned above. Due their unique capabilities and their significance in this work, a brief background on ANNs will be provided next.

2. Brief background on ANNs

ANNs fall under the umbrella of artificial intelligence (AI), a general term used to describe machines displaying human-like intelligence (presently, only in specific domains). Synonyms used in this field are reflective of the preferences of individual research groups. Examples of synonyms include cognitive computing, computational intelligence, and machine learning in which a machine is trained to use data (e.g., spectra, images, text, or speech) so that it can learn from examples on how to perform a task. These AI methods (no matter what they are called) are contrary to conventional programming paradigms in which explicit program instructions are issued to tell a machine how to perform a task.

ANNs were inspired by biological neurons, and in many respects they mimic them. The ideas behind artificial neurons and their networks date back to the late 1940s and early 1950s followed by the landmark description of the perceptron [20] (a linear classifier, developed in the late 1950s). The influential paper by Hopfield [21] in 1982 addressed the limitations identified by Misnky and Papert and it opened the field of ANNs to their application in a diversity of disciplines [20]. To provide few examples, ANNs have been applied to different areas ranging from finance, to engineering, to physics, to chemistry, to geology, and to medicine and pharmacy. As already been mentioned, a unique advantage of neural networks is their ability to “learn by example”.

In chemical analysis, interest in ANNs begun around the mid-1980s. Since then, ANNs have been applied to many chemistry-related areas, limited examples include of IR- and UV-spectra, classification, calibration, nuclear magnetic resonance (NMR), and ion mobility spectrometry (IMS). In my lab, we have been applying ANNs for spectral interference correction in analytical atomic spectrometry [22, 23, 24, 25, 26, 27].

A comprehensive description of the theory and practice of ANNs is beyond the scope of this chapter. Briefly [28, 29, 30, 31, 32], an ANN is formed by using many individual artificial neurons. An example of an artificial neuron is shown in Figure 3a . Neurons are typically organized in layers. The weight (that is the strength of the connection between each neuron) is adjustable. Furthermore, each neuron has its own weighted inputs and its own transfer function ( Figure 3b ). The transfer function is a mapping operation of an input to a neuron or a layer to its output, for example, a linear transfer function ( Figure 3a ); a log-sigmoid function or a hard limit function.

Figure 3.

(a) Simplified illustration of a neuron and (b) of a linear transfer function. As shown above, the input to a neuron incorporates an adjustable input bias, b . The input x is multiplied by the strength of the weight (w), and the weight is being adjusted during learning. The product w*x is passed through a transfer function (F).

In ANNs, learning can be supervised or unsupervised. In supervised learning (used in this work), the network adjusts its internal parameters (e.g., weights) so that a user-specified (i.e., expected) or target-value at the output is reached. At this stage, the error (defined as the difference between the output provided by the network and the target output provided by the user) is computed. Using a “learning rule”, the weights are adjusted until the error mentioned above is minimized. The learning rate, determining the rate (or speed) at which the weights change is a key parameter. This is because if the learning rate is too fast (and as a consequence the learning step is too large), the network may become unstable. Conversely, if the learning rate is too slow (due to use of a very small learning step), the network may take too long to converge, or it may get trapped at a local minimum. The challenge is to seek a balance between learning rate and convergence to the lowest possible minimum. To find the minimum, a gradient descent algorithm (i.e., finding the derivative, typically using the chain rule for derivatives) with momentum (m) was used. Specifically, a fraction m is added to the previous weights to update the current weight. This approach ignores small ridges in the error surface, thus reducing the possibility of being trapped at a local minimum. Momentum values ranging anywhere between 0 and 1 can be used. A backpropagation of errors algorithm is often used with gradient descent method applied and it was employed here. To generalize, backpropagation strives to find a set of weights that minimize the errors by gradient descent between the output of the network and the target output.

The predictive ability of an ANN depends on the transfer function ( Figure 3b ), on the learning rule applied and on the network’s architecture. The architecture of the network consists of the number of neurons in each layer, on the number of layers involved, on the transfer function, and on how the layers are connected to each other (and to the network’s inputs). Sequential, feed-forward architecture is the most the most widely used [28, 29, 30, 31, 32]. In this architecture, each neuron is connected to previous neurons and its output becomes an input to the neighboring neurons. An example of spectral scans used for training purposes of a feed-forward network architecture is shown in Figure 4 .

Figure 4.

ANNs-top frame: Example spectral scans used as a training set. The intensity between scans is different because the concentrations (and hence intensities) of Analyte (A) and Interferent (I) are different. Multiple scans are shown in the top half, each scan (with data points taken typically every 0.080 nm) was fed individually into the network (bottom half).

For validation of network performance (i.e., the ability to predict A (analyte) and I (Interferent) concentrations) when given an “unknown” scan (i.e., one in which the network is asked to predict the correct or expected value), one spectral scan at a time is fed into the network (bottom frame of Figure 4 ) and the network returns a “predicted” concentration for the Analyte (A) and for the Interferent (I). For validation (as is typical in this field of research), the performance of ANNs was compared with the performance of a typical chemometric method, such as partial least squares (PLS).

3. Brief background on PLS

The key objective in PLS or PLS regression (as it is often called), is to develop a mathematical model relating spectral response (e.g., via spectral scans) in the concentration-range of interest for an interfered analyte. In this case, a calibration model is developed using one of the several PLS algorithms, in this work, a non-linear iterative partial least squares (NIPLS) algorithm was used. For more details on PLS, the review Geladi and Kowalski [33] is recommended. Additional references abound, for brevity only selected few are listed [34, 35, 36, 37, 38, 39, 40].

In PLS, an appropriate number of latent variables (or principal components) must be used, otherwise if a small number of variables is employed, the model becomes inadequate; if a large number is utilized both noise and signal are modeled. For comparison purposes, (where possible) the same spectral scans were used to evaluate the predictive ability of PLS and of ANNs. An example of spectral scans used with PLS is shown in Figure 5 .

Figure 5.

PLS-top frame: Example spectral scans. Bottom frame: Simplified diagram of the PLS algorithm. To facilitate meaningful comparisons, simulations and experimentally obtained spectral scans (where possible) were the same as those used for ANNs.

4. Experimental

For both ANNs and PLS, MatLab (www.mathworks.com) with the neural network and the PLS toolbox were used. To eliminate the effects of instrument-drift (often referred to as 1/f noise) and (potentially) eliminate any other instrument-induced errors, initially both ANNs and PLS were tested using in-house developed spectral simulation software. An example output of this software has been shown in Figure 1 . The variables tested and the ranges used are listed in Table 1 and will be briefly discussed next.

Variables testedRange of test valuesExperimental approach
Random noise level0, 3, 6, 9 to 18 (%)Simulations only
Wavelength separation (Δλ)0, 3, 5, 7, 9, 15 (pm)Simulations and spectral scans
Intensity ratio1:0.01 to 1:0.98 (~1:1)Simulations and spectral scans
Wavelength shift±0.5 to ±1.8 (pm)Simulations and aligned spectral scans

Table 1.

Variables potentially affecting predictive ability of ANNs and PLS.

Random noise level: Spectral simulations were used to study the effect of random (white) noise, because simulations do not suffer from drift (or 1/f noise) and because a pre-determined amount of noise can be easily added to a simulated spectral scan.

Wavelength separation: What is meant by wavelength separation Δλ (typically in a few pm) is shown in Figure 6 . Spectral overlaps can be distinguished as “direct overlaps” ( Figure 6a ), intermediate overlaps ( Figure 6b ), and “wing type” overlaps ( Figure 6c ). The question addressed was “how far does an interfering spectral line have to be (in units of Δλ) before its effect on the analyte spectral line is negligible?” As listed in Table 1 , the effect of Δλ on predictive ability was as large as 15 pico-meter (pm).

Figure 6.

Wavelength separation (Δλ). (a) “direct spectral overlap”, small Δλ, (b) intermediate case, (c) “wing spectral overlap”, large Δλ. Depending on Δλ, the max measured response in the spectral axis visually “appears” to have shifted (but actually it has not), the visual effect is due to the presence of the interferent. The intensity was scaled to a max of 100% (for A) with I scaled appropriately. Intensities >100% are due to combined contributions of A and I.

Intensity ratio A:I affects the measured (or combined) response. For practical reasons, different ratios were used for simulations ( Table 2 ) and for spectral scans ( Table 3 ). An example is shown in Figure 7 .

No.ElementsSpectral line pairs, wavelength (nm)Wavelength separation (pm)
1Zn and Ni213.856 and 213.8560
2Cr and Pt267.716 and 267.7151
3Zn and Cu213.856 and 213.8533
4Ni and Cr232.003 and 232.0085
5B and Mo208.959 and 208.9527
6Ca and Co315.887 and 315.8789
7Be and V315.042 and 313.02715

Table 2.

Analyte (A) and Interferent (I) line pairs used in spectral simulations.

No.ElementsSpectral line pairs, wavelength (nm)Wavelength separation (pm)
1Zn and Ni213.856 and 213.8560
2Cr and Pt267.716 and 267.7151
3Zn and Cu213.856 and 213.8533
4Ni and Cr232.003 and 232.0085
5B and Mo208.959 and 208.9527
6Be and V313.042 and 313.02715

Table 3.

Analyte and Interferent line pairs used for the experimentally obtained spectral scans.

Figure 7.

Effect of A:I intensity ratio. The intensity axis was scaled to a max of 100% (for A) with I scaled appropriately. Intensities >100% are due to combined contributions of A and I.

The spectral lines used for the spectral simulations are listed in Table 2 and those used for the experimentally obtained spectral scans in Table 3 .

For the experimentally measured scans, a sequential, optical emission spectrometer (OES, a Czerny-Turner scanning monochromator) with a focal length of 0.75 m, an 1800 grooves/mm holographic grating, a photo multiplier tube (PMT) detector, and an ICP were used (Varian Liberty 100). The total weight of this instrument was 300 kg. This ICP-OES system was selected due to its spectral resolution and its ability to scan user-defined spectral windows. A diagram is shown in Figure 8 and it included here to facilitate explanations of some experimental observations made during the course of this research.

Figure 8.

Illustration of the 75 cm focal length (f) scanning spectrometer, scans were 60–160 points.

A liquid sample is introduced into a 1–2 kW plasma ( Figure 8 ). For simplicity, assume that the plasma generates only two spectral lines (at wavelengths λ1 for Analyte A and λ2 for Interferent I) from a sample introduced into it. Scanning a wavelength-window (dotted line, bottom of Figure 8 ) is accomplished using two mechanisms, one, by initially rotating the computer-controlled grating. Once the beginning of the desired wavelength range has been reached, further scanning is obtained by rotating the computer-controlled scanning plate. To complete a scan, the scanning plate is stopped and a measurement of the intensity is made at the wavelength where the plate was stopped. Then, the wavelength is incremented (typically by about 0.01 nm) by rotating the plate and another measurement of the intensity is made and so on until the desired spectral range has been covered (typically ~0.100 nm). In other words, a scan is accomplished using a step-measure-and-repeat process. The dots in the “example spectral scan” ( Figure 8 ) indicate intensity measurements at each step, the presentation software simply “connects-the-dots”. Such experimentally obtained spectral scans were used for both ANNs and PLS.

Wavelength shift (misalignment): When repeatedly scanning the same spectral window, it was discovered that scans were offset from each other. An example is shown in Figure 9a . To address the effect of this spectrometer limitation, a fixed amount of a reference element was added to the A and I mixtures. The reference element was selected so that (where possible) its spectral line was separated from the analyte and from the interfering peaks. Subsequently, the spectral scans were corrected by manually aligning them with respect to the marker peak ( Figure 9a ). An example is shown in Figure 9b .

Figure 9.

Wavelength shift (misalignment) due to scanning plate ( Figure 8 ) reset-errors. (a) Experimentally obtained signal response from Zn (Analyte) + Cu (Interferent) plus V (added to generate a marker peak). And (b) same as (a) but after manual alignment of spectral scans. See text for discussion.

The intensity ratio of A:I ranged between 1:1, 1:0.01, and 1:0.01, for both simulations and experimental spectral scans. A:I ratios higher than 1:1 were not tested because they were deemed too unrealistic for practical analytical applications. If A:I is higher than 1:1, an alternative spectral line (if possible) should be used or, the sample may have to undergo some form of chemical separation (per Section 1).

5. Results and discussion

From the large number of experiments that were run ( Tables 2 and 3 ), for brevity, only a few results will be included here (and are briefly discussed below).

5.1. Effect of A:I intensity ratio on predictive ability of ANNs and PLS for different Δλ

The predictive performance of ANNs and PLS was found to be inter-dependent on two key variables, specifically A:I ratio and Δλ. Experimental results summarizing absolute |Error| are shown in Figure 10 .

Figure 10.

Predictive ability of ANNs and PLS using only four pairs (selected for clarity) of overlapping spectral lines (scans obtained experimentally) with their Δλ ranging between 0 and 15 pm.

It can be concluded that ANNs (with an average prediction error of the concentration of an Analyte (A) of ~4.1%) performed equally as well as PLS (with an average prediction error of ~4.4%).

5.2. Effect of added white (random) noise on prediction of A:I ratios

To avoid the effect of 1/f noise, only simulated spectral scans were used for noise-studies. Noise levels tested were as high as an (unrealistic for practical applicability) 18%. Results are shown in Figure 11 .

Figure 11.

Predictive ability of ANNs and PLS for six pairs of simulated spectral lines with their Δλ ranging from 0 to 15 pm and with % added random level noise added ranging from 0 to 18% (with two traces overlapping).

The predictive ability of ANNs (on the average) was 5.0%, and of PLS was (on the average) 5.1%. As expected, prediction errors increased as noise levels increased. A key difference between ANNs and PLS is that PLS did poorly when Δλ was 0 pm (even at very low noise levels). Interestingly, from the simulated spectral scans when 0% noise was added ( Figure 11 ) both ANNs and PLS had a prediction errors (on the average) of less than 1%, essentially the predictions were error free. Thus, it can be concluded that (likely) predictive ability was noise-depended (or noise-limited).

6. Conclusions (when using a large-size ICP spectrometer)

Shallow depth ANNs for spectral interference correction were experimentally evaluated using a large-size ICP spectrometer with a scanning monochromator ( Figure 8 ) that had resolution typical of commercial systems. The ability of ANNs to predict the concentration of an Analyte (A) in a mixture of A with an Interferent (I) was used a key figure-of-merit and it was studied extensively ( Figures 10 and 11 ). To validate predictive ability, predicted A concentrations by ANNs were compared with those obtained by PLS. Using experimental spectral scans, the average prediction error for ANNs was 4.1% and for PLS was 4.4%. Simulations were used to understand the origin of prediction errors for both of these methods. The average errors in predictive ability for simulated spectral scans and for Analyte (A) by ANNs was 5.0% and for PLS was 5.1%. The higher errors obtained when using simulations over those obtained when using experimentally obtained spectral scans is likely due to use of high (by experimental standards) levels of noise. When low levels of noise were used, the prediction errors were less than 1% ( Figure 11 ). In other words the predicted concentrations by both methods were essentially error free. Clearly, methods capable of better discriminating between signals and noise are desirable.

ANNs may also find applicability in portable, miniaturized systems that can be used for “taking part of the lab to the sample” types of applications. Due to the short focal length of portable spectrometers employed in miniaturized systems, such spectrometers suffer from significant spectral overlaps (but not from wavelength shift). Interference using miniaturized systems will be discussed next.

7. Spectral interference correction in miniaturization using ANN-based, deep learning approaches

7.1. Why miniaturization?

We have been developing and characterizing miniaturized plasmas in the form of microplasmas [41, 42, 43, 44, 45, 46, 47, 48] that we fabricated using a variety of fabrications technologies [49, 50, 51, 52] for taking part of lab to the sample types of applications [53, 54, 55, 56, 57, 58, 59, 60]. Microplasmas are arbitrarily defined as those with one critical dimension in the micrometer regime. To enable use of a non-thermal, non-equilibrium microplasma with an optical emission spectrometer outside of a lab, use of a portable spectrometer is also required.

It is expected that a miniaturized system of the type shown in Figure 12 will find wide applicability in many analytical situations. For example, in environmental monitoring for testing of water quality on-site (e.g., drinking water, lake water, river water, and ground water). Water monitoring is significant to this geographical area because Ontario alone has 250,000 lakes. Outside of Ontario (and Canada), and as already mentioned, Arsenic (As) in water wells affects more than 100 million inhabitants worldwide. Other examples of potential environmental uses include air quality monitoring (e.g., sick building syndrome), safety and security. In clinical analysis, such a portable, “shoe-box size” instrument could be used to test Pb (Lead) concentrations in blood (in the US alone, every child under the age of 7 has to be tested for potential elevated Pb concentrations in their blood); Na and K have to be measured in blood (Na and K concentration-determinations are mandatory in any medical checkup); Li (Lithium) in blood must be measured in certain cases because Li-containing medications are often prescribed as mood stabilizers. In space exploration, a system of the type shown in Figure 12 may be used in studies involving Ca loss in bones during space flight. This limited set of examples has been included here to highlight potential applicability of miniaturized, microplasma-based optical emission spectrometry systems.

Figure 12.

Illustration of a miniaturized microplasma-based optical emission system. Such a system may find use for chemical analysis on-site. The ability to obtain analytical results on-site and in (near) real-time is in stark contrast to the traditional sample collection-in-a-field and chemical analysis-in-a-lab approaches. On-site analysis has the potential to alter the traditional chemical analysis paradigm in which samples are collected in a field and are brought to a lab for analysis.

An example of a battery-operated, 3D-printed microplasma coupled to a portable, fiber-optic emission spectrometer is shown in Figure 12 . All components shown in Figure 12 can fit inside a shoe box.

For a system of the type shown in Figure 12 , in addition to requiring a battery-operated, light-weight, miniaturized plasma source (for use on-site rather than in a lab), the spectrometer must also be portable (generally meaning that it must have a short focal length). However, as the focal length of the optical spectrometer is reduced (e.g., from 75 cm as shown in Figure 8 to about 12.5 cm as shown in Figure 12 ), resolution decreases and spectral overlaps become more prevalent. An example is shown in Figure 13 .

Figure 13.

Left frame: Spectral window showing two Cu lines (acquired using a long focal length spectrometer, for example, one shown in Figure 8 ) demonstrating baseline resolution. Right frame, spectra from 200 to 850 nm acquired using a spectrometer with a short focal length ( Figure 12 , StellarNet co, http://www.stellarnet.us/). Right frame (insert): In this case, the same Cu lines cannot be resolved. Each 200–850 nm spectrum (or full spectral scan) is 2048 data points.

Some specifics: in optical emission spectrometry, resolution (R = λλ) is defined as the ability to baseline resolve two closely spaced wavelengths ( Figure 13 , left frame). As already mentioned, for use on-site, a portable optical spectrometer (with a short focal length) is required. But as focal length decreases, resolution degrades. And as resolution degrades, spectral overlaps become more severe, thus making spectral interference correction essential. To demonstrate the severity of spectral overlaps and the need for spectral interference correction, the superimposed spectra of Europium (Eu), Strontium (Sr), and Lead (Pb) shown in Figure 14 and will be used as an example.

Figure 14.

Spectral overlaps observed when using Eu (solid line), Sr (dashed line), and Pb (dotted line).

Shown in Figure 14 are three spectra obtained by introducing individually (or separately) into a microplasma ( Figure 12 ) equal concentrations of Eu, Pb, and Sr. and to facilitate discussion, the spectra have been superimposed and graphed together. In a hypothetical sample containing these three elements, Sr. is the analyte and Eu and Pb are the interferents. The most intense Sr. line (at ~460 nm) is directly overlapped by two Eu lines (at ~459 nm). The second most intense Sr. line (at around 407 nm) is overlapped by the most intense Pb line (at 405 nm), and the third most intense Sr. line (~ 421 nm) is also directly overlapped by a Eu spectral line. Clearly in this example interference free determinations of Sr. are not possible, thus making spectral interference correction essential. In this laboratory, an ambitious goal of using ANN-based deep learning approaches is being pursuit. Deep learning approaches can handle larger amounts of data (as opposed to shallow depth ANNs), and they are claimed to have improved abilities to distinguish signals from noise, thus likely having improved predictive abilities.

7.2. ANN-based deep learning

Similar to conventional or shallow depth ANNs (i.e., those with a few network layers, Figure 4 ), deep learning neural nets (or deep neural networks, or DNNs for short), have numerous hidden layers. An example is shown in Figure 15a . A key advantage of deep learning [61, 62, 63, 64, 65] ANNs versus shallow depth ANNs is that in deep learning the network continues to learn as the amount of data increases, thus increasing its learning and (likely) predictive abilities (thus reducing %|Error| of prediction). In sharp contrast, shallow depth ANNs (e.g., with a few layers) although they may outperform deep learning when using relatively small data sets, their learning and predictive abilities plateau ( Figure 15b ).

Figure 15.

(a) Deep learning neural nets with every neuron connected to all neurons in neighboring layers (where n maybe in the millions of layers). (b) Sketch of predictive (and learning) abilities of shallow depth and deep learning ANNs versus the amount data used in a training set.

At present, deep learning is receiving significant attention. A limited number of examples include IBM’s Watson [66], a reportedly $24 billion investment so far); Google is offering a time-limited free access to cloud machine learning [67]; Mobile Eye is marketing their advanced driver assist system (ADAS) claiming that it has been installed in many self-driving cars [68]; NVIDIA is marketing graphics processing units (GPUs) with deep learning abilities [69]; Intel, a semiconductor fabrication house has a sizable investment on deep learning [70]; OpenText [71] is using deep learning in the name of Magellan to rival IBM’s Watson; Qualcomm is a fabrication house of CPUs for smartphones offers a software kit for neural networks [72]; Samsung is developing deep learning approaches for heath applications [73]; Noah’sArkLab [74] funded by Huawei (a telecom company), is heavily investing in deep learning; Microsoft is offering a (currently free) “cognitive toolkit” for deployment of deep learning approaches [75]; and Apple is offering developers machine learning tools [76] for inclusion into any iOS app. Apple is also publishing their Machine Learning Journal. Overall and one way or another, all of these companies are either using or promoting use of ANN-based deep learning.

We are experimenting with deep learning for spectral interference correction using a miniaturized, portable spectrometer (of the type shown in Figure 12 ) using data of the type shown in Figures 13 and 14 .

8. Overall conclusions

ANNs proved effective at addressing the key problem of spectral interference encountered in optical emission spectrometry. Due to the relatively small number of data points used (e.g., 60–160), training and validation times did not become bottlenecks. To obtain method validation (as is typical in chemical analysis), ANNs were compared to PLS. It was concluded that the predictive ability of both methods was at ~5% and that both methods were noise-limited. Thus, our attention has now been turned to ANN-based deep learning approaches that are reported to have improved abilities to distinguish signals from noise. Deep learning is being evaluated for use in miniaturized systems (with short focal length, portable spectrometers) in which spectral interference is typically more severe than those of long focal length, large-size spectrometers. It is expected that application of deep learning approaches has the potential to lead to portable chemical analysis instruments that are “smaller, cheaper, smarter and faster” at producing precise and accurate analytical results on-site [53]. On-site analysis capabilities have the potential to cause a paradigm shift in classical chemical analysis (caption of Figure 12 ) by allowing practioners “to bring part of the lab to the sample” so that analytical results can be obtained in-situ and in (near) real-time. Although large-size and miniaturized plasma-based instruments were used as an ANN application example, it is expected that the ideas presented here will have wider applicability to include non-plasma-based chemical analysis instruments regardless of their size.

Acknowledgments

Financial assistance from Natural Sciences and Engineering Research Council (NSERC) of Canada is gratefully acknowledged. A special thank you to Dr. Harold Szu for the numerous discussions we have had (VK) on ANNs during several SPIE conferences.

© 2017 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Z. Li, X. Zhang, G. A. Mohua and Vassili Karanassios (December 20th 2017). Artificial Neural Networks (ANNs) for Spectral Interference Correction Using a Large-Size Spectrometer and ANN-Based Deep Learning for a Miniature One, Advanced Applications for Artificial Neural Networks, Adel El-Shahat, IntechOpen, DOI: 10.5772/intechopen.71039. Available from:

chapter statistics

684total chapter downloads

4Crossref citations

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Solar Radiation Prediction Using NARX Model

By Ines Sansa and Najiba Mrabet Bellaaj

Related Book

First chapter

Introductory Chapter: Electric Machines for Smart Grids and Electric Vehicles Applications

By Adel El-Shahat

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us