Application of partial least squares (PLS) method to electroanalytical data.

## Abstract

Electroanalytical techniques consist of the interplay between electricity and chemistry, namely the measurement of electrical quantities, such as charge, current or potential and their relationship to chemical parameters. Electrical measurements for analytical purposes have found a lot of applications including industry quality control, environmental monitoring and biomedical analysis. Chemometrics is the chemical discipline that uses mathematical and statistical methods to design or select optimal procedures and experiments and to provide maximum chemical information by analysing chemical data. The use of chemometrics in electroanalytical chemistry is not as popular as in spectroscopy, although recently, applications of these methods for mathematical resolution of overlapping signals, calibration and model identification have been increasing. The electroanalytical methods will be improved with the application of chemometrics for simultaneous quantitative prediction of analytes or qualitative resolution of complex overlapping responses. This chapter focuses on applications of first-, second- and third-order multivariate calibration coupled with voltammetric data for quantitative purposes and has been written from both electrochemical and chemometrical points of view with the aim of providing useful information for the electrochemists to promote the use of chemometrics in electroanalytical chemistry.

### Keywords

- chemometrics
- electroanalytical chemistry
- voltammetry
- quantification
- multivariate calibration

## 1. Introduction

### 1.1. Chemometrics

Chemometrics uses statistics, mathematics and formal logic: (i) to provide maximum relevant chemical information by analysing chemical data, (ii) to design or to select optimal experimental procedures and (iii) to obtain knowledge about chemical systems [1]. Chemometric analyses have received good acceptance over the past 20 years due to the study of complex samples by improving existing analytical methods. There is an impressive research related to the development and testing of multivariate algorithms applied to difficult chemical scenarios [2, 3]. The main reason is that the second‐ and higher‐order data are able to deal with unwanted interferences in contrast to zero‐ and first‐order calibrations [4]. Modelling of the unwanted interferences that are not included in the calibration set allows us to accurate determination of the calibrated analytes even in the presence of uncalibrated interferences. The property is a ‘second‐order advantage’ [5], which has a great potential in multicomponent analysis. Second‐ and third‐order multivariate calibrations are gaining widespread acceptance by the analytical community due to their variety of second‐ and third‐order instrumental data that are being produced by modern instruments and due to their appeal from the analytical standpoint.

### 1.2. Electroanalytical methods

Electroanalytical methods can monitor an analyte by measuring the potential or current in an electrochemical cell containing the analyte [6–9]. It is well known that electrochemical analysis has been benefited from the electronics revolution in two ways: (i) the development of neater, faster and simpler and arguably, competitively affordable instrumentation and (ii) the applicability for rapid analysis. In addition, electrochemistry has a wide range of analytical methods, e.g. polarography, potentiometry, coulometry and voltammetry, which can provide a wide concentration range (from ppb to mg L^{−1} levels).

### 1.3. Chemometrics in electroanalytical chemistry

The use of chemometrics in electroanalytical chemistry is still in its infancy, and for many years, the application of chemometrics to electroanalytical data had been quite scarce as compared to the case of spectroscopic techniques. The limited use of chemometrics in electroanalytical chemistry is related to the relationship between mathematics and electroanalytical chemistry. In this case, the fundamental corpus are: (i) a hypothetical physicochemical picture of the processes, the transport phenomena and the nature of the measurements; (ii) the numerical solution of the mathematical formulation and (iii) the interpretation of the electroanalytical data and determination of concentrations and constants or whatever [10, 11]. This approach is usually called as hard‐modelling, a common approach in electrochemical investigations and by the electrochemists, and is regarded as the real approach. Postulation of a theoretical physicochemical model is difficult because the transport phenomenon, the electrode process and perturbation by an excitation signal are complex. In these cases, the other types of approaches that can provide quite more information about the systems are required. This alternative or complementary approach can be provided by chemometrics, and it is based on extracted results from statistical analysis of the data. This approach is well known as soft‐modelling approach.

Chemometrics has different applications in electrochemical analyses, such as experimental design and optimization, data treatments, sample classification, calibration for determination of concentrations and model identification.

In general, over the past two decades, instrumentation has been significantly developed. This is especially the case for electrochemical instruments. Such progress has provided increasing opportunities for the application of chemometrics in electrochemical analyses. Among the most relevant multivariate methods, multi‐way algorithms play an important role in numerous analytical fields [12, 13]. Developing an electrochemical technique by chemometric methods may supply a valuable resource for accurate analyte quantification when the absolute separation is not accomplished, or unexpected components are present in the sample being analysed. Chemometrics will be useful when it is coupled to multi‐way calibration, for instance, applying excitation‐emission [14], high‐performance liquid chromatography with diode array detection (HPLC‐DAD) [15–17], flow injection analysis‐ diode array detection (FIA‐DAD) [18, 19], liquid chromatography‐attenuated total internal reflectance‐Fourier transform infrared spectrometry (LC‐ATR‐FTIR) [20], liquid chromatography‐diode array detection‐mass spectrometry (LC‐DAD‐MS) [20], pH‐DAD [21, 22], DAD‐kinetics [23] and differential pulse voltammetry [24].

### 1.4. Required information

#### 1.4.1. Calibration

According to the international union of pure and applied chemistry (IUPAC), calibration is, in a general sense, ‘an operation that relates an output quantity to an input quantity for a measuring system under given conditions’ [25, 26]. The input quantities of our primary interest, i.e. in analytical calibration, are the concentrations of a sample constituent of interest (the analyte), while the output quantities are analytical signals or responses delivered by analytical instruments (a spectrometer, chromatograph, voltammetric equipment, etc.). Therefore, in this chapter, calibration means the operation of relating instrumental signals to analyte concentrations.

#### 1.4.2. Univariate calibration

A specific case of the general calibration process is the one relating the content of a single analyte in a sample to a single value of an instrumental signal and is called ‘univariate calibration'. In analytical chemistry, univariate calibration employs a calibration curve as a general method for the determination of the concentration of a constituent in an unknown sample [26].

#### 1.4.3. Multivariate calibration

A more general calibration process involves the relationship between the concentrations of various constituents in a test sample and multiple measured responses, i.e. multivariate instead of univariate [26, 27]. In contrast to univariate calibration, which works with a single instrumental response measured for each experimental sample, multivariate calibration works with many different signals for each sample. Depending on the instrumental setup, the delivered data for a single sample may have different degrees of complexity. The simplest multivariate data are those produced in vector form, i.e. as a series of responses, which can be placed on top of each other to generate a mathematical object known as a column vector. This object is also referred as having a single ‘mode’ or ‘direction’. Multivariate calibration using vectorial data has given rise to a highly fruitful analytical field that today is routine in many industrial laboratories and process control units [28]. Multiple analytes can be determined simultaneously in the presence of others, possibly unknown constituents, provided they have been properly taken into account during the calibration phase [29, 1].

#### 1.4.4. Multi‐way calibration

Multi‐way calibration is based on many instrumental signals per sample, which can be meaningfully organized into a certain mathematical object with higher modes than a vector, for example, as a data table or matrix [26, 30].

The most important advantage of multi‐way calibration is the fact that analytes can be determined in the presence of unexpected constituents in test samples. It is called the ‘second‐order advantage’.

Multi‐way calibration has interesting advantages relative to other calibration methods. One is the increase in sensitivity, because the measurement of redundant data tends to decrease the relative impact of the noise in the signal. Selectivity does also increase, because each new instrumental mode, which is added to the data, contributes positively to the overall selectivity. Still another one is the possibility of obtaining qualitative interpretation of chemical phenomena through the study of multi‐way data, in a much better way than with univariate or first‐order data.

#### 1.4.5. Nomenclature for data and calibrations

In algebraic jargon, a scalar is a zeroth‐order object, a vector is first order, a matrix is second order, etc. A nomenclature exists for the different calibrations, based on the measurement of data of various orders for a single sample: zeroth‐order calibration is equivalent to univariate calibration, first‐order multivariate calibration is equivalent to calibration with vectorial data per sample, second‐order multivariate calibration is equivalent to calibration with matrix data per sample and third‐order multivariate calibration is equivalent to calibration with three‐dimensional data arrays per sample. The list may continue with data arrays with additional modes per sample [26]. On the other hand, with data for a group of samples it is possible to create an array having an additional mode, the sample mode. For example, univariate measurements for several samples can be grouped to form a vector, first‐order data can be placed adjacent to each other to create a matrix, etc. On the basis of above fact, an alternative nomenclature has been developed, in which calibrations are named according to the number of ways (modes) of an array for a sample set. Thus, zeroth‐order calibration is also one‐way calibration, first order is two‐way, second order is three‐way, etc., and it is customary to name all calibration methodologies involving second‐ and higher‐order data (i.e. three‐way and beyond) as multi‐way calibration, which is thus a subdivision of multivariate calibration. Figure 1 provides a compact view of the hierarchy of data and calibrations.

### 1.5. Linearity and nonlinearity of second‐ and third‐order data

#### 1.5.1. Second‐order data

#### 1.5.1.1. Trilinear data

When a second‐order data array is processed, it is vital to meet the so‐called trilinearity condition. A three‐way data array can be modelled by the following expression:

where *N* is the total number of chemical constituents generating the measured signal, *ain* is the relative concentration or score of component *n* in the *i*th sample, and *bjn* and *ckn* are the intensities in the instrumental channels (or data dimensions) *j* and *k*, respectively. The values of *Eijk* are the elements of the three‐dimensional array **E**, representing the residual error, and having the same dimensions as **X**. The column vector *an* is collected in the scores matrix **A**, whereas vectors *bn* and *cn* are collected in the loading matrices **B** and **C** (usually *bn* and *cn* are normalized to unit length). For a three‐dimensional data array, the signal must be linearly related to concentration and the component profiles must be constant across the different samples [31].

#### 1.5.1.2. Non‐trilinear data

To evaluate the linearity of a three‐way data array should first consider its basic ingredients, i.e. the individual data matrices and whether they are bilinear or not. In case they are bilinear, a further subdivision can be made on the existence and number of trilinearity‐breaking modes: (i) when one of the data modes is non‐reproducible and breaks the trilinearity, the data are not trilinear, but can be unfolded into a bilinear augmented matrix and (ii) when both data modes are trilinearity breaking, the data are not trilinear and cannot be unfolded into a bilinear augmented matrix. To distinguish these two latter non‐trilinear data types, we propose to call them non‐trilinear Type 1 and non‐trilinear Type 2, respectively. Finally, in case the individual matrices are non‐bilinear, we have a fourth data type that we may call non‐trilinear Type 3. There is no point in further dividing Type 3 data according to the number of non‐reproducible modes, since the former are neither trilinear nor unfoldable to an augmented bilinear matrix [26, 31]. Figure 2 illustrates the classification of three‐way data for a sample set.

#### 1.5.2. Third‐order data

#### 1.5.2.1. Quadrilinear data

Quadrilinearity of a four‐way array can be defined by the extension of Eq. (1), including an additional mode: the sample mode. A four‐way data array obtained by ‘joining’ three‐dimensional data arrays for a sample set is a quadrilinear if its elements can be thought to be obtained through:

where all symbols are as in Eq. (1), with *d*_{in} describing the changes in constituent concentrations along the sample mode. A requirement for quadrilinearity of a data array for a sample set is that the three instrumental profiles for each constituent are equal for all samples [26].

#### 1.5.2.2. Non‐quadrilinear data

Quadrilinearity may be lost if one or more modes behave as quadrilinearity‐breaking mode, in the sense that constituent profiles change from sample to sample along this mode. In the present case, there might be one, two, or three quadrilinearity‐breaking modes. Hence, a pertinent classification of non‐quadrilinear third‐order/four‐way data would be in types 1, 2 and 3, respectively. On the other hand, intrinsically non‐trilinear data for each sample for reasons of mutual correlations among the phenomena in the different data modes will be classified as non‐quadrilinear of type 4 [26]. Figure 3 illustrates a classification tree.

### 1.6. Algorithms

#### 1.6.1. First‐order algorithms

The standard models are principal component regression (PCR) and partial least‐squares (PLS) analyses, although a number of different algorithms such as continuum power regression (CPR), multiple linear regression‐successive projections algorithm (MLR‐SPA), robust continuum regression (RCR), partial robust M‐regression (PRM), polynomial‐PLS (PLY‐PLS), spline‐PLS (SPL‐PLS) and radial basis function‐PLS (RBF‐PLS) exist. However, when the data behave in a non‐linear manner with respect to the analyte, a different approach is needed, such as an artificial neural network (ANN) or a least‐squares support vector machine (LS‐SVM).

#### 1.6.2. Second‐order algorithms

Suitable algorithms for analysing second‐order data are parallel factor analysis (PARAFAC) [32], the generalized rank annihilation method (GRAM) [33], direct trilinear decomposition (DTLD) [34], multivariate curve resolution‐alternating least squares (MCR‐ALS) [35], bilinear least squares (BLLS) [36, 37] and alternating trilinear decomposition (ATLD) [38] and its variants (self‐weighted alternating trilinear decomposition (SWATLD) [39] and alternating penalty trilinear decomposition APTLD [40, 41]).

Rearranging the second‐order data to vectors and applying a first‐order algorithm such as unfolded‐principal component regression (U‐PCR) and unfolded‐partial least squares (U‐PLS) [42] is an alternative to working with second‐order data. Another alternative which is a genuine multi‐way method is multi‐way partial least squares (N‐PLS) [43]. These three methods can obtain the second‐order advantage by coupling of them to residual bilinearization (RBL) [44, 45].

#### 1.6.3. Third‐order algorithms

Suitable quadrilinear models for third‐order data are including PARAFAC, trilinear least‐squares (TLLS) with residual trilinearization (RTL) [46] and alternating penalty quadrilinear decomposition (APQLD) [47]. However, models allowing for deviations of multilinearity in one way or another are including PARAFAC2 (a variant of PARAFAC that allows profile variations in one of the data dimensions from sample to sample) [48], PARALIND (PARAFAC for linearly dependent systems) [49], MCR‐ALS [50], non‐bilinear rank annihilation (NBRA) [51], bilinear least squares (BLLS) extended to linearly dependent systems [22], U‐PLS [42], N‐PLS [52], non‐linear kernel‐PLS [53] and artificial neural networks (ANN) [54, 55]. To achieve the second‐order advantage, BLLS, PLS and ANN should be combined with RTL [44–46, 55–59].

### 1.7. Generation of second‐ and third‐order electrochemical data

Differential pulse voltammetry (DPV) is the most frequently used technique for generation of second‐ and third‐order electrochemical data. The second‐ or third‐order data could be obtained via changing one or two of the instrumental parameters of DPV [60]. The theory behind the proposed procedure will be briefly discussed. The current signal intensity in DPV can be obtained using the following equations [61]:

where O and Red are species involved in the electrode reaction (Eq. (1)), *n* is the number of electrons involved in the electrode reaction, *F* is the Faraday's constant, *A* is the electrode area, *D*_{O} and *D*_{Red} are the diffusion coefficients of O and Red species, respectively, **CO***is the concentration of O species at the electrode surface, *R* is the gas constant, *T* is the temperature, Δ*E*, *E* and **E0′** are the pulse height, potential and formal potential of the electrode, respectively, *τ* and **τ'** are, the pulse duration or pulse time and starting time of potential pulse, respectively. For an electrochemical reaction, a data vector can be produced by sweeping the potential at constant Δ*E* and *τ*. Applying a different Δ*E* and sweeping potential at the constant *τ*, produces different data vectors. By the same way, third‐order voltammetric data could be obtained by sweeping potentials at different pulse durations and pulse heights [62]. Literature survey shows that changing Δ*E* can cause non‐linearity in DPV data while changing *τ* does not cause non‐linearity [60].

### 1.8. Data pre‐processing

#### 1.8.1. Shift correction or data alignment

Linearity is a property assumed by multivariate linear calibration algorithms. However, in many electroanalytical situations, slight deviations from linearity could be observed such as in the presence of interactions among components. Generally, non‐linear signals can cause signal shifts, peak broadening or increase of the peak height. Any of these problems hinders the application of multi‐linear data processing algorithms. Such problems become more complicated when signals are overlapping. Therefore, aligning the voltammograms is an important step that should be performed before the application of multi‐linear algorithms. The data alignment is based on digitally moving a voltammogram towards a reference voltammogram, with certain objective function such as correlation coefficient, residual fit, similarity index, etc. which indicate the quality of the matching process. The most algorithms used for data alignment require a reference voltammogram, to which all the remaining ones are aligned. Suboptimal choice of the template could affect the alignment results [63]. The most frequently used alignment algorithms are correlation optimized warping (COW) [64], interval correlation optimised shifting (icoshift) [65], Gaussian peak adjustment (GPA) [66], Gaussian peak adjustment with transversal constraints (GPA2D) [67], asymmetric logistic peak adjustment (ALPA) [68], *shiftfit* [69, 70] and *pHfit* [71].

#### 1.8.1.1. COW

For understanding a detailed description about the mathematical aspects of COW algorithm the reader is referred to Ref. [64].

#### 1.8.1.2. icoshift

For understanding a detailed description about the mathematical aspects of icoshift algorithm the reader is referred to Ref. [65].

#### 1.8.1.3. shiftfit, pHfit, GPA, GPA2D and ALPA

The *shiftfit* corrects the data matrix from the signal movements and for this purpose, it optimises by least squares of the potential shift of every pure voltammogram with respect to a reference position. To correct the potential shift, there are *peakmaker*, *shiftcalc* and *shiftfit* functions. *Peakmaker* generates a Gaussian peak as an initial estimation of the pure voltammograms. The *shiftcalc* function displaces every signal in every experimental voltammograms matrix for a given potential shift Δ*E*. The *shiftfit* function iteratively optimises the values of Δ*E* to generate a matrix (*I*_{cor}) in which all signals remain at the fixed potentials stated in the pure voltammograms matrix (*V*_{o}) [69, 70].

The *pHfit* algorithm can solve more intricate systems like those encountered in voltammetric pH titrations by imposing a shape restriction to the movements of the signals as a function of potential, by means of adjustable sigmoid or linear functions [71].

After *shiftfit* and *pHfit*, the GPA algorithm based on a new strategy, parametric signal fitting (PSF), was proposed for the chemometric analysis of voltammetric data when the pure signals do not maintain a constant shape [66]. This is based on the fitting of parametric functions to reproduce the shape of the signals. As a first approach, two Gaussian functions are fitted, one at each side of the signal, and the parameters are least‐squares optimised. Such parameters determine not only the height and position of the signals (as in the algorithms above) but also the width at both sides of the maximum. It is important to note that, unlike *shiftfit* and *pHfit*, the use of Gaussian functions restrict the GPA exclusively to peak‐shaped signals. Moreover, it must be remarked that, despite fitting of Gaussian peaks which has been already used in some situations such as the resolution of UV‐vis spectra, in such approaches the symmetric character of the Gaussian function prevents an appropriate treatment of asymmetric signals. In the proposed method, the use of two separated Gaussian functions at both sides of the maximum (sharing the same height and position but different widths) is a new and simple solution for the fitting of asymmetric peaks. Figure 4 summarizes the main steps of the fitting procedure.

A new method, GPA2D, was developed as a significant improvement of the GPA which includes, for the first time, transversal constraints to increase the consistency of the resolution along the different signals of a voltammetric dataset [67]. The aim of GPA2D is to extract a physicochemical sense to the evolution of the signals and their shifts along the experimental axis. The imposition of the transversal constraints makes this method more powerful for the analysis of voltammetric data, especially if they are non‐bilinear. Figure 5 shows the main structure of the operation program, which is based on a common GPA procedure with two alternative intermediate paths depending on the kind of transversal constraint to be applied (signal shift evolution or equilibrium).

The asymmetric logistic peak adjustment (ALPA) was developed as a new function for the PSF of highly asymmetric electrochemical signals in non‐bilinear datasets or in the presence of irreversible electrochemical processes [68]. Figure 6 summarizes the main steps of the fitting procedure.

#### 1.8.2. Baseline correction

Baseline correction has been considered as a critical step for enhancing the signals and reducing the complexity of the analytical data [72, 73]. Considering this aim, Eilers et al. [74] have introduced an algorithm for baseline elimination based on asymmetric least squares splines regression (AsLSSR) approach. Details of the implementation of the mentioned method can be found in the literature [74, 75].

## 2. Applications of first‐order multivariate calibration

Tables 1–3 summarize applications of first‐order multivariate calibration with different first‐order algorithms to electroanalytical data.

Technique | Application | Refs. |
---|---|---|

DPASV | Determination of Tl and Pb | [76] |

PSA (at Au electrode) | Determination of As in the presence of Cu and Sn | [77] |

DPP | Determination of furaltadone, furazolidone and nitrofurantoin | [78] |

DPV | Determination of 2‐(3)‐t‐butyl‐4‐methoxyphenol and propyl gallate | [79] |

NPP, DPP | Determination of furazolidone and furaltadone | [80] |

DPP | Determination of sulfadiazine, sulfamerazine and sulfamethazine | [81] |

SWV, SWAdSV | Determination of sulphamethoxypyridazine and trimethoprim in veterinary formulations | [82] |

DPP | Determination of Cu, Pb, Cd and Zn | [83] |

LSV | Determination of indomethacin and acemethacin | [84] |

ASV | Determination of Tl and Pb | [85] |

DPP | Determination of Pb, Cd and Sn(IV) | [86] |

LSV, CV, DC, DPP | Determination of propylgallate, butylated hydroxyanisole and butylated hydroxytoluene | [87] |

DPASV | Determination of Cu in the presence of Fe | [88] |

DPP | Variable selection for the determination of benzaldehyde and of Cu, Pb, Cd and Zn | [89] |

DPASV | Variable selection for the determination of the binary mixtures Tl/Pb and Cu/Fe(III) | [90] |

SWV, DPV | Determination of paraquat and diquat | [91] |

DPAdSV | Speciation of Cr (determination of Cr(III) and Cr(VI)) | [92, 94] |

FIA‐ED | Determination of 4‐nitrophenol, phenol and p‐cresol | [93] |

DPV | Determination of the anti‐inflammatory drugs indomethacin, acemethacin, piroxicam and tenoxicam | [95] |

LSV | Determination of mixtures of vapours (ethanol, acetaldehyde, acetylene, SO_{2}, NO_{2}, NO, O_{3}) | [96] |

DPAdSV | Determination of Al and Cr(VI) | [97] |

DPV | Determination of nordihydroguaiaretic acid | [98] |

DC, DPV, SWV | Determination of tocopherols in vegetable oils | [99] |

NPP (at UME array) | Monitoring of Staphylococcus aureus population | [100] |

NPP (at UME array) | Monitoring of Escherichia coli ATCC 13706 and Pseudomonas aeruginosa ATCC 27853 population | [101] |

CV | Determination of cysteine, tyrosine and tryptophan | [102] |

DPAdSV | Determination of In | [103] |

Technique | Application | Refs. |
---|---|---|

DPASV | Determination of Pb, Cd, Tl and In | [104] |

PSA (at Au electrode) | Determination of Cu, Zn, Cd and Pb | [105] |

DPP | Monitoring of freshness of milk (by an electronic tongue) | [106] |

DPV | Determination of Cu | [107] |

NPP, DPP | Determination of adenine and cytosine | [108] |

DPP | Determination of Cu and Mo | [109] |

SWV, SWAdSV | Determination of ethanol, fructose and glucose | [110] |

DPP | Determination of vitamins B6 and B12 in | [111] |

LSV | Determination of Cu | [112] |

ASV | Determination of ethanol, methanol, fructose and glucose | [113] |

DPP | Determination of nalidixic acid and its metabolite 7‐hydroxymethylnalidixic acid | [114] |

Chemometrical technique | Electrochemical technique | Application | Refs. |
---|---|---|---|

MLR/PLS | DPASV | Study of influence of pH and Ca in metal/fulvic interactions | [115] |

CLS/PLS/PCR | AdSV | Determination of synthetic colorants | [116] |

CLS/ILS/KF | SWASV | Determination of Pb, Cd, In and Tl | [117] |

CLS/PLS/PCR/MLR | DPP, NPP | Determination of Pb, Cd, Cu, Ni and V | [118] |

CLS/PLS/PCR/MLR | ASV | Determination of Pb, Cd, Cu and Zn | [119] |

PLS/NL‐PLS/PCR | CV | Determination of tryptophan in feed samples | [120] |

PLS/NL‐PLS/PCR/MLR/ANN | DuPSV | Determination of ethanol, fructose and glucose | [121] |

PLS/PCR | CV | Determination of cysteine, tyrosine and tryptophan | [122] |

PLS/ANN | DPV | Determination of catechol and hydroquinone at C fiber electrode | [123] |

PLS/ANN | DC, DPP | Determination of atrazine/simazine and terbutryn/prometryn | [124] |

CLS/PLS/PCR | LSV | Determination of synthetic food antioxidants | [125] |

CLS/PLS/PCR/MLR | DPSV | Determination of chlorpromazine and promethazine hydrochloride | [126] |

CLS/PLS/PCR/MLR | DPSV | Determination of five nitro‐substituted aromatic compounds | [127] |

PLS/PCR | ASV | Determination of Pb, Cd, In and Tl | [128] |

CLS/PLS/MCR‐ALS | ASV | Determination of Pb, Cd, In and Tl | [129] |

PLS/PCR/ANN | ASV | Determination of Pb and Tl | [130] |

PLS/PCR | DPSV | Determination of paracetamol and phenobarbital in pharmaceuticals | [131] |

CLS/PCR/PLS/KF/ANN | DPSV | Determination of parathion, fenitrothion and parathion | [132] |

PCR/PLS/GA‐PLS/ANN | DPV | Determination of cysteine, tyrosine and tryptophan | [133] |

CLS/PCR/PLS/ANN | DPV | Determination of propoxur, isoprocarb, carbaryl and carbofuran | [134] |

HPCR/HPLS/CPCR/MBPLS | ACV | Determination of brightener in industrial Cu electroplating baths | [135] |

PLS/ANN | CV | Determination of isoniazid and hydrazine | [136] |

## 3. Applications of second‐ and third‐order multivariate calibrations

In an interesting work, Kooshki et al. generated three‐way DPV data at different pulse heights of 20–100 mV with a 20 mV interval and analyzed them by MCR‐ALS for determination of tryptophan (Trp) in the presence of tyrosine (Tyr) as an uncalibrated interference at the gold nanoparticles decorated multiwalled carbon nanotube modified glassy carbon electrode (Au NPs/GCE) [60]. The data were non‐bilinear; therefore, the *shiftfit* algorithm was used to correct the observed shift in the data. Figure 7 shows the potential shift correction of the augmented data for three standard Trp solutions (top left) and for a synthetic mixture solution containing Trp and Tyr (top right). These corrected data were augmented, and MCR‐ALS was performed on them. The results of the potential shift correction and the MCR‐ALS analysis for the determination of Trp in the synthetic mixtures confirmed that the analysis of the shift corrected data generates convergence with a low lack of fit value. Finally, they assessed the analytical utility of the proposed method by applying it to the determination of Trp in a fresh meat sample.

Galeano‐Diaz et al. have reported a work based on adsorptive stripping square wave voltammetry (Ad‐SSWV) for the simultaneous determination of fenitrothion (FEN) and its metabolites: fenitrooxon (OXON) and 3‐methyl‐4‐nitrophenol (3‐MET) in environmental samples [137]. These three compounds produced an electrochemical signal due to an adsorptive‐reductive process at hanging mercury drop electrode (HMDE). The electrochemical approach showed a very high overlap degree for FEN and OXON voltammograms. Second‐order multivariate calibration has been tested to solve the mixture of these three compounds. The accumulation time (*t*_{acc}) was chosen as the third variable (third way). The *t*_{acc} was varied in 5 s intervals and with the aim of increasing the total *t*_{acc} value without electrode saturation, the equilibration time was fixed at 5 s, and *t*_{acc} was varied from 5 to 25 s, thus five voltammograms for each sample were recorded. For the second‐order multivariate calibration N‐PLS/RBL, U‐PLS/RBL and PARAFAC have been tested, using the three‐way data intensity‐potential‐accumulation time. The U‐PLS/RBL model was stated as the best second‐order algorithm for the simultaneous determination of these three compounds. Finally, the proposed method was used to the analysis of river water samples as real cases and the results were encouraging.

Another interesting work entitled ‘second‐order data obtained from differential pulse voltammetry: determination of lead in river water using multivariate curve resolution‐alternating least‐squares (MCR‐ALS)’ was reported by Abdollahi et al. [24]. In this work, the MCR‐ALS has been applied to potential‐time second‐order data with the aim of achieving the electrochemical second‐order advantage. A simple way (change in pulse duration) was reported as the first approach towards generation of second‐order DPV data. A linear dependency exists in the pulse duration profiles of the electroactive species in the mixture samples. Rank deficiency of the mixture data matrix was broken by matrix augmentation. Due to existence of potential shift in the obtained data, MCR‐ALS could not be achieved the convergence on the augmented data. So, this shift was corrected with *shiftfit* program. Results of MCR‐ALS after shift correction show that the proposed method could be efficiently used for determination of Pb^{2+} in the presence of unexpected interferents in the river water sample.

Khoobi et al. coupled DPV with MCR‐ALS for simultaneous determination of betaxolol (Bet) and atenolol (Ate) at a multi‐walled carbon nanotube modified carbon paste electrode (MWCNT/CPE) [138]. Operating conditions were optimized with central composite rotatable design (CCRD) and response surface methodology (RSM). Then, the second‐order DPV data were generated at different pulse heights and after potential shifts correction by COW algorithm were analysed by MCR‐ALS. Figure 8 shows the resolved voltammograms of the seven mixtures of Bet and Ate that were applied in calibration curve, after using COW and MCR‐ALS procedures. Finally, the developed method was successfully applied to simultaneous determination of Bet and Ate in human plasma.

In an interesting work by Khoobi et al. the MCR‐ALS was used for determination of dopamine (DA) in the presence of epinephrine (EP) using second‐order DPV data at different pulse heights on a carbon paste electrode modified with gold nanoparticles (AuNPs/CPE) [139]. The CCRD was employed to generate an experimental programme to model the effects of different parameters on voltammetric responses and the RSM was applied to show the individual and interactive effects of variables on the responses. The voltammograms of the samples were then collected into a column‐wise augmented data matrix and subsequently analyzed by MCR‐ALS. The effect of rotational ambiguity associated with a particular MCR‐ALS solution under a set of constraints was also studied. With the aid of MCR‐BANDS method, the absence of rotational ambiguity was verified. Finally, by the developed methodology, satisfactory results were obtained for the determination of DA in the presence of EP in spiked human blood plasma samples.

Ghoreishi et al. have reported a work based on coupling of three‐way calibration with second‐order DPV data for simultaneous quantification of sulfamethizole (SMT) and sulfapyridine (SPY) [140]. After finding the optimized values of the variables which affected the voltammetric responses, potential shift corrected by COW was used for further processing by MCR‐ALS. Finally, the method was applied for simultaneous determination of SMT and SPY in spiked human serum and urine samples.

Masoum et al. generated second‐order electrochemical data by changing the pulse height as an instrumental parameter [141]. After potential shift correction, MCR‐ALS results showed that second‐order calibration could be applied with great success for (+)‐catechin determination in the presence of gallic acid at the surface of the multi‐walled carbon nanotubes modified carbon paste electrode. The ability of the proposed method was evaluated using (+)‐catechin determination in the presence of gallic acid in a green tea sample. In this study, fixed size moving window‐evolving factor analysis (FSMW‐EFA) [142] was used for the determination of pure variables, zero concentration and selective regions. Result of FSMW‐EFA is shown in Figure 9. There are two curves higher than the noise level at peak region in FSWM‐EFA plot. In this plot, regions that do not have any curves higher than the noise level are zero regions; regions that have one curve higher than the noise level are pure region (part 1 for gallic acid and part 2 for (+)‐catechin) and regions that have two or more curves higher than the noise level are overlapped region (part 1 + 2). As shown in Figure 9, both components have selective region, so selectivity constraint can be applied. The solution to the problem of rank deficiency was the combined analysis of the rank deficient matrix with other matrices in the column direction that can have the suitable information to detect the presence of the hidden components [143].

A work entitled ‘application of Fe doped ZnO nanorods‐based modified sensor for determination of sulfamethoxazole (SMX) and sulfamethizole (SMT) using chemometric methods in voltammetric studies’ has been reported by Meshki et al. [144]. In this work, the second‐order DPV data have been produced by changing the pulse heights and after potential shift correction with the help of COW algorithm they further processed by MCR‐ALS for exploiting second‐order advantage. The potential shift correction was carried out on a column‐wise augmented data matrix that contained 13 calibration set of SMX and SMT. Then MCR‐ALS was performed on the new augmented data and lack of fit was reduced and was better than that obtained in the absence of potential shift correction. Finally, the application of the proposed method was examined for simultaneous determination of SMX and SMT in human blood serum and urine samples.

Jalalvand et al. have reported a work for generation of second‐order DPV data based on changing the pulse heights and application of them for simultaneous quantification of norepinephrine (NE), paracetamol (AC) and uric acid (UA) in the presence of pteroylglutamic acid (FA) as an uncalibrated interference at an electrochemically oxidized glassy carbon electrode (OGCE) [145]. In this work, several second‐order calibration models based on ANN‐RBL, U‐PLS/RBL, N‐PLS/RBL, MCR‐ALS and PARAFAC2 were used to exploiting second‐order advantage to identify which technique offers the best predictions. The baseline of the DPV signals was corrected by asymmetric least squares spline regression (AsLSSR) algorithm and the observed shifts were corrected using COW algorithm. All the algorithms achieved the second‐order advantage and were in principle able to overcome the problem of the presence of unexpected interference. Comparison of the performance of the applied second‐order chemometric algorithms confirmed the more superiority of U‐PLS/RBL to resolve complex systems (see Figure 10). The results of applying U‐PLS/RBL for the simultaneous quantification of the studied analytes in human serum samples were also encouraging.

Mora Diez et al. have reported the work to develop a method based on DPV coupled to second‐order data modelling with MCR‐ALS and U‐PLS/RBL for the quantitation of the pesticide ethiofencarb in the presence of fenobucarb and bendiocarb as interferences in tap water [146]. In this study, the possibility of second‐order multivariate calibration was studied by using the hydrolysis time as the third variable, and MCR‐ALS and U‐PLS/RBL. Asymmetric least squares background correction adapted to second‐order data was used to remove the baseline of the data (Figure 11A). Figure 11B shows the voltammograms retrieved by MCR‐ALS for all the three mentioned components in validation sample number 1. As can be appreciated there exists a high degree of overlapping among analyte and interferents signals. In addition, Figure 11C shows the corresponding time evolution profiles in this particular sample and the ones retrieved for three ethiofencarb standard samples. The areas under the kinetic profiles were used to build a calibration curve that allowed them to obtain the concentration of ethiofencarb in the validation samples. After model building by U‐PLS/RBL, the outputs of MCR‐ALS and U‐PLS/RBL were compared by elliptical joint confidence region (EJCR) method and EJCR confirms the better performance of U‐PLS/RBL than MCR‐ALS.

Granero et al. have reported a work based on three‐way calibration with second‐order square wave voltammetric (SWV) data for simultaneous determination of ascorbic acid, uric acid, and dopamine in the presence of glucose (interfering species) in lyophilized human serum samples [147]. The second‐order data were baseline‐ and shift‐corrected by AsLSSR and COW algorithms, respectively, and then modelled by U‐PLS/RBL second‐order algorithm. Finally, the developed analytical method was successfully applied to determine ascorbic acid, uric acid, and dopamine in lyophilized human serum samples.

Jaworski et al. have reported a work related to the application of multi‐way chemometric techniques for the analysis of AC voltammetric data [148]. In this study, three multi‐way calibration techniques have been applied for determining the suppressor concentration in industrial copper electrometallization baths used in semiconductor manufacturing. PARAFAC for multi‐way array decomposition coupled with inverse least squares (ILS) regression (PARAFAC/ILS), DTLD coupled with ILS (DTLD/ILS), and multilinear partial least squares (NPLS) regression were employed to develop and test calibration models based on trilinear AC voltammetric data. The hardships associated with the physical interpretation of very complex AC voltammograms were tackled by the use of powerful chemometric tools which played a significant role in the revival of interest in real‐life applications of AC based electroanalytical techniques.

Recently, an interesting work has been published by Jalalvand et al., which reports coupling of four‐way multivariate calibration with third‐order DPV data [62]. To achieve this goal, the DPV response of each sample was recorded 36 times. Six current‐potential matrices were recorded at six different pulse durations. Each matrix consists of six vectors which have been recorded at six different pulse heights. The three‐way data array obtained for the calibration set and for each of the test samples were joined into a single four‐way data array. The recorded data were baseline‐corrected by AsLSSR and the data array was nonlinear, thus, the non‐linearities were tackled by potential shift correction using COW algorithm (see Figure 12) and subsequently was analysed with U‐PLS/RTL and N‐PLS/RTL as third‐order multivariate calibration algorithms. A comprehensive and systematic strategy for comparing the performance of the two algorithms was presented in this work, in particular with a view of practical applications. This comparison was developed to identify which algorithm offers the best predictions for the simultaneous determination of levodopa (LD), carbidopa (CD), methyldopa (MD), acetaminophen (AC), tramadol (TRA), lidocaine (LC), tolperisone (TOP), ofloxacin (OF), levofloxacin (LOF) and norfloxacin (NOF) in the presence of benserazide (BA), dopamine (DP) and ciprofloxacin (COF) as uncalibrated interferences using a multi‐walled carbon nanotubes modified glassy carbon electrode (MWCNTs/GCE). This study demonstrated the more superiority of U‐PLS/RTL to resolve the complex systems. The results of applying U‐PLS/RTL for the simultaneous determination of the studied analytes in human serum samples as experimental cases were also encouraging.

## 4. Conclusions

Multi‐dimensional data are being abundantly produced by modern analytical instrumentation, calling for new and powerful data‐processing techniques. Research in the last two decades has resulted in the development of a multitude of different processing algorithms, each equipped with its own sophisticated artillery. Going from univariate data (a single datum per sample, employed in the well‐known classical univariate calibration) to multivariate data (data arrays per sample of increasingly complex structure and number of dimensions) is known to provide a gain in sensitivity and selectivity, combined with analytical advantages which cannot be overestimated. Nowadays, chemometrics is essential to exploiting the extraordinary potential of modern analytical instruments. This has been widely demonstrated with different types of signals. Electroanalytical chemistry cannot ignore this dominant trend. Modern electrochemical instrumentation provides reliable and reproducible data that are the basis of analytical methods with very low quantitation limits. Electrochemical methods are very interesting techniques for coupling with multi‐way calibration because they provide excellent and low‐cost opportunities for accurate and reliable determination of analyte(s) and because of the existence of instrumental parameters; they are very suitable for generating second‐ and third‐order data. The second‐order advantage, achieved with second‐ (or higher‐) order sample data, allows one not only to mark new samples containing components which do not occur in the calibration phase but also to model their contribution to the overall signal, and most importantly, to accurately quantitate the calibrated analyte(s). Voltammetric measurements assisted by multi‐way calibration are producing increasingly complex data structures, whose appropriate chemometric processing opens new dimensions in analytical studies. Improved sensitivity and selectivity, the possibility of analyte quantitation in the presence of uncalibrated interferents, and the possibility of obtaining qualitative interpretation of chemical phenomena through the study of multi‐way data, in a much better way than with univariate or first‐order data, are some of the advantages which can be achieved. The most problems with voltammetric data for coupling with multi‐way calibration are the baseline of the signals and sample‐to‐sample potential shifts in the analyte profiles and for tackling these problems chemometric tools can show an interesting power. The chemometric algorithms such as COW, icoshift, *shiftfit*, *pHfit*, GPA, GPA2D and ALPA can be used for correcting the shifts and the baseline of the signals could be removed with AsLSSR as a power chemometric tool. On the whole, voltammetric measurements assisted by multi‐way calibration are gaining attention of the scientists and we hope this review will help to promote the use of multi‐way calibration in electroanalytical chemistry.

## Acknowledgments

ARJ wishes this chapter to be useful for the electrochemists to promote the use of chemometrics in electrochemistry.