Process Monitoring Using Data-Based Fault Detection Techniques: Comparative Studies

Mohammed Ziyan Sheriff; Chiranjivi Botre; Majdi Mansouri; Hazem
Nounou; Mohamed Nounou; Mohammad Nazmul Karim

doi:10.5772/67347

Abstract

Data based monitoring methods are often utilized to carry out fault detection (FD) when process models may not necessarily be available. The partial least square (PLS) and principle component analysis (PCA) are two basic types of multivariate FD methods, however, both of them can only be used to monitor linear processes. Among these extended data based methods, the kernel PCA (KPCA) and kernel PLS (KPLS) are the most well-known and widely adopted. KPCA and KPLS models have several advantages, since, they do not require nonlinear optimization, and only the solution of an eigenvalue problem is required. Also, they provide a better understanding of what kind of nonlinear features are extracted: the number of the principal components (PCs) in a feature space is fixed a priori by selecting the appropriate kernel function. Therefore, the objective of this work is to use KPCA and KPLS techniques to monitor nonlinear data. The improved FD performance of KPCA and KPLS is illustrated through two simulated examples, one using synthetic data and the other using simulated continuously stirred tank reactor (CSTR) data. The results demonstrate that both KPCA and KPLS methods are able to provide better detection compared to the linear versions.

Keywords

principal component analysis
partial least squares
kernels
fault detection
process monitoring

Author Information

Show +

Mohammed Ziyan Sheriff
- Artie McFerrin Department of Chemical Engineering, College Station, TX, USA
- Chemical Engineering Program, Texas A&M University at Qatar, Doha, Qatar
Chiranjivi Botre
- Artie McFerrin Department of Chemical Engineering, College Station, TX, USA
Majdi Mansouri
- Electrical and Computer Engineering Program, Texas A&M University at Qatar, Doha, Qatar
Hazem Nounou
- Electrical and Computer Engineering Program, Texas A&M University at Qatar, Doha, Qatar
Mohamed Nounou*
- Chemical Engineering Program, Texas A&M University at Qatar, Doha, Qatar
Mohammad Nazmul Karim
- Artie McFerrin Department of Chemical Engineering, College Station, TX, USA

*Address all correspondence to: mohamed.nounou@qatar.tamu.edu

1. Introduction

Process monitoring is an essential aspect of nearly all industrial processes, often required both to ensure safe operation and to maintain product quality. Process monitoring is generally carried out in two phases: detection and diagnosis. This chapter focuses only on the fault detection aspect. Fault detection methods can be categorized using a number of different methodologies. One popular method of categorization is into quantitative model-based methods, qualitative model-based methods, and data (process history)-based methods [1–3]. Figure 1 illustrates a general schematic of fault detection phase.

Figure 1.
Schematic illustration of detection phase.

Quantitative model-based methods require knowledge of the process model, while qualitative model-based methods require expert knowledge of the given process. Hence, data-based methods are often used as they require neither prior knowledge of the process model nor expert knowledge of the process [4].

Data-based monitoring methods can be further classified into input model-based methods and input-output model-based methods. Input model-based methods only require the data matrix of the input process variables, while input-output model-based methods require both the input and output data matrices in order to formulate a model and carry out fault detection [5]. Input model-based methods are sometimes utilized when the input-output models cannot be formed due to the high dimensionality and complexity of a system being monitored [6]. However, input-output model-based methods do have the added advantage of being able to detect faults in both the process and the variables [5].

Principal component analysis (PCA) is a widely used input model-based method that has been used for monitoring a number of processes including air quality [7], water treatment [8], and semiconductor manufacturing [9]. On the other hand, partial least squares (PLS) are an input-output model-based method that has been applied in chemical processes to monitor online measurement variables and also to monitor and predict the output quality variable [10]. PLS has been applied for the monitoring of distillation columns, batch reactors [11], continuous polymerization processes [12], and other similar industrial processes, which are described by input-output models. However, both PCA and PLS are fault detection techniques that only work reasonably well with linear data. PCA and PLS have been extended to handle nonlinear data by utilizing kernels to transform the data to a higher dimensional space, where linear relationships between variables can be drawn. The extensions kernel principal component analysis (KPCA) and kernel partial least squares (KPLS) have both shown improved performance over the conventional PCA and PLS techniques when handling nonlinear data [5, 13]. T2 and Q charts are commonly used as fault detection statistics. In the literature, it has been seen that T2 test is less effective fault detection technique compared to Q statistic; this is because T2 test can only represent variation of the data in the principle component and not in residue of the model [14].

In our previous works [5, 13, 15], we addressed the problem of fault detection using linear and nonlinear input models (PCA and kernel PCA) and input-output model (PLS and kernel PLS)-based generalized likelihood ratio test (GLRT), in which PCA, kernel PCA, PLS, and kernel PLS methods are used for modeling and the univariate GLRT chart is used for fault detection. In the current work, we propose to use the PCA, kernel PCA, PLS, and kernel PLS methods for multivariate fault detection through their multivariate charts Q and T2. The fault detection performance is evaluated using two examples, one using simulated synthetic data and the other utilizing a simulated continuous stirred tank reactor (CSTR) model.

The remainder of this chapter is organized as follows. Section 1 introduces linear PCA and PLS, along with the fault detection indices used for these methods. Section 2 then describes the idea of using kernels for nonlinear transformation of data, along with the kernel fault detection extensions: KPCA and KPLS. In Section 3, two illustrative examples are presented, one using simulated synthetic data and the other utilizing a simulated continuous stirred tank reactor. At the end, the conclusions are presented in Section 4.

2. Conventional linear fault detection methods

Before constructing either the PCA or PLS models, data are generally preprocessed to ensure that all process variables in the data matrix are scaled to zero mean and unit variance. This step is essential as different process variables are usually measured with varying standard deviations and means and often using different units.

2.1. Principal component analysis (PCA)

Consider the following input data matrix, X ∈ R^n × m, where m and n represent the number of process variables and the number of observations, respectively. After preprocessing the data, single value decomposition (SVD) can be utilized to express the input data matrix as follows:

X=TPTE1

where T=t1,t2,t3…tm∈Rn×m is a matrix of the transformed variables, where each column represents the score vectors or the transformed variables, and P=p1,p2,p3…pm∈Rm×m is a matrix of the orthogonal vectors, where each column is also known as loading vectors, and these are eigenvectors that are associated with the covariance matrix of the input data matrix X. The covariance matrix can be computed as follows [13]:

∑=1n−1XTX=PΛPTwithPPT=PTP=ImE2

where Λ=diagλ1λ2…λm is a diagonal matrix that contains the eigenvalues that are related to the m principal components λ1>λ2>…>λm, and I_m is the identity matrix [16]. It should be noted that the model built by PCA uses the same number of principal components as the original number of process variables in the input data matrix (m). However, since many industrial processes may contain process variables that are highly correlated, a smaller number of principal components can be utilized to capture the variation in the process data [6]. The quality of the model built by PCA is dictated by the number of principal components obtained. Overestimating the number could introduce noise that may mask important features in the data, while underestimating the number could decrease the prediction ability of the model [17].

Therefore, selection of the number of principal components is vital, and several methods have been developed for this purpose. A few popular techniques are cumulative percent variance (CPV) [13], scree plot and profile likelihood [18], and cross validation [19]. CPV is commonly utilized due to its computational simplicity and because it provides a good estimate of the number of principal components that need to be retained for most practical applications. CPV can be computed as follows [13]:

CPVl=∑i=1lλitraceΣ×100E3

CPV is used to select the smallest number of principal components that represents a certain percentage of the total variance (e.g., 99%). Once the number of principal components to retain is determined, the input data matrix can then be expressed as [13]:

X=TP=[ T^ T˜ ][ P^ P˜ ]TE4

where T^∈Rn×l and T˜=∈Rn×m−l represent the matrices containing the l retained principal components and the ignored (m − l) principal components, respectively. Likewise, the matrices that contain the l retained eigenvectors and the ignored (m − l) eigenvectors are represented by P^∈Rm×l and P˜∈Rm×m−l, respectively.

After expansion X can be expressed as [13]

X=T^P^T+P˜T˜T=XP^P^T⏞X^+XIm−P^P^T⏞EE5

where matrix X^ is the modeled variation of X computed utilizing only the l retained principal components, while matrix E represents the residual space formed by variations that correspond to process noise.

The PCA model can be illustrated as shown in Figure 2.

Figure 2.
Schematic illustration of PCA.

2.2. Partial least squares (PLS)

PLS is a popular input-output technique used for modeling, regression and as a classification tool, which has been extended to fault detection purpose [20]. PLS includes process variables (X ∈ R^n × m) and the quality variables (Y ∈ R^n × p) with a linear relationship between input and output score vectors. Nonlinear iterative partial least square (NIPALS) algorithm developed by Word et al. is used to compute score matrices and loading vectors [21]:

X=TPT+E=∑i=1Mtipi+EY=UQT+F=∑j=1Mujqjt+FE6

where E ∈ R^n × m and F ∈ R^n × p are the PLS model residues; T ∈ R^n × M and U ∈ R^n × M are the orthonormal input and output score matrix, respectively; P and Q are the loading vectors of the input (X) and output (Y) matrices, respectively; m and n are the number of process variables and observations in input (X) matrix; p is the number of quality variables in output (Y) matrix; and M is the total number of latent variables extracted. NIPALS method is shown in Algorithm 1; X and Y matrixes are first standardized by mean centering and unit variance. NIPALS algorithm is initialized by assigning one of the columns of output matrix (Y) as output score vector (u); at each iteration t, u, p, and q are computed and stored; M latent variables are extracted.

Another modification to NIPALS algorithm has been published in the literature [22]. In other work, different variations of PLS technique have been stated. Qin et al. have presented recursive PLS model [23], where PLS model is updated with new training data set; MacGregor et al. [24] have developed multiblock PLS model to monitor subsection of process variables. While for process monitoring of batch processes, PLS has been extended to multiway PLS technique [25] to incorporate past batches in training data set.

PLS being an input-output type model can also be used as a regression tool, to predict quality variable (Y) from online measurement variable (X). From PLS, model input and output matrices are related by

Y=BX+GE7

The regression coefficient B is computed as shown in Eq. (8):

B=WPTW−1CTE8

Substituting weights, loading vector P and constant C from Algorithm 1:

B=XTUTTXXTY−1TTYE9

From Eqs. (7) and (9), the output matrix (Y) is predicted as

Ynew=XnewXTUTTXXTU−1TTY+GE10

Algorithm 1: Modified NIPALS algorithm

Initialized output score: u = y_i.E11
Weights regressed on X: w = u^TX/u^Tu.E12
Normalizing weight: w = w/‖w‖.E13
Weights R: r₁=w₁E14
rj=∏i=1j−1Im×m−wipjTwj,j>1.
Input Score vector: t = Xr/r^Tr.E15
Input loading vector: p = Xt^T/t^Tt.E16
Output loading vector: q = Yt^T/t^Tt.E17
Output score vector: u = Yq/q^Tq.E18
Normalizing weights, loading vectors and scores:
p=pnormp,w=w×normp,t=t×normp.E19
Input and output matrices are deflated:
X=X−tpTY=Y−tqTE20
Store latent score vectors in T and U, loading vectors in P and Q
Repeat steps 2 to 11 until M latent variables are computed

2.3. Fault detection indices

Different fault detection indices can be used for the linear PCA and PLS techniques. The two most popular indices are the T² and Q statistics. T² measures the variation of the model, while the Q statistic measures the variation in the residual space, and these statistics will be described next.

2.3.1. T² statistic

The T² statistic measures the variation in the principal components at different time samples and is defined as follows [26, 27]:

T2=XTP^Λ^P^TXE21

where Λ^=diagλ1λ2…λl is the diagonal matrix that contains the eigenvalues that are associated with the retained principal components. For testing data, a fault is declared when the T² value exceeds the value of the threshold as follows:

T2≥Tα2=n2−1lnn−lFl,n−lE22

where α is the level of significance, generally assigned a value between 90 and 99%, and F(a, n − a) is the critical value of the Fisher-Snedecor distribution with n and n−a degrees of freedom.

2.3.2. Q statistic

The Q statistic measures the projection of the data on to the residual subspace and allows the user to measure how well the data fit the PCA model. The Q statistic is defined as follows [16]:

Q=‖ X∼ ‖=‖ (I−P^P^T)X ‖2E23

For testing data, a fault is declared when the threshold value is violated as follows [16]:

Q≥Qα=φ1h0cα2φ2φ1+1+φ2h0h0−1φ12E24

where φi=∑j=l+1mλjii=1,2,3, h0=1−2φ1φ33φ22, where c_α is the value obtained from the normal distribution of significance α.

3. Nonlinear fault detection methods using kernel transformations

A popular nonlinear version of PCA and PLS is the projection of nonlinear data to a high-dimensional feature space, where the linear fault detection method is applied in the features space, F. The authors in Ref. [28] used projection of X for PLS response surface modeling using the quadratic function as the mapping function:

Φ:χ=R2→F=R3E25

However, it is difficult to know the accurate nonlinear transformation function for nonlinear data matrix to be linear in the feature space. According to Mercer’s theorem, orthogonal semi-positive definite function can be used to map the data into the feature space instead of knowing the explicit nonlinear function. This nonlinear function is called the kernel function and is defined as the dot product of the mapped data in the feature space:

kXiXj=ΦXiΦXjE26

Thus, kernel-based multivariate methods can be defined as nonlinear fault detection methods in which the input data matrix is mapped into high-dimensional feature space and developed linear models can be applied in the feature space for fault detection purposes.

Commonly used kernel functions are given below [29]:

Radial basis function (RBF):

KXY=exp−X−Y2cE27

Polynomial function:

KXY=XYdE28

Sigmoid function:

KXY=tanhβ0XY+β1E29

The next section describes the methodology of utilizing kernel transformations to extend linear PCA and PLS to the hyperdimensional space in order to carry out fault detection of nonlinear data.

3.1. Kernel principal component analysis (KPCA)

While PCA seeks to find the principal components by minimizing the data information loss in the input space, KPCA does this in the feature space (F). For KPCA learning using training data, X1,X2,…,Xn∈Rm, nonlinear mapping gives Φ: X ∈ ℜ^m → Z ∈ ℜ^h, where input data are extended into the hyperdimensional feature space, where the dimension can be very large and possibly infinite [30].

The covariance in the feature space can be computed as follows [31]:

CF=1n∑j=1nΦXjΦXjTE30

Similar to PCA, the principal components in the feature space can be found by diagonalizing the covariance matrix. In order to diagonalize the covariance matrix, it would be necessary to solve the following eigenvalue problem in the feature space [31]:

λv=CFvE31

where λ≥0 and represents the eigenvalues.

In order to solve the eigenvalue problem, the following equation is derived [32]:

nλα=KαE32

where K and α are the n × n kernel matrix and eigenvectors, respectively.

For test vector X, the principal components (t) are extracted projecting Φ(X) onto the eigenvectors v_k in the feature space where k = 1,…,l:

tk=vk,ΦX=∑i=1NαikΦXi,ΦXE33

It is important to note that before carrying out KPCA, it is necessary to mean center the data in the high-dimensional space. This can be accomplished by replacing the kernel matrix K with the following [32]:

K=K−1nK−K1n+1nK1nE34

where 1n=1n1⋯1⋮⋱⋮1⋯1∈Rn×n.

3.1.1. T² statistic for KPCA

Variation in the KPCA model can be found using T² statistic, which is the sum of normalized squared scores, computed as follows [31]:

T2=t1…tlΛ−1t1…tlTE35

where t_k is obtained from Eq. (33).

The confidence limit is computed as follows [31]:

Tl,n,α2ln−1n−lFl,n−l,αE36

3.1.2. Q statistic for KPCA

In order to compute the Q statistic, the feature vector Φ(X) needs to be reconstructed. This is done by projecting t_k into the feature space using v_k as follows [31]:

Φ^nX=∑k=1ntkvkE37

The Q statistic in the feature space can now be computed as [31]

Q=ΦX−Φ^lX2E38

The confidence limit of the Q statistic can then be computed using the following equation [31]:

Qαgχh2E39

This limit is based on Box’s equation, obtained by fitting the reference distribution obtained using training data, to a weighted distribution. Parameter g is the weight assigned to account for the magnitude of the Q statistic, and h represents the degree of freedom. Considering a and b the estimated mean and variance of the Q statistic, g and h are approximated using g = b/2a and h = 2a²/b.

3.2. Kernel partial least square (KPLS)

The KPLS methodology works by mapping the data matrices into the feature space and then applying the nonlinear partial least square algorithm and computing the loading and score vectors.

The mapped data points are given as

Φ:χ→Fχ→λ1φ1X,λ2φ2X,…,λnφnXE40

The kernel gram function can be used to map the data into the feature space instead explicitly using the nonlinear mapping function; this is called the kernel trick. Kernel gram function is defined as the dot product of the mapping function:

kXiXj=ΦXiΦXjE41

As with the KPCA algorithm, the kernel matrix has to be mean centered before applying the NIPALS algorithm using Eq. (34).

The input score matrix and weights are computed as

t=ϕ¯XTRR=ΦTUTTK¯U−1E42

Thus, the score matrix is given as [33]

t=KttZE43

where Z=UTTKU−1.

Now, the relationship between the input and output score matrices can be derived by combing Eqs. (15), (17), and (18):

t=XXTuE44

In the feature space, replace X by its image Φ:

t=ΦΦTuE45

Substituting the kernel gram function, K = ΦΦ^T, input and output scores are given by

t=Kuu=YtE46

After every iteration, input kernel (K) and output matrix (Y) are deflated as

ΦΦT=Φ−ttTΦΦ−ttTΦTE47

ΦΦ^T dot product is replaced by kernel gram function K:

K=K−ttTK−KttT+ttTKttTY=Y−ttTYE48

Let Xii=1n be the training data and Xjj=1n be the testing data, Φ(X_i) is the mapped training data, and Φ(X_j) is the mapped testing data. Kernel functions for the testing data are given as

Kt=KXXt=∑i=1nλiφiXλiφiXt=ΦX⋅ΦXt=ΦXTΦXtKtt=KXtXt=∑i=1nλiφiXtλiφiXt=ΦXt⋅ΦXt=ΦXtTΦXtE49

KPLS algorithm can also be used to predict output matrix Y from input matrix X as

Yt=ΦtBE50

Φ_t is mapped testing data in feature space from Xjj=1n, and B is the regression coefficient which is given as [34]

B=ΦTYTTKU−1TTYE51

Thus combining Eqs. (50) and (51), we get predicted output quality matrix:

Yt=KtUTTKU−1TTYE52

Algorithm 2: Kernel partial least square (KPLS) algorithm
Compute Kernel matrix: K. Kernel matrix is mean centered using Eq. (34). For first iteration, initialized score matrix: u = y_i. Calculate scores t and u using Eq. (46). Deflate K and Y, using Eq. (48) Score vectors t and u are stored in cumulative matrix T and U Repeat steps 1 to 6 to extract M latent variables.

3.2.1. T² statistic for KPLS

The T² statistic for KPLS can be computed as

Tt2=tTΛ−1tE53

where Λ=n−1−1TTT and the score matrix being orthonormal matrix T^TT = I, leading to Λ=n−1−1I. The score matrix is t=KttZ; hence the T² statistic is given by [33]

Tt2=n−1KttZZTKttE54

The threshold value for T² statistic is computed using the f-inverse distribution and is given by [35]

Tα2=g⋅finvαmhE55

where n and m are the total number of observations and variables in the input data matrix X, respectively, and

g=mn2−1nn−mh=n−mE56

3.2.2. Q statistic for KPLS

As with the other data-based models, the Q statistic computes the mean square error of the residue from the KPLS model:

Q=ϕ¯−ϕ¯¯˜2Q=ϕ¯Tϕ¯−2ϕ¯Tϕ¯¯˜+ϕ¯¯˜Tϕ¯¯˜E57

Substituting the kernel gram functions as the dot product of mapped points:

Q=Ktt−2KtTKZt+tTTTKTtE58

where Z=UTTKU−1.

The threshold value for the Q statistic under the significance level of α [36] is given by

Qα=gχα2hE59

where g and h are given by

g=varianceQ2×meanQh=2×meanQ2varianceQE60

A fault is declared in the system if the Q statistic value is higher than threshold value (Q_α) for new data set.

The following section demonstrates the implementation of the fault detection methods described above and analyzes the effectiveness of all techniques.

4. Illustrative examples

The effectiveness of the kernel extensions of PCA and PLS for fault detection purposes will be demonstrated through two illustrative nonlinear examples, using a simulated synthetic data set and a simulated continuous stirred tank reactor (CSTR).

4.1. Simulated synthetic data

Synthetic nonlinear data can be simulated through the following model [37]:

x1=u2+0.3sin2πu+ε1x2=u+ε2x3=u3+u+1+ε3E61

where u is a variable that is defined between −1 and 1 and ε_i is a variable of independent white noise distributed uniformly between −0.1 and 0.1. Training and testing data sets of 401 observations each are generated using the model above. The performance of KPCA and KPLS techniques is illustrated and compared to the conventional PCA and PLS methods for two different cases. In the first case, the sensor measuring the first variable x1 is assumed to be faulty with a single fault. In the second case, multiple faults are assumed to occur simultaneously in x1, x2, and x3.

Figure 3 shows the generated data.

Case 1

In this case, a single fault of magnitude unity is introduced between observations 200 and 250 in x1 in the testing data set. The Gaussian kernel was chosen to model the nonlinearity in the process data. The most common fault detection metrics used are the missed detection rate, the false alarm rate, and the out-of-control average run length (ARL₁). The missed detection rate is when a fault goes undetected in the faulty region, while the false alarm rate is when an observation is flagged as a fault in the non-faulty region. The false alarm and missed detection rates are also commonly referred to as Type I and Type II errors, respectively. ARL₁ is the number of observations, and it takes for a particular technique to flag a fault in faulty region and is used to assess the speed of a detection. The fault-free and faulty data are shown in Figures 4 and 5, respectively.

Figure 5.
Faulty data in the presence of single fault in x1.

The fault detection (FD) performance of PCA-, KPCA-, PLS-, and KPLS-based Q methods is shown in Figures 6 and 7 as well as Table 1. The results show that both KPCA and KPLS-based Q provide a better FD performance than the linear PCA- and PLS-based Q methods and are able to detect the faults with lower missed detection rates, false alarm rates, and ARL₁ values (see Table 1).

	Missed detection (%)	False alarm (%)	ARL₁
PLS-based Q statistic	90.1961	13.1429	36
KPLS-based Q statistic	3.9216	0	2
PCA-based Q statistic	100	7.4286	-
KPCA-based Q statistic	27.4510	5.1429	1

Table 1.

Summary of missed detection (%), false alarms (%), and ARL1 for case 1.

Figure 6.
Monitoring single fault using PCA- and KPCA-based Q methods—case 1.

Figure 7.
Monitoring single fault using PLS- and KPLS-based Q methods—case 1.

Case 2

In this case, a multiple faults of magnitude unity are introduced between observations 200 and 250 in x1, 100 and 150 in x2, and 385 and 401 in x3 in the testing data set (as shown in Figure 8).

Figure 8.
Faulty data in the presence of multiple faults in x1, x2, and x3.

The FD performance of the kernel PCA and kernel PLS methods is illustrated and compared to that of the conventional PCA and PLS methods using the Q statistic. The Q statistic was chosen for analysis, since it is often better able to detect smaller faults using the residual space and for simplicity of analysis as well. The fault detection performance of a particular process monitoring technique can be monitored using multiple fault detection metrics.

As can be seen through Figures 9(a) and 10(a) and Table 2, the conventional linear PCA and PLS techniques are unable to effectively capture the nonlinearity present in the data set, which leads to entire sets of faults going undetected for both the linear PCA and PLS techniques. However, as demonstrated in Figures 9(b) and 10(b), the KPCA and KPLS-based Q techniques are better able to detect the faults with lower missed detection rates, false alarm rates, and ARL₁ values than the linear PCA and PLS methods (as shown in Table 2). These improved results can be attributed to the fact that the kernel techniques are able to capture the nonlinearity in the hyperdimensional feature space, providing better detection especially in this case where there are multiple faults in the system.

	Missed detection (%)	False alarm (%)	ARL₁
PLS-based Q statistic	67.2269	0	58
KPLS-based Q statistic	6.7227	0	1
PCA-based Q statistic	42.8571	0	1
KPCA-based Q statistic	8.4034	2.8369	1

Table 2.

Summary of missed detection (%), false alarms (%), and ARL1 for case 2.

Figure 9.
Monitoring multiple faults using PCA- and KPCA-based Q methods using simulated synthetic data—case 2.

Figure 10.
Monitoring multiple faults using PLS- and KPLS-based Q methods using simulated synthetic data—case 2.

4.2. Simulated CSTR model

In order to effectively assess the performance of the kernel PCA and kernel PLS techniques, it is also necessary to examine the performance of the techniques using an actual process application as well. A continuous stirred tank reactor model can be used to generate nonlinear data, and the fault detection charts can be applied to test their performance.

4.2.1. CSTR process description

The dynamic for the CSTR that was utilized for this simulated example is represented as follows [5]:

∂CA∂t=FVCA0−CA−k0e−E/RTCA∂T∂t=FVT0−T+−ΔHρCPe−E/RTCA−qVρCpq=aFcb+1Fc+aFcb2ρcCpcT−TcinE62

where k₀, E, F, and V represent the reaction rate constant, activation energy, flow rates (both inlet and outlet), and reactor volume, respectively. The concentration of A in the inlet stream and of B in the exit stream is represented by C_A and C_B, respectively. The temperatures of the inlet stream and of the cooling fluid in the jacket are T_i and T_j, respectively. ΔH, U, A, ρ, and C_p represent the heat of reaction, overall heat transfer coefficient, area through which the heat transfers to the cooling jacket, density, and heat transfer coefficient of all streams, respectively.

Using the described CSTR model, 1000 observations were generated, which was assumed to be initially noise-free. Zero-mean Gaussian noise with a signal-to-noise ratio of 20 was used to contaminate the noise-free process observations, in order to replicate reality. Figure 11 shows the generated CSTR data. This data set was then split into training and testing data sets, of 500 observations each. Faults of magnitude 3σ were added to the temperature and concentration process variables in the testing data set, at three different locations: observations 101–150, 251–350, and 401–450. σ is the standard deviation of that particular process variables. Figures 12 and 13 show the unfaulty and faulty data, respectively. Similar to the previous example, the performance of kernel PCA and kernel PLS methods is compared to the conventional linear PCA and PLS methods using the Q statistic.

Figure 11.
Generated continuously stirred tank reactor (CSTR) data.

Figure 13.
Faulty data in the presence of multiple faults in temperature and concentration.

For this example, comparing the two conventional techniques, we can see that the PCA-based Q statistic is unable to all faults (see Figure 14 (a)), while the PLS-based Q model is able to better detect the faults (see Figure 15 (a)). However, the kernel PCA and kernel PLS-based Q techniques are able to provide result charts with lower missed detection rates, false alarm rates, and ARL₁ values than their corresponding conventional techniques (see Figures 14(b) and 15(b)). These improved results can once again be attributed to the kernel techniques being able to effectively capture the nonlinearity of the data in the hyperdimensional feature space. The FD results using the two examples showed that the kernel PLS-based Q provides a relative performance compared to the kernel PCA Q. This is because kernel PCA is an input space model and cannot take into consideration outcome measures and most chemical processes or many of them are usually described by input-output space models.

Figure 14.
Fault detection using PCA- and kernel PCA-based Q methods using CSTR data.

Figure 15.
Fault detection using PLS- and kernel PLS-based Q methods using CSTR data.

5. Conclusion

In this chapter, a nonlinear multivariate statistical techniques are used for fault detection. Kernel PCA and kernel PLS have been widely used to monitor various nonlinear processes, such as distillation columns and reactors. Thus, in the current work, both kernel PCA and kernel PLS methods are used for nonlinear fault detection of chemical process. A commonly used fault detection index is Q-square statistic, and it is used to detect fault in the system. The fault detection performance using linear and nonlinear input models (PCA and kernel PCA) and input-output models (PLS and kernel PLS) is evaluated through two simulated examples, synthetic data set and continuous stirred tank reactor (CSTR). Missed detection rate, false alarm rate, and ARL₁ are the parameters used to compare the fault detection techniques. The results of the two case studies showed that the kernel PCA and kernel PLS-based Q provide improved fault detection performance compared to the conventional PCA- and PLS-based Q methods.

Acknowledgments

This work was made possible by NPRP grant NPRP7-1172-2-439 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.

References

1. V. Venkatasubramanian, R. Rengaswamy, K. Yin, and S. N. Kavuri, “A review of process fault detection and diagnosis part I: Quantitative model-based methods,” Comput. Chem. Eng., vol. 27, pp. 293–311, 2003.
2. V. Venkatasubramanian, R. Rengaswamy, and S. N. Ka, “A review of process fault detection and diagnosis part II: Qualitative models and search strategies,” Comput. Chem. Eng., vol. 27, pp. 313–326, 2003.
3. V. Venkatasubramanian, R. Rengaswamy, K. Yin, and S. N. Kavuri, “A review of process fault detection and diagnosis: Part III: Process history based methods,” Comput. Chem. Eng., vol. 27, pp. 327–346, 2003.
4. Z. Ge, Z. Song, and F. Gao, “Review of recent research on data-based process monitoring,” Ind. Eng. Chem. Res., vol. 52, no. 10, pp. 3543–3562, 2013.
5. M. Mansouri, M. Nounou, H. Nounou, and N. Karim, “Kernel PCA-based GLRT for nonlinear fault detection of chemical processes,” J. Loss Prev. Process Ind., vol. 40, pp. 334–347, 2016.
6. Jolliffe, Ian. Principal component analysis. John Wiley & Sons, Ltd., 2002.
7. M. F. Harkat, G. Mourot, and J. Ragot, “An improved PCA scheme for sensor FDI: Application to an air quality monitoring network,” J. Process Control, vol. 16, no. 6, pp. 625–634, 2006.
8. J. P. George, Z. Chen, and P. Shaw, “Fault detection of drinking water treatment process using PCA and Hotelling’s T 2 chart,” pp. 970–975, 2009.
9. J. Yu, “Fault detection using principal components-based gaussian mixture model for semiconductor,” IEEE Trans. Semicond. Manuf., vol. 24, no. 3, pp. 432–444, 2011.
10. Y. Zhang, W. Du, Y. Fan, and L. Zhang, “Process fault detection using directional kernel partial least squares,” Ind. Eng. Chem. Res., vol. 54, no. 9, pp. 2509–2518, 2015.
11. P. Nomikos and J. F. MacGregor, “Multi-way partial least squares in monitoring batch processes,” Chemom. Intell. Lab. Syst., vol. 30, no. 1, pp. 97–108, 1995.
12. T. Kourti and J. F. J. F. MacGregor, “Process analysis, monitoring and diagnosis, using multivariate projection methods,” Chemom. Intell. Lab. Syst., vol. 28, pp. 3–21, 1995.
13. M. Mansouri, M. Z. Sheriff, R. Baklouti, M. Nounou, H. Nounou, A. Ben Hamida, and M. N. Karim, “Statistical fault detection of chemical process - comparative studies,” J. Chem. Eng. Process Technol., vol. 7, no. 1, pp. 1–10, 2016.
14. S. Joe Qin, “Statistical process monitoring: basics and beyond,” J. Chemom., vol. 17, no. 8–9, pp. 480–502, 2003.
15. J. E. Jackson and G. S. Mudholkar, “Control procedures for residuals associted with principal component analysis,” Technometrics, vol. 21, no. 3, pp. 341–349, 1979.
16. J. E. Jackson “Quality control methods for several related variables,” Technometrics, vol. 1, no. 4, pp. 359–377, 1959.
17. M. Zhu and A. Ghodsi, “Automatic dimensionality selection from the scree plot via the use of profile likelihood,” Comput. Stat. Data Anal., vol. 51, no. 2, pp. 918–930, 2006.
18. G. Diana and C. Tommasi, “Cross-validation methods in principal component analysis: A comparison,” Stat. Methods Appl., vol. 11, no. 1, pp. 71–82, 2002.
19. R. Rosipal and L. J. Trejo, “Kernel partial least squares regression in reproducing kernel hilbert space,” J. Mach. Learn. Res., vol. 2, pp. 97–123, 2002.
20. P. Geladi and B. R. Kowalski, “Partial least-squares regression: A tutorial,” Anal. Chim. Acta, vol. 185, pp. 1–17, 1986.
21. B. S. Dayal and J. F. MacGregor, “Recursive exponentially weighted PLS and its applications to adaptive control and prediction,” J. Process Control, vol. 7, no. 3, pp. 169–179, 1997.
22. S. J. Qin, “Recursive PLS algorithms for adaptive data modeling,” Comput. Chem. Eng., vol. 22, no. 4–5, pp. 503–514, 1998.
23. J. F. MacGregor, C. Jaeckle, C. Kiparissides, and M. Koutoudi, “Process monitoring and diagnosis by multiblock PLS methods,” AIChE J., vol. 40, no. 5, pp. 826–838, 1994.
24. T. Kourti, P. Nomikos, and J. F. MacGregor, “Analysis, monitoring and fault diagnosis of batch processes using multiblock and multiway PLS,” J. Process Control, vol. 5, no. 4, pp. 277–284, 1995.
25. H. Hotelling, “Analysis of a complex of statistical variables into principal components,” J. Educ. Psychol., vol. 24, no. 6, pp. 417–441, 1933.
26. U. Krüger and L. Xie, Statistical monitoring of complex multivariate processes: With applications in industrial process control. Chichester, West Sussex; Hoboken, N.J: Wiley, 2012.
27. S. Yin, S. X. Ding, A. Haghani, H. Hao, and P. Zhang, “A comparison study of basic data-driven fault diagnosis and process monitoring methods on the benchmark Tennessee Eastman process,” J. Process Control, vol. 22, no. 9, pp. 1567–1581, 2012.
28. S. Wold, N. Kettaneh-Wold, and B. Skagerberg, “Nonlinear PLS modeling,” Chemom. Intell. Lab. Syst., vol. 7, no. 1–2, pp. 53–65, 1989.
29. Rosipal, R. (2011). Nonlinear partial least squares: An overview. In Lodhi H. and Yamanishi Y. (Eds.), Chemoinformatics and Advanced Machine Learning Perspectives: Complex Computational Methods and Collaborative Techniques, pp. 169–189, 2011. ACCM, IGI Global. Retrieved from http://aiolos.um.savba.sk/~roman/Papers/npls_book11.pdf
30. S. W. Choi, C. Lee, J. M. Lee, J. H. Park, and I. B. Lee, “Fault detection and identification of nonlinear processes based on kernel PCA,” Chemom. Intell. Lab. Syst., vol. 75, no. 1, pp. 55–67, 2005.
31. J.-M. Lee, C. Yoo, S. W. Choi, P. A. Vanrolleghem, and I.-B. Lee, “Nonlinear process monitoring using kernel principal component analysis,” Chem. Eng. Sci., vol. 59, no. 1, pp. 223–234, 2004.
32. B. Schölkopf, A. Smola, and K.-R. Müller, “Nonlinear component analysis as a kernel eigenvalue problem,” Neural Comput., vol. 10, no. 5, pp. 1299–1319, 1998.
33. J. L. Godoy, D. A. Zumoffen, J. R. Vega, and J. L. Marchetti, “New contributions to non-linear process monitoring through kernel partial least squares,” Chemom. Intell. Lab. Syst., vol. 135, pp. 76–89, 2014.
34. R. Rosipal, “Kernel partial least squares for nonlinear regression and discrimination,” Neural Netw. World, vol. 13, no. 3, pp. 291–300, 2003.
35. M.-F. Harkat, S. Djelel, N. Doghmane, and M. Benouaret, “Sensor fault detection, isolation and reconstruction using nonlinear principal component analysis,” Int. J. Autom. Comput., vol. 4, no. 2, pp. 149–155, 2007.
36. S. A. Vejtasa and R. A. Schmitz, “An experimental study of steady state multiplicity and stability in an adiabatic stirred reactor,” AIChE J., vol. 16, no. 3, pp. 410–419, 1970.
37. Botre, Chiranjivi, et al. “Kernel PLS-based GLRT method for fault detection of chemical processes,” J. Loss Prevent. Process Industries, 43 (2016): 212–224.

[1] 1. V. Venkatasubramanian, R. Rengaswamy, K. Yin, and S. N. Kavuri, “A review of process fault detection and diagnosis part I: Quantitative model-based methods,” Comput. Chem. Eng., vol. 27, pp. 293–311, 2003.

[2] 2. V. Venkatasubramanian, R. Rengaswamy, and S. N. Ka, “A review of process fault detection and diagnosis part II: Qualitative models and search strategies,” Comput. Chem. Eng., vol. 27, pp. 313–326, 2003.

[3] 3. V. Venkatasubramanian, R. Rengaswamy, K. Yin, and S. N. Kavuri, “A review of process fault detection and diagnosis: Part III: Process history based methods,” Comput. Chem. Eng., vol. 27, pp. 327–346, 2003.

[4] 4. Z. Ge, Z. Song, and F. Gao, “Review of recent research on data-based process monitoring,” Ind. Eng. Chem. Res., vol. 52, no. 10, pp. 3543–3562, 2013.

[5] 5. M. Mansouri, M. Nounou, H. Nounou, and N. Karim, “Kernel PCA-based GLRT for nonlinear fault detection of chemical processes,” J. Loss Prev. Process Ind., vol. 40, pp. 334–347, 2016.

[6] 6. Jolliffe, Ian. Principal component analysis. John Wiley & Sons, Ltd., 2002.

[7] 7. M. F. Harkat, G. Mourot, and J. Ragot, “An improved PCA scheme for sensor FDI: Application to an air quality monitoring network,” J. Process Control, vol. 16, no. 6, pp. 625–634, 2006.

[8] 8. J. P. George, Z. Chen, and P. Shaw, “Fault detection of drinking water treatment process using PCA and Hotelling’s T 2 chart,” pp. 970–975, 2009.

[9] 9. J. Yu, “Fault detection using principal components-based gaussian mixture model for semiconductor,” IEEE Trans. Semicond. Manuf., vol. 24, no. 3, pp. 432–444, 2011.

[10] 10. Y. Zhang, W. Du, Y. Fan, and L. Zhang, “Process fault detection using directional kernel partial least squares,” Ind. Eng. Chem. Res., vol. 54, no. 9, pp. 2509–2518, 2015.

[11] 11. P. Nomikos and J. F. MacGregor, “Multi-way partial least squares in monitoring batch processes,” Chemom. Intell. Lab. Syst., vol. 30, no. 1, pp. 97–108, 1995.

[12] 12. T. Kourti and J. F. J. F. MacGregor, “Process analysis, monitoring and diagnosis, using multivariate projection methods,” Chemom. Intell. Lab. Syst., vol. 28, pp. 3–21, 1995.

[13] 13. M. Mansouri, M. Z. Sheriff, R. Baklouti, M. Nounou, H. Nounou, A. Ben Hamida, and M. N. Karim, “Statistical fault detection of chemical process - comparative studies,” J. Chem. Eng. Process Technol., vol. 7, no. 1, pp. 1–10, 2016.

[14] 14. S. Joe Qin, “Statistical process monitoring: basics and beyond,” J. Chemom., vol. 17, no. 8–9, pp. 480–502, 2003.

[15] 15. J. E. Jackson and G. S. Mudholkar, “Control procedures for residuals associted with principal component analysis,” Technometrics, vol. 21, no. 3, pp. 341–349, 1979.

[16] 16. J. E. Jackson “Quality control methods for several related variables,” Technometrics, vol. 1, no. 4, pp. 359–377, 1959.

[17] 17. M. Zhu and A. Ghodsi, “Automatic dimensionality selection from the scree plot via the use of profile likelihood,” Comput. Stat. Data Anal., vol. 51, no. 2, pp. 918–930, 2006.

[18] 18. G. Diana and C. Tommasi, “Cross-validation methods in principal component analysis: A comparison,” Stat. Methods Appl., vol. 11, no. 1, pp. 71–82, 2002.

[19] 19. R. Rosipal and L. J. Trejo, “Kernel partial least squares regression in reproducing kernel hilbert space,” J. Mach. Learn. Res., vol. 2, pp. 97–123, 2002.

[20] 20. P. Geladi and B. R. Kowalski, “Partial least-squares regression: A tutorial,” Anal. Chim. Acta, vol. 185, pp. 1–17, 1986.

[21] 21. B. S. Dayal and J. F. MacGregor, “Recursive exponentially weighted PLS and its applications to adaptive control and prediction,” J. Process Control, vol. 7, no. 3, pp. 169–179, 1997.

[22] 22. S. J. Qin, “Recursive PLS algorithms for adaptive data modeling,” Comput. Chem. Eng., vol. 22, no. 4–5, pp. 503–514, 1998.

[23] 23. J. F. MacGregor, C. Jaeckle, C. Kiparissides, and M. Koutoudi, “Process monitoring and diagnosis by multiblock PLS methods,” AIChE J., vol. 40, no. 5, pp. 826–838, 1994.

[24] 24. T. Kourti, P. Nomikos, and J. F. MacGregor, “Analysis, monitoring and fault diagnosis of batch processes using multiblock and multiway PLS,” J. Process Control, vol. 5, no. 4, pp. 277–284, 1995.

[25] 25. H. Hotelling, “Analysis of a complex of statistical variables into principal components,” J. Educ. Psychol., vol. 24, no. 6, pp. 417–441, 1933.

[26] 26. U. Krüger and L. Xie, Statistical monitoring of complex multivariate processes: With applications in industrial process control. Chichester, West Sussex; Hoboken, N.J: Wiley, 2012.

[27] 27. S. Yin, S. X. Ding, A. Haghani, H. Hao, and P. Zhang, “A comparison study of basic data-driven fault diagnosis and process monitoring methods on the benchmark Tennessee Eastman process,” J. Process Control, vol. 22, no. 9, pp. 1567–1581, 2012.

[28] 28. S. Wold, N. Kettaneh-Wold, and B. Skagerberg, “Nonlinear PLS modeling,” Chemom. Intell. Lab. Syst., vol. 7, no. 1–2, pp. 53–65, 1989.

[29] 29. Rosipal, R. (2011). Nonlinear partial least squares: An overview. In Lodhi H. and Yamanishi Y. (Eds.), Chemoinformatics and Advanced Machine Learning Perspectives: Complex Computational Methods and Collaborative Techniques, pp. 169–189, 2011. ACCM, IGI Global. Retrieved from http://aiolos.um.savba.sk/~roman/Papers/npls_book11.pdf

[30] 30. S. W. Choi, C. Lee, J. M. Lee, J. H. Park, and I. B. Lee, “Fault detection and identification of nonlinear processes based on kernel PCA,” Chemom. Intell. Lab. Syst., vol. 75, no. 1, pp. 55–67, 2005.

[31] 31. J.-M. Lee, C. Yoo, S. W. Choi, P. A. Vanrolleghem, and I.-B. Lee, “Nonlinear process monitoring using kernel principal component analysis,” Chem. Eng. Sci., vol. 59, no. 1, pp. 223–234, 2004.

[32] 32. B. Schölkopf, A. Smola, and K.-R. Müller, “Nonlinear component analysis as a kernel eigenvalue problem,” Neural Comput., vol. 10, no. 5, pp. 1299–1319, 1998.

[33] 33. J. L. Godoy, D. A. Zumoffen, J. R. Vega, and J. L. Marchetti, “New contributions to non-linear process monitoring through kernel partial least squares,” Chemom. Intell. Lab. Syst., vol. 135, pp. 76–89, 2014.

[34] 34. R. Rosipal, “Kernel partial least squares for nonlinear regression and discrimination,” Neural Netw. World, vol. 13, no. 3, pp. 291–300, 2003.

[35] 35. M.-F. Harkat, S. Djelel, N. Doghmane, and M. Benouaret, “Sensor fault detection, isolation and reconstruction using nonlinear principal component analysis,” Int. J. Autom. Comput., vol. 4, no. 2, pp. 149–155, 2007.

[36] 36. S. A. Vejtasa and R. A. Schmitz, “An experimental study of steady state multiplicity and stability in an adiabatic stirred reactor,” AIChE J., vol. 16, no. 3, pp. 410–419, 1970.

[37] 37. Botre, Chiranjivi, et al. “Kernel PLS-based GLRT method for fault detection of chemical processes,” J. Loss Prevent. Process Industries, 43 (2016): 212–224.

Process Monitoring Using Data-Based Fault Detection Techniques: Comparative Studies

Fault Diagnosis and Detection

Abstract

Keywords

Author Information

Mohammed Ziyan Sheriff

Chiranjivi Botre

Majdi Mansouri

Hazem Nounou

Mohamed Nounou*

Mohammad Nazmul Karim

1. Introduction

Figure 1.

2. Conventional linear fault detection methods

2.1. Principal component analysis (PCA)

Figure 2.

2.2. Partial least squares (PLS)

2.3. Fault detection indices

2.3.1. T2 statistic

2.3.2. Q statistic

3. Nonlinear fault detection methods using kernel transformations

3.1. Kernel principal component analysis (KPCA)

3.1.1. T2 statistic for KPCA

3.1.2. Q statistic for KPCA

3.2. Kernel partial least square (KPLS)

3.2.1. T2 statistic for KPLS

3.2.2. Q statistic for KPLS

4. Illustrative examples

4.1. Simulated synthetic data

Figure 3.

Figure 4.

Figure 5.

Table 1.

Figure 6.

Figure 7.

Figure 8.

Table 2.

Figure 9.

Figure 10.

4.2. Simulated CSTR model

4.2.1. CSTR process description

Figure 11.

Figure 12.

Figure 13.

Figure 14.

Figure 15.