Open access peer-reviewed chapter

An Improved Wavelet‐Based Multivariable Fault Detection Scheme

Written By

Fouzi Harrou, Ying Sun and Muddu Madakyaru

Reviewed: 03 April 2017 Published: 05 July 2017

DOI: 10.5772/intechopen.68947

From the Edited Volume

Uncertainty Quantification and Model Calibration

Edited by Jan Peter Hessling

Chapter metrics overview

1,700 Chapter Downloads

View Full Metrics


Data observed from environmental and engineering processes are usually noisy and correlated in time, which makes the fault detection more difficult as the presence of noise degrades fault detection quality. Multiscale representation of data using wavelets is a powerful feature extraction tool that is well suited to denoising and decorrelating time series data. In this chapter, we combine the advantages of multiscale partial least squares (MSPLSs) modeling with those of the univariate EWMA (exponentially weighted moving average) monitoring chart, which results in an improved fault detection system, especially for detecting small faults in highly correlated, multivariate data. Toward this end, we applied EWMA chart to the output residuals obtained from MSPLS model. It is shown through simulated distillation column data the significant improvement in fault detection can be obtained by using the proposed methods as compared to the use of the conventional partial least square (PLS)‐based Q and EWMA methods and MSPLS‐based Q method.


  • data uncertainty
  • multiscale representation
  • fault detection
  • data‐driven approaches
  • statistical monitoring schemes

1. Introduction

Monitoring chemical and environmental processes has increasingly attracted greater attention of researchers and practitioners for improving the quality of products and enhancing process safety. For example, detecting anomalies in chemical or environmental plants is expected to reflect not only on the productivity and profitability of these plants, but also on the safety of people [1, 2]. To enhance process operation, we should monitor the process in an efficient manner and correctly detect abnormality events that may result in any degradation of product quality, operation reliability, and profitability, in order that we can respond accordingly by making any necessary correction to the process. Fault detection and diagnosis represent two vital components of process monitoring (see Figure 1), during which abnormal events are first identified and then isolated to ensure that they can be appropriately handled [2, 3]. Generally, faults in modern automatic processes are difficult to avoid and may result in serious process degradations. Even small deviations in process parameters can result in lost time, and catastrophic failure can bring devastating health, safety, and financial consequences. Because of this, engineers must keep tweaking and improving the reliability of their processes, watching carefully for signs of anomalies that could lead to disaster. Therefore, it is crucial to be able to detect and identify any possible faults or failures in the system as early as possible [2, 4, 5].

Figure 1.

Scheme of fault detection and diagnosis.

Keeping an automated process running smoothly and safely and producing the desired results remains a major challenge in many sectors. Various fault detection techniques have been developed for the safe operation of systems or processes. There are two main types of these techniques: process history‐based approaches and model‐based approaches, as shown in Figure 2. Model‐based approaches compare analytically computed outputs with measured values and signal an alarm when large differences are detected [2, 6, 7]. Unfortunately, the effectiveness of model‐based fault‐detection approaches relies on the accuracy of the models used. When there is no process model, model‐free or process‐history‐based methods were successfully used in process monitoring because they can effectively deal with highly correlated process variables [8, 9]. Such methods require a minimal a prior knowledge about process physics, but depends on the availability of quality input data. Process‐history‐based methods use implicit empirical models derived from analysis of available data and rely on computational intelligence and machine learning methods [1012]. In the last four decades, process‐history‐based methods such as principal component analysis (PCA) and partial least squares (PLSs) have become more and more important in statistical process monitoring. They have been extensively applied in the field of chemometrics [5, 13, 14]. In contrast to the classical univariate statistical process monitoring tools, these approaches take the correlations between variables into account and monitor a set of correlated variables simultaneously. Moreover, by projecting the original measurements into a latent sub space, latent variables (LVs) are monitored in a reduced dimensional space. A PCA or PLS model is built on good historical data of normal or process operation [15, 16]. This model can then be used to monitor or predict the future behavior of the process [17].

Figure 2.

Fault detection methods.

However, most of the processes are in dynamic state, with various events occurring such as abrupt process changes, slow drifts, bad measurements due to sensor failures, and human errors. Data from these processes are not only cross‐correlated, but also autocorrelated. Applying conventional latent variable regression (LVR) methods directly to dynamic systems results in false alarms, making it insensitive to detect and discriminate different kinds of events. In addition, noisy data and model uncertainties negatively affect the performance of fault detection methods. In fact, wavelet‐based multiscale representation of data has been shown to provide effective noise‐feature separation in the data, to approximately decorrelate autocorrelated data, and to transform the data to better follow the Gaussian distribution [18]. Multiscale representation of data using wavelets has been widely used for data denoising, compression, and for process monitoring [1821].

The detection of incipient faults is crucial for maintaining the normal operations of a system by providing early fault warnings. The problem is that incipient anomalies are often too weak to be detected by conventional monitoring methods. The objective of this chapter is to extend the fault detection techniques developed to take into account the uncertainty of the data. To this end, multiscale data representation, a powerful feature extraction tool, will be used to reduce false alarms by improving noise‐feature data separation and decorrelation of autocorrelated measurement errors. To do so, multiscale partial least square (MSPLS)‐based exponentially weighted moving average (EWMA) fault detection techniques will be developed. The overarching goal of this work is to tackle multivariate challenges in process monitoring by merging the advantages of EWMA chart and multiscale‐PLS modeling to enhance their performance. It is shown through simulated distillation column data that significant improvement in detecting small fault can be obtained using the MSPLS‐EWMA approach as compared to the PLS‐EWMA fault detection approach.

The remainder of this chapter is organized as follows. Section 2 gives a brief overview of the PLS and the multiscale PLS approach. In Section 3, we present the proposed MSPLS‐EWMA fault‐detection procedure. In Section 4, EWMA chart is briefly presented. Section 5 applies the proposed fault‐detection procedure to a simulated distillation column process. Finally, Section 6 concludes the chapter.


2. Preliminary materials

2.1. Partial least squares (PLS)‐based charts

The objective of PLS models is to find relations between input and output data blocks by relating their latent variables. A detailed description of the PLS technique is given in Ref. [22]. This data‐driven empirical statistical model approach is extremely useful under the situation where either a first principal model or analytical model is difficult to obtain or the measured variables are highly correlated (collinear) to each other. The PLS methods have been extensively researched and applied in the chemometrics field.

Consider an input data matrix X R n × m and an output data matrix Y R n × p , where n is the number of samples or observations, m and p are the number of input and output variables, respectively. The objective of PLS is to maximize the covariance matrix between linear combinations of X and Y . A PLS model is given by the inner model and the outer model [15, 23] (see Figure 3). The input and output matrices can be related to LVs as follows via the outer model [23]:

Figure 3.

Principle of PLS.

{ X = X ^ + E = i = 1 l t i P i T + E = T P T + E Y = Y ^ + F = i = 1 l u i q i T + F = U Q T + F E1

where X ^ and Y ^ are approximated data matrices of X and Y , respectively, the matrices T R n × l and U R n × q consist of l retained LVs of the input and output data, respectively. E R n × m and F R n × p represent the residuals matrices that were the unexplained variance of the input and output data, respectively, P R m × l and Q R p × q are the loading of matrices X and Y , respectively. In practice, how to choose a proper number l for LVs is an important step in PLS modeling. If all LVs are used in modeling, the model may fit the noise and therefore reduce the predictive ability of the model. Here, the cross‐validation method can be used to determine a proper number of LVs [24]. The inner model can be computed as

U = T B + H , E2

where B is a regression matrix and H is a residual matrix. The information in Y can be expressed as

Y = T B Q T + F * E3

where matrix F * was the residue that presented the unexplained variance.

2.2. Wavelet transform

Most engineering processes generate data with multiscale properties, signifying that they include both useful information and noise at different times and frequencies. The majority of fall detection approaches are based on time‐domain data (operates on a single time scale) that do not take multiscale characteristics of the data into consideration. Wavelet analysis has been show to represent data with multiscale properties, efficiently separating deterministic and stochastic features [18].

Multiresolution time series decomposition was initially applied by Mallat, who used orthogonal wavelet bases during data compression for image decoding [25]. Wavelets represent a family of basis functions that can be expressed as the following localized in both time and frequency [18]:

ψ a , b ( t ) = 1 a ψ ( t b a ) , E4

where a represents the dilation parameter, b is the translation parameter [26] and ψ ( t ) is the mother wavelet. Both these parameters are commonly discretized dyadically as a = 2 m , b = 2 m k , ( m , k ) Z 2 , and the family of wavelets can represented as ψ m n ( t ) = 2 m 2 ψ ( 2 m t m ) . Here, ψ ( t ) is the mother wavelet and m and k are the respective dilation and translation parameters, respectively. Different families of basis functions are created based on their convolution with different filters, such as the Haar scaling function and the Daubechies filters [26, 27]. Parameters that are discretized dyadically force downsampling reduce the number of parameters dyadically with every decomposition. However, dyadically discretized wavelet force samples at nondyadic locations to become decomposed only after a certain time delay.

The discrete wavelet transform (DWT) analyzes the signal at different scales (or over different frequency bands) by decomposing the signal at each scale into a coarse approximation (low frequency information), A , and detail information (high frequency information), D . DWT employs two sets of functions: the scaling functions ϕ j , k ( t ) = 2 j ϕ ( 2 j t k ) , k Z and wavelet functions ψ j , k ( t ) = 2 j ψ ( 2 j t k ) , j = 1, , J , k Z , which are associated with low pass filter H and high pass filter G, respectively. Where the coarsest scale J usually termed the decomposition level. Any signal can be represented by a summation of all scaled and detailed signals as follows [26]:

x ( t ) = k = 1 n 2 J a J k ϕ J k ( t ) A J ( t ) + j = 1 J k = 1 n 2 j d j k ψ j k ( t ) D j ( t ) . E5

where j , k , J , and n represent the dilation parameter, translation parameter, number of scales, and number observations in the original signal, respectively [28, 29]. d j k and a J k are respectively the scaling and the wavelet coefficients, and A J ( t ) and D j ( t ) , ( j = 1,2, , J ) represent the approximated signal and the detail signals, respectively. Of course, by passing a series of high and low pass wavelets filters, it is decomposed into signals at different scales as shown in Figure 4.

Figure 4.

Principle of multiscale representation based on wavelet transform.

In the next section, we highlight the advantages of multiscale.

2.3. Advantages of multiscale representation

Conventional methods are referred to as time‐domain analysis methods. These methods are more sensitive toward impulsive oscillations and are unable to extract frequencies and patterns in the data that may be hidden. Before the introduction of multiscale wavelet analysis, mathematical tools such as Fourier transform analysis, coherence function analysis, and power spectral density analysis were used. However, these tools would only allow the signal to imitate the tool being used for analysis. For example, the use of Fourier transform analysis would decompose the signal into a sum of cosine and sine functions. Multiscale helps overcome this problem as it helps simultaneously examine both the time and frequency domains, while Fourier transform is only capable of shifting between the time and frequency domain.

Ganesan et al. in a literature review of multiscale statistical process monitoring state the following advantages of using wavelet coefficients in Multiscale statistical process control (MSSPC) over conventional Statistical process control (SPC) methods [20]:

  • The ability to separate noise from important feature.

  • The wavelet coefficients of autocorrelated data are approximately decorrelated at multiple scales.

  • Data are closer to normality at multiple scales.

2.4. Separating noise feature

Two important applications, data compression and data denoising, can be achieved through wavelet multiscale decomposition. One of the biggest advantages of multiscale representation is its capacity of distinguishing measurement noise from useful data features, by applying low and high pass filters to the data during multiscale decomposition. This allows the separation of features at different resolutions or frequencies, which makes multiscale representation a better tool for filtering or denoising noisy data than traditional linear filters, like the mean filter and the EWMA filter. Despite their popularity, linear filters rely on defining a frequency threshold above where all features are treated as measurement noise. The ability of multiscale representation to separate noise has been used not only to improve data filtering, but also to improve the prediction accuracy of several empirical modeling methods and the accuracy of state estimators.

A noisy signal is filtered by a three‐step method [30]:

  • Apply wavelet transform to decompose the noisy signal into the time‐frequency domain.

  • Threshold the detail coefficient and remove coefficients a selected threshold.

  • Transform back into the original domain the threshold coefficients to obtain a filtered signal.

2.5. Multiscale PLS modeling

Data observed from environmental and engineering processes are usually noisy and correlated in time, which makes the fault detection more difficult as the presence of noise degrades fault detection quality, and most methods are developed for independent observations. Multiscale representation of data using wavelets is a powerful feature extraction tool that is well suited to denoising and decorrelating time series data.

The integrated multiscale PLS (MSPLS) modeling approach is to take advantage of the both latent variable regression and denoising ability of the multiscale decomposition using wavelets. Thus, improve in prediction ability of the model, which in term improves the fault detection methods. The given input variable data matrix X and response variable matrix y are decomposed at different scales using multiscale basis function called wavelets. Let the decomposed data at each scale ( j ) be X j and y j . Then, the MSPLS model is developed using decomposed data, can be expressed as

y j = T j B j Q j T F j , E6

where X j R n × m is the filtered input data matrix at scale ( j ) , y j R n × 1 is the response output vector at scale ( j ) . F R m × p is the MSPLS model residual at j th decomposition scale.

However, denoising the input and output variables a prior to developing model results in poor prediction ability of the MSPLS model due to removal of features which may be important to model. Therefore, in the proposed integrated MSPLS modeling approach, the selection of optimum decomposition depth based on the prediction ability of the developed MSPLS model is used. The integrated MSPLS modeling algorithm is summarized next [8].

  • Preprocessing of training and testing data is required to ensure that all available data is set to zero mean and unit variance.

  • Wavelet decomposition allows the data to be converted into wavelet coefficients. This changes the set of data from a single scale to multiple scales that allow for multiscale modeling.

  • Filter the training data at different scales based on the filtering algorithm is given in Section 2.4.

  • Build a PLS model using the filtered data at each scale. Cross‐validation is used to determine the number of LVs.

  • Use the estimated model from each scale to predict the output for the testing data and compute the cross‐validated mean square error.

  • Choose the PLS with the smallest cross‐validated mean square error as the MSPLS model.

Once an MSPLS model based on past normal operation is obtained, it can be used to monitor future deviation from normality. Two monitoring statistics, the T 2 and Q statistics, are usually utilized for fault detection purposes [31]. First, the Hoteling T 2 statistics indicates the variation within the process model in the LVs subspace. Second, the Q statistic, also known as the squared prediction error (SPE), monitors how well the data conforms to the model (see Figure 5).

Figure 5.

(a) Hoteling T 2 statistic and (b) Q statistic.

The T 2 statistic based on the number of retained LVs, l , is defined as [31]

T 2 = i = 1 l t i 2 λ i , E7

where λ i is eigenvalue of the covariance matrix of X . The T 2 statistic measures the variation in the LVs only. A large change in the PC subspace is observed if some points exceed the confidence limit of the T 2 chart, indicating a big deviation in the monitored system. Confidence limits for T 2 at level ( 1 α ) relate to the Fisher distribution, F , as follows [31]:

T l , n , α 2 = l ( n 1 ) n l F l , n l , α , E8

where F l , n l , α is the upper 100 α % critical point of F with l and n l degrees of freedom.

The squared prediction error (SPE) or Q statistic, which is defined as [31]

Q = e T e , E9

captures the changes in the residual subspace. e = x x ^ represents the residuals vector, which is the difference between the new observation, x , and its prediction, x ^ , via the MSPLS model. Eq. (9) provides a direct mean of the Q statistic in terms of the total sum of measured variation in the residual vector e . The SPE can be considered a measure of the system‐model mismatch. The confidence limits for SPE are given in Ref. [32]. This test suggests the existence of an abnormal condition when Q > Q α , where Q α is defined as

Q α = φ 1 [ h 0 c α 2 φ 2 φ 1 + 1 + φ 2 h 0 ( h 0 1 ) φ 1 2 ] , E10


φ i = j = l + 1 m λ j i , for i = 1,2,3, E11
h 0 = 1 2 φ 1 φ 3 3 φ 2 2 . E12

c α is the confidence limits for the 1 α percentile in a normal distribution.

However, the MSPLS‐based T 2 and Q approaches fail to detect small faults [9]. Here, we use only the Q ‐based chart as a benchmark for fault detection with PLS and MSPLS. Motivated by the power of the EWMA chart, which are widely used univariate control chart, is proposed as improved alternatives for fault detection. The objective is to tackle MSPLS challenges in process monitoring by merging the advantages of the EWMA and MSPLS approaches to enhance their performance and widen their practical applicability.


3. EWMA monitoring charts

In this section, we briefly introduce the basic idea of the EWMA chart and its properties. For a more detailed discussion of EWMA charts, see Ref. [33]. EWMA is a statistic that gives less weight to old data, and more weight to new data. The EWMA charts are able to detect small shifts in the process mean, since the EWMA statistic is a time‐weighted average of all previous observations. The EWMA control scheme was first introduced by Roberts [34], and is extensively used in time series analysis. The EWMA monitoring chart is an anomaly detection technique widely used by scientists and engineers in various disciplines [6, 33, 35]. Assume that { x 1 , x 2 , , x n } are individual observations collected from a monitored process. The expression for the EWMA is [33]

{ z t = λ x t + ( 1 λ ) z t 1 if t > 0 z 0 = μ 0 , if t = 0. E13

The starting value z 0 is usually set to the mean of the fault‐free data, μ 0 . Z t is the output of EWMA and x t is the observation from the monitored process at the current time. The forgetting parameter λ ( 0,1 ] determines how fast EWMA forgets historical data. Equation (13) can also be written as

z t = λ t = 1 n ( 1 λ ) n t x t + ( 1 λ ) n μ 0 , E14

where λ ( 1 λ ) n t is the weight for x t , which falls off exponentially for past observations. We can see that if λ is small, then more weight is assigned to past observations. Thus, the chart is tuned to have efficiency for detecting small changes in the process mean. On the other hand, if λ is large, then more weight is assigned to the current observations, and the chart is more suitable for detecting large shifts [33]. In the special case, λ = 1 , the EWMA is equal to the most recent observation, x t , and provides the same results as Shewhart chart. As λ approaches zero, EWMA approximates the CUSUM criteria, which gives equal weights to the current and historical observations.

Under fault‐free conditions, the standard deviation of z t is defined as

σ z t = σ 0 λ ( 2 λ ) [ 1 ( 1 λ ) 2 t ] , E15

where σ 0 is the standard deviation of the fault‐free or preliminary data set. Therefore, in such cases, z t N ( μ 0 , σ z t 2 ) . However, in the presence of a mean shift at the time point 1 τ n , z t N ( μ 0 + [ 1 ( 1 λ ) n τ + 1 ] ( μ 1 μ 0 ) , σ z t 2 ) . The upper and lower control limits (UCL and LCL) of the EWMA chart for detecting a mean shift are UCL/LCL= μ 0 ± L σ z t , where L is a multiplier of the EWMA standard deviation σ z t . The parameters L and λ need to be set carefully [33]. In practice, L is usually set to three, which corresponds to a false alarm rate of 0.27%. If z t is within the interval [LCL and UCL], then we conclude that the process is under control up to time point t . Otherwise, the process is considered out of control.


4. Combining MSPLS model with EWMA chart: MSPLS‐EWMA

In this chapter, we combine the advantages of MSPLS modeling with those of the univariate EWMA monitoring chart, which results in an improved fault detection system, especially for detecting small faults in highly correlated, multivariate data. Toward this end, we applied EWMA charts to the output residuals obtained from the MSPLS model (see Figure 6). Indeed, under normal operation with little noise and few errors, the residuals are close to zero, while they significantly deviate from zero in the presence of abnormal events. In this work, the output residuals from MSPLS are used as a fault indicator.

Figure 6.

Principle of MSPLS‐EWMA procedure.

As given in Eq. (6), the output vector y can be written as the sum of a predicted vector y ^ and a residual vector F , i.e.,

y = y ^ + F . E16

The residual of the output variable, F = [ f 1 , , f t , , f n ] , which is the difference between the observed value of the output variable, y , and the predicted value, y ^ , obtained from the MSPLS model, is a potential indicator for fault detection. The EWMA statistic based on the residuals of the response variable can be calculated as follows:

z t = λ f t + ( 1 λ ) z t 1 t [ 1, n ] E17

In this case, since the EWMA control scheme is applied on the residual data matrix, one EWMA decision function will be computed to monitor the process.


5. Monitoring a simulated distillation column

In this section, the ability of the proposed MSPLS‐EWMA technique to detect faults is studied through simulation data and the results compared with those obtained using a traditional PLS‐EWMA method. In all monitoring charts, the red‐shaded area is the region where the fault is injected to the test data while the 95% control limits are plotted by the horizontal‐dashed line.

5.1. Description and data generation of the process

A distillation column is most commonly used unit operation in chemical process industries. The objective of the distillation operation is to separate the component from a mixture of component. The operation of distillation column is very energy expensive. Therefore, monitoring of such process plays very important role in bringing down the cost of the operation. The schematic diagram of the distillation column is shown in Figure 7.

Figure 7.

Distillation column diagram.

The efficacy of the proposed fault detection strategy tested using simulated (using ASPEN simulation software [36]) distillation column. The input variables consist of temperature measurements at different location of the distillation column along with feed flow rate and reflux flow. The light distillate from reflux drum considered as the response variable. The operating conditions, nominal operating conditions, and detailed steps involved in the data generation can be found in Ref. [36]. The generated 1024 data samples are then corrupted with zero mean Gaussian white noise with signal‐to‐noise ratio (SNR) of 10 dB used for model development and testing the Fault detection (FD) strategy. Figure 8 shows dynamic data of the distillation column, i.e., variations of the light component for changes in the reflux and feed flow. The MSPLS model is developed from first 512 data samples and later part of the data points is used for testing purpose. The optimal LVs for the model are achieved through cross‐validation methods and found to be three LVs for the MSPLS model.

Figure 8.

Simulation of a distillation column: variation of input‐output data with SNR = 10 (Solid line: noise-free data; dots: noisy data).

A scatter plot of the measured and predicted data is presented in Figure 9. This plot indicates a reasonable performance of the selected models.

Figure 9.

Scatter plots of predicted and observed training data.

5.2. Detection results

After a process model has been successfully identified, we can proceed with fault detection. Three types of faults in distillation columns will be considered here: abrupt, intermittent, and gradual faults.

To quantify the efficiency of the proposed strategies, we use two metrics: the false detection rate (FAR) and the miss detection rate (MDR) [37]. The FAR is the number of normal observations that are wrongly judged as faulty (false alarms) over the total number of fault‐free data. The MDR is the number of faults that are wrongly classified as normal (missed detections) over the total number of faults.

5.2.1. Case (A): abrupt fault detection

In this case study, an abrupt change is simulated by adding a small constant deviation which is 2% of the total variation in temperature T c 3 , to the temperature sensor measurements T c 3 , between sample times 150 and 200. In the example, the testing data with low SNR, SNR = 5, are generated for the purpose of evaluation of MSPLS‐EWMA and PLS‐Q monitoring performances. Results of the PLS‐Q and MSPLS‐Q statistics are demonstrated in Figure 10(a) and (c), respectively. It can be seen from Figure 10(a) and (c) that PLS‐Q and MSPLS‐Q cannot detect this small fault. Figure 10(b) shows that the PLS‐EWMA chart is capable of detecting this simulated fault but with a lot missed detection (i.e., MDR = 55% and FAR = 0.96%). Figure 10(d) shows that although the MSPLS‐EWMA chart clearly detected this abrupt faults without missed detection (i.e., MDR = 0% and FAR = 0.96%).

Figure 10.

Monitoring results of PLS‐Q chart (a), PLS‐EWMA chart (b), MSPLS‐Q chart (c), and MSPLS‐EWMA chart (d) in the presence of a bias anomaly in the temperature sensor measurements ‘ T c 3 ’ with SNR = 30, Case (A).

5.2.2. Case (B): intermittent fault

In this case study, we introduce into the testing data a bias of amplitude 2% of the total variation in temperature T c 3 of between samples 50 and 100, and a bias of 10% between samples 350 and 450. Figure 11(a)(d) shows the monitoring results of the PLS‐based Q and EWMA charts, and MSPLS‐based Q and EWMA charts. Figure 11(a) shows that the PLS‐based Q chart has no power to detect this fault. From Figure 11(b), it can be seen that the MSPLS‐Q chart can detect the intermittent faults but with several missed detections. Figure 11(c) shows that the PLS‐EWMA chart can indeed detect this fault, but with some missed detections. On the other hand, the MSPLS‐EWMA chart with λ = 0.3 correctly detects this intermittent fault (see Figure 11(d)). In this case study, we can see that detection performance is much enhanced when using the MSPLS‐EWMA chart compared to the others.

Figure 11.

Monitoring results of PLS‐Q chart (a), PLS‐EWMA chart (b), MSPLS‐Q chart (c), and MSPLS‐EWMA chart (d) in the presence intermittent sensor fault in ‘ T c 3 ’ with SNR = 30, Case (B).

5.2.3. Case (C): drift failure detection

A slow drift fault is simulated by adding a ramp change with a slope of 0.01 to the temperature sensor, T c 3 , from sample 250 through the end of the testing data. Monitoring results of PLS and MSPLS‐based Q and EWMA statistics are shown in Figure 12(a)(d). Figure 12(a) shows the monitoring results of PLS‐Q chart, in which we can see that a signal is first given at sample 313 with a significant false alarm rate (i.e., FAR = 22.4%). Figure 12(b) shows that the PLS‐EWMA chart first detects the fault at the 290th observation. The MSPLS‐Q chart is shown in Figure 12(c), which first flags the fault at sample 323. Figure 12(d) shows that the MSPLS‐EWMA chart first detects the fault at the 288th observation. Therefore, a fewer observations are needed for the MSPLS‐EWMA chart to detect a fault compared to the other charts.

Figure 12.

Monitoring results of PLS‐Q chart (a), PLS‐EWMA chart (b), MSPLS‐Q chart (c), and MSPLS‐EWMA chart (d) in the presence drift sensor anomaly in ‘ T c 3 ’ with SNR = 30, Case (C).

This case study testifies again to the superiority of the proposed approaches compared to conventional PLS‐based fault detection. Of course, this chapter also demonstrates through simulated data that significant improvement in fault detection can be obtained by using the MSPLS model when combined with the EWMA chart.


6. Conclusion

The objective of this chapter is to extend the PLS fault‐detection methods to deal with uncertainty in the measurements. The developed approach merges the flexibility of multiscale PLS model and the greater sensitivity of the EWMA control chart to incipient changes. Specifically, in this approach, the multiscale PLS model has been constructed using the wavelet coefficients at different scales, and then EWMA monitoring chart was applied using this model to improve the fault detection abilities of this PLS fault detection method even further. Using a simulated distillation column, we demonstrate the effectiveness of MSPLS‐EWMA to detect abrupt and drift faults. Results show that the MSPLS‐EWMA can achieve better fault‐detection efficiency than the PLS‐EWMA, PLS‐Q, and MSPLS‐Q monitoring approaches.



This publication is based upon work supported by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No: OSR‐2015‐CRG4‐2582.


  1. 1. Aldrich C, Auret L. Unsupervised process monitoring and fault diagnosis with machine learning methods. London: Springer; 2013
  2. 2. Isermann R. Model‐based fault‐detection and diagnosis: Status and applications. Annual Reviews in Control. 2005;29:71–85
  3. 3. Ralston P, DePuy G, Graham J. Computer‐based monitoring and fault diagnosis: A chemical process case study. ISA Transactions. 2001;40(1):85–98
  4. 4. Neumann J, Deerberg G, Schlüter S. Early detection and identification of dangerous states in chemical plants using neural networks. Journal of Loss Prevention in the Process Industries. 1999;12(6):451–453
  5. 5. Chiang L, Braatz R, Russell E. Fault Detection and Diagnosis in Industrial Systems. Springer-Verlag London: Springer Science & Business Media; 2001
  6. 6. Harrou F, Fillatre L, Bobbia M, Nikiforov I. Statistical detection of abnormal ozone measurements based on constrained generalized likelihood ratio test. In: IEEE 52nd Annual Conference on Decision and Control (CDC), Firenze, Italy; IEEE; 2013. pp. 4997–5002
  7. 7. Harrou F, Fillatre L, Nikiforov I. Anomaly detection/detectability for a linear model with a bounded nuisance parameter. Annual Reviews in Control. 2014;38(1):32–44
  8. 8. Madakyaru M, Harrou F, Sun Y. Improved data‐based fault detection strategy and application to distillation columns. Process Safety and Environmental Protection. 2017;107:22–34
  9. 9. Harrou F, Nounou M, Nounou H, Madakyaru M. PLS‐based EWMA fault detection strategy for process monitoring. Journal of Loss Prevention in the Process Industries. 2015;36:108–119
  10. 10. Yin S, Ding SX, Haghani A, Hao H, Zhang P. A comparison study of basic data‐driven fault diagnosis and process monitoring methods on the benchmark Tennessee Eastman process. Journal of Process Control. 2012;22(9):1567–1581
  11. 11. Yin S, Ding S, Xie X, Luo H. A review on basic data‐driven approaches for industrial process monitoring. IEEE Transactions on Industrial Electronics. 2014;61(11):6418–6428
  12. 12. Zhao Y, Wang S, Xiao F. Pattern recognition‐based chillers fault detection method using support vector data description (SVDD). Applied Energy. 2013;112:1041–1048
  13. 13. Liang W, Zhang L. A wave change analysis (WCA) method for pipeline leak detection using gaussian mixture model. Journal of Loss Prevention in the Process Industries. 2012;25(1):60–69
  14. 14. Abdi H, Williams L. Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics. 2010;29(4):433–459
  15. 15. Wold S, Ruhe H, Wold H, III WD. The collinearity problem in linear regression. the partial least squares (PLS) approach to generalized inverses. SIAM Journal on Scientific and Statistical Computing. 1984;5(3):735–743
  16. 16. Yin S, Xiangping Z, Okyay K. Improved PLS focused on key‐performance‐indicator‐related fault diagnosis. IEEE Transactions on Industrial Electronics. 2015;62(3):1651–1658
  17. 17. Harrou F, Kadri F, Khadraoui S, Sun Y. Ozone measurements monitoring using data‐based approach. Process Safety and Environmental Protection. 2016;100:220–231
  18. 18. Bakshi B. Multiscale PCA with application to multivariate statistical process monitoring. AIChE Journal. 1998;44(7):1596–1610
  19. 19. Yoon S, MacGregor J. Principal‐component analysis of multiscale data for process monitoring and fault diagnosis. AIChE Journal. 2004;50(11):2891–2903
  20. 20. Ganesan R, Das K, Venkataraman V. Wavelet‐based multiscale statistical process monitoring: A literature review. IIE Transactions. 2004;36(9):787–806
  21. 21. Li X, Yao X. Multi‐scale statistical process monitoring in machining. IEEE Transactions on Industrial Electronics. 2005;52(3):924–927
  22. 22. Geladi P, Kowalski B. Partial least‐squares regression: A tutorial. Analytica Chimica Acta. 1986;185:1–17
  23. 23. MacGregor J, Kourti T. Statistical process control of multivariate processes. Control Engineering Practice. 1995;3(3):403–414
  24. 24. Li B, Morris J, Martin E. Model selection for partial least squares regression. Chemometrics and Intelligent Laboratory Systems. 2002;64(1):79–89
  25. 25. Mallat S. A theory for multiresolution signal decomposition: The wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1989;11(7):674–693
  26. 26. Gao R, Yan R. Wavelets: Theory and Applications for Manufacturing. Springer US:Springer Science & Business Media; 2010
  27. 27. Zhou S, Sun B, Shi J. An SPC monitoring system for cycle‐based waveform signals using Haar transform. IEEE Transactions on Automation Science and Engineering. 2006;3(1):60–72
  28. 28. Strang G. Wavelets and dilation equations: A brief introduction. SIAM Review. 1989;31(4):614–627
  29. 29. Daubechies I. Orthonormal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics. 1988;41(7):909–996
  30. 30. Donoho DL, Johnstone IM, Kerkyacharian G, Picard D. Wavelet shrinkage: Asymptotia? Journal of the Royal Statistical Society B. 1995;57:301
  31. 31. Qin S. Statistical process monitoring: Basics and beyond. Journal of Chemometrics. 2003;17(8–9):480–502
  32. 32. Jackson J, Mudholkar G. Control procedures for residuals associated with principal component analysis. Technometrics. 1979;21:341–349
  33. 33. Montgomery DC. Introduction to Statistical Quality Control. New York: John Wiley& Sons; 2005
  34. 34. Roberts, SW. “Control Chart Tests Based on Geometric Moving Averages,” Technometrics, 1959;42(1):97–102.
  35. 35. Morton P, Whitby M, McLaws M‐L, Dobson A, McElwain S, Looke D, Stackelroth J, Sartor A. The application of statistical process control charts to the detection and monitoring of hospital‐acquired infections. Journal of Quality in Clinical Practice. 2001;21(4):112–117
  36. 36. Madakyaru M, Nounou M, Nounou H. Enhanced modeling of distillation columns using integrated multiscale latent variable regression. In: IEEE Symposium on Computational Intelligence in Control and Automation (CICA), 16–19 April, 2013, Singapore: Singapore; IEEE; 2013. pp. 73–80
  37. 37. Harrou F, Sun Y, Madakyaru M. Kullback‐leibler distance‐based enhanced detection of incipient anomalies. Journal of Loss Prevention in the Process Industries. 2016;44:73–87

Written By

Fouzi Harrou, Ying Sun and Muddu Madakyaru

Reviewed: 03 April 2017 Published: 05 July 2017