Open access peer-reviewed chapter

Process Fault Diagnosis for Continuous Dynamic Systems Over Multivariate Time Series

Written By

Chris Aldrich

Submitted: June 13th, 2018 Reviewed: February 26th, 2019 Published: April 23rd, 2019

DOI: 10.5772/intechopen.85456

Chapter metrics overview

1,795 Chapter Downloads

View Full Metrics


Fault diagnosis in continuous dynamic systems can be challenging, since the variables in these systems are typically characterized by autocorrelation, as well as time variant parameters, such as mean vectors, covariance matrices, and higher order statistics, which are not handled well by methods designed for steady state systems. In dynamic systems, steady state approaches are extended to deal with these problems, essentially through feature extraction designed to capture the process dynamics from the time series. In this chapter, recent advances in feature extraction from signals or multivariate time series are reviewed. These methods can subsequently be considered in a classical statistical monitoring framework, such as used for steady state systems. In addition, an extension of nonlinear signal processing based on the use of unthresholded or global recurrence quantification analysis is discussed, where two multivariate image methods based on gray level co-occurrence matrices and local binary patterns are used to extract features from time series. When considering the well-known simulated Tennessee Eastman process in chemical engineering, it is shown that time series features obtained with this approach can be an effective means of discriminating between different fault conditions in the system. The approach provides a general framework that can be extended in multiple ways to time series analysis.


  • process fault diagnosis
  • statistical process control
  • machine learning
  • time series analysis
  • deep learning

1. Introduction

In the process industries, advanced process control is widely recognized as essential to meet the challenges arising from the trend toward more complex, larger scale circuit configurations, plant-wide integration, and having to make do with fewer personnel. In these environments, characterized by highly automated process operations, algorithms to detect and classify abnormal trends in process measurements are critically important.

Process diagnostic algorithms can be derived from a continuum spanning first principle models on one end to entirely data driven or statistical models on the other. The latter is typically based on historical process data and is seen as the most cost effective approach to deal with complex systems. As a consequence, diagnostic methods have seen considerable growth over the last couple of decades. Data-driven fault diagnosis can be traced back to control charts invented by Walter Shewhart at Bell Laboratories in the 1920s to improve the reliability of their telephony transmission systems. In these statistical process control charts, variables of interest were plotted as time series within statistical upper and lower limits. Shewhart’s methodology was subsequently popularized by Deming and these statistical concepts, such as Shewhart control charts (1931), cumulative sum charts (1954), and exponentially weighted moving average charts were well established by the 1960s [1].

These univariate control charts do not exploit the correlation that may exist between process variables. In the case of process data, crosscorrelation is present, owing to restrictions enforced by mass and energy conservation principles, as well as the possible existence of a large number of different sensor readings on essentially the same process variable. These shortcomings have given rise to multivariate methods or multivariate statistical process control and related methods that have proliferated exponentially over the last number of years. These approaches can be viewed on the basis of the elementary operations involved in the fault diagnostic process, as outlined in Figure 1 [2].

Figure 1.

Generalized framework for unsupervised process fault diagnosis.

In this diagram, (i) a data matrix ( X ˇ ), representative of the process, is preprocessed or transformed to (ii) data matrix X and then mapped to (iii) a feature space (F) within some bounded region (iv) L F . These features can be used to (v) reconstruct the data ( X ̂ ), from which (vi) an error matrix ( E ) is generated, with scores again mostly confined to some bounded region L E (vii).

Fault detection and fault diagnosis are typically done in both the feature space (F) and the error space ( E ), based on the use of forward (ℑ) and reverse mapping (ℜ) models and suitable confidence limits L F and L E for the feature and error spaces. Alternatively, forward mapping in to the feature space can be only used for process monitoring.

Preprocessing of the data prior to fault diagnosis has received considerable attention over the last decade or so as a basis for the development of methods that can deal with nonlinearities in the data, lagged variables and unfolding of higher dimensional data. These approaches will mostly be discussed in the second part of the chapter.

1.1 Steady state systems

Linear steady state Gaussian processes and the use of principal component analysis will first be considered as an example on the basis of this general framework, after which other methods proposed over the last few decades will be reviewed.

As mentioned in the previous section, univariate control charts do not exploit the correlation that may exist between process variables and when the assumptions of linearity, steady state, and Gaussian behavior hold, multivariate statistical process control based on the use of principal component analysis can be used very effectively for early detection and analysis of any abnormal plant behavior. Since principal component analysis plays such a major role in the design of these diagnostic models, a brief outline of the methodology is in order.

Analysis, monitoring, and diagnosis of process operating performance based on the use of principal components is well established. The basic theory can be summarized as follows, where X R N x M comprises the data matrix representative of the process with M variables and N observations, S is the covariance matrix of the process variables typically scaled to zero mean and unit variance, P is the loading matrix of the first k < M principal components, Λ is a diagonal matrix containing the k eigenvalues of the decomposition, P is the loading matrix of the M k remaining principal components, and Λ is a diagonal matrix containing the M k remaining eigenvalues of the decomposition. The T 2 and Q-diagnostics (Eqs. 2 and 3) are commonly used in process monitoring schemes.

S = X T X N 1 = P T + P Λ P T E1
Q = x x ̂ T x x ̂ = x T Cx , where C = P P T E2
T 2 = t T Λ 1 t = x T Dx , where D = P Λ 1 P T E3

In classical multivariate statistical process control based on principal component analysis, the control limits required for automated process monitoring are based on the assumption that the data are normally distributed. The α upper control limit for T 2 is calculated from N observations based on the F-distribution, that is,

UCL T 2 PCA = k N + 1 N 1 F α , k , N k N N k E4

Then upper control limit for Q is calculated by means of the χ2 distribution as:

UCL Q PCA = Λ 1 [ 1 + c α ( 2 Λ 2 θ 2 ) / Λ 1 + Λ 2 θ θ 1 / θ Λ 1 2 ] θ E5

where Λ 1 = k + 1 M λ ji (for i = 1, 2, 3) and θ = 1 2 Λ 1 Λ 3 / 3 Λ 2 2 . The standard normal deviates, c α corresponding to the upper (1−α) percentile, while M is the total number of principal components (variables). The residual Q i is more likely to have a normal distribution than the principal component scores, since it is a measure of the nondeterministic behavior of the system.

1.2 Unsteady state systems

Unlike steady state systems, unsteady state or dynamic systems show time dependence. This time dependence implies the presence of autocorrelation and/or nonstationarity [3]. Autocorrelation arises when the observations within a time series are not independent, while nonstationarity means that the parameters governing a process change over time, for example, the mean, covariance or other higher order statistics. Therefore, in principle at least, these systems cannot be treated directly by the methods dealing with steady state systems.

Broadly speaking, methodologies dealing with dynamic process systems are all aimed at dealing with the issues arising from the time dependence of the data. Essentially, these approaches are based on the analysis of a segment of the time series data, as captured by a fixed or a moving window, as indicated in Figure 2. The time series segment amounts to observation of the process over a time interval, and the window length should be sufficient to capture the dynamics of the systems.

Figure 2.

Dynamic process monitoring as an extension of steady state approaches.

Dynamic process monitoring can be as simple as monitoring the mean or the variance of a signal, in which case, a test window as shown in Figure 2 would not be required, and model maintenance would not be an issue. In more complex systems, as could be characterized by large multivariate sets of signals or high-dimensional signals, such as streaming video or hyperspectral data, feature extraction is often model-based. That is, a model derived from the data in the base window is applied to the data in the test window. For example, principal component models can be used for this purpose.

Where models are used and the nature of the signals changes as a result of process drift, recalibration of the models need to be done either at regular intervals or episodically, that is, when a change occurs. Some models, such as those based on principal and independent components can be updated recursively, as discussed in more detail in Sections 4 and 5. Alternatively, the model is updated ab initio at regular intervals.

Moreover, most feature extraction methods are unsupervised, that is, the time series data are unlabeled. Where supervised methods are used, features are extracted based on their ability to predict some label, such as the future evolution of the time series.


2. Unsupervised feature extraction

In principle, any low-dimensional representation of the time series data would constitute a feature set, that is, the data in the time series window, X R N x M containing N measurements of the M plant variables with time lagged copies of these variables. These features can subsequently be dealt with by the same methods used for steady state systems, such as principal component analysis, independent component analysis, kernel methods, etc., some of which are considered in more detail below.

2.1 Dynamic principal component analysis (DPCA)

In dynamic PCA, first proposed by Ku et al. [4], the PCA model is built on the data matrix X residing in the window, to account for auto- and crosscorrrelation between variables. This approach implicitly estimates the autoregressive structure of the data (e.g., [5]). As functions of the model, the T 2 and Q -statistics will also be functions of the lag parameters. Since the mean and covariance structures are assumed to be invariant, the same global model is used to evaluate observations at any future time point.

Although dynamic PCA is designed to deal with autocorrelation in the data, the resultant score variables will still be autocorrelated or even crosscorrelated when no autocorrelation is present [4, 6]. These autocorrelated score variables have the drawback that they can lead to higher rates of false alarms when using Hotelling’s T2 statistic.

Several remedies have been proposed to alleviate this problem, for example, wavelet filtering [7], ARMA filtering [6], and the use of residuals from predictive models [8]. Nonlinear PCA models have been considered by several authors [9, 10, 11, 12, 13].

2.2 Independent component analysis

Stefatos and Hamza [14] and Hsu et al. [15] have introduced diagnostic methods using an approach based on dynamic independent component analysis capable of accurately detecting and isolating the root causes of individual faults. Nonlinear variants of these approaches have been investigated by Cai et al. [16], who have integrated the kernel FastICA algorithm with a manifold learning method known as locality preserving projection. Moreover, kernel FastICA was used to integrate FastICA and kernel PCA to exploit the advantages of both algorithms, as indicated by Zhang and Qin [17], Zhang [18], and Zhang et al. [19].

2.3 Slow feature analysis

Slow feature analysis [20] is an unsupervised learning method, whereby functions g x are identified to extract slowly varying features y t from rapidly varying signals x t . This is done virtually instantaneously, that is, one time slice of the output is based on very few time slices of the input. Extensions of the method have been proposed by other authors [21, 22, 23].

2.4 Multiscale methods

Multiscale methods can be seen as a complementary approach preceding feature extraction from the time series. In this case, each process variable is extended or replaced by different versions of the variable at different scales. For example, with multiscale PCA, wavelets are used to decompose the process variables under scrutiny into multiple scale representations before application of PCA to detect and identify faulty conditions in process operations. In this way, autocorrelation of variables is implicitly accounted for, resulting in a more sensitive method for detecting process anomalies. Multiscale PCA constitutes a promising extension of multivariate statistical process control methods, and several authors have reported successful applications thereof [24, 25, 26, 27].

2.4.1 Wavelets

Bakshi [28, 29] has proposed the use of a nonlinear multiscale principal component analysis methodology for process monitoring and fault detection based on multilevel wavelet decomposition and nonlinear component extraction by the use of input-training neural networks. In this case, wavelets are first used to decompose the data into different scales, after which PCA was applied to the reconstituted time series data. Choi et al. [30] have proposed nonlinear multiscale multivariate monitoring of dynamic processes based on kernel PCA, while Xuemin and Xiaogang [31] have proposed an integrated multiscale approach where kernel PCA is used on measured process signals decomposed with wavelets and have also proposed a similarity factor to identify fault patterns.

2.4.2 Singular spectrum analysis

With SSA, the time series is first embedded into a p-dimensional space known as the trajectory matrix. Singular value decomposition is then applied to decompose the trajectory matrix into a sum of elementary matrices [32, 33, 34], each of which is associated with a process mode.

Subsequently, the elementary matrices that contribute to the norm of the original matrix are grouped, with each group giving an approximation of the original matrix. Finally, the smoothed approximations or modes of the time series are recovered by diagonal averaging of the elementary matrices obtained from decomposing the trajectory matrix. Although SSA is a linear method, it can readily be extended to nonlinear forms, such as kernel-based SSA or SSA with autoassociative neural networks. Nonetheless, it has not been used widely in statistical process monitoring as yet, although some studies have provided promising results [2, 35, 36].

Table 1 gives a summary of multiscale methods that have been considered in process monitoring schemes over the last two decades.

Methodology Comment References
Wavelets Variable decomposition with wavelets before building PCA models [37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47]
Singular spectrum analysis Different variants have been proposed [2, 36, 48]

Table 1.

Data preprocessing methodologies for multiscale process monitoring.

2.5 Phase space methods

Phase space methods rely on the embedding of the data in a so-called phase space, by the use of delayed vector methods, that is, y R N × 1 X R N m + 1 × m = x t x t k x t k m 1 . Embedding can also be done by the use of principal components or singular value decomposition of X R N m + 1 × m , where k = 1 and m is comparatively large. In the latter case, the scores of the eigenvectors would represent an orbit or attractor with some geometrical structure, depending on the frequencies with which different regions of the phase space are visited. The topology of this attractor is a direct result of the underlying dynamics of the system being observed, and the changes in the topology are usually an indication of a change in the parameters or structure of the system dynamics. Therefore, descriptors of the attractor geometry can serve as sensitive diagnostic variables to monitor abnormal system behavior.

2.5.1 Phase space attractor descriptors

For process monitoring purposes, the data captured in a moving window are embedded in a phase space, and descriptors such as correlation dimension [49, 50, 51], Lyapunov exponents, and information entropy [49] have been proposed to monitor deterministic or potentially chaotic systems. These approaches have not found widespread adoption in the industry yet, since the reliability of the descriptors may be compromised by high levels of signal noise.

2.5.2 Complex networks

Process circuits or plants lend themselves naturally to representation by networks and process monitoring schemes can exploit this. For example, Cai et al. [52] have essentially considered a lagged trajectory matrix in the form of a complex network, whereby the variables and their lagged versions served as network vertices. The edges of the network were determined by means of kernel canonical correlation analysis (a nonlinear approach to correlation relationships between sets of variables). Features were extracted from the variables based on the dynamic average degree of each vertex in the network. A standard PCA model, as described in Section 1.1 was consequently used to monitor the process. Case studies have indicated that this could yield considerable improvement in the reliability of the model to detect process disturbances.

2.5.3 Local recurrence quantification analysis

Any given sequence of numbers or time series can be characterized by similarity matrix containing measures of similarity (e.g., Euclidean distances) between all pair-wise points in the time series. A recurrence matrix is generated by binary quantization of the similarity matrix, based on a user specified threshold value. This thresholded matrix can be portrayed graphically as a recurrence plot, amenable to qualitative interpretation. The recurrence matrix, consisting of zeros and ones, can also be used as a basis to extract features that are representative of the dynamic behavior of the time series. This approach is widely referred to as recurrence quantification analysis, and in process engineering, it has mainly been used in the description of electrochemical phenomena and corrosion [53, 54, 55, 56, 57, 58], but in principle has general applicability to any dynamic system.

2.5.4 Global recurrence quantification analysis

More recent extensions of recurrence quantification analysis have been considered by using the unthresholded similarity matrix as a basis for feature extraction. This is also referred to as global, as opposed to (local) recurrence quantification described in Section 2.5.3. The resulting recurrence plot can consequently be treated as an artificial image amenable to analysis by a large variety of algorithms normally applied to textural images, as discussed in more detail in Section 4.


3. Supervised feature extraction

3.1 Autoregressive models

Autocorrelated data can be addressed by fitting models to the data and analyzing the residuals, instead of the variables. With ARIMA models, crosscorrelation between the variables is not accounted for, and although multivariate models can also be employed using this approach, it becomes a complex task when there are many variables (m > 10), owing to the high number of parameters that must be estimated, as well as the presence of crosscorrelation [3, 59].

Apart from ARIMA models, other models, such as neural networks [60, 61, 62], decision trees [63], and just-in-time-learning with PCA [64], have also been proposed.

3.2 State space models

If it is assumed that the data matrix X contains all the dynamic information of the system, then the use of predictive models can be viewed as an attempt to remove all the dynamic information from the system to yield Gaussian residuals that can be monitored in the normal way. State space models offer a principled approach for the identification of the subspaces containing the data. This can be summarized as follows

x k + 1 = f x k + w k E6
y k = g x k + v k E7

x k and y k are the respective state and measurement vectors of the system, and w k and v k are the plant disturbances and measurement errors, respectively, at time k. State space models and their variants have been considered by several authors [65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75].

3.3 Machine learning models

In principle, machine learning models are better able to deal with complex nonlinear systems than linear models, and some authors have considered the use of these approaches. For example, Chen and Liao [62] have used a multilayer perceptron neural network to remove the nonlinear and dynamic characteristics of processes to generate residuals that could be used as input to a PCA model for the construction of simple monitoring charts. Guh and Shiue [63] have used a decision tree to detect shifts in the multivariate means of process data. Auret and Aldrich [48] have considered the use of random forests in the detection of change points in process systems. In addition, Aldrich and Auret [2] have compared the use of random forests with autoassociative neural networks and singular spectrum analysis in a conventional process monitoring framework.

The application of deep learning in process monitoring is an emerging area of research that shows particular promising. This includes the use of stacked autoencoders [76], deep long short term memory (LSTM) neural networks [77], and convolutional neural networks. Table 2 gives an overview of the feature extraction methods that have been investigated over the last few decades.

Class Method References
Unsupervised feature extraction Dynamic PCA Linear PCA [4, 8, 78, 79]
Partial PCA [12, 13]
Kernel PCA [80]
Multiscale [81]
Dynamic ICA ICA [14, 82, 83, 84, 85, 86, 87]
Kernel ICA [88]
Slow feature analysis [22, 23, 24]
ICA Standard [89]
Kernel [88]
Phase space and related methods Attractor descriptors [49, 50, 51, 90]
Recurrence quantification analysis [2, 53, 54, 55, 56, 57, 58]
Complex networks [52]
Dissimilarity [84, 91, 92]
Supervised feature extraction Autoregressive models [59, 71, 93]
State space models [65, 66, 67, 68, 69, 70, 76, 77, 93, 94, 95]
Machine learning Conventional [2, 48, 60, 61, 62]
Deep learning [76, 77, 96]

Table 2.

Approaches to the monitoring of continuous dynamic process systems.


4. Case study: Tennessee Eastman process

Finally, as an example of the application of a process monitoring scheme incorporating feature extraction from time series data in a moving window, the following study can be considered. It is based on the Tennessee Eastman benchmark process widely used in these types of studies. The feature extraction process considered here is an extension of recurrent quantitative analysis discussed in Section 2.5.2. Instead of using thresholded recurrence plots, unthresholded or global recurrence plots are considered, as explained in more detail in below.

4.1 Tennessee Eastman process data

The Tennessee Eastman (TE) process as proposed by Downs and Vogel [97] and has been used as a benchmark in numerous process control and monitoring studies [98]. It captures the dynamic behavior of an actual chemical process, the layout of which is shown in Figure 3.

Figure 3.

Process flow of Tennessee Eastman benchmark problem.

The plant consists of 5 units, namely a reactor, condenser, compressor, stripper and separator, as well as eight components (four gaseous reactants A, C, D, E, one inert reactant B, and three liquid products F, G, H) [97]. In this instance, the plant-wide control structure suggested by Lyman and Georgakis [99] was used to simulate the process and to generate data related to varying operating conditions. The data set is available at

A total of four data sets were used, that is, one data set associated with NOC and the remaining three associated with three different faults conditions. The TE process comprises 52 variables, of which 22 are continuous process measurements, 19 are composition measurements, and the remaining 11 are manipulated variables. These variables are presented in Table 3. Each data set consisted of 960 measurements sampled at 3 min intervals.

Process measurement Composition measurement Manipulated variable
Variable Description Variable Description Variable Description
1 A Feed 23 Reactor feed component A 42 D feed flow
2 D Feed 24 Reactor feed component B 43 E feed flow
3 E Feed 25 Reactor feed component C 44 A feed flow
4 Total Feed 26 Reactor feed component D 45 Total feed flow
5 Recycle flow 27 Reactor feed component E 46 Compressor recycle valve
6 Reactor feed rate 28 Reactor feed component F 47 Purge valve
7 Reactor pressure 29 Purge component A 48 Separator product liquid flow
8 Reactor level 30 Purge component B 49 Stripper product liquid flow
9 Reactor temperature 31 Purge component C 50 Stripper steam valve
10 Purge rate 32 Purge component D 51 Reactor cooling water flow
11 Separator temperature 33 Purge component E 52 Condenser cooling water flow
12 Separator level 34 Purge component F
13 Separator pressure 35 Purge component G
14 Separator underflow 36 Purge component H
15 Stripper level 37 Product component D
16 Stripper pressure 38 Product component E
17 Stripper underflow 39 Product component F
18 Stripper temperature 40 Product component G
19 Stripper steam flow 41 Product component H
20 Compressor work
21 Reactor cooling water outlet temperature
22 Separator cooling water outlet temperature

Table 3.

Description of variables in Tennessee Eastman process.

The NOC samples were used to construct an off-line process monitoring model that consisted of a moving window of length b, moving sliding along the time series with a step size s. The three fault conditions are summarized in Table 4. Fault conditions 3, 9, and 15 are the most difficult to detect, and many fault diagnostic approaches fail to do so reliably.

Fault number Description Type
3 D feed temperature Step change
9 Reactor feed D temperature Random variation
15 Condenser cooling water valve Sticking

Table 4.

Description faults 3, 9, and 15 in the Tennessee Eastman process.

In this case study, the approach previously proposed by Bardinas et al. [96] is applied to the three fault conditions in the TE process. The methodology can be briefly summarized as shown in Figure 4.

Figure 4.

Process monitoring methodology (after Bardinas et al., 2018). (A) Time series matrix, (B) Segmented time series matrix, (C) Distance matrices, and (D) Features and labels.

A window of user defined length b slides along the time series (A) with a user defined step size s, yielding time series segments (B), each of which can be represented by a similarity matrix (C) that is subsequently considered as an image from which features can be extracted via algorithms normally used in multivariate image analysis (D).

4.2 Feature extraction

Two sets of features were extracted from the similarity or distance matrices, namely features from the gray level co-occurrence matrices of the images, as well as local binary pattern features, as briefly discussed below.

4.2.1 Gray level co-occurrence matrices (GLCMs)

GLCMs assign distributions of gray level pairs of neighboring pixels in an image based on the spatial relationships between the pixels. More formally, if y i j is an element of a GLCM associated with an image I of size R × S , having L gray levels, then y i j can be defined as

y i j = r = 1 R s = 1 S 1 , if I r s = i , and I r + r s + s = j 0 , otherwise E8

where r s and r + r s + s denote the positions of the reference and neighboring pixels, respectively. From this matrix, various textural descriptors can be defined. Only four of these were used, as defined by Haralick et al. [100], namely contrast, correlation, energy, and homogeneity.

4.2.2 Local binary patters (LBPs)

LBPs are nonparametric descriptors of the local structure of the image [101]. The LBP operator is defined for a pixel in the image as a set of binary values obtained by comparing the center pixel intensity with its neighboring pixels. If the neighboring pixel exceeds the intensity of the center pixel value, this pixel is set to 1 (otherwise 0). Formally, given the central pixel’s coordinates x 0 y 0 , the resulting LBP can be obtained in the decimal form as

LBP x 0 y 0 = p = 0 p 1 s i p i 0 2 p E9

where the gray level intensity value of the central pixel is i 0 , that of its p’th neighbor is i p . Moreover, the function s is defined as

s x = 0 , if x < 0 1 , if x 0 E10

4.3 Selection of window length and step size

Apart from the selection of a feature extraction method, one of the main choices that need to be made in the process monitoring scheme is the length of the sliding window. If this is too small, the essential dynamics of the time series would not be captured. On the other hand, if it is too large, it would result in a considerable lag before any change in the process can be detected. There is also the possibility that transient changes may go undetected altogether. In the case of a moving window, the step size of the moves also needs to be considered. The selection of these two parameters can be done by means of a grid search, and the results of which are shown in Figure 5.

Figure 5.

Grid search optimization of the window length (b) and step size (s).

As indicated in Figure 5, the optimal window size was b = 1000 and the step size was s = 20 for both the GLCM and LBP features that were used as predictors. With these settings, the 500-tree random forest model [102] was able to differentiate between the normal operating conditions and the three fault classes with a reliability of approximately 82%.

In Figure 6, principal component score plots of the two optimal feature sets are shown. The large LBP feature set could not be visualized reliably, as the first two principal components could only capture 52.5% of the variance of the features. The variance in the smaller GLCM feature set could be captured with high reliability by the first two principal components of the four features. Here, the differences between the normal operating data (“0” legend) and the other fault conditions are clear (“3,” “9,” and “15”).

Figure 6.

Principal component score plots of GLCM (left) and LBP features (right).

4.4 Discussion

The approach outlined in Section 2.5.4. considered in more detail in the above case study is an extension of recurrent quantification analysis with the advantage that information is not lost when the similarity matrix of the signal is thresholded. Also, while thresholding does not preclude the use of a wide range of feature extraction algorithms, recurrent quantification has mostly been applied to dynamic systems based on a set of engineered features that allow some modicum of physical interpretation.

In most diagnostic systems, this is not essential and therefore more predictive feature sets may be constructed. These features could be engineered, as was considered in the case study or they could be learned, by taking advantage of state-of-the-art developments in deep learning.

In addition, the following general observations can be made not only with regard to the approach considered in this case study but also to other approaches reviewed in this chapter.

  • Most of the nonlinear approaches used in steady state systems can be used in dynamic systems, and as a consequence, principal and independent component analysis and kernel methods have figured strongly in recent advances in dynamic process monitoring.

  • With the routine acquisition of ever larger volumes of data and more complex processing, it can be expected that the field will continue to benefit from advances in machine learning. The application of deep learning methods in particular is a highly promising emerging area of research.

  • Likewise, dynamic process monitoring is also likely to continue to benefit from closely related fields, such as process condition monitoring, structural health monitoring, change point detection, and novelty detection in other engineering or technical systems.

  • As with steady state process monitoring, fault identification has received comparatively little attention to date.


5. Conclusions and future work

Data-driven fault diagnosis of dynamic systems has advanced considerably over the last decade or more. In this chapter, the large variety of algorithms currently available has been discussed in terms of a feature extraction problem associated with the data captured by sliding a window across the time series or in some cases making use of a fixed window. These features could be used in statistical process monitoring frameworks that are well established for steady state systems.

In addition, extension of a recent approach to nonlinear time series analysis, namely recurrence quantification analysis, has been considered and shown to be an effective means of monitoring dynamic process systems, such as represented by the Tennessee Eastman benchmark problem in chemical engineering.

As mentioned in Section 4.4., a wide range of feature extraction algorithms can be used with unthresholded or global recurrence quantification analysis. In future work, the application of convolutional neural networks to extract features from global recurrence plots will be considered. This does not necessarily require a large amount of data, as pretrained networks, such as AlexNet, ResNet, and VGG architectures, and others could possibly be used as is, in what would essentially be a texture analysis problem, similar to the work done by Fu and Aldrich [103, 104] in the recognition of flotation froth textures, for example.


Conflict of interest

The author declares no conflict of interest in this contribution.


  1. 1. Russell EL, Chiang LH, Braatz RD. Data-Driven Techniques for Fault Detection and Diagnosis in Chemical Processes. London: Springer; 2000
  2. 2. Aldrich C, Auret L. Unsupervised Process Monitoring and Fault Diagnosis with Machine Learning Methods. London: Springer-Verlag Ltd; 2013. Series: Advances in Pattern Recognition. ISBN: 978-1-4471-5184-5
  3. 3. De Ketelaere B, Hubert M, Schmitt E. A review of PCA-based statistical process monitoring methods for time-dependent, high-dimensional data. (Downloaded from: on 26 December 2014). 2013
  4. 4. Ku W, Storer RH, Georgakis C. Disturbance detection and isolation by dynamic principal component analysis. Chemometrics and Intelligent Laboratory Systems. 1995;30(1):179-196
  5. 5. Tsung F. Statistical monitoring and diagnosis of automatic controlled processes using dynamic principal component analysis. International Journal of Production Research. 2000;38(3):625-637
  6. 6. Kruger U, Zhou Y, Irwin GW. Improved principal component monitoring of large scale processes. Journal of Process Control. 2004;14(8):879-888
  7. 7. Luo R, Misra M, Himmelblau DM. Sensor fault detection via multiscale analysis and dynamic PCA. Industrial and Engineering Chemistry Research. 1999;38(4):1489-1495
  8. 8. Rato TJ, Reis MS. Defining the structure of DPCA models and its impact on process monitoring and prediction activities. Chemometrics and Iintelligent Laboratory Systems. 2013;125:74-80
  9. 9. Choi SW, Lee I-B. Nonlinear dynamic process monitoring based on dynamic kernel PCA. Chemical Engineering Science. 2004;59(24):5897-5908
  10. 10. Cui P, Li J, Wang G. Improved kernel principal component analysis for fault detection. Expert Systems with Applications. 2008;34(2):1210-1219
  11. 11. Shao J-D, Rong G. Nonlinear process monitoring based on maximum variance unfolding projections. Expert Systems with Applications. 2009;36(8):11332-11340
  12. 12. Li R, Rong G. Fault isolation by partial dynamic principal component analysis in dynamic process. Chinese Journal of Chemical Engineering. 2006a;14(4):486-493
  13. 13. Li R, Rong G. Dynamic process fault isolation by partial DPCA. Chemical and Biochemical Engineering Quarterly. 2006b;20(1):69-77
  14. 14. Stefatos G, Hamza AB. Dynamic independent component analysis approach for fault detection and diagnosis. Expert Systems with Applications. 2010;37(12):8606-8617
  15. 15. Hsu C-C, Chen M-C, Chen L-S. A novel process monitoring approach with dynamic independent component analysis. Control Engineering Practice. 2010;18:242-253
  16. 16. Cai L, Tian X, Zhang N. Non-Gaussian process fault detection method based on modified KICA. CIESC Journal. 2012;63(9):2864-2868. (In Chinese)
  17. 17. Zhang Y, Qin SJ. Fault detection of nonlinear processes using multiway kernel independent component analysis. Industrial and Engineering Chemistry Research. 2007;46(23):7780-7787
  18. 18. Zhang Y. Enhanced statistical analysis of nonlinear processes using KPCA, KICA and SVM. Chemical Engineering Science. 2009;64(5):801-811
  19. 19. Zhang Y, Li S, Teng Y. Dynamic process monitoring with recursive kernel principal component analysis. Chemical Engineering Science. 2012;72:78-86
  20. 20. Wiskott L, Berkes P, Franzius M, Sprekeler H, Wilbert N. Slow feature analysis. Scholarpedia. 2011;6(4):5282. Available from:
  21. 21. Zhang N, Tian X, Cai L, Deng X. Process fault detection based on dynamic kernel slow feature analysis. Computers and Electrical Engineering. 2015;41:9-17
  22. 22. Shang C, Huang B, Yang F, Huang D. Slow feature analysis for monitoring and diagnosis of control performance. Journal of Process Control. 2016;39:21-34
  23. 23. Guo F, Shang C, Huang B, Wang K, Yang F, Huang D. Monitoring of operating point and process dynamics via probabilistic slow feature analysis. Chemometrics and Intelligent Laboratory Systems. 2016;151:115-125
  24. 24. Fourie SH, De Vaal PL. Advanced process monitoring using an on-line non-linear multiscale principal component analysis methodology. Computers and Chemical Engineering. 2000;24(2–7):755-760
  25. 25. Rosen C, Lennox JA. Multivariate and multiscale monitoring of wastewater treatment operation. Water Research. 2001;35(14):3402-3410
  26. 26. Yoon S, MacGregor JF. Principal component analysis of multiscale data for process monitoring and fault diagnosis. AICHE Journal. 2004;50(11):2891-2903
  27. 27. Lee DS, Park JM, Vanrolleghem PA. Adaptive multiscale principal component analysis for on-line monitoring of a sequencing batch reactor. Journal of Biotechnology. 2005;116(2):195-210
  28. 28. Bakshi BR. Multiscale PCA with application to multivariate statistical process monitoring. AICHE Journal. 1998;44:1596
  29. 29. Bakshi BR. Multiscale analysis and modeling using wavelets. Journal of Chemometrics. 1999;13:415-434
  30. 30. Choi SW, Morris AJ, Lee IB. Nonlinear multiscale modelling for fault detection and identification. Chemical Engineering Science. 2008;63(8):2252-2266
  31. 31. Xuemin T, Xiaogang D. A fault detection method using multi-scale kernel principal component analysis. In: Proceedings of the 27th Chinese Control Conference. Kunming, Yunnan, China; 2008
  32. 32. Golyandina N, Nekrutkin V, Zhigljavsky A. Analysis of Time Series Structure: SSA and Related Techniques. New York, London: Chapman & Hall/CRC; 2001
  33. 33. Jemwa GT, Aldrich C. Classification of process dynamics with Monte Carlo singular spectrum analysis. Computers and Chemical Engineering. 2006;30:816-831
  34. 34. Hassani H. Singular spectrum analysis: Methodology and comparison. Journal of Data Science. 2007;5:239-257
  35. 35. Aldrich C, Jemwa GT, Krishnannair S. Multiscale process monitoring with singular spectrum analysis. Proceedings of the 12th IFAC Symposium on Automation in Mining, Mineral and Metal Processing. Vol. 12(1). Quebec City, QC: Canada. 2007. pp. 167?172. Code 85804
  36. 36. Krishnannair S, Aldrich C, Jemwa GT. Fault detection in process systems with singular spectrum analysis. Chemical Engineering Research and Design. 2016;113:151-168
  37. 37. Alexander SM, Gor TB. Monitoring, diagnosis and control of industrial processes. Computers and Industrial Engineering. 1998;35(1–2):193-196
  38. 38. Kano M, Nagao K, Hasebe S, Hashimoto I, Ohno H, Strauss R, et al. Comparison of statistical process monitoring methods: Application to the Eastman challenge problem. Computers and Chemical Engineering. 2000;24:175-181
  39. 39. Misra M, Yue HH, Qin SJ, Ling C. Multivariate process monitoring and fault diagnosis by multiscale PCA. Computers and Chemical Engineering. 2002;26(9):1281-1293
  40. 40. Li X, Yu Q, Wang J. Process monitoring based on wavelet packet principal component analysis. Computer Aided Chemical Engineering. 2003;14:455-460
  41. 41. Ganesan R, Das T, Venkataraman V. Wavelet-based multiscale statistical process monitoring: A literature review. IIE Transactions. 2004;36:787-806
  42. 42. Geng Z, Zhu Q. Multiscale nonlinear principal component analysis (nlpca) and its application for chemical process monitoring. Industrial and Engineering Chemistry Research. 2005;44(10):3585-3593
  43. 43. Wang D, Romagnoli JA. Robust multi-scale principal components analysis with applications to process monitoring. Journal of Process Control. 2005;15:869-882
  44. 44. Maulud A, Wang D, Romagnoli JA. A multi-scale orthogonal nonlinear strategy for multi-variate statistical process monitoring. Journal of Process Control. 2006;16(7):671-683
  45. 45. Zhang Y, Hu Z. Multivariate process monitoring and analysis based on multi-scale KPLS. Chemical Engineering Research and Design. 2011;89(12):2667-2678
  46. 46. Wang T, Liu X, Zhang Z. Characterization of chaotic multiscale features on the time series of melt index in industrial propylene polymerization system. Journal of the Franklin Institute. 2014;351:878-906
  47. 47. Yang Y, Li X, Liu X, Chen X. Wavelet kernel entropy component analysis with application to industrial process monitoring. Neurocomputing. 2015;147(1):395-402
  48. 48. Auret L, Aldrich C. Change point detection in time series data with random forests. Control Engineering Practice. 2010;18:990-1002
  49. 49. Legat A, Dolecek V. Chaotic analysis of electrochemical noise measured on stainless steel. Journal of the Electrochemical Society. 1995;142(6):1851-1858
  50. 50. Aldrich C, Qi BC, Botha PJ. Analysis of electrochemical noise with phase space methods. Minerals Engineering. 2006;19(14):1402-1409
  51. 51. Xia D, Song S, Wang J, Shi J, Bi H, Gao Z. Determination of corrosion types from electrochemical noise by phase space reconstruction theory. Electrochemistry Communications. 2012;15(1):88-92
  52. 52. Cai E, Liu D, Liang L, Xu G. Monitoring of chemical industrial processes using integrated complex network theory with PCA. Chemometrics and Intelligent Laboratory Systems. 2015;140:22-35
  53. 53. Cazares-Ibáñez E, Vázquez-Coutiño AG, García-Ochoa E. Application of recurrence plots as a new tool in the analysis of electrochemical oscillations of copper. Journal of Electroanalytical Chemistry. 2005;583(1):17-33
  54. 54. Acun ̃a-González, N, Garcia-Ochoa, E. and González-Sanchez, J. Assessment of the dynamics of corrosion fatigue crack initiation applying recurrence plots to the analysis of electrochemical noise data. International Journal of Fatigue. 2008;30:1211-1219
  55. 55. Hou Y, Aldrich C, Lepkova K, Suarez LM, Kinsella B. Monitoring of carbon steel corrosion by use of electrochemical noise and recurrence quantification analysis. Corrosion Science. 2016;112:63-72
  56. 56. Hou Y, Aldrich C, Lepkova K, Machuca LL, Kinsella B. Effect of electrode size on the electrochemical noise measured in different corrosion systems. Electrochimica Acta. 2017;256:337-347
  57. 57. Hou Y, Aldrich C, Lepkova K, Kinsella B. Detection of under deposit corrosion in CO2 environment by electrochemical noise and recurrence quantification analysis. Electrochimica Acta. 2018a;274:160-169
  58. 58. Hou Y, Aldrich C, Lepkova K, Kinsella B. Identifying corrosion of carbon steel buried in iron ore and coal cargoes based on recurrence quantification analysis of electrochemical noise. Electrochimica Acta. 2018b;283:212-220
  59. 59. Xie L, Zhang J, Wang S. Investigation of dynamic multivariate chemical process monitoring. Chinese Journal of Chemical Engineering. 2006;14(5):559-568
  60. 60. Markou M, Singh S. Novelty detection: a review—Part 2: Neural network based approaches. Signal Processing. 2003;83(12):2499-2521
  61. 61. Augusteijn MF, Folkert BA. Neural network classification and novelty detection. International Journal of Remote Sensing. 2002;23(14):2891-2902
  62. 62. Chen J, Liao C-M. Dynamic process fault monitoring based on neural network and PCA. Journal of Process Control. 2002;12(2):277-289
  63. 63. Guh R, Shiue Y. An effective application of decision tree learning for on-line detection of mean shifts in multivariate control charts. Computers and Industrial Engineering. 2008;55(2):475-493
  64. 64. Cheng C, Chiu M. Nonlinear process monitoring using JITL-PCA. Chemometrics and Intelligent Laboratory Systems. 2005;76:1-13
  65. 65. Odiowei PP, Cao Y. State-space independent component analysis for nonlinear dynamic process monitoring. Chemometrics and Intelligent Laboratory Systems. 2010;103:59-65
  66. 66. Odiowei PP, Cao Y. Nonlinear dynamic process monitoring using canonical variate analysis and kernel density estimations. IEEE Transactions on Industrial Informatics. 2009b;6(1):36-45
  67. 67. Simoglou A, Argyropoulos P, Martin EB, Scott K, Morris AJ, Taam WM. Dynamic modelling of the voltage response of direct methanol fuel cells and stacks part I: Model development and validation. Chemical Engineering Science. 2001;56:6761-6772
  68. 68. Simoglou A, Martin EB, Morris AJ. Statistical performance monitoring of dynamic multivariate processes using state space modelling. Computers & Chemical Engineering. 2002;26:909-920
  69. 69. Simoglou A, Georgieva P, Martin EB, Morris AJ, Feyo de Azevedo S. On-line monitoring of a sugar crystallization process. Computers & Chemical Engineering. 2005;29:1411-1422
  70. 70. Russell EL, Chiang LH, Braatz RD. Fault detection in industrial processes using canonical variate analysis and dynamic principal component analysis. Chemometrics and Intelligent Laboratory Systems. 2000;51:81-93
  71. 71. Negiz A, Cinar A. PLS, balanced, and canonical variate realization techniques for identifying VARMA models in state space. Chemometrics and Intelligent Laboratory Systems. 1997a;38(2):209-221
  72. 72. Stubbs S, Zhang J, Morris AJ. Fault detection in dynamic processes using a simplified monitoring-specific CVA state space approach. Computer Aided Chemical Engineering. 2009;26:339-344
  73. 73. Stubbs S, Zhang J, Morris AJ. Fault detection in dynamic processes using a simplified monitoring-specific CVA state space approach. Computers and Chemical Engineering. 2012;41:77-87
  74. 74. Karoui MF, Alla H, Chatti A. Monitoring of dynamic processes by rectangular hybrid automata. Nonlinear Analysis: Hybrid Systems. 2010;4(4):766-774
  75. 75. Khediri IB, Limam M, Weihs C. Variable window adaptive kernel principal component analysis for nonlinear nonstationary process monitoring. Computers and Industrial Engineering. 2011;61(3):437-446
  76. 76. Zhang Z, Jiang T, Li S, Yan Y. Automated feature learning for nonlinear process monitoring – An approach using stacked denoising autoencoder and k-nearest neighbor rule. Journal of Process Control. 2018;64:49-61
  77. 77. Mehdiyev N, Lahann J, Emrich A, Enke D, Fettke P, Loos P. Time series classification using deep learning for process planning: A case from the process industry. Procedia Computer Science. 2017;114:242-249
  78. 78. Lin WL, Qian Y, Li XX. Nonlinear dynamic principal component analysis for on-line process monitoring and diagnosis. Computers and Chemical Engineering. 2000;24(2–7):423-429
  79. 79. Dobos L, Abonyi J. On-line detection of homogeneous operation ranges by dynamic principal component analysis based time-series segmentation. Chemical Engineering Science. 2012;75:96-105
  80. 80. Liu X, Krüger U, Littler TB, Xie L, Wang S. Moving window kernel PCA for adaptive monitoring of nonlinear processes. Chemometrics and Intelligent Laboratory Systems. 2009;96(2):132-143
  81. 81. Mele FD, Musulin E, Puigjaner L. Supply chain monitoring: A statistical approach. Computer Aided Chemical Engineering. 2005;20:1375-1380
  82. 82. Lee JM, Yoo CK, Lee IB. Statistical monitoring of dynamic processes based on dynamic independent component analysis. Chemical Engineering Science. 2004;59:2995-3006
  83. 83. Odiowei PP, Cao Y. Nonlinear dynamic process monitoring using canonical variate analysis and kernel density estimations. Computer Aided Chemical Engineering. 2009a;27:1557-1562
  84. 84. Rashid MM, Yu J. A new dissimilarity method integrating multidimensional mutual information and independent component analysis for non-Gaussian dynamic process monitoring. Chemometrics and Intelligent Laboratory Systems. 2012b;115:44-58
  85. 85. Cai L, Tian X, Chen S. A process monitoring method based on noisy independent component analysis. Neurocomputing. 2014a;127:231-246
  86. 86. Cai L, Tian X, Zhang N. A kernel time structure independent component analysis method for nonlinear process monitoring. Chinese Journal of Chemical Engineering. 2014b;22(11–12):1243-1253
  87. 87. Cai L, Tian X. A new fault detection method for non-Gaussian process based on robust independent component analysis. Process Safety and Environmental Protection. 2014;92(6):645-658
  88. 88. Fan J, Wang Y. Fault detection and diagnosis of non-linear non-Gaussian dynamic processes using kernel dynamic independent component analysis. Information Sciences. 2014;259:369-379
  89. 89. Chen J, Yu J, Mori J, Rashid MM, Hu G, Yu H, et al. A non-Gaussian pattern matching based dynamic process monitoring approach and its application to cryogenic air separation process. Computers & Chemical Engineering. 2013;58:40-53
  90. 90. Ruschin-Rimini N, Ben-Gal I, Maimon O. Fractal geometry statistical process control for non-linear pattern-based processes. IIE Transactions. 2013;45(4):355-373
  91. 91. Alabi S, Morris A, Martin E. On-line dynamic process monitoring using wavelet-based generic dissimilarity measure. Chemical Engineering Research and Design. 2005;83:698-705
  92. 92. Yunus MYM, Zhang J. Multivariate process monitoring using classical multidimensional scaling and procrustes analysis. IFAC Proceedings Volumes (IFAC-PapersOnline). 2010;9(1):165-170
  93. 93. Negiz A, Cinar A. Statistical monitoring of multivariate dynamic processes with state-space models. AICHE Journal. 1997b;43(8):2002-2020
  94. 94. Alawi A, Morris AJ, Martin EB. Statistical performance monitoring using state space modelling and wavelet analysis. In: Proceedings of the 15th European Symposium on Computer Aided Process Engineering. 2005. pp. 1375-1381
  95. 95. Hill DJ, Minsker BS. Anomaly detection in streaming environmental sensor data: A data-driven modeling approach. Environmental Modelling and Software. 2010;25(9):1014-1022
  96. 96. Bardinas JP, Aldrich C, Napier LFA. Predicting the operational states of grinding circuits by use of recurrence texture analysis of time series data. PRO. 2018;6:17
  97. 97. Downs JJ, Vogel EF. A plant-wide industrial process control problem. Computers and Chemical Engineering. 1993;17(3):245-255
  98. 98. Detroja KP, Gudi RD, Patwardhan SC. Fault detection using correspondence analysis: Application to Tennessee Eastman challenge problem. IFAC Proceedings Volumes. 2006;39(2):705-710
  99. 99. Lyman PR, Georgakis C. Plant-wide control of the Tennessee Eastman problem. Computers and Chemical Engineering. 1995;19(3):321-331
  100. 100. Haralick RM, Shanmugam K, Dinstein IH. Textural features for image classification. IEEE Transactions on Systems, Man, and Cybernetics. 1973;3(6):610-621
  101. 101. Ojala T, Pietikainen M, Harwood D. A comparative study of texture measures with classification based on featured distribution. Pattern Recognition. 1996;29(1):51-59
  102. 102. Breiman L. Random forests. Machine Learning. 2001;45(1):5-32
  103. 103. Fu Y, Aldrich C. Froth image analysis by use of transfer learning and convolutional neural networks. Minerals Engineering. 2018;115:68-78
  104. 104. Fu Y, Aldrich C. Flotation froth image recognition with convolutional neural networks. Minerals Engineering. 2019;132:183-190

Written By

Chris Aldrich

Submitted: June 13th, 2018 Reviewed: February 26th, 2019 Published: April 23rd, 2019