On-Line Monitoring of Batch Process with Multiway PCA/ICA

Batch processes play an important role in the production and processing of low-volume, high-value products such as specialty polymers, pharmaceuticals and biochemicals. Generally, a batch process is a finite-duration process that involves charging of the batch vessel with specified recipe of materials; processing them under controlled conditions according to specified trajectories of process variables, and discharging the final product from the vessel.


Introduction
Batch processes play an important role in the production and processing of low-volume, high-value products such as specialty polymers, pharmaceuticals and biochemicals. Generally, a batch process is a finite-duration process that involves charging of the batch vessel with specified recipe of materials; processing them under controlled conditions according to specified trajectories of process variables, and discharging the final product from the vessel.
Batch processes generally exhibit variations in the specified trajectories, errors in the charging of the recipe of materials, and disturbances arising from variations in impurities. If the problem not being detected and remedied on time, at least the quality of one batch or subsequent batches productions is poor under abnormal conditions during these batch operations. Prior to completion of the batch or before the production of subsequent batches, batch processes need effective strategy of real-time, on-line monitoring to be detected and diagnosed the faults and hidden troubles earlier and identified the causes of the problems for safety and quality.
Based on multivariable statistical analysis, several chemometric techniques have been proposed for online monitoring and fault detection in batch processes. MacGregor (1994, 1995) firstly developed a powerful approach known as multiway principal component analysis (MPCA) by extending the application of principal component analysis (PCA) to three-dimensional batch processes. By again projecting the information contained in the process-variable trajectories onto low-dimensional latent-variable space that summarizes both the variables and their time trajectories, the main idea of their approach is to compress the normal batch data and extract information from massive batch data. A batch process can be monitored by comparing with its time progression of the projections in the reduced space with those of normal batch data after having set up normal batch behaviour. Several studies have investigated the applications of MPCA (Chen & Wang, 2010;Jung-hui & Hsin-hung, Chen, 2006;Kosanovich et al., 1996;Kourti, 2003;Westerhuis et al., 1999).
Many of the variables monitored in one process are not independent in some cases, may be combination of independent variables not being measured directly. Independent component analysis (ICA) can extract the underlying factors or components from non-Gaussian multivariate statistical data in the process, and define a generative model for massive observed data, where the variables are assumed to be linear or nonlinear mixtures of unknown latent variables called as independent components (ICs) Ikeda and Toyama, 2000). Unlike capturing the variance of the data and extracting uncorrelated latent variable from correlated data on PCA algorithm, ICA seeks to extract separated ICs that constitute the variables. Furthermore, without orthogonality constraint, ICA is different from PCA whose direction vectors should be orthogonal.  extended ICA to batch process on proposing on-line batch monitoring using multiway independent component analysis (MICA), and regarded that ICA may reveal more information in non-Gaussian data than PCA.
Although the approach proposed by MacGregor (1994, 1995) is based on the strong assumption that all the batches in process should be equal duration and synchronized, every operational period of the batches is almost different from others actually because of batch-to-batch variations in impurities, initial charges of the recipe component, and heat removal capability from seasonal change, therefore operators have to adjust the operational time to get the desired product quality. There are several methods to deal with the different durations for the algorithm MPCA. However, neither stretching all the data length to the maximum by simply attaching the last measurements nor cutting down all 'redundant trajectories' to the minimum directly could construct the process model perfectly. Kourti et al. (1996) used a sort of indicator variable which is followed by other variables to stretch or compress them applied on industrial batch polymerization process. Kassidas et al. (1998) presented an effective dynamic time warping (DTW) technique to synchronize trajectories, which is flexible to transform the trajectories optimally modelling and monitoring with the concept of MPCA. DTW appropriately translates, expands and contracts the process measurements to generate equal duration, based on the principle of optimally of dynamic programming to compute the distance between two trajectories while time aligning the two trajectories (Labiner et al., 1978). Chen and Liu (2000) put forward an approach to transform all the variables in a batch into a series of orthonormal coefficients with a technique of orthonormal function approximation (OFA), and then use those coefficients for MPCA and multiway partial least square (MPLS) modelling and monitoring Liu, 2000, 2001). One group of the extracted coefficients can be thought as abbreviation of its source trajectory, and subsequent relevant information of the projection from PCA can reveal the variation information of process well.
About the measures of online monitoring MPCA, Nomikos and MacGregor (1995) presented three solutions: filling the future observation with mean trajectories from the reference database; attaching the current deviation as the prediction values of incomplete process; and partial model projection that the known data of appeared trajectories are projected onto the corresponding partial loading matrix. The former two schemes are introduced to estimate the future group of data by just filling hypothesis information simply, without consideration of possible subsequent variations; and on the latter scheme only part information of MPCA model is used with the appeared trajectories projection onto the corresponding part of loading matrix of MPCA to analyze the variation of local segments. Therefore the indices of monitoring may be inaccurate on the above three solutions. To eliminate the errors of monitoring, Gao and Bai (2007) developed an innovative measure to estimate the future data of one new batch by calculation of the Generalized Correlation Coefficients (GCC) between www.intechopen.com the new batch trajectory and historical trajectories, to fill the subsequent unknown portion of the new batch trajectory with the corresponding part of the history one with maximum GCC.
Recently, for online monitoring of batch process, some papers were involved in GCC prediction after DTW synchronization with MPCA/MICA (Bai et al., 2009a(Bai et al., , 2009bGao et al., 2008b), other works were concerned with GCC prediction after OFA synchronization with MPCA/MICA (Bian 2008;Bian et al., 2009;Gao et al., 2008a). These examples proved that both DTW and OFA are integrated with GCC prediction perfectly with MPCA/MICA.
In this chapter, a set of online batch process monitoring approaches are discussed. On real industrial batch process, the process data is not always followed Gaussian distribution, Compared with MPCA, MICA may reveal more hidden variation than MPCA though its complexity of computation; the methods of synchronization DTW and OFA, are applied in compound monitoring approaches respectively; four solutions for missing data of future value, are applied in an example comparatively.
The chapter is organized as follows. Section 2 gives introduction of the principle of DTW and relevant method of synchronization. In section 3, the principle of OFA is also introduced in advance and narration of how the extracted coefficients from the trajectories are used for model and monitoring. Then the traditional three solutions of Nomikos and MacGregor (1995) and GCC estimation are discussed in Section 4. An industrial polyvinyl chloride (PVC) polymerization process is employed to illustrate the integrative approaches in Section 5. Finally, a conclusion is presented in Section 6.

Dynamic time warping
Dynamic Time warping (DTW) is a flexible, deterministic pattern matching method for comparing two dynamic patterns that may not perfectly aligned and are characterized by similar, but locally transformed, compressed and expanded, so that similar features within (Kassidas et al.,1998) the two patterns are matched. The problem can be discussed from two general trajectories, R and T.

Symmetric and asymmetric DTW algorithm
Let R and T express the multivariate trajectories of two batches, whose matrices of dimension t×N and r×N, separately, where t and r are the number of observations and N is the number of measured variables. In most case, t and r are not always equal, so that the two batches are not synchronized because they have not common length. Even if t=r, their trajectories may not be synchronized because of their different local characteristics. If one applies the monitoring scheme of MPCA (Nomikos and MacGregor, 1994), or the scheme of MICA (Yoo et al., 2004), by simply add or delete some measured points artificially, unnecessary variation will be included in statistical model and the subsequent statistical tests will not detect the faulty batches sensitively.
On the principle of dynamic programming to minimize a distance between two trajectories, DTW warps the two trajectories so that similar events are matched and a minimum distance between them is obtained, because DTW will shift, compress or expand some feature vectors to achieve minimum distance (Nadler and Smith, 1993). Let i and j denote the time index of the T and R trajectories, respectively. DTW will find optimal route in sequence F * of K points on a t×r grid.
and each point c(k) is an ordered pair indicating a position in the grid. Two univariate trajectories T and R in Figure 1 show the main idea of DTW.
Most of DTW algorithms can be classified either as symmetric or as asymmetric. Although on the former scheme, both of the time index i of T and the time index j of R are mapped onto a common time index k, shown as Eqs.1, 2, the result of synchronization is not ideal, because the time length of synchronized trajectories often exceeds referenced trajectories. On the other hand, the latter maps the time index of T on the time index of R or vice-versa, to expand or compress more one trajectory towards the other. Compared with Eqs.1, 2, the sequence becomes as follow: and www.intechopen.com This implies that the path will go through each vector of R, but it may skip some vectors of T.

Endpoints, local and global constraints
In order to find the best path through the grid of t×r grid, three rules of the DTW algorithm should be specified.
(3)Global constraints: the searching area is () M Mtr   widening strip area around the diagonal of the t×r grid, which is shown in Fig.3.
The endpoint constraints illustrate that the initial and final points in both trajectories are located with certainty. The local continuity constrains consider the characteristics of time indices to avoid excessive compression or expansion of the two time scales (Myers et al. 1980).
On the requirement of monotonous and non-negative path, the local constrains also prevent excessive compression or expansion from the several latest neighbors (Itakura, 1975). The global constraints prevent large deviation from the linear path.

Minimum accumulated distance of the optimal path
As mentioned above, for the best path through a grid of vector-to-vector distances searched by DTW algorithm, some total distance measured between the two trajectories should be minimized. The calculation of the optimal normalized total distance is impractical, a feasible substitute is minimum accumulated distance, D A (i, j) from point (1,1) to point (i, j) (Kassidas et al., 1998). The suitable one is: is the weighted local distance between the i vector of the T trajectory and the j vector of the R trajectory, therein W is a positive definite weight matrix that reflects the relative importance of each measured variables.

The advantage and disadvantage of symmetric and asymmetric DTW
As mentioned above, DTW works with pairs of patterns. Therefore, the problem of whether symmetric or asymmetric is suitable for synchronization. Symmetric DTW algorithms include all points in the original trajectories, but expanded trajectories of various lengths, because the length is determined by DTW. After synchronization, each B i will be individually synchronized with B REF , but not with each other unfortunately.
Although asymmetric may eliminate some points, they will produce synchronized trajectories of equal length, because each time axis of B i will be mapped with the one of B REF so that they all are synchronized with reference trajectories B REF and synchronized with each other.
Unavoidably, the asymmetric algorithms have to skip some points in the optimal path, so the characteristics of some segments may be left out after synchronization to construct incomplete MPCA/MICA model from 'trimmed' trajectories to cause miss/false alarm.

The circumstance of combination of symmetric and asymmetric DTW
The essence of DTW is to match the pairs of two trajectories on synchronization. At first, on symmetric DTW algorithm, the optimal path is reconstructed following above 3 constraints and Eq.5,6. Aligning points of B i with B REF on asymmetric synchronization, some statuses would appear:

An improvement of DTW algorithm for more measurements
In some processes, the measurement may be relative too large to be satisfied with the need of memory of many calculated minimum accumulate distance D A (i, j). Gao et al. (2001) presented a solution to overcome the problem 'out of memory'. Their idea is that D A (i, j) should not be worked out until the final result D A (t, r) to accumulate a large number of the medium result. The programming can be composed with local dynamic programming in strip of adjacent time intervals, following is the improved algorithm under the three constraints and eq.5, 6, which is shown in Fig.3.
2) Then i←(i +1), compute D A (i, :) with the aid of the result of D A (i-1, :); 3) The local optimal path could be searched between the columns (i-1, :) and (i, :). The start point of the path is (I P , J P ) and the relay end point is (I E , J E ), where I E =I P +1, J E is ascertained on the following comparison: where fix is the function that keeps only the integer fraction of the result of computation.
4) Delete the column of D A (i-1, :), then set I P ←I E , J P ←J E ; 5) Repeate step 2 to step 4 till i=t (t is one end point of pair);

Procedure of synchronization of batch trajectories
The iterative procedure proposed for the synchronization of unequal batch trajectories (Kassidas et al., 1998) is a practical approach for industrial process, which is now being presented.
First of all, each variable from each batch should be scaled as preparation. Let B i , i=1,…,I be the result of scaled batch trajectories from I good quality raw batches, the scaling method is to find the average range of each variable in raw batches by averaging the range form each batch, then to divide each variable in all batches with its average range, and store average ranges for monitoring. Then synchronization begins. www.intechopen.com Step 0: Select one of the scaled trajectories B k as the referenced trajectories B REF on the technic requirement. Set weight matrix W equal to the identity matrix. Then execute the following steps for a specified maximum number of iterations.
Step 1: Apply the DTW method between  be the synchronized trajectories whose common durations is same as the one of B REF .
Step 2: Compute the average trajectory B from average values of all B i .
Step 3: For each variable, compute the sum of squared deviations from B , whose inverse will be the newer weight of the particular variable for the next iteration.
As a diagonal matrix, W should be normalized so that the sum of the weight is equal to the number of variables, that is, W could be replaced as: Step 4: In most case, the times of iterations are not greater than 3, so keep the same referenced trajectory: B REF =B k . If the more iterations are needed, set the reference equal to the average trajectory:

Offline implementation of DTW for batch monitoring
Now, a available complete trajectory of one new batch B RAW, NEW (b NEW ×N) needs to be monitored using MPCA/MICA. It has to be synchronized before the monitoring scheme is applied because most probably the new batch trajectory B RAW, NEW hardly accord with the referenced trajectory B REF .
When being scaled, each variable in the new batch B RAW, NEW is divided with the average range from referenced trajectory to get the resulting scaled new trajectory, B NEW . B NEW is synchronized with referenced trajectory B REF using W from Eq.8, 9 in the synchronization procedure to get the result NEW B  (b NEW ×N) which can be used in MPCA/MICA model.

Orthonormal function approximation
Under the condition of synchronous batch processes, the data from batch process are supposed to take the form of three-way array: j=1,2…J variables are measured at k=1,2,…K time intervals throughout i=1,2,…I batch runs. The most effective unfolding the three-way data on monitoring is to put its slices (I×J) side by side to the right, starting with the one corresponding to the first interval, then to generate a large two-dimensional matrix (I×JK) MacGregor 1994, 1995;Wold et al., 1987). The variable in the twodimensional matrix is treated as a new variable for building PCA model. Nevertheless, the batch processes are asynchronous in some cases so that two-dimensional matrix (I×JK) can not be formed. Unlike translation, expansion and contraction of process measurements to generate equal duration in DTW, orthonormal function is employed to eliminate the problem resulted from the different operating time to turn the implicit system information into several key parameters which cover the necessary part of the operating conditions for each variable in each batch (Chen and Liu, 2000;Neogi and Schlags, 1998).

Orthonormal function
On the concept of Orthonormal Function Approximation (OFA), the process measurements of each variable in each batch run can be mapped onto the same number of orthonormal coefficients to represent the key information. As an univariate trajectory, the profile of each variable in each batch run can be represented as a function F(t), which can be approximated in terms of an orthonormal set {φ n } of continuous function: are the projection of F(t) onto each basis function. Therefore, the coefficients C of the orthogonal function is representative of the measured variable F(t) of one batch run. Not being calculated from a set of K measurements, the coefficient α n can be derived practically with orthonormal decomposition of F(t): where E n = [E n (t 1 ) E n (t 2 )…E n (t ki )] T and Φ n = [φ n (t 1 ) φ n (t 2 )…φ n (t ki )] T . The Legendre polynomial basis function is regard as an effective function to be used due to the finite time interval for each batch run (Chen and Liu, 2000): where t∈ [-1,1]. When n=0, the constant coefficient α 0 is for 00 () ()/ 2 tP t   and P 0 (t)=1. Before applying the orthonormal function approximation, the variables of the system with different units needs to be pretreated in order to be put on an equal basis. However, mean centering of the measurement data is not necessary because the constant coefficient α 0 is for φ 0 orthonormal basis function. Mean centering will affect the constant coefficient for φ 0 corresponding to zero. The ratio convergence test for mathematical series is applied to determine the approximation error associated with the reduction in the number of the basis spaces (Moore and Anthony, 1989). The measure of approximation effectiveness can be obtained as: www.intechopen.com  F N (C, t). When a consistent minimum G ij( N) is reached, the required optimal number of terms N ij can be chosen for the measurement variable j at batch i (Moore and Anthony, 1989). Therefore, most of the behavior of the original F(t) is extracted from the coefficients C. Nevertheless, the maximum number of terms of the approximated function for each variable in all batch runs is taken to obtain enough more terms whose expansion F N (t) extracts the main behavior of F(t).
where ,, 0 , 1 , 1 [, ] , represents the coefficient vector of the approximation function for the measurement variable j at batch i, and N j is the needed number of terms for variable j.

Offline implementation of OFA for batch monitoring
When one new batch is completed, after being applied orthonormal function transformation, all the variables of the batch along the time trajectory become a row vector composed of a series of coefficients

Traditional online monitoring schemes
It is assumed that the future measurements are in perfect accordance with their mean trajectories as calculated from reference database, the first approach is to fill the unknown part of x new with zeros. In other words, batch is supposed to operate normally for the rest of its duration with no deviations in its mean trajectories. On the analysis of Nomikos and MacGregor (1995), the advantage of this approach is a good graphical representation of the batch operation in the t plots and the quick detection of an abnormality in the SPE plot, whereas the drawback of this approach is that the t scores are reluctant, especially at the beginning of the batch run, to detect an abnormal operation.
On the hypothesis that the future deviations form the mean trajectories will retain for the rest of the batch duration at their current values at the time interval k, the second approach is to fill the unknown part of x new with current scaled values under the assumption that the same errors will persist for the rest of the batch run. Although the SPE chart is not relative sensitive than one in the first approach, the t scores pick up an abnormality more quickly (Nomikos and MacGregor, 1995). Nomikos and MacGregor (1995) had to suggest that the future deviations will decay linearly or exponentially from their current values to the end of the batch run, to share the advantages and disadvantages of the first two approaches.
The unknown future observations can be regarded as missing data from a batch in MPCA on the third approach. To be consistent with the already measured values up to current time k, and with the correlation structure of the observation variables in the database as defined by the p-loading matrices of MPCA model, one can use the sub model of principal components of the reference database without excessive consideration of the unknown future values. MPCA projects the already known measurements www.intechopen.com where P( ) k kJ R  is a matrix whose all elements in each columns of p-loading vectors (p r ) from all the principal component are from start to the current time interval k. The matrix 1 () T kk PP  is well conditioned even for the early times, and approaches the identity matrix as k approaches the final time interval K because of the orthogonality property of the loading vectors p r (Nomikos and MacGregor, 1995). The advantage of this method is that at least 10% known measurements of new batch trajectory are enough for computation and perfect t scores near to the actual final values. However, Nomikos and MacGregor (1995) also indicated that little information will result in quite large and unexplainable t scores at the early stage of the new batch run. Similarly, the third approach can be applied to MICA model that the deterministic part of independent component vector, , ()  dk sd J k , can be calculated as: where W d (Jk×1)is the deterministic part of W s , a separating matrix in ICA algorithm.
It is uncertain that which one of above mentioned schemes is most suitable for batch process. Nomikos and MacGregor (1995) stated that each scheme is fit for respective condition: the third for non frequent discontinuities, the second for persistent disturbances and the first for non persistent disturbances. They also suggested combining these schemes when online monitoring.

Online monitoring with filling similar subsequent trajectory
Generally, as measurements of correlation degree between two vectors, Correlation Coefficients (CC) are numerical values which stand for the similarity in some sense. However, because each multivariable trajectory can be expressed as one matrix whose columns are variables with time going on, the relationship of corresponding two matrices of two multivariable trajectories can not be distinctly denoted with CC in the form of a numerical value but a matrix that one can not examine the similarity between the matrices by comparing the CC value. A sort of Generalized Correlation Coefficients measuring method was presented to the solution of the mentioned problem by computation of the traces of covariances, because as the sums of the eigenvalues of the matrices, their traces expresses the features of corresponding matrices in some ways (Gao and Bai., 2007). Suppose that a monitoring trajectory V (k ×m), where k is the current time interval, and m is the number of variables, another trajectory Y (k ×m) from history model database is chosen to match with V (k ×m) , their GCC can be defined as: where tr is the function of trace, ρ(V,Y) is the GCC. In eq.18, the definitions of cov(V), cov(Y), cov(V, Y) are: When two trajectories align with each other from start, the range of GCC is (0, 1], they are more similar as the value of their GCC near to 1. Caution must be paid when two trajectories are asynchronous so that the two matrices which have different dimensions have to be dealt with in eq. 22.

The procedure of online monitoring of asynchronous batch
The first step is to deal with the lack of data of online batch. The trouble of online monitoring of asynchronous batch is to choose the scheme properly. As above mentioned, traditional schemes are relative easy to be implemented whereas GCC approach need more computation time than others. The ongoing new batch V (k ×m) needs to compare with many normal batches and abnormal batches included in history model database Ω contained more matrices for prepared in many cases. Due to different dimensions of matrices between the new batch run and history batch run N i (K n ×m)∈Ω, i=1,2,...,h, h is the number of stored history batches in Ω, the pseudo covariance is introduced to be calculated instead of Eq. 21 (Gao et al., 2008b).
Then one of trajectories, N i (K n ×m) , that have the largest GCC with V (k ×m) is chosen. If k<K n , extend V (k ×m) by copying from k+1 to K n part of N i (K n ×m) to follow V (k ×m), otherwise maintain V (k ×m). Although k is far less than K n sometimes, the result of Eq.22 reveals the homologous relationship like covariance between the two matrices. Hence, the insufficiency of data of online batch run can be solved by filling the assumptive values in different ways.
The second step is pre-treatment of data. Before synchronization, all the measurements of new batch should be scaled.
The third step is synchronization; one can choose DTW or OFA to deal with the asynchronous running trajectory. After that, the new test batch is similar to offline batch so as to be projected onto MPCA/MICA model.

Brief introduction of technics of PVC polymerization process
As a thermoplastic resin, when its vinyl chloride molecules are associated, the production of PVC is forming chains of macromolecules, whose process is called polymerization. The vinyl chloride (VC) monomer, dipped in aqueous suspension, is polymerized in a rector shown as Fig.6.

Fig. 6. Flow diagram of PVC polymerization progress
The polymerization process reaction changes violently because the container in the rector goes through water phase, liquid VC phase and solid PVC phase on different stage of reaction. At the start of reaction, water, VC, suspension of stabilizers and initiator are on request loaded into the reactor through respective inlets, and then they are stirred adequately to create a kind of milky solution, suspension of VC droplets.
It is noticed that several indices should be monitored and controlled on each stages of the reaction, especially temperatures. Nine important variables of all the batches depicted on Table 1, are shown in Fig.7 from one batch. At the beginning of the reaction, the hot water is pumped into the jacket of reactor to heat the reactor content to the set temperature (57℃). The indirect heating does not continue until the sufficient reaction heat has been generated www.intechopen.com from the reaction. PVC in the solution will precipitate quickly to form solid phase PVC granules inside almost each VC monomer droplets on the polymerization, because it is not soluble in water, but little dissolved in the VC.  Due to the exothermic reaction, the temperature of the reactor will rise gradually so that the redundant reaction heat should be removed at once to keep constant temperate. In order to cool down the reactor, a flow of cooling water is pumped into the jacket surrounding the reactor. The condenser on the top the reactor also concentrates VC monomer from vapor to liquid. If temperature of reactor is lower than the set point temperature, the hot water is commanded to be injected in the jacket again, which is the automatic control of process by the parameters of the important variables. At the end of the polymerization, there is a little monomer of remained gaseous VC. With the VC being absorbed from the byproduct of exhaust gas, the polymerization does not continue until the action of terminator.

The essential of the batches of training set and test set
Although the PVC process last just several hours (3h~8h), the sampling frequency is comparatively higher because it is necessary to online monitor time-variant batch process. The sampling interval is 5 seconds, so that all the measurements of any one batch is on the scope of (2000, 6000) due to the adjustment of the duration for different requirements of products. After more observation of the production, most of the durations of batches are around 3200 measurements and the distribution of the batches does not follow normal distribution. From Fig.8 we can observe clearly the asynchronous chosen batches from temperature of the reactor (variable 1). There are 10 batches (#1~#10) taken as test data from the batch process in the plant. Some problems of these batches are listed in Table 2 in the polymerization of batch process, one tries to discriminate the abnormal of them with two statistics of SPE and T 2 of MPCA, or SPE and I 2 of MICA, and then find whose variables were affected.

The offline monitoring of batches without intelligent synchronization
For those asynchronous batches modeling and monitoring, without intelligent synchronization of DTW or OFA, the rough method of synchrozation, to prune so-called redundant data over the specified terminal or to extend the short trajectories with the last values, is experimented. All the durations of reference batches and test batches should be 3200 measurements.
Then the reference data set is arranged as a three-way X (I×J×K), where I corresponds to 50 batches, J corresponds to 9 process variables, and K corresponds to 3200 th time intervals. With the reference batch data X, the MPCA and MICA models are constructed initially. Offline analysis of ten test batches is executed to show if this kind of rough construction of data for MPCA or MICA is appropriate or not. After batch-wise unfolding, 8 principal components of the MPCA model are determined by the cross-validation method (Nomikos and MacGregor, 1994), which explain 82.61% of the variability in the data. 8ICs are selected for the MICA for 77.54% variation of the whole data. Fig.9 shows the results of SPE based on MPCA and MICA under 99% control limit. It is clear that neither of MPCA nor MICA does well on the incorrect asynchronous multivariate statistic model: MPCA misses the detection of the batch #2, and MICA reports false alarm batches #4,#5, and misses #1,#2.  1.1527, 1.8648, 0.2390, 1.4778, 0.1742, 0.2118, 0.8186, 0.2760, 0.4592, 3.3258] from Eq.8, 9 for twice iterations. The MPCA model is built and its retained principal number is 8 to show 88.44%the variation of the batch process, whereas MICA retains 3 IC to explain the 93.85% of variation of data. All three solutions of of Nomikos and MacGregor (1995) and GCC are simulated compared with the offline analysis to find which one is the most appropriate in the batch process.
www.intechopen.com Fig.10 shows several online monitoring SPE indices of the 10 test batches compared with offline in MPCA and MICA, respectively. It can be shown that the MPCA results of first solution always misses faults in abnormal batches because of its smoothing the variation, the MICA result also misses the alarm of #2 and #3; while the results of second and third soltions are too large to alarm by mistake. Comparatively, SPE of GCC prediction has adequate information of variations to identify the abnormal, only its MPCA results miss the abnormal of #4, the MICA results perform well.

Online monitoring of PVC with OFA-MPCA and OFA-MICA
After OFA synchronization, the information of original trajectories are extracted. Each variable of each batch run can be transformed into two coefficients, therefore in stead of irregular time length of three-dimensional data block, the two-dimensional coefficients matrix Θ (50×18) inherits the main features from the primative three-dimensional data block. Based on the new data of coefficients, the MPCA and MICA are experimented respectively. The online monitoring time point is set to 800 th measurement. MPCA algorithm holds 12 PCs to explain the 89.52% variation of the data, whereas MICA reserved 3ICs to illustrate the 51.92% variability in the data. The first two solutions of Nomikos and MacGregor (1995) and GCC are experimented in contrast with the offline analysis to find the best one in the batch process. It is noticed that the third solution does not fit for the coefficients matrix because the loading matrix is not from the coefficients, but from primative variables.
From various on-line monitoring solutions and offline analysis, Q-statistics-the SPE indices of 10 test batches are drawn in Fig.11, with MPCA and MICA, respectively. Similarly, the first solution of Nomikos and MacGregor (1995) erases many fine characters of the process so that it cannot detect the problem of many batches correctly, and the values of results of second online monitoring method are too large to be drawn in Fig.11, and always make false alarm to these batches, so it has to list them in   The D-statistics of PVC, T 2 of OFA-MPCA and I 2 of OFA-MICA are drawn in Fig.12 as well. GCC performs well in the D-statistics in the same way, either T 2 or I 2 , which are both close to the counterparts of offline. The first traditional solution can not predict any little variation after the time of detection, and the second one always has too larger error to be drawn in Fig. 12 that the results of the second solution has to be enumerated in Table 4 Fig.12, it can be seen that OFA-MICA misses alarm #1 and #4, but OFA-MPCA has more errors: missed #1 and #3, and has a false alarm about #5, #7, #9 and #10.
Consquently, it is proved that the effect of OFA-MICA is better than ones of OFA-MPCA on both of Q-statistics and D-statistics in Fig.11 and Fig.12.

Contribution plot of SPE and I 2 in OFA-MICA
The contribution plots can be used to dignose the event from non-conforming batches so as to assign a cause of abnormal by indication of which variables are predominatly responsible for the deviations (Jackson and Mudholkar, 1979). For instance, based on the approach of OFA-MICA, when the 800 th measurements of a diseased batch #3, the online SPE and I 2 contribution plots of 9 process variables are shown in Fig. 13 and Fig.14. It is obvious that the ratio of GCC (upper right) looks like the one of offline (upper left) which is different from the others (lower) distinctly. From Fig.13, The comparative larger ones of SPE is temperature of the baffle outlet (variable 4), flow rate of jacket water (variable 8) and stirring power (variable 9). Meanwhile we can find that the notable contribution of I 2 in Fig.14 are temperature of the reactor jacket inlet (variable 2), baffle outlet (variable 4) and flow rate of jacket water (variable 8). Therefore, contrasted with the report from plant in Table 2, the root cause is lower stirring power (the most conspicuous one in bar plot of Fig.13), which decreased other variables such as variable 4 and variable 8 consequently. It is inferred that lower stirring power decreased the rate of the reaction and generated less heat and needed smaller quantity of cooling water. www.intechopen.com

Conclusion
This chapter introduces online monitoring approaches of batch process to detect fine abnormal at early stage. MICA reveals more nature that occurs abnormal than MPCA. By DTW/OFA, two kinds of synchronization method, more accurate multivariate statistical models are constructed and new batch run is manipulated as much for correct monitoring. GCC method speculates the unknown data of future for MPCA/MICA well when batch process is online. However, in spite of its accuracy, the computation of MICA is more complicated than one of MPCA. It is not suggested to use the methods of synchronization if it is not serious asynchronous among the batch processes, because any method of synchronization consumes a large amount time and memory. Similarly, than other three traditional solutions, GCC needs more time of computation to compare with each other, and huge history model database. None of methods is predominant on the online monitoring of batch processes. The future work may combine the integrative approaches with SDG (Signed Direct Graph) to detect the root cause of the faults (Vedam & Venkatasubramanian, 1999).

Acknowledgment
The author wishes to acknowledge the assistance of Miss Lina Bai and Mr. Fuqiang Bian, who have done some work of the simulation of the chapter.