Open access

Data Processing Approaches for the Measurements of Steam Pipe Networks in Iron and Steel Enterprises

Written By

Luo Xianxi, Yuan Mingzhe, Wang Hong and Li Yuezhong

Submitted: 16 November 2011 Published: 17 October 2012

DOI: 10.5772/47781

From the Edited Volume

Energy Efficiency - The Innovative Ways for Smart Energy, the Future Towards Modern Utilities

Edited by Moustafa Eissa

Chapter metrics overview

1,922 Chapter Downloads

View Full Metrics

1. Introduction

Steam is important secondary energy resource media in iron& steel enterprises, amounting to nearly 10% of the whole energy consumption. When an enterprise is running, if all the produced steam can meet all the demands and no steam is bled, the overall energy efficiency can be effectively improve. Thus the complex networks of steam pipes and steam production scheduling systems were set up. Obviously, steam scheduling has to depend on the real time measured data from the steam pipe networks. Accurately measuring the variables of pressure, temperature and flow rate is essential to secure the safety and economic efficiency. It’s also necessary for accumulating the amount of steam production or the consumption to calculate the energy cost for each working procedure.

With the help of Energy Management System (EMS), all the data collected from the distributed instruments. In practice, the measurements of pressure and temperature are usually accurate enough for the application except that the sensors or the transducers fail. However, the measurements of mass flow rate are not so accurate because of the complex nature of the steam itself, lacking of high precision measuring instruments, the impact of interference and information transmitting network failures and other reasons. The reliability of the measurement of mass flow rate is poor.

When the steam mass flow measurement values deviate from the actual values to a certain extent, the automatic control system may largely deviate from the process requirements substantially. Even worse steam bleeding or accident could happen [1]. Therefore, it is not a satisfactory to decide or adjust the production process according to the data from the flow rate meters [2]. In energy management, the accumulation differences between production and consumption make it difficult to calculate the energy costs, analyze the segments of irrational use of steam, and find the weakness in the management links. Therefore, improving the reliability the steam flow rate measurement data is essential for the normal production and energy conservation in iron& steel enterprises.

The objective of the work may be depicted by figure 1. The real time data (mainly referring to the mass flow rate variables) would be mainly processed by three approaches. By fault data detection and reconstruction, the fault data would be picked out and the real value would be reconstructed or estimated. By gross error detection, the data with gross error would be discovered and re-estimated. By data reconciliation, the random error would be decreased and the quality of the data would be further improved.

Figure 1.

The objective of the work

For fault data detection and reconstruction, the reasons of the low accuracy of steam flow rate data are introduced. By applying the statistical process control theory[3] to determine univariate and multivariate process control limit to monitor the abnormal data online. The approach to calculate the real data (mass flow rates) through the thermal and hydraulic mathematical models of the steam pipe networks is proposed.

For the section of gross error detection, the definition of the problem is introduced. And the two basic gross error detection approaches, the Measurement Test (MT) method and the Method of pseudonodes (MP), are demonstrated.

For the section of data reconciliation, the constrained least-squares problems stated in the section of gross error detection is discussed in detail, including the assumptions for the application,the constraint equations and the selection of the weighted parameter matrix.

The presented approaches of data processing can be programmed for computers to determine the abnormality and improve the precision of the mass flow rate data.


2. Fault data detection and reconstruction

2.1. The reasons of the poor accuracy of steam flow rate measurement

There are many reasons for the poor accuracy of steam measurement data. The most important reasons can be concluded as follows:

  1. The working conditions of the instrument deviating from the designed conditions

    At present, the mass flow rates are mostly deduced by the volume flow rates and the density. However, the changes of temperature and pressure in the transmission process would lead to the density of steam deviate from the original designed value [3]. The measurement errors would be very large [1]. Any more, some superheated steam would change into a vapor-liquid two-phase medium, which making precision worse.

  2. The Complexity of Steam Characteristics

    With the ambient temperature changing, the total amount of the condensate water in the transmission process will be different. That makes difference between the amount of the production and the consumption of steam. In addition, the steam pipe leakage will add to the difference. Therefore, the accumulated readings are always doubtful.

  3. The Occurrence of wearing or damage to the Key Components

    As the orifice differential pressure flow rate meter being in use for long time, the size of aperture would differ from the original size because of the adhered foreign bodies, or erosion by the durative high temperature steam flow. As the parameters can not be adjusted on time, and it is hard to calibrate the instrument, the measurement errors will accumulate.

  4. The external Interference or the Failure of Data Transmission Channel

    The instrument interfered by the disturbances or failure happening in data transmission channels will make significant errors to the data derived in the control center.

2.2. Determining control limits for fault data detecting

The abnormal data from certain sensors (including temperature, pressure and flow rate sensors) may be characterized as fluctuation quickly with a large magnitude (induced by the poor contact in the instruments), keeping a certain value without any varying (induced by the failure of the data sampling systems) or being outside the normal variable range. The first two cases are not discussed here because they are easy to be discovered. As for the last case, the statistical process control approaches to determine the control limits for data monitoring is applied. When the values of the controlled variables (or functions) exceed the limits, it shows there are abnormal data except that the process is actually abnormal. The statistical process control includes two types--univariate and multivariate control.

2.2.1. Determining the control limits for single variable

The statistical process control (SPC) and control chats were first proposed by Shewhart in quality prediction. Traditional SPC mainly treated single variables. When the value locates outside the normal range, the system would output an alarming signal to notify the operators to check and make sure whether the state is abnormal. In the work, if the process state is actually normally operating, the values of the variables can be judged as fault data. Reasonably determined control limits can reduce the probability of false alarm. Many researches focus on the problem [4-6]. In the work, empirical distribution function combined with the principle of “3σ”[3]is applied to determine the control limits of different variables.

Denote XRn×1as one of the measurement data matrix, n is the number of the samples, each row of X is a measured sample.IfX=(x1,x2,...,xn)T, the range of the data matrix is:


Divide the range into N intervals with the same length, each length of the interval is:


The corresponding intervals are marked asc1,c2,...,cN. So each element in X belongs to one of the intervals and these data are divided into N groups. The probability of the sampled variable value locates in each interval can be estimated as:

r=1,2,...N,viis the number of the data locate in the ith interval. As the data are independent to each other (usually the hypothesis is reasonable). The probability defined:

The average probability density of each interval can be written as:


The empirical distribution of X can be got with the lengths of intervals as bottom and pras the height. When n and N are large enough, the bar graph would be more similar to the global distribution of X. According to the central limit theorem, when the system has only one stable state, the measured data would follow normal distribution as the sample size grows. So the global distribution function of X can be written as normal distribution:


Figure 2 shows the distribution graph. The probability of data locating in the range of μ±3σis 99.73%. When the value of X is outside the range, it’s reasonable to suspect the value being abnormal (otherwise, the process doesn’t normally function ). That’s the principle of “3σ”. In practice, the outside of “3σ”occupying about 0.25% probability is named “red zone”, and the range inside μ±2σis named “green zone”(about 95% probability). The intervals between them are “yellow zone”. Four control limits are determined by the method.

As the expectation and the deviation of the X global distribution are unknown, and the estimated values with the sample are not accurate especially when the empirical distribution isn’t similar to normal distribution, the principle of “3σ” can not be applied directly.

Figure 2.

The probability density distribution graph of normal distribution

However, according the principle of “3σ” and the empirical distribution of X, the control limits can be deduced reversely by searching the respective probability of limits.

Two kinds of special instances have to be noticed:

  1. No crossing point between empirical distribution and the testing level.

    It shows the corresponding sample value is not appeared in the significance test level range. To determine the limits, the empirical distribution can be linearly extended to outer range or estimated with experience by comparing with the real process performance state.

  2. Two more crossing points between empirical distribution and the testing level at one side.

    It shows there are disturbances in the data, which makes the empirical distribution fluctuate at one lateral side. According to principle of “3σ”, find the highest probability interval and search from it to the two laterals for the points first amounting to the probability (1-α)/2.

Figure 3-5 show the control limits (4 lines) and monitoring result in a plant. The interval between the two green lines is the “green zone” showing normally working. The intervals outside the red lines are the “red zones” indicating the fault data or process abnormality. The other two intervals are the “yellow zones” which reminds the operators to notice the changes of the data. Figure 4 and 5, in the right side the red and green line are coincident. It can be explained that the outer equipment constraints the variable freely moving.

By determining the limits of single variable, the wild values considered as fault data can easily be discovered. However, two kinds of inherent errors can not be avoided.

The approach of Univariate Statistical Process Control (USPC) can roughly judge fault data. The approach is also appropriate to monitor the temperature and pressure variables.

2.2.2. Determining the control limits of multivariate system based on PCA

The shortage of USPC is the operators being prone to “information overload”when the number of the variables is large. Anymore, it’s too rough to judge the normalty of data just by monitor alarming. In practice, the variables maybe constrained by certain inherent functions. It shows the existence of fault data when the functions are unsatisfied. Principal Component Analysis (PCA) is one of the approaches to discover the relationship among the variables by means of statistics. By determining the control limits of two guidelines, Hotelling’s T2 and SPE (Squared Predictive Error, Q), the multivariate process can be monitored conveniently.

Figure 3.

Monitoring the Steam Flow Rate of the Steel Making Process

Figure 4.

Monitoring the Flow Rate of Start Steam Boiler

Figure 5.

Monitoring the Steam Flow Rata of CDQ (Coke Dry Quenching)

The proposed approach is to distinguish different stable states of the process by empirical distribution function, and determine the control limits for each stable state.

  1. Differentiate the stable states and group the samples

    Denote the measurement matrix asXRn×m, n is the size of samples, m is the number of the monitored variables. The rows of X are the serial samples in time order.

    Denote Xi=(x1i,x2i,...,xni) (i=1,2,…,m) as the ith column. The empirical distribution of Xi can be derived as just mentioned. When the process has two or more stable states, the distribution bar graph would characterize as multi peaks.

    Figure 6 is the curve of flow rate and the distribution bar graph of a chemical process in a period. The two peaks represent different demands for steam in different states.

    Without loss of generality, the two peaks distribution is to be discussed. If the control limits are determined without considering the different states, the sensitivity of detecting fault data will be too low for the multivariate process. Apply two normal distributions to fit the two peak sections. If the averages and deviations are signed asXi.s1,σi.s1 ,Xi.s2 ,σi.s2 , the rule to divide the samples into different groups (different states) is listed as equation (7):

    Figure 6.

    The curve and distribution bar graph of flow rate for a chemical process


    By the rule of (7), the samples are separated into two groups. By the same way, the two groups can be further grouped with other variables. However, it’s not suggested to divide the sample into many groups with too few elements in each group. Assume there are only two groups of samples are derived. They are marked as XaRn1×mand XbRn2×m(n1+n2n,some samples may be discarded because they don’t belong to any state region).

  2. Determining the control limits of different state with PCA

    Through grouping, the sample data is more convergence. First standardizeXa, denote:


    In it,, x1,x2,...,xmRn1×1. Calculate each average and standard deviation:


    The measured data matrix is standardized as:


    Derive the eigenvalues of the covariance matrix C=1n1X˜aTX˜a and orthogonalize the matrix, then the model of PCA can be written as:


    In the equations,PRm×A is the load matrix. If arrange the eigenvalues of C in descendent order, the former A elements are noted asλa=(λ1,λ2,...,λA). The corresponding unitized eigenvectors form the matrix of P,P=(P1,P2,...,PA). A is the number of the selected components.TaRn1×A is the scored matrix and E is the error matrix..

    The MSPC based on PCA defined two statistical variables: Hotelling’s T2 and SPE (Q).

    If the jth measurement vector is signed as X˜aj(row vector), the corresponding scored vector is signed as Taj (ie. The jth row of matrixTa), then Taj2 is defined as[7]:


    When the testing level isα,its control limit can be calculated according F distribution:


    SPE(also called statistical variable Q)is defined as:


    The control limit can be calculated by:


    In the equationθi=j=A+1mλji(i=1,2,3), h0=12θ1θ33θ22, cαis the threshold value under the testing level α of normal distribution.When there’s no fault data in the sample or the process normally functions, the two statistical variables satisfy the following inequalities.


    the second data matrixXb can also be processed to derive the control inequalities:

  3. The monitoring results for the experimental data

    Three variables are considered and some of the data are modified on the original data. The no. 167-172 and no.350-355 samples show the transition procedure of two states. The no.360-365 samples are modified data which deviate from the normal constraint. Figure 7 shows the changing curves of the three variables and the scatter diagram in 3-dimention space.

    Two experiments are designed to testify the advantage of differentiating states applied in this work. Figure 8 is the case no differentiating states, whilst figure 9 is the differentiating case. As the figures shown, the approach of differentiating states and determining the control limits respectively is more sensitive to the transition of states and fault data.

  4. Locate and isolate the fault data

    In the case of MSPC, the fault data can be located and isolated by contribution diagram. The variable contribute to SPE and Ti2 can be written as (22)(23). ξiis the ith column of unit matrix. The fault data is judged as the variable with larger contribution to SPE andT2.


Figure 7.

The changing curves and scatter diagram of three variables

Figure 8.

The result of monitoring no differentiating states

Figure 9.

The result of monitoring no differentiating states

2.3. The steam pipe network modeling and calculating the fault data based on the model

After the fault data are detected, the next step is to estimate the true values of the variables. As the experience shown, the measured temperature and pressure values are usually accurate. So the object of the work is to calculate the flow rate value with the detected temperatures and pressures according to the steam network model.

2.3.1. The static model of steam pipe network

To deduce the model, the real conditions are simplified as: 1) The steam in the pipes is axial-direction one dimensional flow; 2) The network is composed of nodes and branches(pipes). 3) There is no condensate or secondary steam and the effect is neglected.

  1. The hydraulic and thermal model of single pipe.

    • The hydraulic model of single pipe

      By the law of momentum conservation the hydraulic model can be written as [8]:


      Usually, there’s little difference between the products of compressibility factors and temperatures in the same pipe. Considering the elbows, reducer extenders and other friction factors, the equivalent coefficient η is added to the equation. Equation (24) changes to (25).




      In accordance with Альтщуль equation, frictional factorλ:


      After transferring the units of D,q to mm and t/h (ton per hour), Reynolds number is :


      In the equation,u—the characteristic velocity of the steam in this pipe, m/s;μ—the mean dynamic viscosity coefficient. It’s determined by equation (33)[9].

      Denote ρ1,ρ2 as the input and output steam densities.ρ1,ρ2,μ ,ρm can be calculated with following equations (31),(32),(33). The unclaimed symbols were defined in reference[9].

    • The thermal model of single pipe

      By the law of energy conversation, the static thermal model can be written as follows:


      The definitions and units of T1,T2,L,q are the same as eqation (24).(note: changing equation (34) to (35) is to make thermal model has the same pattern with equation (26)).


      In term of equation IF-97, The unclaimed symbols are defined in the reference[9].


      The amount of heat loss along unit pipe length can be calculated using equation (38)


      In equation (38), awis determined by:

  2. Steam pipe network synthetic model

    • Incidence matrix of pipe network

      Draw out pipe network map, number the nodes in the order of steam sources, users, three-way nodes, and number the pipes in the order of leaf pipes, branch pipes. Suppose there are m1 intermediate nodes andm2 outer nodes, there are total m=m1+m2 nodes and p=m1 pipes. Set the element of the ith row and the jth column in ARm×pasaij, it’s defined as:

    • The flow rate balance equation of the pipe network

      Denote:P=(P12,P22,...,Pm2)T,Pi2,(i=1,...,m) the square of the ith node’s pressure, MPa2;

      T=(T1,T2,...,Tm)TTi,(i=1,...,m)q=(q1,q2,...,qp)Tqj,(j=1,...,p)Q=(Q1,Q2,...,Qm)TQi,(i=1,...,m)qjqjCp*=diag(CP1,CP2,...,CPp),CPj,(j=1,...,p)equation (27)CT*=diag(CT1,CT2,...,CTp),CTj,(j=1,...,p)equation (36)

      In term of mass conservation law, the total flow rate of each node should be 0, that is:


      According to the hydraulic and thermal equations, for the pipe network equations:


      Substitute (42) into (41):


      Equations (26), (27), (35), (36), (42), (43), (44) comprise the static model.

2.3.2. Hydraulic and thermal calculation based on searching

Industrial networks are equipped with temperature, pressure and flow meters except intermediate nodes. The proposed algorithm tries to calculate the flow rate of each pipe with the given condition. The flow rate meter readings are applied to evaluate of the algorithm.

Figure 10.

Layout of steam pipe network

Set figure 10 as an example. the index of the nodes and pipes are shown in the figure:

  1. Determine the elements of matrices A, Q as the preliminarily defined.


    As the previous definitions for equation (41)(42),P5,P6,T5,T6,q are unknown. From equation (27), it’s easy to discover friction factor and density are relative withP5,P6,T5,T6. Only by presetting q can equations (42) (43) be applied for hydraulic iterative calculation. However, in thermal calculation only cp relates withP5,P6,T5,T6. Changing equation (34) to (35), qcan be calculated directly.


    In equation (47),i—the pipe index,ΔTi—the temperature difference of input and output.

  2. Thermal calculation

    It can be inferred by the flow direction of steam that:




    Set the searching start value forT5,T6 as T1 and max(T3,T4) respectively, and set the searching step size as h then begin searchingT5,T6 in the range (49). For each searching step size, calculate cp(equation (37)) and flow rate of each pipe (equation (49),(50)), and test whether Aq+Qξ1 is satisfied. If it is not satisfied, updateT5,T6 with one step to continue calculation, or retain the temperature value and the corresponding flow rates.

  3. Hydraulic calculation

    Just as the thermal calculation, first presume the temperature of the intermediate nodes as the value searched by the former step, and set the start point and step size for the two intermediate pressures. Then begin searching in the region determined by inequality (48) with the step size to calculate the density (equation (32)), frictional factor ((28), (29), (33)), and flow rate ((26), (42)) of each pipe. When calculating the flow rate, presetting the initial values as the result of step 2 will reduce the iterative calculation time. Test whether Aq+Qξ2 is satisfied, If the inequality is not satisfied, updateP5,P6 with one step size to continue calculation, or retain the pressure values and the corresponding flow rates.

  4. Thermal model Verification

    Substitute the intermediate nodes’ temperature (determined by step 2), pressure and flow rate values (determined by step3) into thermal model to verify the inequalityACT*ATT+Qξ3. If it is satisfied, the calculation ends with the result of step 3. If not, presume the intermediate pressure values as the result of step 3, reduce step size and go to step 2 and start another calculation cycle.

    In the algorithm, the value ξ1,ξ2,ξ3 can be set as 0.05-0.3 depending on the demand for errors and calculation stability. The calculation flow chart is depicted with figure 11.

2.3.3. Comparisons of calculation results and measurements

The specifications of each pipe are shown in table 1. The parameters in the model and algorithm are initially set as:η=0.2, Δ=0.2, β=0.15, ε=0.035,ξ1,2,3=0.3.

Substitute the specification data in Table 1 and temperature and pressure data in Table 2 to the algorithm calculating flow rate values.The comparision results are listed in Table 2. It shows the largest difference is less than 6%. Actually, the neglected factors in model, the parameter errors and measurement errors add to the difference.

Pipe No.Length(m)Inner Diameter
Outer Diameter
Heat Insulation Layer Thickness (mm)

Table 1.

The Specifications of the Pipes

Outer node No.Pressure(MPa)Temperature
Measured q(t/h)Calculated
q (t/h)
the relative

Table 2.

The Comparisons of the Measured Data and the Calculated Data.

Figure 11.

The flow chat of flow rate calculation

The results show the validation of the model and effectiveness of the algorithm, and the proposed model and algorithm can be applied to simulate the running of static steam pipe network, reconstruct the discovered fault data of the mass flow rates.

As for the larger scale steam network with more than 3 intermediate nodes, it’s difficult to apply the algorithm directly. However, the pipe network can be divided into several smaller networks from the nodes with known temperature and pressure.


3. Gross errors detection

3.1. Problem definition

A section of the steam network named “S2” for an iron &steel plant in China is shown in figure 12. In the figure, N1-N7 represent the different production processes. The arrows point to the direction of steam flow, the variables Xi or Xij on behalf of the real steam mass flow. The electric valves are remotely controlled by the operators. Many industrial steam systems are similar in structure to this system, but the scale is much larger.

Figure 12.

A section diagram of steam network

If all of the variables above have been measured, suppose that the electric valves are fixed at a certain position, and the pipeline leakage and the amount of condensate water can be neglected, the constraint equations can be written out on the basis of the mass balance.


The overall balance equation can also be written out as:


If we remark


as the vector of mass flow rates in the steam network.(51) (52) can be abbreviated as:


In (54), A is the incidence matrix. The matrix is composed by the elements of 1, -1, and 0.

If Y represents the vector of measured flow rates and X is the vector of true flow rates, then:


In the equation, Wis the vector of the measurement gross errors (or systematic errors), andεis the vector of random measurement errors with each element being normally distributed with zero means and known covariance matrix, Q. The approaches to detect and remove the gross error from the measurements are to be discussed in this section.

When there is no gross error in the measurement vector, finding a set of adjustments to the measured flow rates to satisfy equation (54) is the problem of data reconciliation. Denote the adjustment vector asa, and the adjusted flow rate vector asX^, we get:


Applying Least squares method, it can be stated as the constrained least-squares problems:

minaTQ1asubject to AX^=0

Q is the weighed parameter matrix. Usually, Q is selected as:

σi(1i18)is the deviation of the measurement variableXi(1i18).

The solution X^*to the problem can be obtained by Largrange multipliers[10]


The vector of residuals e


By gross error detection, the systematic errors in the measurements can be removed, and by data reconciliation the random errors can be reduced. However, if the gross errors are not removed, the gross errors in some variables will propagate to other accurately measured variables in the procedure of data reconciliation. So gross error detection had better be adopted in advance.There are two basic types of gross error detection methods, they are based on measurement test (MT) and nodal imbalance test (NT).

3.2. Gross error detection based on test of residuals

Detection based on test of residuals belongs to the statistical test and has been termed and evaluated [11]. Measurement test[12] is the basic algorithm.

  1. Apply the least-squares routine by using equation (58) (59) to compute X^*ande.

  2. Compute the variable for each pipe (or stream)


    In the equation,


    On the hypothesis that the measured value in the jth stream doesn’t contain gross error, zjfollows standard normal distribution.

  3. Compare zj with a critical test valuezc. If|zj|>zc, denote stream j as a bad stream. zcrecommended[12]zc=z1β/2, the 1β/2 point of the standard normal distribution.


    Where n is the number of measurements tested (currently, n=18), α(the recommended value is 0.05) is the overall probability of a type I error for all tests, and β is the probability of a type I error for each individual test. Denote by S the set of bad streams found by the above procedure. The measurement yj,jS is considered to contain gross error.

  4. If S is empty, proceed to step 7. Otherwise, remove the streams contained in S and aggregate the nodes connected with the stream. This process yields a system of lower dimension with compressed incidence matrixA, measurement vectorY, and weighed matrixQ. Denote T as the set of streams with the measurement data inY.

  5. Replace A, Y, and Q withA,Y , andQrespectively and Compute the least-squares estimates in T applying equation (58).

  6. Solve equation (54) to Compute the rectified values of the streams in S by substituting the estimated values computed in step 5 with the data of the steams in T. The original measured data are used for the streams in the setR=U(ST), where U is the all stream set.

  7. The vector comprised of the results from steps 5, 6 and the original measured data for the streams in R is the rectified measurement vector. If S is empty, thenY^=X^*, the rectification of the measured data is completed in step 1.

Notations to the algorithm:

  1. In step 4, it’s possible to remove the good streams into S in some cases.

  2. The equation (62) and critical level value provides a conservative test since the residuals are generally not independent. It’s not always applied [13].

  3. The different significance levels will induce different results and effects.

  4. For the shortage of MT that it tends to spread the gross errors over all the measurement and obtain unreasonable (negative or absurd) reconciled data, The Iterative Measurement Test (IMT) method and Modified IMT (MIMT) method were proposed. However, the main frames are the same with MT.

3.3. Gross error detection based on nodal imbalance test

The algorithms of gross error detection based on statistical test of nodal imbalance are mainly based on the work of literature [14]. Applying nodal imbalance test to each node and the aggregation node (ie. pseudonode) to locate and remove the gross errors. The basic algorithm named Method of pseudo nodes (MP) is listed as:

  1. Compute the nodal imbalances vector r and the statistical testing variable vectorz. On the assumption that no systematic error exits, the defined variableziare standard normal distributed.



  2. Compare eachziwith a critical test valuezc=z1α/2, which corresponding to the point of the significance testing level(1α/2). For the instance a test at the 95% significance level, α=0.05andzc=1.96. If|zi|zcis satisfied, the ith node is regarded as a good node and denote all streams connected to the node i as good streams.

  3. If no bad nodes are detected in step 2, proceed to step5. Otherwise, repeat steps 1,2 (changing the matrices and vectors accordingly) for pseudonodes containing 2, 3… m nodes.

  4. Denote the set of all streams not denoted as good in the previous steps by S. The measurements yi,iS are considered containing gross errors.

  5. Steps 5-8. The procedure is the same as the steps 4-7 of the MT algorithm.

Notations to the algorithm:

  1. The principle assumption is that the errors in two or more measurements do not cancel.

  2. In step 3, m is chosen by the effect of locating the gross errors. If the increasing of m doesn’t gain any improvement, the step can be stopped.

  3. By applying graph-theoretical rules[14] some streams can be determined as bad measured streams. The additional identification may be useful for the procedure of MP.

  4. Equation (62) is not applied in the algorithm to control type I error. The probability of a type I error in the nodal imbalance test is not necessarily equal to the probability of rejecting a good measurement in MT [15].

  5. If the set S obtained in step 4 is empty, but there is one or more nodes are truly bad. Usually the instances happen when there’s leak in the network system or the measured errors canceling each other. To solve the problem, the Modified MP (MMP), MT-NT combined methods [16] are proposed.

As the above two methods shown, gross data detection and data reconciliation are inherently combined together. Gross errors detection is proceeded with the help of the least-square routine, which actually can obtained the optimal estimates.


4. Data reconciliation

4.1. The basic problems of data reconciliation

Just as mentioned in section 3, the data reconciliation problem is expressed as the solution of the constrained least-squares problems.However, the assumptions, the constraint equations, and the weighted parameter matrix have to be discussed.

4.1.1. On the assumptions of data reconciliation in the present application

For the application of the steam network, there are four latent assumptions. They are:

  1. The process is at the steady state or approximately steady state.

    Suppose that the electric valves are fixed at a certain position, and the mass flow rate in each steam has been close to a constant for a period of time. So the constraint equation (54) can be written out on the basis of the balance for all the nodes (including real nodes and pseudonode) and the solution of the problem can be in accordance with the actual state.

  2. The measurement data are serial uncorrelated. The assumption makes it easier to estimate the deviations of the measurements. Although this is usually not the real case, the serial data can be processed to reduce the confliction [17].

  3. The Gross Errors have been detected and removed. If there’s gross error in the measurement data, the procedure of data reconciliation will propagate the errors.

  4. The constrained equations are linear. The style of equation (54) is linear; however the true constrained equation will be related with many other environmental variables and obviously not linear. So the assumption is to be approximately satisfied.

    Only these four assumptions are nearly satisfied, can the solution of the problem (equation (58)) be closer to the true value.

4.1.2. On the constraint equation

Need to note equations (51) and (52) are based on the assumptions of steady state and linear constraints, no pipeline leakage or condensate water loss. But it cannot avoid the condition of the pipeline leakage and condensate water. So the equations (51)(52)should be written as:




It represents the vector for the condensate water and leakage loss amount of each constraint equation. Each element has relations with the environmental temperature τ, pipe diameter D, pipe length l, steam temperature T, pressure P and the flow rates of main pipes.

It’s difficult to set up the mathematical models of these loss amounts. However, by the first order of Taylor Expansion at the point of (τ0,Di0,li0,Ti0,Pi0,Xi0):


The constants in equation (69) can be determined by the method of multiple linear regression with a certain number of history data. As the constraint equation changed into:


The solution to the least-squares problem of reconciliation is:


4.1.3. On selection of the weighted parameter matrix Q

The selection of Q directly influences the result of the data reconciliation. In theory Q is recommended as the equation (57). However, the deviations are usually unknown or may vary as the time of instruments being used getting long. The deviation of each measurement can be estimated by the standard deviation of the sample data.


Where, Xijis the jth sample of the variableXi, and X¯i is the average ofXi.

If a small number of high-precision instrument are applied,, the corresponding elements in the matrix of Q with smaller value, then the quality of data will be greatly improved. Though some literatures [18] [19] recommended the methods to determine or adjust the matrix Q, the methods or theories for general application is not available.

4.1.4. Simulation results

The simulation results of data reconciliation for the present application are shown in table 3. The measured data is presumed according to (73)


The standard deviations are set to be 0.5 percent of the true values. In the simulation, the pipe network loss is not considered. As shown in table 3, most of the rectified data are much closer to the true values. The simulation testifies the efficiency of data reconciliation.

true value5550252560303030
measured value5.3475.15954.75125.08626.09762.28932.29732.38630.561
rectified value5.3475.15052.22825.09025.51462.17831.94631.73031.174
true value151540202020252530
measured value15.73515.66842.58521.41921.50920.55226.69926.63832.488
rectified value15.58215.59242.30621.34021.56620.94626.08626.23631.375

Table 3.

The Result of Data Rectification


5. Conclusion

In the chapter, three data processing approaches to improve data quality are demonstrated. For the importance of properly controlling the steam system performance normally, The data obtained from EMS should be accurate and reliable. However, the data may be influenced many outer factors. The approaches proposed in the chapter are to detect the fault data, locate and remove the gross errors and reduce the random errors.

Four main reasons induce the low accuracy of the mass flow rate measurement. Combining the principle of “3σ” and empirical distribution function to determine control limit is proposed for single variable monitoring, and applying PCA to determine the control limits for the multivariate process. With the limits, most of fault data can be identified easily. For the fault data of flow rates, the approach to setup the mathematical model of the steam network and calculate the flow rates is proposed. The simulation and experimental results show the effectiveness of the approaches.

Two approaches, MT and MP, to detect the gross errors are demonstrated. Both are preceded by selecting the statistical variables, which follow standard normal distribution, and applying hypothesis test. Some notations for the two algorithms are stated.

The constrained least-squares problem applied for present application is discussed. The four assumptions are approximately satisfied when the steam network is normally function and the state is nearly static. The pipe network loss can be considered to add to the constraint equations for more accurate results. The weighed parameter matrix has influence on the results of data reconciliation. To estimate the deviations of the instruments online and apply several instruments with high precision will improve the quality of reconciled data.



This work is supported by the Knowledge Innovation Project of Chinese Academy of Science (No.KGCX2-EW-104-3), National Nature Science Foundation of China (No.61064013) and Natural Science Foundation of Jiangxi Province,China (No. 20114BAB201024).


  1. 1. TongT.TheAnalysis.onthe.FlowRate.Measurementof.SteamJournal.ofHebei.Instituteof.ArchitecturalEngineering.20024345
  2. 2. ZhenjianD.Onthe.Methodsto.Reducethe.SteamMeasurement.Differencealong.theTransmission.LinetheExisting.ProblemsChina Metrology, 20036768
  3. 3. ZhangJ.Y. X. H.MultivariateStatistical.ProcessControl.Peking: Chemical Industry Press (CIP). 2002
  4. 4. ChandS. T. S.Estimatingthe.limitsfor.statisticalprocess.controlcharts. A.directmethod.improvingupon.thebootstrap.European Journal of Operational Research, 2007472481
  5. 5. KuoH. C. W.L.Comparisonsof.theSymmetric.AsymmetricControl.Limitsfor. X.ChartsR.Computers & Industrial Engineering, 2010903910
  6. 6. Chen, T., On Reducing False Alarms in Multivariate Statistical Process Control.Chemical Engineering Research and Design, 2010430436
  7. 7. Zhou Donghua, Li Gang, ed.Data Driven Fault Diagnosis Technology for Industrial Process--Based on Principal Component Analysis and Partial Least Squares. Peking: Scince Press. 2011
  8. 8. Zhang Zenggang, Research on Coupling Hydraulic & Thermal Calculation for Steam Pipe Network Theory & its Application(D).2008China University of Petroleum.
  9. 9. W.Wagner, A.K., Properties of Water and Steam.1998Springer-Verlag Berlin Heiderberg.
  10. 10. Kuhn, D.R., H.Davidson, Computer Control II:Mathematics of Control.Chem. Eng.Prog., 196157
  11. 11. Iordache, C., R.S.H.Mah, Performance Studies of the Measurement Test for Detection of Gross Errors in Process Data.AIChE Journal, 1985187
  12. 12. Mah, R.S.H., A.C.Tamhane, Detection of Gross Errors in Process Data.AIChE Journal, 1982828
  13. 13. Crowe, C.M., Y.A.Garcia, Reconciliation of Process Flow Rates By Matrix Projection.AIChE Journal, 1983881
  14. 14. Mah, R.S., G.M.Stanley and D.Downing, Reconciliation and Rectification of Process Flow and Inventory Data.IEC Proc. Des. Dev., 1976197
  15. 15. Serth, R.W. and W.A. Heenan, Gross Error Detection and Data Reconciliation in Steam-metering Systems.AIChE Journal, 1986733742
  16. 16. MeiC.SuH.ChuJ.An-MN. T.CombinedT.Methodfor.GrossError.DetectionDataReconciliation.Chinese Journal of Chemical Engineering, 2006592596
  17. 17. KaoC. S. e.GeneralA.PrewhiteningProcedure.forProcess.MeasurementNoises.Chem. Eng.Prog., 199249
  18. 18. Wu Shengxi, Z.e., Estimation of Measurement Error Variances/Covariance in Data Reconciliation, in the 7th World Congress on Intelligent Control and Automation.2008Chongqing, China. 714718
  19. 19. NarasimhanS.ShahS. L.Modelidentification.errorcovariance.matrixestimation.fromnoisy.datausing. P. C. A.Control Engineering Practice, 2008146155

Written By

Luo Xianxi, Yuan Mingzhe, Wang Hong and Li Yuezhong

Submitted: 16 November 2011 Published: 17 October 2012