Results of calculations for six symptoms; γ is dimensionless, and entropy decrease rate and representativeness factor are given in arbitrary units.
For complex objects, condition assessment is usually based on indirect symptoms related to residual processes such as vibration, noise, heat generation, etc. The number of available symptoms is often large, and it is necessary to select those which are most representative (i.e., sensitive to condition parameters). Such selection may be based on singular value decomposition (SVD). An alternative approach is proposed that employs information content measures. In order to obtain a reliable condition assessment and prognosis of its evolution (in particular, remaining useful life estimation), certain preprocessing of experimental data is necessary. This involves, among others, issues such as life cycle normalization or identification and removal of outliers. Suitable procedures are proposed and discussed. Example is presented for vibration-based symptoms of steam turbine technical condition.
- diagnostic symptom
- technical condition
- information content
Terms like condition assessment (which is basically equivalent to diagnosis) and prognosis are commonly used in technical sciences and have been defined in several ways. For any given class of diagnostic objects, there is a logical sequence of activities which may be summed up including four consecutive stages [1, 2]:
Measurement (acquisition of data that contain information on object condition)
Qualitative diagnosis (or recognition-identification and localization of failures and malfunctions)
Quantitative diagnosis (estimation of damage advancement)
Prognosis (forecast for object operation in future)
In structural health monitoring and condition-based maintenance, the third and fourth steps are of particular importance. Quantitative diagnosis is in fact an estimation of the current object condition. Once this has been accomplished, a prognosis may follow, which basically means remaining useful life (RUL) estimation on the basis of certain criteria. This is extremely important for proper and safe operation and cost-effective maintenance of complex and critical machinery.
Evolution of object condition may be described in terms of the hazard function [3, 4], which typically takes the form of the bathtub curve (Figure 1). Initially hazard function decreases with time; this may be interpreted as “running-in.” During normal operation period, hazard function increase is so weak that it may be treated as constant. Finally, during the final stage of the object service life, hazard function increases with time—in theory to infinity and in practice until the highest acceptable value is attained. For a wide range of objects, reliability is well described by three-parameter Weibull distribution. In such case, hazard function in its classic form is given by [5, 6]
where θ denotes time and β, η, and γ are parameters; γ is the location parameter (set to zero if θ = 0 corresponds to the beginning of object life—in such case, two-parameter distribution is obtained); η denotes characteristic life; and β is the shape factor. Cases β < 1, β = 1, and β > 1 correspond to three consecutive periods shown in Figure 1.
For many objects it is impracticable or inconvenient to describe condition evolution in terms of the hazard function (or failure density). An alternative approach is based on the analysis of energy transformation and dissipation mechanisms, which leads to the energy processor model [1, 7]. This model implies that object condition is estimated in an indirect manner, from measurable physical quantities referred to as diagnostic symptoms. Each symptom is related to the power V(θ) of residual processes that accompany the principal process of energy transformation. In the simplest case, the ith symptom Si(θ) is given by
where V0 = V(θ = 0) and Φ is the symptom operator and θb denotes time to breakdown. Detailed description can be found in literature; several modifications have been proposed [1, 8], but basic principles have remained unchanged. The Si(θ) given by Eq. (2) and referred to as symptom life curve is a monotonically increasing function with a vertical asymptote at θ = θb. As for the symptom operator, Weibull and Fréchet functions have been shown to give consistent results; they yield Si(θ) in the forms of
for the former and
for the latter; in both cases, Si0 = Si(θ = 0) and γ is the shape factor. For a given object, if sufficient database is available, it is possible to estimate θb by relatively simple fitting procedure. This, in turn, allows to estimate RUL. It has to be kept in mind that θb is obviously not equivalent to RUL, unless the most primitive “run-to-breakdown” operational policy is employed.
Large and complex objects usually generate many diagnostic symptoms, and their number in fact has no upper limit. It has to be kept in mind that values of these symptoms depend not only on condition parameters. If all symptoms Si are expressed in the form of a vector S(θ), then the following general relation holds [1, 9]:
where X, R, and Z denote vectors of condition parameters, control parameters, and interference, respectively. Obviously individual symptoms differ in their sensitivity to the components of all these vectors; it is thus necessary to select those which can be regarded “the best.” The problem of selection was addressed at early stages of technical diagnostic development (see, e.g., ). Initially, at the stage of qualitative diagnosis, the principal criterion was symptom sensitivity to condition parameters. Quantitative diagnosis and prognosis imply a need to follow object condition evolution with time; thus, the symptom which best represents this process should be considered the most suitable one.
This chapter is devoted mainly to symptom evaluation and selection methods based on the analysis of information content measures. Some attention shall, however, also be paid to the method employing the singular value decomposition, the first that has been used for this purpose.
Suitability of symptom evaluation methods has been verified for a number of vibration-based symptoms generated by steam turbines operated at utility power plants. Details on symptom generation mechanisms may be found, e.g., in [1, 10, 11]. Absolute vibration velocity was recorded in the form of 23% constant percentage bandwidth (CPB) spectra, at points located at bearings and low-pressure turbine casings. Piezoelectric accelerometers were used with magnetic mountings, which allows for a frequency range well above 10 kHz. This implies that both “harmonic” (i.e., resulting directly from rotational motion) and “blade” (i.e., generated by the fluid flow system) components are recorded. Vibration amplitudes in frequency bands determined from turbine vibrodiagnostic models [1, 11] are the diagnostic symptoms to be evaluated. It has to be stressed here that presented methods are valid for a broad class of various diagnostic symptoms, irrespective of their physical origin.
2. Singular value decomposition
Singular value decomposition (SVD) is well known from linear algebra; concise description can be found, e.g., in . To the author’s best knowledge, the idea to employ this method in technical diagnostics goes back to the late 1990s . Application for vibration-based symptoms has shown this method to give consistent results .
The first step is to represent symptom value database in the form of an m × n matrix O, where m denotes the number of symptoms and n is the number of symptom value readings. In principle, symptoms of different physical origins are compared, so all are normalized with respect to their values at θ = 0; moreover, 1 is subtracted from all normalized values, so they start from zero and are dimensionless. In accordance with general SVD rules, matrix O can be expressed as the following product:
where U and V are orthogonal matrices (n × n and m × m, respectively) and Σ is a diagonal m × n matrix, Σ = diag(σi). If σi components are arranged in the descending order, which is conventionally accepted, the representation given by Eq. (6) is unique. Components σi correspond to generalized faults, so that the sum given by
where p = min(m, n) represents the total damage advancement or lifetime consumption degree. Columns of U and V matrices are left-singular and right-singular vectors, denoted by ut and vt, respectively, with 1 ≤ t ≤ n. Eq. (6) can thus be rewritten as
According to  and following notation used herein, the tth fault can be described by two discriminants, namely
This means that this fault can be expressed in terms of left-singular or right-singular vectors, which are generally interpreted as “input” and “output” [13, 15]. In the case of system condition evolution, “input” represents condition parameters and “output” represents symptoms. Obviously, the second discriminant, given by Eq. (10), is of practical use here, as condition parameters are typically nonmeasurable.
SVD analysis may be performed using one of available software packages. In practical applications the first step is to analyze individual singular values. For a comparatively new object, the descent of consecutive singular values is rather slow; this means that dominant failure mode has not yet appeared. On the other hand, with considerable lifetime consumption degree, the first singular value dominates. Examples are shown in Figure 2. They refer to vibration-based symptoms generated by steam turbine fluid flow systems. In both cases illustrated in Figure 2, there are six such symptoms. For a turbine with a few dozen thousand hours logged (Figure 2a), contributions of the first three singular values into generalized damage are 36, 29, and 17%, respectively. For the second turbine (Figure 2b), which has logged well over 200,000 hours, corresponding values are 48, 24.5, and 10%—the difference is clearly seen. The second step is to calculate contributions of individual symptoms into several (e.g., three) first singular values. Corresponding graphs are shown in Figure 3. For the first turbine, dominant symptoms cannot be identified, although we may infer that symptom numbers 1 and 5 can be skipped. For the second turbine, however, dominance of symptom numbers 5 and 6 is clearly seen, and they may be judged most sensitive to the fluid flow system lifetime consumption.
3. Information content measures
3.1. The idea
The abovementioned energy processor model is, by its very nature, deterministic. From Eq. (5), however, it is clearly seen that symptom values depend not only on deterministic condition parameters Xi(θ) but also on control parameters Ri(θ) and interferences Zi(θ), which are random variables. Therefore, any symptom Si(θ) should in principle be treated as a random variable with time-dependent parameters.
For a given object operated at a given location, it is reasonable to assume that Ri(θ) and Zi(θ) are characterized by statistical distributions with time constant parameters. At the same time, from Eq. (2) it is clearly seen that the influence of lifetime consumption θ/θb (or, more generally, of deterministic condition parameters) will increase as θ → θb. This means that Si(θ) will become more deterministic or, to put it in a different way, more predictable. As pointed out in , this corresponds to information content decrease, in the sense of Shannon entropy . Therefore, a symptom with the highest rate of an information content measure which decreases with time is the one that is most sensitive to lifetime consumption mechanisms.
Investigations of information content and its measures were pioneered by Claude E. Shannon. In his fundamental work , he introduced an information content measure H(p1, p2, …, pn), later termed Shannon entropy, where pi is the probability of the ith event, and showed it to have the following form:
where K is a constant depending on units used (irrelevant if only decrease rate is of interest). Logarithm base b is typically set at 2, Euler constant, or 10, H being expressed in bits, nats, and dits, respectively. Obviously
Shannon entropy was originally introduced for verbal communication; hence, a discrete random variable is involved. A diagnostic symptom in the sense of the energy processor model is in general continuous, so a derivative of H known as continuous or differential entropy should be used. It is given by (see, e.g., )
where p(Si) is the probability density function. Despite formal similarity, Eq. (14) is not just a limiting case of Eq. (11) for n → ∞. Contrary to H, continuous entropy is not invariant under change of variables . Moreover, h can be negative, although a satisfactory physical explanation of the negative information content is still lacking. From the practical point of view, continuous entropy is very convenient, as for widely employed statistical distributions it is given by relatively simple analytical expressions.
It may be added here that several other entropy types have been proposed, e.g., by Hartley , Rényi , or Tsallis . Their use, however, has been limited. Hartley entropy is a specific case of the Shannon entropy, while Rényi entropy may be viewed by its generalization. Both Rényi and Tsallis entropies involve certain adjustable parameters of rather unclear physical meanings, which are generally difficult to estimate.
For the purpose of condition symptom evaluation, the time window procedure may be employed. A window containing sufficient number of Si(θ) readings is moved along the time axis; for each position, statistical distribution parameters within it are determined, and in this way the h(θ) curve is obtained. This in turn allows for estimation of the information content measure (ICM) decrease rate. In practice this involves certain problems which shall be discussed in the following section.
3.2.1. Distribution type
Obviously, in order to employ the abovementioned procedure, symptom distribution type has to be determined. In general, distributions of diagnostic symptom values are of the right-hand tailed type . Weibull and gamma distributions are commonly used, with the probability density functions given by
respectively, where k is the shape factor, λ denotes the scale factor, and Γ is the gamma function. It has been shown for a number of cases [1, 22, 23] that results obtained with these two distributions are quantitatively similar. Moreover, although this might seem strange, normal distribution given by
(μ and σ denote mean value and standard deviation, respectively) yields very similar results; this greatly simplifies calculations. Continuous entropy for these three distributions is given by the following relations :
where γE is the Euler-Mascheroni constant and ψ(x) is the digamma function. An example of comparison of results obtained with gamma, and Weibull and normal distributions is shown in Figure 4.
Diagnostic symptom time histories often exhibit a considerable number of outliers. According to , “an outlying observation, or outlier, is one that appears to deviate markedly from other members of the sample in which it occurs”; there is no generally accepted precise definition. From the point of view of information theory, outliers are equivalent to noise. As with the definition, there is no universal method for removing outliers. The “three-sigma rule,” which is often used for this purpose, is not applicable to distributions with long right-hand tails . Three-point averaging  merely flattens outliers instead of removing them. The author has suggested a procedure referred to as “peak trimming” , based on comparison of a data point with two adjacent points. If for the Si(θk) symptom value reading one of the following criteria is met:
then Si(θk) is considered as an outlier and replaced by the average of two adjacent readings. Upper and lower thresholds, ch and cl, are adjusted experimentally and depend on the object. In practice, situation described by Eq. (21) is much more frequent, mainly as a result of the influence of control parameters and/or interference (cf. Eq. (5)). Very low symptom value readings, as in Eq. (22), are usually caused by plain measurement errors. Effect of peak trimming is illustrated in Figure 5.
Fitting continuous distributions to experimental symptom value histograms within the time window limits require at least weak stationarity. This implies that for every symptom Si mean value and autocovariance must not change with time. In view of the fact that Si(θ) has a vertical asymptote at θb, this may be considered valid only for θ < < θb. As already mentioned, it may be assumed that control and interference (Eq. (5)) are represented by stationary stochastic processes. Therefore, Si(θ) may be viewed a trend stationary process, and, if the deterministic trend is removed, what is left is a stationary process . In fact over a hundred years ago, it was pointed out that, in time series analysis, a measure of deviation from trend and not from some “mean” or “average” should be taken into account . In other words, trend normalization should be performed prior to ICM analysis.
Trend may be determined by fitting a suitable function to experimental symptom time history. Weibull and Fréchet functions may be used for this purpose; for low values of θ, exponential function may be a good approximation. An obvious prerequisite is lack of abrupt (stepwise) changes; this issue shall be discussed in detail in the following section. Once this is performed, a procedure may be employed wherein each symptom value reading Si(θ) is replaced by trend-normalized value given by 
where subscript t denotes value determined from the estimated trend. An example of trend normalization (Weibull function fitting) is shown in Figure 6.
3.2.4. Abrupt changes
Complex and costly machines like, for example, power-generating units are usually designed for long service life. During the period between commissioning and final withdrawal from use, they are usually subject to various processes of maintenance, repair, and overhaul. Each of them introduces changes of object properties, which influence both diagnostic symptom generation mechanisms and their propagation from origin to measurement points. So far, it has been assumed (tacitly) that each Si(θ) function, or symptom life curve, is a superposition of a monotonic and continuous trend Sit(θ) and random fluctuations. In general this is not the case. Deterministic trend is in fact a sequence of symptom life curves, each being characterized by some specific values of Si(0) and θb. Of course repair or overhaul is performed before the breakdown, so of each curve is represented by a section of the length of θ0 < θb. This is shown schematically in Figure 7.
Figure 7 clearly shows that, if fitting continuous function to experimental data is expected to yield consistent results, abrupt changes should be eliminated. In principle this is relatively simple. Each life cycle and hence each symptom life curve are characterized by the so-called logistic vector , which describes its “quality.” This vector may be replaced by its scalar measure L, which influences both Si(0) and θb. For a sequence such as shown in Figure 7, one cycle is chosen as a reference; it may be convenient to use the one with the lowest initial value for this purpose, but this is not mandatory. Its value for θ = 0 is taken as a reference Sr(0). Then, for each other cycle, a normalizing factor Fi = Si(0)/Sr(0) is determined, and normalization is obtained by simple multiplication of all symptom readings in this cycle by 1/Fi.
This idea may seem simple, but precise determination of the moment of transition from a life cycle to the next one may be problematic. Sufficient operational documentation is not always available, and transitions are often masked by random fluctuations. A method for their detection is thus necessary. Such method may be based on techniques originally developed for statistical process control.
In the 1920s Walter A. Shewhart developed a tool for determining whether a process (e.g., manufacturing) is under control, known as the process control chart. If that was the case, no modifications of process or control were needed; otherwise, an intervention was necessary, in order to restore stable and controlled operation . In 1954 E.S. Page proposed a more sensitive process control chart, employing cumulative sum and consequently named CUSUM . His approach consisted in introducing a quantity originally referred to as a “quality number,” developing an algorithm to estimate its changes and establishing a quantitative criterion. In general this quality number is a statistical parameter. If this procedure is employed for mean value, it can be used for detecting abrupt changes .
Let us assume that a variable x characterizes the process under consideration; its consecutive readings are x1, x2, …, xN. Each sample 〈x1, …, xi〉 has a probability density function given by pi(xi, φ); φ is a parameter which changes from φ0 to φ1 when an abrupt change occurs. The log-likelihood ratio ci given by
defines the figure of merit. Cumulative sum Cm is then defined by
If φ is sample mean, then Cm time history can be used for abrupt change detection. If there is no continuous trend, i.e., the process is stationary, Cm will fluctuate around zero and exhibit an upward or downward drift when an abrupt increase or decrease, respectively, has occurred. As already mentioned, in the case of a diagnostic symptom, there will always be such trend which can be neglected only for θ < < θb. Thus, Cm does not fluctuate around zero, but exhibits a continuous trend. An abrupt change, if sufficiently large, will then be indicated by a reversal of the Cm(θ) trend. This method can be employed for detecting transitions between consecutive life cycles. For normal distribution which, as noted earlier, is often a good approximation, Cm is given by a simple expression:
3.2.5. Representativeness factor
It may be said, in a descriptive manner, that ICM is a measure of the degree of process organization around a monotonically increasing trend. However, the rate of this increase should also be taken into account in symptom evaluation. Organization may take place around a weakly increasing curve; such symptom is only weakly sensitive to object condition evolution and as such is of little use, despite marked ICM decrease. A measure is thus required that would combine both sensitivity to condition parameters and a degree of process organization [23, 33]. Such measure, termed representativeness factor R, is proposed in the following manner. Linear approximation is used for continuous entropy:and Weibull approximation for normalized symptom:
representativeness factor is then defined as
Obviously, R should be positive: the larger the R, the more representative is the symptom under consideration. Alternative approach may be adopted for Fréchet approximation; the choice of either of these approximations does not influence qualitative results of symptom assessment.
Measurement data for the first example were obtained with the intermediate-pressure turbine of a 260 MW power-generating unit; the first measurement was performed shortly after commissioning, and available data cover the period of almost 10 years. Vibration velocity was recorded at the front and rear bearings, in three mutually perpendicular directions. Components generated by turbine fluid flow system are contained in four 23% CPB bands, which give 24 available symptoms. Of these, as many as 13 symptoms have revealed no increasing trend; this may be attributed to comparatively short period of operation, as evolution of the fluid flow system condition is usually rather slow. For the remaining 11 symptoms, measured values were normalized, and peak trimming was performed (Eqs. (21) and (22), with ch = 1.5 and cl = 0.7). It was followed by CUSUM analysis, which revealed an abrupt change at about 2200 days (see Figure 10). Normalization was thus performed according to the procedure outlined in Section 3.2.4. Trend normalization was based on the Weibull function assumption. Data processed in this manner were used for ICM analysis, with time window length of 25 points and normal distribution assumption (cf. Section 3.2.1).
Continuous entropy time histories are in some cases rather irregular, but nonetheless six of them exhibit a decreasing trend; an example is shown in Figure 11. For these six cases, representativeness factor was calculated in accordance with Eq. (29). Results are shown in Table 1. It is easily seen that the values of R vary within broad limits. Without doubt symptoms numbered 1 and 2 are the most representative ones. Symptoms 16, 18, and 24 are certainly inferior, while representativeness of symptom 5 is weak. In this manner, symptoms may be identified that are most suitable for fluid flow system condition assessment.
Figure 12 shows contributions of all 11 symptoms that exhibit an increasing trend into the first three singular values. It may be noted that results are basically consistent with those shown in Table 1. The main differences are:
Comparatively high contributions of symptom number 5, which has a low representativeness factor
Better result for symptom number 18
Comparatively high contributions of symptom number 9, which is absent in Table 1 (lack of entropy decreasing trend)
|Symptom number||Symptom description (kHz)||Value of γ||Entropy decrease rate||Representativeness factor|
|1||FB-V 3.15||11.24||0.960||85.44 × 10−3|
|2||FB-V 4||10.64||0.905||85.07 × 10−3|
|5||FB-H 3.15||500.0||0.010||0.02 × 10−3|
|16||RB-V 6.3||52.63||0.775||14.73 × 10−3|
|18||RB-H 4||52.63||0.497||9.44 × 10−3|
|24||RB-A 6.3||55.56||0.637||11.47 × 10−3|
Before commenting on these findings, a second example will follow, this time for a comparatively old 200 MW unit with over 200,000 hours logged; available database covers over 16 years. Fluid flow system of the high-pressure turbine generates vibration components that are contained in ten 23% CPB frequency bands. Given two bearings and three directions, this means that as many as 60 symptoms have to be analyzed. In order to simplify the picture, a two-stage procedure was employed . First, for every measurement point and direction, two dominant symptoms were selected using the SVD approach. Twelve symptoms selected in this manner were then analyzed with both SVD and ICM methods. Results are shown in Figure 13 and Table 2.
|Symptom number||Symptom description (KHz)||Representativeness factor|
|1||FB-V 6.3||−0.63 × 10−3|
|2||FB-V 8||−15.30 × 10−3|
|3||FB-H 5||6.97 × 10−3|
|4||FB-H 6.3||5.57 × 10−3|
|5||FB-A 6.3||8.84 × 10−3|
|6||FB-A 8||3.77 × 10−3|
|7||RB-V 5||−1.93 × 10−3|
|8||RB-V 8||1.17 × 10−3|
|9||RB-H 6.3||0.20 × 10−3|
|10||RB-H 8||4.14 × 10−3|
|11||RB-A 2||−2.52 × 10−3|
|12||RB-A 8||5.69 × 10−3|
In Table 2, cases with R < 0 have been deliberately included, in order to demonstrate that symptoms with comparatively good rating based on the SVD analysis—in this case, symptom No. 2—sometimes have to be rejected. On the other hand, symptoms with rather high values of R—e.g., numbers 5, 10, and 12—are poorly rated by the SVD method. In fact, only symptom numbers 3 and 4 are chosen on the basis of both methods.
In order to comment on these two examples, it has first to be noted that neither SVD nor ICM approach can be considered a reference one. It seems that discrepancies between the results obtained with both may be attributed to at least two possible causes. First, preprocessing of measurement data is based on relatively simple procedures, and their inherent deficiencies—such as inadequate robustness—may influence the final result. Second, the SVD method does not disqualify cases with entropy increase, which are rejected within the ICM approach. This question requires further study. As pointed out in , it seems justified to state that symptoms selected on the basis of both methods can be safely labeled as the most suitable ones for object condition assessment and prognosis.
In this chapter, a relatively straightforward and simple method is presented for evaluation of diagnostic symptoms from the point of view of their suitability for assessment and prognosis of technical condition evolution. For this purpose, the proper choice of symptoms is of prime importance. This is particularly important for complex objects that generate a large numbers of various symptoms. In most cases it is very difficult, or even impossible, to make such choice in a direct manner, even with extensive knowledge on object layout and operation. The proposed method is based on an analysis of an information content measure as a function of time, and the basic assumption is that the greater is general damage advancement, the more deterministic, and hence predictable the symptom becomes. It turns out, however, that in order to obtain reliable results certain preprocessing of measurement data is mandatory. Results of this method have been compared with those obtained from singular value analysis, which had been earlier proposed and tested. This approach has been applied for vibration-based symptoms of steam turbines operated by power plants and shown to give consistent results. In general it can be applied to any symptom, irrespective of its physical origin, as well as for other machines or structures. In the author’s view, possible further development should be concentrated on the preprocessing of measurement data and improvement of the representativeness factor. Other information content measures might also be worth considering; however, the best results have so far been obtained with continuous entropy.
Conflict of interest
The author of this text has no conflict of interest to declare.