1. Introduction
Rotary machinery is widely used in various types of engineering systems ranging from simple electric fans to complex machinery systems such as aircraft. A reliable online condition monitoring system is very useful in industries both as a quality control scheme and as a maintenance tool. In quality control, the early detection of faulty components can prevent machinery performance degradation and malfunction. As a maintenance tool, machinery health condition monitoring enables the establishment of a maintenance program based on an early warning. This can be of great value in cases involving critical machines (e.g., airplanes, power turbines, and chemical engineering facilities), where an unexpected shutdown can have serious economic or environmental consequences.
Condition monitoring is an act of fault diagnosis by means of appropriate observations from different information carriers, such as temperature, acoustics, lubricant, or vibration. Vibration-based monitoring, however, is the most commonly used approach in industries because of its ease of measurement, which also will be used in this study.
Fault diagnosis is a sequential process involving two steps: representative feature extraction and pattern classification. Feature extraction is a mapping process from the measured signal space to the feature space. Representative features associated with the health condition of a machinery component (or subsystem) are extracted by using appropriate signal processing techniques. Pattern classification is the process of classifying the characteristic features into different categories. The classical approach, which is also widely used in industry, relies on human expertise to relate the vibration features to the faults. This method, however, is tedious and not always reliable when the extracted features are contaminated by noise. Furthermore, it is difficult for a diagnostician to deal with the contradicting symptoms if multiple features are used. The alternative is to use analytical tools (Li & Lee, 2005, Gusumano et al., 2002) and data-driven paradigms (Isermann, 1998). The latter will be utilized in this work because an accurate mathematical model is difficult to derive for a complex mechanical system, especially when it operates in noisy environments. Data-driven diagnostic classification can be performed by reasoning tools such as neural networks (Rish et al, 2005, Uluyol, 2006), fuzzy logic (Mansoori et al., 2007, Ishibuchi & Yamamoto, 2005), and neural fuzzy synergetic schemes (Wang, 2008, Uluyol et al., 2006).
Even though several techniques have been proposed in the literature for machinery condition monitoring, it still remains a challenge in implementing a diagnostic tool for real-world monitoring applications because of the complexity of machinery structures and operating conditions. When a monitoring system is used in real-time industrial applications, the critical issue is its reliability. Unreasonably missed alarms (i.e., the monitoring system cannot pick up existing faults) and false alarms (i.e., the monitoring system triggers an alarm because of noise instead of real faults) will seriously mitigate its validity. To tackle these challenges, the objective of this research work is to develop a new technique, an integrated classifier, for real-time condition monitoring in, especially, gear transmission systems. In this novel classifier, the monitoring reliability is enhanced by integrating the information of the object’s future states forecast by a multiple-step predictor; furthermore, the diagnostic scheme is adaptively trained by a novel recursive hybrid algorithm to improve its convergence and adaptive capability.
This chapter is organized as follows: Section 2 describes integrated classifier, whereas the multiple-step predictor and monitoring indices are described in Section 3. Section 4 discusses the hybrid online training algorithm. In Section 5, the viability of the proposed integrated classifier is verified by experimental tests corresponding to different gear conditions.
2. Diagnostic system
The diagnostic classifier is used to integrate the selected features obtained by implementing appropriate signal processing techniques. The purpose is to make a more positive assessment of the health condition of the mechanical component (or subsystem) of interest. The diagnostic reliability in this suggested classifier will be enhanced by implementing the future (multi-step-ahead) states of the object’s conditions. The forecasting in this integrated classifier is performed for input variables so as to make it easier to track the error sources in diagnostic operations.
The developed classifier is an NF paradigm which is able to facilitate the incorporation of diagnostic knowledge from expertise and to extract new knowledge in operations by online/offline training. The diagnostic classification is performed by fuzzy logic (Jang 1993), whereas an adaptive training algorithm, as discussed in Section 4, is utilized to fine-tune the fuzzy system parameters and structures. The conditions of each object (machinery component or subsystem) are classified into three categories:
The diagnostic classification, in terms of the diagnostic indicator
where
When multiple features (input indices) are employed for diagnostic classification operations, the contribution of each feature combination (association) to the final decision depends, to a large degree, on the situation under which the diagnostic decision is made. Such a contribution is characterized by a weight factor
Similarly, the diagnostic classification based on the predicted monitoring indices, {
where
The number of rules is associated with the diagnostic reasoning operations of input state variables. In general, if all monitoring indices are
The input nodes in layer 1 transmit the monitoring indices {
Each node in layer 2 acts as a MF, which can be either a single node that performs a simple activation function or multilayer nodes that perform a complex function. The nodes in layer 3 perform the fuzzy
where
Defuzzification is undertaken in layer 4. By normalization, the faulty diagnostic indicator will be
Similarly, the fault diagnostic indicator based on forecast inputs will be
The states of the diagnostic indicator
The final decision regarding the health condition of the object of interest is made by:
3. Prediction of monitoring indices
3.1. Monitoring indices
In general, most machinery defects are related to transmission systems, mainly for gears and bearings. In this work, gears are used as an example to illustrate how to apply the proposed integrated classifier for machinery condition monitoring. In operations, the fault diagnosis of a gear train is conducted gear by gear. Because the measured vibration is an overall signal contributed from various vibratory sources, the primary step is to differentiate the signal specific to each gear of interest by using a synchronous average filter (Wang et al., 2001). By this filtering process, the signals which are non-synchronous to the rotation of the gear of interest (e.g., those from bearings, shafts and other gears) are filtered out. As a result, each gear signal is computed and represented in one full revolution, called the
Several techniques have been proposed in the literature for gear fault detection. However, because of the complexity in the machinery structures and operating conditions, each fault detection technique has its own advantages and limitations, and is efficient for some specific application only (Wang et al., 2001). Consequently, the selected features for fault diagnostics should be robust, that is, sensitive to component defects but insensitive to noise (i.e., the signal not carrying information of interest). In this case, three features from the information domains of energy, amplitude, and phase are employed for the diagnosis operation:
1. Wavelet energy function, using the overall residual signal which is obtained by bandstop filtering out the gear mesh frequency
2. Phase demodulation (McFadden, 1986), using the signal average;
3. Beta kurtosis, using the overall residual signal.
The details of these reference functions are listed in Appendix A.
Based on the derived reference functions, the monitoring indices are determined to quantify the feature characteristics. Each index is a function of two variables, magnitude and position. The magnitude of an index is determined as the normalized relative maximum amplitude value of the corresponding reference function; the position is where the maximum amplitude is located. Usually, the maximum amplitude positions in these reference functions do not coincide exactly due to the phase lags in signal processing. Based on simulation and test observations, an
Fig. 3 illustrates an example of the reference functions corresponding to a healthy gear with 41 teeth. Fig. 3a shows part of the original vibration signal measured from the experimental setup to be illustrated in Section 5. Fig. 3b represents the signal average of the gear of interest, which is obtained by synchronous average filtering; each wave represents a tooth period. Figs. 3c to 3e represent the resulting reference functions of the wavelet energy, beta kurtosis, and phase modulation, respectively. It is seen that no specific irregularities can be found from these reference functions for this healthy gear.
Fig. 4 shows the processing results corresponding to a cracked gear with 41 teeth. It is impossible to recognize the gear damage from the original signal (Fig. 4a). A little signature irregularity can be recognized around 200 in the signal average graph (Fig. 4b). However, this gear damage can be identified clearly from the proposed reference functions (Figs. 4c to 4e). Although the maximum peak positions are little different from one graph to another, these peaks occur within one influence window (four tooth periods in this case).
Fig. 5 illustrates the processing results for a chipped gear (with 41 teeth). Some signature irregularity can be recognized around 200 in the signal average graph (Fig. 5b) due to this gear tooth damage. However, this defect can be clearly identified from other three reference functions (Figs. 5c to 5e), and the monitoring indices are located within one influence window (four tooth periods).
3.2. Forecasting of the monitoring indices
System state forecasting is the process to predict the future states in a dynamic system based on available observations. Several techniques have been suggested in the literature for time series forecasting. The classical methods are the use of stochastic models (Chelidze & Cusumano, 2004), which are usually difficult to derive for mechanical systems with complex structures. More recent research on time series forecasting has focused on the use of data-driven paradigms, such as neural networks and neural fuzzy schemes (Tse & Atherton, 1999, Pourahmadi, 2001). In this work, the multi-step-ahead prediction of the input variables (indices) is performed by the use of a predictor as suggested in (Wang & Vrbanek, 2007), whose effectiveness has been verified: it can capture and track the system’s dynamic characteristics quickly and accurately, and it outperforms to other related classical forecasting schemes.
Given a monitoring index
where
This NF predictor has a weighted feedback link to each node in layer 2 to deal with time explicitly as opposed to representing temporal information spatially. The context units copy the activations of output nodes from the previous time step, and allow the network to memorize clues from the past, which forms a context for current processing. This function of recurrent networks is valuable for predictors with limited and step inputs (i.e.,
where
where
The fuzzy system parameters are trained by using a hybrid algorithm: that is, the premise parameters in the MFs
4. Online training of the diagnostic classifier
The developed diagnostic classifier should be optimized in order to achieve the desired input-output mapping. Several training algorithms have been proposed in the literature for NF-based classification schemes (Figueiredo et al., 2004, Castellano et al., 2004). In offline training, representative data should cover all of the possible application conditions (Korbicz et al., 2004); such a requirement is usually difficult to achieve in real-world machinery applications because most machinery operates in noisy and uncertain environments. Furthermore, machinery dynamic characteristics may change suddenly, for instance, just after repair or regular maintenance. Therefore, an adaptive training algorithm is preferred in time-varying systems to accommodate different machinery conditions (Wang & Lee, 2002). In this case, a hybrid method based on recursive Levenberg-Marquet (LM) and LSE will be adopted to train the integrated classifier. Such a training approach possesses randomness that may help to escape certain local minima.
4.1. Training the premise MF parameters
The nonlinear premise MF parameters will be trained by adopting the recursive LM method. The general LM algorithm possesses quadratic convergence close to a minimum. Its convergence property is still reasonable, even if the initial estimates are poor. In addition, the LM algorithm has been proven globally convergent in many applications by properly choosing the step factors.
For a training data pair
where
To simplify expressions, the variable
where
The Hessian matrix can be expressed as
In implementation, instead of computing the
where
Correspondingly, (15) can be rewritten as
where
The computation of
Based on the matrix inversion formula and by some manipulations, Eq. (18) becomes
The recursive LM algorithm can be represented by
The denominator
By simulation tests with the requirements of the recognition rate ≥ 80%, reasonable training speed and accuracy, the following initial values are given to the related parameters in this study:
4.2. Implementation of the hybrid training method
In implementation, inside each training epoch, the nonlinear MF parameters in the classifier are optimized in the backward pass by using a recursive LM method, whereas consequent linear rule weights are updated by LSE in the forward pass. On the other hand, after training or real applications over some time period, if the updated rule weights wj are sufficiently small (e.g., wj< 0.01), the contribution of the related rule to the final classification operation can be neglected, and that rule can be removed from the rule base.
5. Performance evaluation
5.1. Experimental setup
Fig. 7 shows the experimental setup used in this study to verify the performance of the proposed integrated classifier.
The apparatus is anchored onto a massive concrete block. It consists of a 3-HP AC drive motor and a gearbox. The motor rotation is controlled by a speed controller which allows tested gears operating in the range of 20 to 4200 rpm. An optical sensor provides a one- pulse-per-revolution signal which is used as the reference for the time synchronous average
filtering. The gearbox consists of two pairs of spur or helical gears. The shafts in the gearbox are mounted to the housing by rolling element bearings. The load is provided by a magnetic loading system which is connected to the output shaft. The speed of the drive motor and the load are adjusted to simulate different speed/load operating conditions. The vibration is measured using ICP accelerometers mounted on the gearbox housing along different orientations. After being properly preconditioned, the collected signals are fed to a computer for further processing.
5.2. Performance evaluation
To verify the viability of the proposed classifier, five gear cases are tested in this study as represented in Fig. 8:
a. healthy gears (C1);
b. gears having a tooth crack with 15% (C2) and 50% (C3) tooth root thickness;
c. gears having a chipped tooth with 10% (C2) and 40% (C3) tooth surface area removed.
These demonstrated faults belong to localized gear defects. From the signal property standpoint, when a localized fault occurs, some high-amplitude pulses will be generated due to impacts, which are relatively easier for a signal processing technique to recognize. When a localized fault propagates towards a distributed defect, the overall energy of the fault will increase, but it often becomes more wideband in nature and difficult to detect in the presence of the other vibratory components of the machine. This example identifies a characteristic of currently used fault detection techniques: It is usually easier to detect a distinct low-level narrowband tone than a high-level wideband signal in the presence of other signals or noises. Even though a distributed defect, such as pitting and wear, is initiated from a localized fault which is detectable as an incipient defect, most currently available vibration-based signal processing techniques cannot effectively detect an advanced distributed fault which, however, can be diagnosed based on other information carriers, such as acoustic signals.
To make a comparison, the diagnostic results from the following three classifiers are also listed:
A pure fuzzy system with a similar reasoning architecture as in Fig. 2 but without the use of predictors. The rule weight factors are chosen as those in the integrated classifier after initial training.
Classifier-1: An NF classifier with a similar reasoning architecture as in Fig. 2 but without predictors. Its MF parameters are trained by a gradient-LSE algorithm.
Classifier-2: Same as Classifier-1, but trained by the hybrid algorithm of the recursive LM and LSE.
Given the network architectures, the initial parameters of three adaptive classifiers can be primarily trained by using some data sets collected in previous tests on the same test apparatus, or be initialized by experience. Then these classifier parameters are optimized in the following online training processes.
During online tests, motor speed and load levels are randomly changed to simulate general and unusual machinery operating conditions. The tests are conducted under load levels from 0.5 to 3 hp, and motor speeds from 50 to 3600 rpm.
In online monitoring, based on test schedule and load/speed change frequency, the monitoring time-interval is set at 15 minutes; that is, all the monitoring schemes are applied automatically every 15 minutes for condition monitoring operations. Three-steps-ahead predictors (i.e., r = 3) are used in the integrated classifier. The selection of data size depends on noise reduction requirement; usually the data for the gear with the lowest speed should cover more than 100 revolutions. For example, if the slowest gear speed in the gearbox is 1200 rpm, the data acquisition process takes at least 5 seconds (15 seconds in this case). The monitoring is performed gear by gear. Three examples corresponding to healthy, cracked and chipped gears (all having 41 teeth) have been illustrated in Figs. 3 to 5, respectively.
Each healthy gear condition is tested over 24 hours whereas each faulty gear condition is tested over 50 hours. In total, 386 data pairs are recorded for testing purpose. Table 1 summarizes the classification performance by different diagnostic schemes.
The fuzzy classifier records 15 missed alarms and 37 false alarms, with an overall reliability of 85.3%. Its relatively poor diagnostic performance is mainly due to the lack of learning capability. In addition, fixed or human-determined system parameters are subject to variations and are rarely optimal in terms of reproducing the desired classification outputs, which results in the fuzzy classifier not being optimized under different operating conditions.
Classifier-1 records 7 missed alarms and 21 false alarms, with an overall reliability of 92.5%. One difference between this NF system and the fuzzy classifier is related to the rule weight factors. Each signal processing technique (and the resulting feature) has a limited capability in fault detection. Even if the firing strengths of two fuzzy if-then rules are identical, their diagnostic reliabilities may be different under different machinery conditions. Therefore, rule weights play an important role in the diagnostic classification operations.
Classifier-2 records 7 missed alarms and 17 false alarms, with an overall reliability of 93.6%. The main difference between Classifier-2 and Classifier-1 is related to training algorithms. It is seen that the recursive LM algorithm is superior to the gradient method in convergence, and has the randomness to reduce the chance of possible trapping due to local minima. In addition, each rule has its own decision (mapping) space, whereas the MFs and the rule weights are directly associated with the characteristics of the decision space. The efficient optimization of classifiers can adjust the boundary characteristics of the decision space so as to reduce misclassifications. This property is especially important for classifier with coarse fuzzy partitions.
The developed integrated classifier generates 3 missed alarms and 7 false alarms, with an overall reliability of 97.6%. Compared with Classifier-2, the integrated classifier can enhance the classification accuracy by properly implementing the future states of the classifier. It follows that adaptively fine-tuning the fuzzy parameters is necessary to enhance the approximation of the mapping from the observed symptoms to the underlying faults. In addition, the fault severity can be recognized because, to some extent, the greater the fault, the more pronounced the feature modulation, and the larger the monitoring indices will become.
The developed integrated diagnostic classifier provides a robust problem solving framework. Machinery conditions vary dramatically in real-world applications, and new system conditions may occur under different circumstances. With the help of an adequate learning algorithm, new information can be extracted from online training, and the diagnostic knowledge base can be expanded automatically to accommodate different machinery conditions.
In general, deterioration history of most machinery components follows a “U curve” as illustrated in Fig. 9. It consists of four periods: the run-in stage (I), the normal operation period (II), initial (III) and advanced (IV) failure stages, respectively. Such a trend characteristic is easy for a powerful NF predictor to catch up. If a false alarm is generated during the healthy period II, the false alarm is induced due to noise instead of real defect. Based on the forecast result, the diagnostic state should lie in period III (or initial defect). However random noise will disappear in the following processing steps, and the diagnostic indicator should return to period II (or healthy). Correspondingly, this misclassification can be prevented by the integrated classification /forecasting information. On the other hand, if an object is damaged, its diagnostic indicator should lie in period III (or IV). If a misclassification occurs, or the diagnostic indicator falls in period II, the forecast information will be contradictory to that from the classifier. Comprehensive analysis in Eq. (7) can avoid this possible missed alarm so as to improve fault diagnostic reliability. In both aforementioned examples, classifier will be updated to accommodate such a noise in the following monitoring applications.
5. Conclusions
In this paper, an integrated classifier is developed for gear fault diagnostics. The purpose is to provide industries with a more reliable monitoring tool to prevent machinery system performance degradation, malfunction, and sudden failure. The classifier can integrate different features for a more positive assessment of the object’s health condition. The diagnostic reliability is improved by properly integrating the future states of the gear, which are forecast by multi-step predictors. An online hybrid training technique based on a recursive LM and LSE is adopted to improve the classifier’s convergence and adaptive capability to accommodate different machinery conditions. The viability of the new integrated classifier has been verified by experimental tests corresponding to different gear conditions.
On the other hand, it should be stated that although satisfactory results have been achieved based on the developed integrated classifier, its network architecture is relatively complex which may not be easy for implementation for some real-world applications. Future research is to develop novel evolving fuzzy or neuro-fuzzy classification schemes for more effective diagnostic operations. New training algorithms will be proposed to further improve the training convergence. The proposed techniques will also be employed for real-world industrial applications in vehicles, wind turbines, and manufacturing facilities.
Acknowledgments
This work was partly supported by MC Technologies Inc. and Materials and Manufacturing Ontario in Canada.
References
- 1.
Chelidze D. Cusumano J. 2004 A dynamical systems approach to failure prognosis , ,126 1-7. - 2.
Castellano G. Fanelli A. Mencar C. 2004 An empirical risk functional to improve learning in a neuro-fuzzy classifier, , Man, Cybernetics, Part B,34 725 31 . - 3.
Figueiredo M. Ballini R. Soares S. Andrade M. Gomide F. 2004 Learning algorithms for a class of neurofuzzy network and application s, , Man, Cybernetics, Part C,34 293 -301. - 4.
Gusumano J. Chelidze D. Chatterjee A. 2002 Dynamical systems approach to damage evolution tracking, part 2: Model-based validation and physical interpretation, ,124 258-264. - 5.
Ishibuchi H. Yamamoto Y. 2005 Rule weight specification in fuzzy rule-based classification systems , ,13 428 435 . - 6.
Isermann R. 1998 On fuzzy logic applications for automatic control, supervision, and fault diagnosis , ,28 221 235 . - 7.
Jang J. 1993 ANFIS: adaptive-network-based fuzzy inference system , , Man, Cybernetics,23 665 685 . - 8.
Korbicz J. Koscielny J. Kowalczuk Z. Cholewa W. 2004 , Springer. - 9.
Li L. Lee H. 2005 Gear fatigue crack prognosis using embedded model gear dynamic model and fracture mechanics, ,20 836 846 . - 10.
Mansoori E. Zolghadri M. Katebi S. 2007 A weighting function for improving fuzzy classification systems performance , ,158 583 591 . - 11.
Mc Fadden P. 1986 Detecting fatigue cracks in gears by amplitude and phase demodulation of the meshing vibration , ,108 165 -170. - 12.
Pourahmadi M. 2001 , John & Sons. - 13.
Rish I. Brodie M. Ma S. Odintsova N. Beygelzimer A. Grabarnik G. Hernandez K. 2005 Adaptive diagnosis in distributed systems , ,16 1088 1109 . - 14.
Tse P. Atherton D. 1999 Prediction of machine deterioration using vibration based fault trends and recurrent neural networks, ,121 355 -362. - 15.
Uluyol O. Kim K. Nwadiognu E. 2006 Synergistic use of soft computing technologies for fault detection in gas turbine engines , , Man, Cybernetics, Part C,36 476 484 . - 16.
Wang J. Lee L. 2002 Self-adaptive neuro-fuzzy inference systems for classification applications , ,10 790 802 . - 17.
Wang W. 2008 An intelligent system for machinery condition monitoring , ,16 1 110 122 . - 18.
Wang W. Ismail F. Golnaraghi F. 2001 Assessment of gear damage monitoring techniques using vibration measurements , ,15 905 922 . - 19.
Wang W. Vrbanek J. 2007 A multi-step predictor for dynamic system property forecasting , ,18 3673 3681 .