Adaptive Signal Decomposition Methods for Vibration Signals of Rotating Machinery

Vibration‐based condition monitoring and fault diagnosis are becoming more common in the industry to increase machine availability and reliability. Considerable research efforts have recently been directed towards the development of adaptive signal process ‐ ing methods for fault diagnosis. Two adaptive signal decomposition methods, i.e. the empirical mode decomposition (EMD) and the local mean decomposition (LMD), are widely used. This chapter is intended to summarize the recent developments mostly based on the authors’ works. It aims to provide a valuable reference for readers on the processing and analysis of vibration signals collected from rotating machinery.


Introduction
Signal processing methods with adaptive basis functions are more effective in revealing the overlapping components in vibration signals. They are able to adaptively disassemble nonlinear and non-stationary signals into some simpler signal components. The empirical mode decomposition (EMD) [1] method and the local mean decomposition (LMD) [2] method have been recognized to be such effective adaptive signal processing methods.
Since the introductions of EMD in year 1998 [1] and LMD in year 2005 [2], many improvements and applications have been reported. In this chapter, we summarize the recent developments mostly based on the authors' works. We hope that it is a valuable reference for readers on the processing and analysis of vibration signals collected from rotating machinery. This chapter Although the details of decomposition and the resulted signals are quite different, these two methods share some common advantages, for example, the adaptive property. They also share some common challenges, which will be addressed in Section 3. Ref. [4] provides a comparative study, and Ref. [5] reviews applications of EMD in the field of fault diagnosis.
No matter which of two methods is used, a multi-component signal, x(t), can be adaptively decomposed into k mono-components, x p (t) (p = 1, 2, …, k) (IMFs for EMD or PFs for LMD) and a residue, u k , and can be reconstructed by summing them together, i.e.

Reported improvements in EMD and LMD
The EMD and the LMD methods are proven to be quite versatile in a broad range of applications for adaptively extracting signals of interest from noisy data. This section discusses their main and common challenges, including end effects, mode mixing, feature signal selection and strong noise reduction. After analysing each issue, the corresponding improvement is also shown. Other open issues, such as stopping criterion and envelope function, will be briefly discussed in Section 4.

End effects
End effects have plagued data analysis from the beginning of any known method [6]. The end effects were first mentioned in the spine fitting of the EMD. This section briefly reviews related improvements and then introduces an adaptive method to eliminate the end effects for the vibration signals collected from rotating machinery.

Improvements for eliminating end effects
Two ways have been proposed to eliminate end effects. One timid way is to use a sliding window [7], as is done routinely in Fourier analysis [6]. The sliding window is successfully applied to Fourier analysis using various windows and continuous wavelet analyses. However, appropriate and reliable windows are often analysis method related but not related to the data themselves. It inevitably leads to sacrifice some precious data near the ends [8]. Furthermore, it would be a hindrance for data processing when the data are short.
The other elimination way is extension or prediction of data beyond their existing range, which is still the best basic solution. Huang et al. [1] first proposed to add characteristic waves to treat the effects, in which the extra points are determined by the average of n-waves in the immediate neighbourhood of the ends. Motivated by this idea, some extension methods tried to extend the temporal waveforms forward and backward by using all available information in data, including feature-based extension, mirror or anti-mirror extension, intelligent prediction, pattern comparison, etc. [9].
It is proven that prediction methods can provide good performance on the extension of data. It is not needed to predict the whole time series, but to predict the value and location of the extrema adjacent to the ends. However, as pointed out by Huang and Shen [6], the data extension or prediction is a risky procedure even for linear and stationary processes. For nonlinear and non-stationary processes, the problems, such as predictable conditions, method and accuracy, are still open at present. Meanwhile, intelligent methods have their own shortcomings, including minima and over-fitting in artificial neural network (ANN) and sensitiveness to parameter selection in both support vector regression and ANN.
No matter which method is developed, their main idea is that newly added points have minimal interior perturbations and extend the signal implicitly or explicitly beyond the existing range. Furthermore, the extending data can well repeat the form or feature of the original signal. The reliability of such extension will sharply decrease as its distance away from the known data set increases, and thus it is necessary to be careful in extending a signal only by adding the extrapolation data to it [10]. Otherwise, the error of such operation would propagate from the end to the interior of the data and even cause severe deterioration of the whole signal [9].
For most of the vibration signals generated by rotating machinery, their non-linear and nonstationary properties are definite, which is quite challenging for data extension. Although the mirror image extension is easier to be put into practice, the real case that the data are mostly from non-stationary stochastic systems must be faced. Fortunately, the vibration signal has an advantage to assist the extension: it is cyclo-stationary [11]. Meanwhile, the extension based on characteristics of the signal waveform seems to be more appropriate to describe such complexity of problems [10]. In the following section, an adaptive waveform extension method [9] is introduced to extend vibration signals and avoid error accumulation.

Adaptive data extension-based spectral coherence
To facilitate applications to condition monitoring and fault diagnosis, the designed extension method should have good extension performance as well as easy operation to implement. An adaptive extension method [9] was designed for vibration signals, mainly including three steps: waveform segmentation, spectral coherence comparison and waveform extension. Its main idea is to automatically search inside waveforms having mostly similar frequency spectrum to ends, and then use their successive segments for signal extension. In this method, a critical point is how to measure the waveform similarity. Although there are some similarity measures, such as correlation coefficient, cross-correlation, waveform similarity, originally used in the field of data fusion, pattern recognition and speech recognition, most of them are susceptible to noise and not suitable for processing vibration signals since their acquisition and transmission often suffer from noise. Therefore, an index measuring the spectral coherence [12] is introduced here. The procedure is described as follows [9]: Step 1. Waveform segmentation. Identify zero crossings of the analyzed signal and then separate the signal into N segments, c i (t) (i = 1, ···, N).
Step 2. Segment repetition and fast Fourier transform (FFT). Repeat each segment to form a long waveform and then conduct fast Fourier transform (FFT).
Step 3. Spectral coherence comparison. Use Eq. (2) to calculate the revised spectral coherence (RSC) values of the first segment c 1 (t) and other segments, and then find the segment, denoted as c back (t), having the largest value of the RSC. Similarly, search the segment c for (t) similar to the last segment c N (t).
where C i (F) and C j (F) are frequency spectra of the signal components c i (t) and c j (t), respectively.
Step 4. Waveform extension. Use the pervious segment of c back (t) for extending backward and the next segment of c for (t) for extending forward.
Based on this, the extended signal can be decomposed by the EMD or LMD method, and extended samples are at last truncated before further analysis. Using the hidden periodicity, a cyclo-stationary signal, for example, a vibration signal, can be easily extended beyond its original range, and its temporal continuity in time domain and spectral coherence in frequency domain can be properly maintained.

Experiment and analysis
A vibration signal collected from an industrial traction motor [13] is shown in Figure 2. The specification of the experiment setup is given in Table 1. This signal is cyclo-stationary with around three cycles, and its waveform is thus divided into six segments, which are marked in Figure 2. Figure 3 shows frequency spectra of these segments.
To estimate the influence caused by the end effects, a measure of energy change [14] before and after decomposition is defined as where R x , R p and R u are root-mean-square (RMS) values of the original signal x(t), the pth product function PF p (t), and the residue signal u k (t), respectively. The value of the measure is θ ≥ 0 . The closer the measure is to zero, the smaller the error between the original signal and decomposition results is; that is to say, the influence caused by end effects is smaller.  The revised spectral coherence (RSC) values γ 1,j (j = 2, …, 6), i.e. RSC values between the segment c 1 (t) and one of other segments, are shown above each sub-figure in Figure 3. It can be seen that the segment c 5 (t) has the largest RSC value of 0.97 and its previous segment c 4 (t) is then used for backward extension of c 1 (t). In a similar way, c 2 (t) has the largest RSC value with the last segment c 6 (t), and thus the next one of c 2 (t), i.e. c 3 (t) is used for forward extension. The extended vibration signal is shown in Figure 4, where extended waveforms are shown in red. Its RSC value with the original signal is 0.94, and the measure θ is 0.005. If no extension, the measure θ is 0.106. Note: f s -sampling frequency; f r -rotating frequency of the motor; f d -characteristic defect frequency; BPFO-the ball pass frequency of the outer race; BPFI-the ball pass frequency of the inner race.  After applying the LMD method to the extended signal and truncating extended parts, five PFs and a residue are obtained, the first three of which have larger correlation coefficient values with the original signal and thus are selected for further analysis. Their waveforms and envelope spectra are shown in Figure 5. In Figure 5(d), the identified characteristic frequency (104 Hz) and its harmonics (around 2 × and 3 × BPFO) can be easily observed. The error between the theoretical value (114 Hz) and the identified one (104 Hz) is mainly caused by inaccurate shaft speed after long use and limited samples (only 0.12 second). In Figure 5(e) and (f), higher impulses are identified at the frequency of 25 Hz, corresponding to the motor rotating frequency. It indicates that PF1 is the signal generated by the inspected bearing with an outer race defect, and PF2-3 is generated by the motor, which turned out to be caused by the eccentric problem after inspection. More cases on bearings and gears can be found in Ref. [9].

Mode mixing
Another open problem for EMD and LMD is the mode mixing. It is originally defined as a single IMF either consisting of signals of widely disparate scales, or a signal of a similar scale residing in different IMF components, which causes serious aliasing in the time-frequency   distribution and makes the meaning of IMF unclear [8]. This section focuses on the solution to the problem of mode mixing.

Separation of disparate components
According to the above definition, there are two possibilities: either completely different components existing in one IMF, or one component appearing in more than one IMF. To remove the former case, Wu and Huang [8] presented a noise-assisted signal processing method, called ensemble EMD (EEMD). In this method, white noise with a pre-setting amplitude is introduced to perturb the analyzed signal and enables the EMD method to visit all possible solutions in the finite neighbourhood of the true final IMF [8]; and the ensemble means of decomposition results help to remove the remaining noise in the results. For the EEMD method, two parameters, the noise amplitude and the ensemble number, are critical, the former of which has more influence on its performance [15]. In order to process signals adaptively, it is ideal to automatically find appropriate parameters for the analyzed signal.
A parameter optimization method [13] is designed for the EEMD. In this method, an index termed relative root-mean-square error (RMSE) is first used to evaluate the performance of the EEMD method when fixing a small ensemble number and setting various noise amplitudes, and then the signal-to-noise ratio (SNR) is introduced to evaluate the remaining noise in the results when gradually increasing the ensemble number.
For a signal, x o (k), it is assumed that it consists of main component(s), background noise and some components having small correlation coefficients with the chief one, which has the largest correlation coefficient with the signal x o (k) is marked as c max (k). The desired decomposition is to completely separate the component c max (k) from others, and the relative RMSE is thus used to evaluate the separation performance when setting various noise amplitudes. Its formulation is where x ¯¯ o is the mean of the signal x o (k), and S is the number of samples in this signal. The value of this index is in the range of 0-1. The smaller this index is, the closer the component c max (k) to the original signal. It means that the extracted IMF contains not only the main component of interest but also other components, and thus the objective is not achieved. However, there exists a value for the noise amplitude that maximises the index. At this point, the error between x o (k) and c max (k) is from noise and other components, that is to say, the extracted IMF and the other in the original signal share no common component, and the main component of interest is extracted from the original signal. The corresponding value is the optimal noise amplitude. Its procedure is briefly described as follows [13]: Step 1. Set a small value of the initial ensemble number, for example, N E = 10, and choose a relatively large value as the initial noise level, L N = l 0 . The noise amplitude A is to multiply the noise level by the standard deviation of the signal.
Step 2. Perform the signal decomposition using the EEMD method and calculate the relative RMSE of the chief component c max (k).
Step 3. Decrease the noise level and repeat Step 2 until the change in the relative RMSE is negligible or small enough.
Step 4. Identify the optimal noise level corresponding to the maximal relative RMSE.
Once the optimal noise level is numerically determined, the ensemble number can be determined by comparing the SNR values when gradually increasing the ensemble number from its pre-setting value.
To demonstrate this method, a vibration signal was collected from a small motor [13] and is shown in Figure 7(a). In the experiment, a fault was set on the outer race of the tested bearing. The specification of the experiment is shown in Table 1. Initial parameters are set as: a larger value for the noise level, L N = 2, and the ensemble number N E = 10. During the execution of the above program, the noise level is gradually decreased. When 2 ≤ L N ≤ 0.1, the noise level is decreased in the step of 0.1; when 0.1 < L N ≤ 0.01, its decreasing step is 0.01; when 0.01 < L N ≤ 0.001, its decreasing step is 0.001.
After applying the EEMD method with the above optimization method to decompose the vibration signal, the relative RMSEs for various noise levels are shown in Figure 6. As shown in this figure, the maximal relative RMSE is arrived at the noise level of 0.4, corresponding to the optimal one, and accordingly, the extracted IMF (IMF1) is shown in Figure 7(b).
Comparing with the original signal, most of noise and redundant components are separated from IMF1, and its kurtosis value is 26.07.
To compare with this, extracted IMFs when setting any three non-optimal noise levels are also shown in Figure 7. Having determined the optimal noise level, appropriate ensemble number is then determined. The variation in the SNR is shown in Figure 8. As the figure shows, when the ensemble number is less than 80, increasing the ensemble number gently accelerates the increase in the SNR value. When the ensemble number is larger than 120, the SNR value fluctuates smoothly.   Further increasing its value contributes to minor increasing of the SNR, but definitely rising computation cost. Therefore, using this optimization method, parameters of the EEMD can be automatically determined according to the signal itself, instead of empirical setting or the trial and error. More cases on bearings can be found in Ref. [13].

Mixing of similar components
Although the EEMD method can successfully separate signal components with different scales, another mode mixing still exists in the decomposition results, that is to say, one component may spread in more than one IMF. This also belongs to the mode mixing and results in energy dispersion and some redundant components without physical significance. It may be caused by repeated sifting process and severe stopping criterion. A simple and convenient solution is to combine the components from the same source. Therefore, the index of spectral coherence in Eq. (2) is used to evaluate the spectral similarity of two successive components and then combine the components with similar spectra into a natural IMF [12].
Using the index of spectral coherence, the similarity criterion of two successive IMFs obtained by the EEMD method is described as: (1) If γ j,j+1 → 1, it means that the IMFs, c j and c j+1 , have a relationship of similarity in frequency domain, that is to say, they have spectral coherence over the whole frequency range. Thus, these two IMFs should come from the same source and thus are combined to one natural IMF (NIMF).
(3) If γ j,j+1 is around 0.5, the spectral coherence of two IMFs cannot be determined. Such signal components are also viewed as two natural IMFs and would not be combined together. Figure 2 is used to demonstrate the process of similarity analysis and combination. After applying the EEMD method with the noise level 0.2 and the ensemble number of 30 to the signal, 12 IMFs are obtained, the first four of which have larger correlation coefficient values with the original signal and are shown in Figure 9. As shown in the figure, the frequency spectrum of IMF1 is a high-frequency dominated signal and centred at the frequency of 12 kHz, and it indicates that IMF1 corresponds to the signal generated by the faulty bearing in the traction motor; as for IMF3 and IMF4, they share the common frequency of 920 Hz generated by the faulty motor. Furthermore, the revised spectral coherences of all IMFs are calculated and the results are shown in Table 2. According to this table, there are three local minimal points, i.e. γ 2,3 , γ 5,6 and γ 7,8 . The RSC values of IMF3-IMF4 and IMF4-IMF5 are larger than 0.5 and it shows their similarity on frequency domain, and thus these three components are combined to one natural IMF. Between the second and the third local minimal values, IMF6 and IMF7 show the spectral similarity. Similarly, the remaining components, IMF8-IMF12, also show their spectral similarity, and thus are merged into another natural IMF. The RSC value of IMF1 and IMF2 is not close to 1 or 0 and these two IMFs are thus two natural IMFs. Final results are shown in Figure 10. The last two components are practically residues. Based on local minima of RSC, a fusion rule [12] was designed to automatically combine components from the same source and remove the mode mixing in the original EMD method.

The signal in
Other applications on bearings can be found in Ref. [12].

Strong noise reduction
In real rotating machinery, a raw vibration signal generally consists of strong noise and two or more sources. Some vibrations, such as improper installation and surfacing of the installed sensors, random impacts from friction and contact forces and external disturbances [16], are also so strong that the signal of interest is completely overwhelmed. Therefore, the recovery of the feature signal from noise, while preserving its features is a challenging problem. This section introduces a hybrid signal processing method [17] for noisy vibration signals.

Problem analysis
Although the EEMD method improves the scale separation ability of EMD method, both methods are based on extrema to discriminate signals generated by various sources. When the signal of interest is completely overwhelmed by strong noise, there may be a lack of necessary extrema for the EEMD method to separate the real signal from noise. An experimental signal collected from a bearing with an inner race defects is used to illustrate this problem [17]. The specification of the experiments is shown in Table 1. To simulate strong noise in real Fault Diagnosis and Detection cases, Gaussian white noise was added to the experimental signal, and the generated noisy signal is shown in Figure 11. As shown in the figure, the impulses caused by faulty bearing are completely masked by strong noise. After applying the EEMD method to this signal, 13 IMFs are obtained and the first four having larger correlation coefficient values with the original signal are shown in Figure 12, in which impulses are seldom observed and still buried in noise. It is because that the decomposition method lacks necessary extrema generated by the tested faulty bearing.
As for a signal with a relatively low signal-to-noise ratio, it is necessary to design an adaptive filter to extract the weak feature signal of interest from a noisy signal to facilitate further signal decomposition. A possible solution is to use the spectral kurtosis, which is proven to be a powerful tool to identify the existing of bearing faults buried in noise. Its value is large in frequency bands where the impulsive bearing fault signal is dominant, and is effectively zero where the spectrum is dominated by stationary components [16]. Based on this, an SK-based filter [18] was used to pre-process the signal in Figure 11 and remove part of noise before decomposition. It is a kind of band-pass filter whose parameters, centre frequency and bandwidth,  Figure 13. Although the filtered signal still contains some noise, its impulses are a little clearer than those in the original signal, and its kurtosis value is also increased from 3.07 to 3.97. Consequently, a hybrid method is used to reinforce the performance of noise reduction.

A hybrid method for strong noise reduction
By comparing individual performances of the foregoing two methods, a hybrid signal processing method that combines the EEMD and the SK-based filter [17] is introduced. First, an optimal band-pass filter based on SK is employed to remove part noise so that local extrema  of the signal would not be completely concealed by noise. Then, the EEMD method with parameter optimization is applied to further decompose the filtered signal. As a result, the final signal can be separated from strong noise, which allows good detection of the defects but at the same time minimizes the distortion of the impulses. The main procedure is as follows: Step 1. Pre-processing. Filter the raw signal using an optimal band-pass filter based on SK and obtain the filtered signal.
Step 2. Signal decomposition. Use the EEMD method to decompose the filtered signal into some IMFs.
Step 3. Selection of feature signal. Calculate the correlation coefficients between the obtained IMFs and the filtered signal, and select the IMF having the largest values of correlation coefficient (CC) as the resultant signal for further analysis.

Experiment and comparison
In this sub-section, the filtered signal in Figure 13 is decomposed into 13 IMFs, the first three of which have CC values of 0.88, 0.76 and 0.18 with the filtered signal, and the rest of which have CC values close to zero. To save space, only the first four IMFs are shown in Figure 14.
According to the calculation results of CC, IMF1 has a larger correlation coefficient (0.88) than the other signal components and contains the main component in the filtered signal, and it is thus viewed as the bearing signal recovered from the noisy experimental signal. This result is also verified by the identified BPFI and its multiples as shown in Figure 15. There is also an error between the theoretical and identified values of BPFI, which is caused by the same reason mentioned in Section 3.1.3.
Compared with the filtered signal in Figure 13, the extracted bearing signal (IMF1 in Figure 14) is much cleaner than the original signal, and the remaining noise in the filtered signal is almost completely separated and resides in IMF2. The kurtosis values of the raw signal, the filtered signal and IMF1 are 3.07, 3.97 and 11.29, respectively, as observably increasing. It indicates that this hybrid method successfully reveals temporal impulses from a noisy signal while preserving its important feature for accurate fault diagnosis. Figure 16 also shows the filtered signal by applying the normal wavelet threshold denoising to the same noisy signal, and its impulses are not as clear as those in Figure 14. More cases on faulty machine components, such as an outer race and a rolling ball, are given in reference [17].  Figure 11. Figure 15. The envelope spectrum of IMF1 shown in Figure 14 in the range of 0-1 kHz. Figure 16. The filtered signal obtained by applying the normal wavelet threshold denoising to the same signal in Figure 11.

Feature signal component selection
After using the EMD or LMD method, many signal components are disassembled from the original signal. How to effectively select feature signals from many components is critical for further signal processing and analysis. This section primarily discusses the selection method of feature signal components.

Selection based on cluster analysis
For the feature signal selection, a popular solution is to calculate statistical indicators of the signal, for example, correlation coefficient (CC). Dybała and Zimroz [19] used this indicator to divide IMFs into three classes: noise-only IMFs corresponding to low indices and low CC values, signal-only IMFs and trend-only IMFs corresponding to high indices and low CC values. However, it is possible that an impact signal caused by a damaged bearing is wrongly categorized into the class of noise [19]. Similar results can be found when only using single measurement. A more sophisticated diagnostic method is needed to avoid the misdiagnosis.
Referring to the idea of the cluster analysis, an adaptive selection method based on multiple statistic indicators is designed for selecting the feature signal of interest from many signal components [20].
In the anomaly detection, a branch of cluster analysis, a detector is designed to detect any object that deviates from the known state (usually the healthy state) [21]. Referring to this, the decomposed signal components are classified into two groups: feature signals and unrelated signals. The former is used for further analysis, and the latter is viewed as useless signals.
The key of this selection is how to evaluate useful content in the analyzed signal. If the feature signal is wrongly classified into the useless part, the state of the monitored object may be misjudged. If an unrelated signal is wrongly marked as the feature signal, the conclusion based on the analyses of feature signals may be conflicting. To correctly classify them, some statistic indicators commonly used in the anomaly detection and feature extraction are introduced here. They are indicated by many literatures to be good at representing hidden features of the analyzed signal. Therefore, these indicators are jointly used to determine the classes of decomposed components, not to determine the fault types of the tested object. In addition, the strategy of using multiple indicators is very common in pattern recognition to combine various experts with the aim of compensating the weakness of each single expert [22]. This combination can be viewed as a kind of ensemble learning and can improve the classification accuracy in machine learning. What is interesting is that the idea of combining individuals' opinions in order to reach a final decision is humans' second nature before making any crucial decision [23].
As for a large number of indicators, the distance evaluation technique (DET) [24] is introduced to quickly organize the classification result of each indicator (or expert). For more than one expert, their conclusions may not always coincide, and thus the principle of minority obeying majority [22,23] is introduced to solve their conflicts. The detailed selection is described in the following sub-section.

Adaptive feature signal selection
The process of the adaptive feature signal selection can be divided into two stages: classification of each expert and decision of all experts. Its procedure is described as follows: Step 1. Calculate some statistics indicators in time and frequency domains for all decomposed signal components. The indicators include peak-to-peak (P-P), mean, absolute mean, max, root mean square (RMS), standard deviation (SD), skewness, kurtosis, crest factor (CF), shape factor (SF), impulse factor (IF), energy and correlation coefficient (CC).
Step 2. Normalize and sort in a descending order for each indicator. step 3. Classify using the DET. For each indicator, the DET makes the distance within a class shorter and the distance between classes longer, and then the components are classified into two groups.
Step 4. Vote by all 'experts'. For each signal component, summary how many 'experts' (indicators) classify it into the same class.
Step 5. Draw a conclusion. Following the principle of minority obeying majority, the classification results of signal components can be finally determined.
Furthermore, the indicators that win in the voting are viewed as sensitive ones. After comparing the values of any sensitive indicator between the current state and the healthy one, signals in the class having obvious changes can be determined as feature signals.

Case 1: a vibration signal collected from a bearing with single defect
One of experimental signals was collected from a small motor that involves a bearing with an outer race defect [17]. The specification of experiments is shown in Table 1. After applying the LMD method to this signal, five PFs were obtained, and then indicator values of these five signal components are calculated by 13 indicators in time domain and another 13 indicators in frequency domain. Figure 17 shows the indicator values after normalization. As shown in this figure, for the first indicator P-P (peak-to-peak), using the DET, PF3 and other PFs are classified into two groups; while, for the indicator of Mean, PF5 and other PFs belong to different groups. The classification results for all indicators in time and frequency domains are shown in the columns 'Case 1' of Table 3. Based on the majority principle, PF1 and PF2 are finally classified into one class, and the rest of PFs are classified into the other class. Comparing the energy value with that of a healthy bearing, PF1 and PF2 are determined as feature signals of interest. To verify this conclusion, envelope spectra of PF1-PF3 are shown in Figure 18. The characteristic defect frequency f d and its multiples are only identified in the spectra of PF1 and PF2, which demonstrates the right selection of feature signals.

Case 2: a vibration signal collected from a machine with two defects
Another vibration signal was collected from a traction motor, which involves two faulty machine components, i.e. a faulty motor and a bearing with an outer race defect [13]. Its specification is also shown in Table 1. This signal was decomposed into seven PFs by using the LMD. Its classification results are shown in the columns 'Case 2' of Table 3. As a result, PF1, PF2 and PF3 are classified into one class, and others belong to another class.   shows envelope spectra in the range of 1 kHz of the first four PFs. As shown in this figure, the characteristic defect frequency f d and its multiples of the faulty bearing are identified in PF1, and the rotating frequency f r is identified in PF2 and PF3. No specific characteristic frequency can be identified in PF4. These results also match with real condition of the tested machine.

Discussion
Above results also indicate that statistic indicators have varying degrees on sensitivity to abnormal states. Some of them are sensitive and closely related to any faults, but others are not sensitive or stable. For above experiments, sensitive indicators include absolute mean, SD, RMS, energy, correlation coefficient in time domain, and max, peak-to-peak, SD, RMS, energy, correlation coefficient in frequency domain. The commonly used indicator, kurtosis in time and frequency domains does not show its sensitivity to feature signals. Although the indicator Energy is one of sensitive features, its values of five PFs in Case 1 are 0.048, 0.053, 0.192, 0.293 and 0.223, the latter three of which corresponding to useless components are much larger. Therefore, single measure is not suitable for fault detection. Further work on assessment of feature signals is necessary for online monitoring and diagnosis.

Future work
Although EMD and LMD methods are quite simple in principle, they also depend on a number of user-controlled tuning parameters and still lack an exact theoretical foundation. Feldman has given some theoretical analyses of the EMD method in Refs. [3,25]. However, the following issues remain to be further addressed.

Stopping criterion
No matter which method, EMD or LMD, you use, the adaptive signal decomposition is a 'sifting' process, and you need to choose a criterion to stop it at the right time, which is critical for signal processing. The more times sifting is taken, the closer to zero the average will be [26], that is to say, by sifting as many times as possible, it is more likely to eliminate the riding waves and make the wave profiles symmetric. However, too many repetitions would result in the obliteration of the amplitude variation and the loss of physical meanings. Therefore, it is not an easy task to define an appropriate criterion that makes the definition of IMFs satisfied while retaining enough physical sense of amplitude and frequency modulations.
Standard stopping criterion is very rigorous and difficult to implement in practice. The most commonly used criterion is three-threshold criterion [27], and the recommended setting for the three thresholds is applicable for most of the cases. Many modifications on this criterion are also reported, and their wide verifications are not yet finished. Since most of stopping criteria are the summations over the global domain, an undesired feature is that the decomposition is sensitive to the local perturbation and to the addition of new data [8].
Therefore, an open problem is to eliminate extra sifting processes cause by local changes.

Connection between local extrema
In the sifting process of the EMD method, a spline interpolation function is needed to connect the identified local extrema. Commonly used spline functions include linear spline, quadratic spline, cubic spline and cubic Hermite spline (third-order polynomial). Generally, the higherorder spline function can provide better fitting performance for the original signal, whereas, they require additional subjectively determined parameters and take considerable time for computation. The selection of spline function should satisfy the least interferences and maximum smoothness.
Similarly, smoothed connecting between successive extrema is also required to form a smoothly varying continuous function in the LMD method, and the parameter selection of the moving averaging is still explored. Although modifications based on single connection method or a hybrid method are sporadically reported, an appropriate criterion on the selection of connection methods receives little attention and remains an open problem.
Considering that the EMD and the LMD are data-driven analysis methods, they are essentially algorithmic in nature and, hence, suffer from the drawback that there is no well-established analytical formulation on the basis of theoretical analysis and performance evaluation [28]. Accordingly, relevant modifications mainly come from case-by-case comparisons conducted empirically. In spite of this, as adaptive signal processing methods, the EMD and the LMD methods are proven to be useful and adaptive signal processing tools for vibration-based fault diagnosis and detection.