Open access peer-reviewed chapter

Condition Monitoring of Wind Turbine Structures through Univariate and Multivariate Hypothesis Testing

Written By

Francesc Pozo and Yolanda Vidal

Reviewed: 15 May 2018 Published: 26 September 2018

DOI: 10.5772/intechopen.78727

From the Edited Volume

Structural Health Monitoring from Sensing to Processing

Edited by Magd Abdel Wahab, Yun Lai Zhou and Nuno Manuel Mendes Maia

Chapter metrics overview

951 Chapter Downloads

View Full Metrics

Abstract

This chapter presents a fault detection method through uni- and multivariate hypothesis testing for wind turbine (WT) faults. A data-driven approach is used based on supervisory control and data acquisition (SCADA) data. First, using a healthy WT data set, a model is constructed through multiway principal component analysis (MPCA). Afterward, given a WT to be diagnosed, its data are projected into the MPCA model space. Since the turbulent wind is a random process, the dynamic response of the WT can be considered as a stochastic process, and thus, the acquired SCADA measurements are treated as a random process. The objective is to determine whether the distribution of the multivariate random samples that are obtained from the WT to be diagnosed (healthy or not) is related to the distribution of the baseline. To this end, a test for the equality of population means is performed in both the univariate and the multivariate cases. Ultimately, the test results establish whether the WT is healthy or faulty. The performance of the proposed method is validated using an advanced benchmark that comprehends a 5-MW WT subject to various actuators and sensor faults of different types.

Keywords

  • condition monitoring
  • wind turbines
  • principal component analysis
  • hypothesis testing

1. Introduction

The wind energy cost depends strongly on the performance of the condition monitoring system. Advance in this area would decrease downtime periods, extend the WT lifetime, and ultimately reduce the operation and maintenance (O&M) costs, which is one of the main challenges in wind energy as stated in “20% Wind Energy by 2030” [1].

Usually, condition monitoring comprises different systems (vibration analysis, oil monitoring, etc. [2]) for different parts and different types of faults and makes use of expensive specific sensors that must be installed in the WT. Therefore, the advance in fault detection systems that only make use of already available data from the turbine SCADA system and comprehend different parts and different types of faults is promising (since no additional sensors or data acquisition devices are needed). The SCADA signals provide rich information on the WT performance; thus, with appropriate algorithms, they can be used effectively for condition monitoring, prognostics, and remaining useful life prediction of WTs [3]. There are some success stories about using SCADA data for condition monitoring. For example, Ruiz et al. presented a machine learning approach [4], Zaher and McArthur proposed to use the combination of abnormal detection and data-trending techniques encapsulated in a multiagent framework [5], Pozo and Vidal proposed a fault detection system based on principal component analysis [6].

In this work, following the enhanced benchmark challenge for wind turbine fault detection proposed in [7], a set of eight realistic fault scenarios are considered to develop a WT condition monitoring strategy that combines a SCADA data-driven baseline model—reference pattern obtained from the healthy wind turbine—based on MPCA in combination with uni- and multivariate hypothesis testing. Previous works using MPCA and hypothesis testing to detect structural damage [8] work under the hypothesis of guided waves. That is, the vibration (guided wave) induced to the structure is known and always the same. However, in this work, the vibration is induced by the changeful wind. The used benchmark comprehends different types of faults of a 5-MW WT given by the FAST simulator [9], which has been accepted by the scientific community and is widely used for WT-related research, e.g., [10, 11, 12].

The chapter is organized as follows. Section 2 briefly recalls the WT benchmark model. In Section 3, the condition monitoring strategy is stated. Simulation results are discussed in Section 4. Finally, conclusions are drawn in Section 5.

Advertisement

2. Wind turbine benchmark model

The used benchmark model is proposed in [7]. It covers a 5-MW three-bladed, variable speed WT modeled with the FAST simulator, detailed actuator and sensor models, as well as the different fault descriptions. For a complete description of the benchmark, please see reference [7]. Here, a short review is given to introduce the used notation.

The specifications of the 5-MW reference WT is documented in [13]. This model has been used as a reference by research teams throughout the world to standardize baseline on- and off-shore wind turbine specifications. The wind turbine typical features are given in Table 1, and the assumed available SCADA data are given in Table 2. This work copes with the so-called full load region of operation. In order to run the simulations, turbulent wind data sets that cover this region have been generated with TurbSim [14], see Figure 1.

Reference wind turbineMagnitude
Rated power5 MW
Number of blades3
Rotor/hub diameter126, 3 m
Hub height90 m
Cut-in, rated, and cut-out wind speed3, 11.4, and 25 m/s
Rated generator speed (ωng)1173.7 rpm
Gearbox ratio97

Table 1.

WT properties.

NumberSensor typeSymbolUnits
1Generated electrical powerPe,mkW
2Rotor speedωr,mrad/s
3Generator speedωg,mrad/s
4Generator torqueτc,mNm
5First pitch angleβ1,m°
6Second pitch angleβ2,m°
7Third pitch angleβ3,m°
8Fore-aft acceleration at tower bottomafa,mbm/s2
9Side-to-side acceleration at tower bottomass,mbm/s2
10Fore-aft acceleration at mid-towerafa,mmm/s2
11Side-to-side acceleration at mid-towerass,mmm/s2
12Fore-aft acceleration at tower topafa,mtm/s2
13Side-to-side acceleration at tower topass,mtm/s2

Table 2.

Assumed available measurements.

These sensors are representative of the types of sensors that are available on an MW-scale commercial wind turbine.

Figure 1.

Wind speed signal with turbulence intensity set to 10%.

The generator-converter system can be approximated by a first-order ordinary differential equation, see [7], which is given by:

τ̇rt+αgcτrt=αgcτctE1

where τr and τc are the real generator torque and its reference (given by the controller), respectively. In the numerical simulations, αgc=50, see [13]. Moreover, the power produced by the generator, Pet, is given by (see [7]):

Pet=ηgωgtτrtE2

where ηg is the efficiency of the generator and ωg is the generator speed. In the numerical experiments, ηg=0.98 is used, see [7].

Each of the three pitch actuators is modeled as a closed loop transfer function between the pitch angle, βs, and its reference βrs:

βsβrs=ωn2s2+2ξωns+ωn2E3

where ξ is the damping ratio and ωn the natural frequency that takes the fault-free values ξ=0.6 and ωn=11.11 rad/s, see [7].

The fault detection benchmark considers different types of faults at different components (sensors and actuators), as described in Table 3.

FaultTypeDescription
F1Pitch actuatorChange in dynamics: high air content in oil
F2Pitch actuatorChange in dynamics: pump wear
F3Pitch actuatorChange in dynamics: hydraulic leakage
F4Torque actuatorOffset (offset value equal to 2000 Nm)
F5Generator speed sensorScaling (gain factor equal to 1.2)
F6Pitch angle sensorStuck (fixed value equal to 5°)
F7Pitch angle sensorStuck (fixed value equal to 10°)
F8Pitch angle sensorScaling (gain factor equal to 1.2)

Table 3.

Fault scenarios.

Advertisement

3. Condition monitoring (CM) strategy

The overall CM strategy is based on a three-tier framework:

  1. a multiway PCA (MPCA) model is built with the data that are collected from a healthy WT,

  2. when a new WT has to be diagnosed, the SCADA data are projected using the MPCA model created in (i), and

  3. the final decision is based on both univariate and multivariate HT.

3.1. The wind as a source for the excitation: the need for a new paradigm

In general, vibration-based structural health monitoring (SHM) is based on the fact that an alteration or difference in physical properties due to damage or structural change will motivate changes in dynamical responses that may be detected. Figure 2 represents this paradigm in the sense that a healthy structure is excited according to a prescribed signal to build a pattern. Afterward, the structure that has to be diagnosed is affected by exactly the same signal, where the response is measured, processed, and finally compared with the previous pattern. The strategy presented in Figure 2 is known as “guided waves in structures for SHM” [15].

Figure 2.

Vibration-based SHM is based on the fact that an alteration or difference in physical properties due to damage or structural change will motivate changes in dynamical responses that may be detected.

In the present chapter, the field of application is wind turbines and a realistic scenario is to consider that the excitation comes from the wind turbulence. The wind turbulence cannot be controlled and it is always different. Therefore, the paradigm of guided waves in WT for SHM as in Figure 2 cannot be considered. In this case, when the source of the excitation cannot be previously prescribed, a new paradigm is needed, as represented in Figure 3. The foundation of the new paradigm is that, even with a constantly different excitation, the CM strategy based on MPCA and univariate and multivariate HT will be able to disclose some hidden damage, misbehavior, or fault. To sum up, the fundamental idea behind the CM strategy is the hypothesis that a variation in the overall behavior of the WT, even with an unprescribed excitation, should be detected.

Figure 3.

The key idea behind the new paradigm of the detection strategy is the assumption that a change in the behavior of the overall system, even with a different excitation, has to be detected.

However, in our application, the only available excitation of the wind turbines is the wind turbulence. Therefore, guided waves in wind turbines for SHM as in Figure 2 cannot be considered as a realistic scenario. In spite of that, the new paradigm described in Figure 3 is based on the fact that, even with different wind turbulence, the fault detection strategy based on PCA and statistical multivariate hypothesis testing will be able to detect some damage, fault, or misbehavior. More precisely, the key idea behind the detection strategy is the assumption that a change in the behavior of the overall system, even with a different excitation, has to be detected. Section 4 includes the simulation results of the proposed CM strategy that validates this hypothesis.

3.2. Data-driven baseline modeling based on MPCA

Multiway principal component analysis (MPCA) is a natural extension of classical principal component analysis (PCA) to manage data in multidimensional arrays [16, 17]. A conventional two-dimensional data matrix can be treated as a two-way array, where experiments and variables (or discretization instant times) form the two different ways. Frequently, this arrangement has to be extended to multiway arrays, particularly if several sensors—in different experimental trials—are gathering data at different time instants. Consequently, MPCA is equivalent to the application of standard PCA to an unfolded version of the initial multiway array.

Westerhuis et al. [18] propose six different ways of unfolding a three-way data matrix. Besides, in [18], a critical analysis of several aspects of the treatment of multiway data is provided, including how the matrix is unfolded, but also mean-centering and scaling with respect to the effects on the analysis of batch data. Ruiz et al. [19] assign one of the first six letters of the alphabet to each one of the six different ways of unfolding. In this chapter, as well as in [6, 8, 20, 21], we have considered the so-called type E. However, we will present the collected SCADA data arranged in an already unfolded matrix.

The MPCA modeling starts by measuring, from a healthy wind turbine, a sensor during nL1Δ seconds, where Δ is the sampling time and n,L. The discretized measures of the sensor are a real vector

x11x12x1Lx21x22x2Lxn1xn2xnLRnLE4

where the real number xij,i=1,,n,j=1,,L corresponds to the measure of the sensor at time i1L+j1Δ seconds. These collected data can be arranged in matrix form as follows:

x11x12x1Lxi1xi2xiLxn1xn2xnLn×LRE5

where n×LR is the vector space of n×L matrices over R. It is worth noting that n is the number of rows of the matrix in Eq. (5) and L is the number of columns of the same matrix. The effect on the overall performance of the condition monitoring strategy on the choice of n and L is thoroughly analyzed on [21].

Let us assume that the SCADA data are now collected from N sensors also during the same period of time. In this case, the collected data, for each sensor, can be organized in a matrix as in Eq. (5). Subsequently, all the collected data coming from the whole set of sensors are concatenated and disposed in a matrix Xn×NL as follows:

X=x111x121x1L1x112x1L2x11Nx1LNxi11xi21xiL1xi12xiL2xi1NxiLNxn11xn21xnL1xn12xnL2xn1NxnLN=v1v2vLX1vL+1v2LX2vN1L+1vNLXN=X1X2XNn×NLRE6

where the superindex k=1,,N of each element xijk in the matrix represents the number of sensor. Matrix Xn×NLR—where n×NLR is the vector space of n×NL matrices over R—contains the measures from N sensors at nL discretization instants. Consequently, each row vector xiT=Xi:RNL,i=1,,n represents the measurements from all the sensors at time instants i1L+j1Δ seconds, j=1,,L. Equivalently, each column vector vj=X:jRn,j=1,,NL represents measurements from sensor number jL at time instants i1L+j1Δ seconds, 1=1,,n, where is the ceiling function.

The objective of the subsequent analysis is to build the MPCA model, that is, the square orthogonal matrix PNL×NLR that has to be used to transform or project the original data matrix X according to the following matrix-to-matrix product:

T=XPn×NLR,E7

where the shape of the variance-covariance matrix of matrix T in Eq. (7) is diagonal.

In the proposed approach in this chapter, the model defined in matrix P in Eq. (7) is based only on measures that come from a healthy wind turbine. Posteriorly, data from the current WT to diagnose will be projected using the matrix-to-matrix multiplication also defined in Eq. (7). However, a different procedure can be considered, particularly, when the goal is not just to detect a damage or a fault but to classify it. In the latter case, matrix X in Eq. (6) should contain measures from a WT in its healthy state but also in all the possible fault scenarios. This way, the generated model in matrix P in Eq. (7) contains all the possible states of the structure.

3.2.1. Centering and scaling: group scaling (GS) vs. mean-centered group scaling (MCGS)

Considering that the data stored in matrix X are affected by a changing wind turbulence, come from different sensors, and could have different magnitudes and scales, some kind of preprocessing step is required to rescale the data [22, 23]. According to Westerhuis et al. [18], the way this preprocessing step is carried out may affect the overall performance of the CM strategy. In the present chapter, we present two possible choices that have some common core. These two alternatives are as follows:

  1. group scaling (GS) and

  2. mean-centered group scaling (MCGS).

In the former case (GS), both the arithmetic mean and the variance of all measurements of the sensor are used. More precisely, for k=1,2,,N, we define

μk=1nLi=1nj=1Lxijk,E8
σk2=1nLi=1nj=1Lxijkμk2E9

where μk and σk2 are the arithmetic mean and the variance of the whole set of elements in matrix Xk, respectively. In this case, matrix X=xijk is centered and scaled—using GS—to define a modified matrix X=XGS=xijk as

xijkxijkμkσk2,i=1,,n,j=1,,L,k=1,,N.E10

In the latter case (MCGS), the arithmetic of all measurements of the sensor at the same column is considered in the normalization. More precisely, for k=1,2,,N, we define

μjk=1ni=1nxijk,j=1,,L,E11

where μjk is the arithmetic mean of the measures placed at the same column. In this case, then, matrix X=xijk is centered and scaled—using MCGS—to define a modified matrix X=XMCGS=xijk as

xijkxijkμjkσk2,i=1,,n,j=1,,L,k=1,,N.E12

where σk2 is defined as in Eq. (9) using μk as in Eq. (8). It is worth noting that the only difference between the expressions in Eqs. (10) and (12) is how the elements in matrix X=xijk are centered. When matrix X=xijk is scaled and centered according to the MCGS strategy described in Eq. (12), the average value of each column vector in the scaled matrix X can be calculated as

1ni=1nxijk=1ni=1nxijkμjkσk=1nσki=1nxijkμjkE13
=1nσki=1nxijknμjkE14
=1nσknμjknμjk=0E15

Taking advantage of the fact that the scaled matrix X is a mean-centered matrix, the variance-covariance matrix can be straightforwardly computed as a matrix-to-matrix product of X and its transpose, divided by n1, where n is the number of rows of matrix X in Eq. (6). More precisely,

CX=1n1XTXNL×NLRE16

Clearly, GS and MCGS are not the only ways to center and scale data. For instance, feature scaling, also known as unity-based normalization, can also be considered. In this case, data are centered with respect to the minimum value and scaled with respect to the range of the set, that is,

x˜ijkxijkminxijkmaxxijkminxijk,i=1,,n,j=1,,L,k=1,,N.E17

However, to easily compute the variance-covariance matrix in the CM strategy that we present in this chapter, the mean-centered group scaling (MCGS) is the method that we have selected for the centering and scaling. In order to not to use the baroque notation X throughout the rest of this chapter, this centered and scaled matrix is redesignated as X, without the breve sign.

The MPCA model is described by the latent vectors

pj,j=1,,NL,E18

also known as eigenvector or proper vectors, and the latent roots

λj,j=1,,NL,E19

also known as eigenvalues or proper values, of the variance-covariance matrix CX as follows:

CXP=PΛE20

where

P=p1p2pNLNL×NLRE21
Λ=ΛijNL×NLRE22

and

Λjj=λj,j=1,,NLE23
Λij=0,i,j=1,,NL,ijE24

The latent vectors and latent roots in Eqs. (21) and (23) are arranged in descending order with respect to the absolute values of the latent roots, that is,

λiλi+1,i=1,,NL1E25

The latent vector p1—corresponding to the largest latent root λ1 (in absolute value)—is called the first principal component (PC). Likewise, the latent vector p2—corresponding to the second largest latent root λ2 (in absolute value)—is called the second principal component. Equivalently, the latent vector pj,j=1,,NL—corresponding to the latent root λj—is called the jth principal component.

Matrix T in Eq. (7) represents the transformed or projected matrix onto the principal component space and it is also known as score matrix.

When, for the sake of dimensionality reduction, a decreased number of principal components are considered:

<NL,E26

a reduced multiway PCA model is then assembled:

P=p1p2pNL×R.E27

3.3. HT-based condition monitoring

As said in Section 3.2, the MPCA model is based only on measures that come from a healthy wind turbine. Posteriorly, data from the current WT to diagnose—and subjected to a different wind turbulence—are gathered from as many sensors as in the modeling phase described in Section 3.2 and during a period of time, νL1Δ seconds, which is not necessarily equal. These new data are arranged in a new matrix Y in a similar way as in Eq. (6):

Y=y111y121y1L1y112y1L2y11Ny1LNyi11yi21yiL1yi12yiL2yi1NyiLNyν11yν21yνL1yν12yνL2yν1NyνLNν×NLR=w1w2wLY1wL+1w2LY2wN1L+1wNLYN=Y1Y2YNn×NLRE28

It should be noted that ν (the number of rows of matrix Y) does not necessarily need to match the natural number n, which represents the number of rows of matrix X in Eq. (6). However, the number of columns, represented by the natural number NL, must agree.

The collected data in matrix Y in Eq. (28) are first centered and scaled to form a matrix Y=yijk similar to the one in Eq. (12):

yijkyijkμjkσk2,i=1,,ν,j=1,,L,k=1,,N,E29

where σk2 and μjk are the values of the variance and the arithmetic mean that have been previously calculated in Eqs. (9) and (11), respectively, with respect to X in Eq. (6). After the preprocessing step, that is, centering and scaling the raw data collected from the current structure to diagnose, the scores related to each row vector

ri=Yi:RNL,i=1,,νE30

are computed using a vector-to-matrix product:

ti=riP^R,i=1,,νE31

where matrix P^ is the reduced MPCA model in Eq. (27).

Let us consider the canonical basis

e1e2eRE32

of the dimensional real vector space R.

Given a row vector ri as in Eq. (30), the real number

t1i=tie1RE33

is called the first score. Likewise, the scalar

t2i=tie2RE34

is called the second score. In general, the scalar

tji=tiejRE35

is called the score associated with the principal component pj,j=1,, or, simply, score j.

In addition, an s dimensional vector as can be built if more than one score is considered at the same time. Indeed,

tsi=t1it2itsiTRs,s.E36

3.3.1. Scores as a random sample

As said in Section 3.1, the excitation of the WT comes from a changing turbulent wind. Somehow, this turbulent wind can be viewed as a random signal. Therefore, the response of the WT can be also viewed as a random process and so the measurements in the row vector ri in Eq. (30). As a consequence, the vector ti receives this random nature and it can be observed as an -dimensional random vector to construct the statistical approach in this chapter. As a motivating example, in Figure 4, two three-dimensional samples are represented: one is the three-dimensional baseline sample (left) and the other is referred to faults 1, 4, and 7 (right). In a classic application of the PCA strategy in the field of SHM, the scores allow a separation, clustering, or visual grouping [24]. However, in this case, it can be clearly monitored in Figure 4 (right) that a clustering, visual grouping, or separation cannot be performed. Therefore, more powerful and reliable tools are needed to be able to detect a fault in the WT.

Figure 4.

Baseline sample (left) and sample from the wind turbine to be diagnosed (right).

In structural health monitoring or condition monitoring applications, the final decision on whether the structure, the actuator and/or the sensor is healthy or not should not depend on graphical approaches. One of the most common approaches to reliable indicators of damage or faults is the use of the powerful machinery of statistical hypothesis testing. The differences in this kind of strategies rely on what is the subject of the test and, of course, how the raw data collected by the sensors are arranged and preprocessed. For instance, in Zugasti et al. [25] the damage detection is based on testing for significant changes in the parameter vector of an AutoRegressive model. A comprehensive three-tier modular structural health monitoring framework is proposed by Hackell et al. [26] where the hypothesis testing is used to declare decision boundaries, control charts, and ROC curves with the ultimate goal of distinguishing between healthy and potentially damaged data on an operational wind turbine. A somehow different approach is presented by Ng et al. [27] that includes a vehicle health monitoring system where several univariate hypothesis tests are considered in parallel. Again in the field of structural health monitoring or condition monitoring of wind turbines, a recent work by Tsiapoki et al. [28] where damage and ice detection is based on data normalization, feature extraction and hypothesis testing (HT).

The use of univariate hypothesis testing as a key element for structural health monitoring or condition monitoring has been increasing in the last years as a reliable method. Variations of these univariate HT for multiple indicators include the use of univariate HT in parallel, that is, testing for each component of a parameter vector rather than testing for the whole multidimensional parameter vector. The first approach for the detection of structural changes using a multivariate hypothesis testing has been proposed by Pozo et al. [8]. One of the key results in the work [8] is that multivariate HTs allow to get better results in damage or fault detection that just univariate test. One interesting example presented in the work by Pozo et al. [8] shows that, for a given level of significance α, five independent univariate hypothesis

H0:μc,i=μh,iH1:μc,iμh,iE37

where i=1,2,,5 lead to a wrong decision while the single multivariate HT

H0:μc=μhH1:μcμhE38

where

μcT=μc,1μc,2μc,5μhT=μh,1μh,2μh,5E39

is able to correctly classify the structure. This example shows that multivariate HT is even more reliable than univariate HT. However, these benefits come at a price, in the sense that in order to apply the multivariate HT, the statistical distribution of the data must be multinormal. Of course, it may happen that five sets of 50 samples

x1ix2ix50iNμiσi,i=1,2,,5E40

are normally distributed, while the sample vector

x1x2x50NμΣ,E41

where

xj=xj1xj2xj5T,j=1,,50E42

and Σ is the variance-covariance matrix, is not multinormally distributed.

3.3.2. Univariate case: testing for the equality of means

In this section, we present how a fault is detected in the WT using univariate HT. To this end, first we have to define what we consider our baseline. Given a principal component j=1,,, the baseline sample is the set of real numbers τjii=1,,n defined by

τjiXi:P̂j=Xi:P̂ej,i=1,,n,E43

where ej is the j-th vector of the canonical basis in Eq. (32), P is the MPCA model defined in Eq. (27), and X is the centered and scaled matrix of the collected data from a healthy WT as in Eq. (6). Similarly, and given a principal component j=1,,, the sample of the current WT to diagnose is defined as the set of ν real numbers

tjii=1,,νE44

as defined in Eq. (35).

Before the univariate HT is applied, the following assumptions must be made:

  1. the baseline sample τjii=1,,n is a random sample of a random variable (RV) normally distributed with unknown mean μX and unknown variance σX2 and

  2. the random sample tjii=1,,ν in Eq. (44) of the current WT to diagnose follows a normal distribution with unknown mean μY and unknown variance σY2.

It is worth mentioning that the variances of these two samples are not supposed to be necessary equal.

Let us define

δμ=μXμYE45

as the difference between these two mean values. Since we want to know if the distribution of these two samples is related, this leads to a test of the hypothesis

H0:δμ=0versusE46
H1:δμ0E47

where the null hypothesis H0 is “the sample of the WT to be diagnosed is distributed as the baseline sample” and the alternative hypothesis H1 is “the sample of the WT to be diagnosed is not distributed as the baseline sample.” In other words, if the result of the test is that H0 is accepted, the current WT is categorized as healthy. Otherwise, if H0 is rejected in favor of H1, this would indicate the presence of some faults in the WT.

Given the assumptions of normality and considering that the two variances are not necessarily equal, the test for the equality of mean is based on the so-called Welch-Satterthwaite method [29], which is outlined below for the sake of completeness. If two random samples of size n and ν, respectively, are taken from two normal distributions NμXσX and NμYσY and the population variances are unknown and not necessarily equal, the random variable

WS=X¯Y¯+μXμYSX2n+SY2νE48

can be approximated with a t-distribution with ρ degrees of freedom (DOF), that is

WStρE49

where

ρ=sX2n+sY2ν2sX2/n2n1+sY2/ν2ν1,E50

S2 is the sample variance as a random variable, s2 is the variance of a sample, X¯,Y¯ are the sample mean as a random variable, and is the standard floor function.

The magnitude of the test statistic using Welch-Satterthwaite method is defined as

tobs=x¯y¯sX2n+sY2νE51

where x¯,y¯ is the mean of a particular sample. The quantity tobs is the fault indicator. We can then construct the following test:

tobstAcceptH0E52
tobs>tAcceptH1E53

where t is such that

Ptρt=α2,E54

where α is the level of significance for the test. To sum up,

  1. H0 is rejected if tobs>t (the WT is classified as not healthy) and

  2. H0 is accepted if tobst (the WT is classified as healthy).

3.3.3. Multivariate case: testing a multivariate mean vector

In Section 3.3.2, for each principal component j=1,,, a test for the equality of means is performed. This means that for a single sample of the current structure to diagnose, we obtain decisions on whether the structure is healthy or not. In the present section, more than one principal component will be considered jointly thus defining a vector. Therefore, a test for the plausibility of a value for a normal population mean vector will be performed.

As in Section 3.3.2, the objective of this work is to determine whether the distribution of the multivariate random samples that are obtained from the WT to be diagnosed (healthy or not) is connected to the distribution of the baseline.

Let us define s as the number of PCs that are considered at the same time. Before the multivariate HT is applied, the following assumptions must be made:

  1. the baseline projection is a multivariate random sample (MRS) of a multivariate random variable (MRV) following a multivariate normal distribution (MVND) with known population mean vector μhRs and known variance-covariance matrix s×sR and

  2. the multivariate random sample of the WT to be diagnosed also follows an MVND with unknown multivariate mean vector μcRs and known variance-covariance matrix s×sR.

In this case, opposite to what we have assumed in Section 3.3.2, both multivariate random variables have the same known variance-covariance matrix.

Similarly as in Section 3.3.2, the question that arises here is whether a given s-dimensional vector μc is a reasonable value for the mean of an MVND Nsμh. This leads to the following test of the hypothesis

H0:μc=μhversusH1:μcμh,E55

where H0 is “the MRS of the WT to be diagnosed is distributed as the baseline projection” and H1 is “the MRS of the WT to be diagnosed is not distributed as the baseline projection.” In other words, if the result of the test is that H0 is accepted, the current WT is categorized as healthy. Otherwise, if H0 is rejected in favor of H1, this would indicate the presence of some faults in the WT.

In this case, the multivariate test is based on Hotelling’s T2 statistic and it is outlined below. When an MRS of size υ is taken from an MVND Nsμh, the RV

T2=υX¯μhTS1X¯μhE56

is distributed as

T2υ1sυsFs,υs,E57

where Fs,υs denotes an RV with an F-distribution with s and υs DOF, X¯ is the sample vector mean as a MRV, and 1nSs×sR is the estimated variance-covariance matrix of X¯.

The value of the test statistic is defined as

tobs2=υx¯μhTS1x¯μh,E58

and is the fault indicator. We can then construct the following test:

tobs2υ1sυsFs,υsαAcceptH0,E59
tobs2>υ1sυsFs,υsαAcceptH1,E60

where Fs,υsα is the upper 100αth percentile of the Fs,υs distribution, that is,

Fs,υs>Fs,υsα=α,E61

where is a probability measure and α is the level of significance for the test. To sum up,

  1. H0 is rejected if tobs2>υ1sυsFs,υsα (the WT is classified as not healthy) and

  2. H0 is accepted if tobs2υ1sυsFs,υsα (the WT is classified as healthy).

Advertisement

4. Simulation results

The results of the CM strategies presented in Sections 3.3.2 and 3.3.3 are organized into three subsections. The absolute value of samples that are correctly identified and the absolute number of false alarms and missing faults are included in Section 4.1. Sections 4.2 and 4.3 show the results, not as absolute values but as a percentage. More precisely, the sensitivity and the specificity are both comprised in Section 4.2, including the false-negative (FNR) and the false-positive rates (FPR). Besides, the true rate of both false negatives and false positives are contained in Section 4.3.

For the validation of the CM strategies presented in Sections 3.3.2 and 3.3.3, 24 samples of ν=50 elements each have been examined, in accordance with the following organization:

• 8 samples of a faulty WT (one sample for each one of the different fault scenarios described in Table 3) and

• 16 samples of a healthy WT.

All samples are acquired with changing wind data sets with turbulence intensity established to 10% and computed with TurbSim [14]. These wind data have the subsequent features:

  1. Kaimal turbulence model,

  2. logarithmic profile wind type,

  3. mean speed of 18.2 m/s simulated at hub height, and

  4. a roughness factor of 0.01 m.

Each sample of ν=50 elements comes from the measures collected during νL1Δ=312.4875 seconds. The values for these parameters are listed in Table 4.

We present, in Sections 4.1, 4.2, and 4.3, the results when the collected data are projected into:

  1. the first principal component,

  2. the second principal component,

  3. the third principal component,

  4. the first and the second principal components, jointly,

  5. the first seven principal components, jointly, and

  6. the first twelve principal components, jointly.

In the three univariate cases, (i)–(iii), we use the test for the equality of means, while in the three multivariate cases, (iv)–(vi), we use the test for the plausibility of a value for a normal population. In both cases, the chosen level of significance is α=10%.

4.1. Types I and II errors

In this section, each of the 24 samples is classified as follows:

  1. number of samples from the healthy WT (healthy sample), which were classified by the hypothesis test as “healthy” (accept H0) [right decision],

  2. faulty sample classified by the test as “faulty” (accept H1) [right decision],

  3. samples from the faulty WT (faulty sample) classified as “healthy” [wrong decision/missing fault/type II error], and

  4. healthy sample classified as “faulty” [wrong decision/false alarm/type I error].

The results displayed in Table 6 are disposed according to the scheme in Table 5.

ParameterSymbolMagnitude
Number of rowsν50
Number of columnsL500
Sampling timeΔ0.0125
Number of sensorsN13

Table 4.

The collected measures are arranged in a ν×NL matrix Y as in Eq. (28)

Healthy sample (H0)Faulty sample (H1)
Accept H0Correct decisionType II error (missing fault)
Accept H1Type I error (false alarm)Correct decision

Table 5.

Scheme for the presentation of the results in Table 6

H0H1H0H1
Score 1Scores 1–2
Accept H0161Accept H0120
Accept H107Accept H148
Score 2Scores 1–7
Accept H0137Accept H0130
Accept H131Accept H138
Score 3Scores 1–12
Accept H0168Accept H0160
Accept H100Accept H108

Table 6.

Categorization of the samples with respect to the presence or absence of a fault and the result of the test considering the first score, the second score, and the third score (left) and scores 1–2 (jointly), scores 1–7 (jointly), and scores 1–12 (jointly) (right), when the size of the samples to diagnose is ν=50 and the level of significance is α=10%

Healthy sample (H0)Faulty sample (H1)
Accept H0Specificity (1α)False-negative rate (γ)
Accept H1False-positive rate (α)Sensitivity (1γ)

Table 7.

Relationship between specificity and sensitivity.

H0H1H0H1
Score 1Scores 1–2
Accept H01.000.12Accept H00.750.00
Accept H10.000.88Accept H10.251.00
Score 2Scores 1–7
Accept H00.810.88Accept H00.810.00
Accept H10.190.12Accept H10.191.00
Score 3Scores 1–12
Accept H01.001.00Accept H01.000.00
Accept H10.000.00Accept H10.001.00

Table 8.

Sensitivity and specificity of the test considering the first score, the second score, and the third score (left) and scores 1–2 (jointly), scores 1–7 (jointly), and scores 1–12 (jointly) (right), when the size of the samples to diagnose is ν=50 and the level of significance is α=10%

Healthy sample (H0)Faulty sample (H1)
Accept H0H0acceptH0True rate of false negatives H1acceptH0
Accept H1True rate of false positives H0acceptH1H1acceptH1

Table 9.

Relationship between the proportion of false negatives and false positives.

H0H1H0H1
Score 1Scores 1–2
Accept H00.940.06Accept H01.000.00
Accept H10.001.00Accept H10.330.67
Score 2Scores 1–7
Accept H00.650.35Accept H01.000.00
Accept H10.750.25Accept H10.270.73
Score 3Scores 1–12
Accept H00.670.33Accept H01.000.00
Accept H10.000.00Accept H10.001.00

Table 10.

True rate of false negatives and true rate of false positives of the test considering the first score, the second score, and the third score (left) and scores 1–2 (jointly), scores 1–7 (jointly), and scores 1–12 (jointly) (right), when the size of the samples to diagnose is ν=50 and the level of significance is α=10%

4.2. Sensitivity and specificity

As in [20, 30], two more statistical indicators are analyzed to assess the efficiency of the test. On the one hand, the specificity of the test is defined as the fraction of samples from the healthy structure, which are correctly classified. On the other hand, the sensitivity—or the power of the test—is defined as the fraction of samples from the faulty wind turbine that are correctly classified as such.

The sensitivity and specificity of both the univariate HT and the multivariate case with respect to the 24 samples displayed in Table 8 are disposed according to the scheme in Table 7.

4.3. Reliability of the results

Finally, the true rate of false negatives and the true rate of false positives can be used to assess the performance of the proposed CM strategy. These two measures—closely related to Bayes’ theorem [31]—are described in Table 9. On the one hand, the true rate of false negatives is the fraction of samples from the faulty WT that have been wrongly identified as healthy. On the other hand, the true rate of false positives is the fraction of sample from the healthy WT that have been wrongly identified as faulty.

The true rate of false negatives and the true rate of false positives of both the univariate HT and the multivariate case displayed in Table 10 are disposed according to the scheme in Table 9.

Advertisement

5. Concluding remarks

A multifault detection method based on MPCA through uni- and multivariate hypothesis testing has been presented in this chapter. It is noteworthy to mention the obtained performance through the study of eight realistic different faults in different components of the WT, taking into account that the proposed strategy does not need extra sensors but only uses already available data from the WT SCADA system.

The three main conclusions, which show the benefits of the multivariate statistical hypothesis testing in comparison with the univariate case, for WT condition monitoring, are the following:

  1. Given a level of significance α=10%, when the first 12 scores are considered jointly, an accuracy of 100% is obtained, while in all the other studied cases, misclassifications are present.

  2. Multivariate analysis leads to average values of 100% for the sensitivity and 85.33% for the specificity, while for the univariate case, the average values are 33.33 and 93.67%, respectively.

  3. Multivariate analysis leads to average value of the true rate of false negatives of 0% and the average value of the true rate of false positives of 20%, while for the univariate case, the average values are 24.67 and 25%, respectively.

Advertisement

Acknowledgments

This work has been partially funded by the Spanish Ministry of Economy and Competitiveness through the research projects DPI2014-58427-C2-1-R and DPI2017-82930-C2-1-R, and by the Generalitat de Catalunya through the research project 2017 SGR 388.

Advertisement

Abbreviations

DOFdegrees of freedom
CMcondition monitoring
FASTfatigue, aerodynamics, structures, and turbulence
FDfault detection
FNRfalse-negative rate
FPRfalse-positive rate
GSgroup scaling
HThypothesis testing
MCGSmean-centered group scaling
MPCAmultiway principal component analysis
MRSmultivariate random sample
MRVmultivariate random variable
MVNDmultivariate normal distribution
O&Moperation and maintenance
PCAprincipal component analysis
RVrandom variable
SCADAsupervisory control and data acquisition
SHMstructural health monitoring
WTwind turbine

References

  1. 1. Lindenberg S, Smith S, O’Dell K, Demeo E, Ram B. 20% wind energy by 2030: Increasing wind energy’s contribution to US Electricity Supply; Technical Report; Oak Ridge, TN, USA: U.S. Department of Energy; 2008. DOE/GO-102008-2567
  2. 2. Tchakoua P, Wamkeue R, Ouhrouche M, Slaoui-Hasnaoui F, Tameghe TA, Ekemb G. Wind turbine condition monitoring: State-of-the-art review, new trends, and future challenges. Energies. 2014;7(4):2595-2630
  3. 3. Qiao W, Lu D. A survey on wind turbine condition monitoring and fault diagnosis. Part I: Components and subsystems. IEEE Transactions on Industrial Electronics. 2015;62(10):6536-6545
  4. 4. Ruiz M, Mujica LE, Alférez S, Acho L, Tutivén C, Vidal Y, Rodellar J, Pozo F. Wind turbine fault detection and classification by means of image texture analysis. Mechanical Systems and Signal Processing. 2018;107:149-167
  5. 5. Zaher A, McArthur S. A multi-agent fault detection system for wind turbine defect recognition and diagnosis. In: Power Tech. Lausanne: IEEE; 2007. pp. 22-27
  6. 6. Pozo F, Vidal Y. Damage and fault detection of structures using principal component analysis and hypothesis testing. In: Advances in Principal Component Analysis. Springer. 2018. pp. 137-191
  7. 7. Odgaard P, Johnson K. Wind turbine fault diagnosis and fault tolerant control—An enhanced benchmark challenge. In: Proceedings of the 2013 American Control Conference—ACC; Washington DC. USA; 2013. pp. 1-6
  8. 8. Pozo F, Arruga I, Mujica LE, Ruiz M, Podivilova E. Detection of structural changes through principal component analysis and multivariate statistical inference. Structural Health Monitoring. 2016;15(2):127-142
  9. 9. Jonkman, J. NWTC Computer-Aided Engineering Tools (FAST); Last modified 28-October-2013; NWTC Information Portal: Washington, DC, USA, 2013
  10. 10. Vidal Y, Tutiven C, Rodellar J, Acho L. Fault diagnosis and fault-tolerant control of wind turbines via a discrete time controller with a disturbance compensator. Energies. 2015;8(5):4300-4316
  11. 11. Ochs DS, Miller RD, White WN. Simulation of electromechanical interactions of permanent-magnet direct-drive wind turbines using the fast aeroelastic simulator. IEEE Transactions on Sustainable Energy. 2014;5(1):2-9
  12. 12. Beltran B, El Hachemi Benbouzid M, Ahmed-Ali T. Second-order sliding mode control of a doubly fed induction generator driven wind turbine. IEEE Transactions on Energy Conversion. 2012;27(2):261-269
  13. 13. Jonkman JM, Butterfield S, Musial W, Scott G. Definition of a 5-MW reference wind turbine for offshore system development, Technical Report. Golden, Colorado: National Renewable Energy Laboratory; 2009. NREL/TP-500-38060
  14. 14. Kelley, N.; Jonkman, B. NWTC Computer-Aided Engineering Tools (Turbsim); Last modified 30-May-2013; NWTC Information Portal: Washington, DC, USA, 2013
  15. 15. Ostachowicz W, Kudela P, Krawczuk M, Zak A. Guided Waves in Structures for SHM: The Time-Domain Spectral Element Method. Chichester, UK: John Wiley & Sons, Ltd; 2012
  16. 16. Chai Y, Yang H, Zhao L. Data unfolding PCA modelling and monitoring of multiphase batch processes. IFAC Proceedings Volumes. 2013;46(13):569-574
  17. 17. Ruiz M, Villez K, Sin G, Colomer J, Vanrrolleghem P. Influence of scaling and unfolding in PCA based monitoring of nutrient removing batch process. In: Fault Detection, Supervision and Safety of Technical Processes 2006. Amsterdam, The Netherlands: Elsevier; 2007. pp. 114-119
  18. 18. Westerhuis JA, Kourti T, MacGregor JF. Comparing alternative approaches for multivariate statistical analysis of batch process data. Journal of Chemometrics. 1999;13(3-4):397-413
  19. 19. Ruiz M, Mujica LE, Sierra J, Pozo F, Rodellar J. Multiway principal component analysis contributions for structural damage localization. Structural Health Monitoring. 2017:1475921717737971
  20. 20. Pozo F, Vidal Y. Wind turbine fault detection through principal component analysis and statistical hypothesis testing. Energies. 2016;9(1):1-20
  21. 21. Pozo F, Vidal Y, Serrahima JM. On real-time fault detection in wind turbines: Sensor selection algorithm and detection time reduction analysis. Energies. 2016;9(7):520
  22. 22. Anaya M, Tibaduiza D, Pozo F. A bioinspired methodology based on an artificial immune system for damage detection in structural health monitoring. Shock and Vibration. 2015;2015:1-15
  23. 23. Anaya M, Tibaduiza DA, Pozo F. Detection and classification of structural changes using artificial immune systems and fuzzy clustering. International Journal of Bio-Inspired Computation. 2017;9(1):35-52
  24. 24. Mujica LE, Rodellar J, Fernández A, Güemes A. Q-statistic and t2-statistic PCA-based measures for damage assessment in structures. Structural Health Monitoring. 2011;10(5):539-553
  25. 25. Zugasti E, González AG, Anduaga J, Arregui MA, Martínez F. Nullspace and autoregressive damage detection: A comparative study. Smart Materials and Structures. 2012;21(8):085010
  26. 26. Häckell MW, Rolfes R, Kane MB, Lynch JP. Three-tier modular structural health monitoring framework using environmental and operational condition clustering for data normalization: Validation on an operational wind turbine system. Proceedings of the IEEE. 2016;104(8):1632-1646
  27. 27. Ng HK, Chen RH, Speyer JL. A vehicle health monitoring system evaluated experimentally on a passenger vehicle. IEEE Transactions on Control Systems Technology. 2006;14(5):854-870
  28. 28. Tsiapoki S, Häckell MW, Grießmann T, Rolfes R. Damage and ice detection on wind turbine rotor blades using a three-tier modular structural health monitoring framework. Structural Health Monitoring. 2017:1475921717732730
  29. 29. Ugarte MD, Militino AF, Arnholt A. Probability and Statistics with R. Boca Raton, FL, USA: CRC Press/Taylor & Francis Group; 2008
  30. 30. Pozo F, Vidal Y, Salgado Ó. Wind turbine condition monitoring strategy through multiway PCA and multivariate inference. Energies. 2018;11(4):749
  31. 31. DeGroot MH, Schervish MJ. Probability and Statistics. London, UK: Pearson; 2012

Written By

Francesc Pozo and Yolanda Vidal

Reviewed: 15 May 2018 Published: 26 September 2018