Open access peer-reviewed chapter

Condition Monitoring of Wind Turbine Structures through Univariate and Multivariate Hypothesis Testing

By Francesc Pozo and Yolanda Vidal

Submitted: January 29th 2018Reviewed: May 15th 2018Published: September 26th 2018

DOI: 10.5772/intechopen.78727

Downloaded: 199

Abstract

This chapter presents a fault detection method through uni- and multivariate hypothesis testing for wind turbine (WT) faults. A data-driven approach is used based on supervisory control and data acquisition (SCADA) data. First, using a healthy WT data set, a model is constructed through multiway principal component analysis (MPCA). Afterward, given a WT to be diagnosed, its data are projected into the MPCA model space. Since the turbulent wind is a random process, the dynamic response of the WT can be considered as a stochastic process, and thus, the acquired SCADA measurements are treated as a random process. The objective is to determine whether the distribution of the multivariate random samples that are obtained from the WT to be diagnosed (healthy or not) is related to the distribution of the baseline. To this end, a test for the equality of population means is performed in both the univariate and the multivariate cases. Ultimately, the test results establish whether the WT is healthy or faulty. The performance of the proposed method is validated using an advanced benchmark that comprehends a 5-MW WT subject to various actuators and sensor faults of different types.

Keywords

  • condition monitoring
  • wind turbines
  • principal component analysis
  • hypothesis testing

1. Introduction

The wind energy cost depends strongly on the performance of the condition monitoring system. Advance in this area would decrease downtime periods, extend the WT lifetime, and ultimately reduce the operation and maintenance (O&M) costs, which is one of the main challenges in wind energy as stated in “20% Wind Energy by 2030” [1].

Usually, condition monitoring comprises different systems (vibration analysis, oil monitoring, etc. [2]) for different parts and different types of faults and makes use of expensive specific sensors that must be installed in the WT. Therefore, the advance in fault detection systems that only make use of already available data from the turbine SCADA system and comprehend different parts and different types of faults is promising (since no additional sensors or data acquisition devices are needed). The SCADA signals provide rich information on the WT performance; thus, with appropriate algorithms, they can be used effectively for condition monitoring, prognostics, and remaining useful life prediction of WTs [3]. There are some success stories about using SCADA data for condition monitoring. For example, Ruiz et al. presented a machine learning approach [4], Zaher and McArthur proposed to use the combination of abnormal detection and data-trending techniques encapsulated in a multiagent framework [5], Pozo and Vidal proposed a fault detection system based on principal component analysis [6].

In this work, following the enhanced benchmark challenge for wind turbine fault detection proposed in [7], a set of eight realistic fault scenarios are considered to develop a WT condition monitoring strategy that combines a SCADA data-driven baseline model—reference pattern obtained from the healthy wind turbine—based on MPCA in combination with uni- and multivariate hypothesis testing. Previous works using MPCA and hypothesis testing to detect structural damage [8] work under the hypothesis of guided waves. That is, the vibration (guided wave) induced to the structure is known and always the same. However, in this work, the vibration is induced by the changeful wind. The used benchmark comprehends different types of faults of a 5-MW WT given by the FAST simulator [9], which has been accepted by the scientific community and is widely used for WT-related research, e.g., [10, 11, 12].

The chapter is organized as follows. Section 2 briefly recalls the WT benchmark model. In Section 3, the condition monitoring strategy is stated. Simulation results are discussed in Section 4. Finally, conclusions are drawn in Section 5.

2. Wind turbine benchmark model

The used benchmark model is proposed in [7]. It covers a 5-MW three-bladed, variable speed WT modeled with the FAST simulator, detailed actuator and sensor models, as well as the different fault descriptions. For a complete description of the benchmark, please see reference [7]. Here, a short review is given to introduce the used notation.

The specifications of the 5-MW reference WT is documented in [13]. This model has been used as a reference by research teams throughout the world to standardize baseline on- and off-shore wind turbine specifications. The wind turbine typical features are given in Table 1, and the assumed available SCADA data are given in Table 2. This work copes with the so-called full load region of operation. In order to run the simulations, turbulent wind data sets that cover this region have been generated with TurbSim [14], see Figure 1.

Reference wind turbineMagnitude
Rated power5MW
Number of blades3
Rotor/hub diameter126, 3m
Hub height90m
Cut-in, rated, and cut-out wind speed3, 11.4, and 25m/s
Rated generator speed (ωng)1173.7rpm
Gearbox ratio97

Table 1.

WT properties.

NumberSensor typeSymbolUnits
1Generated electrical powerPe,mkW
2Rotor speedωr,mrad/s
3Generator speedωg,mrad/s
4Generator torqueτc,mNm
5First pitch angleβ1,m°
6Second pitch angleβ2,m°
7Third pitch angleβ3,m°
8Fore-aft acceleration at tower bottomafa,mbm/s2
9Side-to-side acceleration at tower bottomass,mbm/s2
10Fore-aft acceleration at mid-towerafa,mmm/s2
11Side-to-side acceleration at mid-towerass,mmm/s2
12Fore-aft acceleration at tower topafa,mtm/s2
13Side-to-side acceleration at tower topass,mtm/s2

Table 2.

Assumed available measurements.

These sensors are representative of the types of sensors that are available on an MW-scale commercial wind turbine.

Figure 1.

Wind speed signal with turbulence intensity set to 10%.

The generator-converter system can be approximated by a first-order ordinary differential equation, see [7], which is given by:

τ̇rt+αgcτrt=αgcτctE1

where τrand τcare the real generator torque and its reference (given by the controller), respectively. In the numerical simulations, αgc=50, see [13]. Moreover, the power produced by the generator, Pet, is given by (see [7]):

Pet=ηgωgtτrtE2

where ηgis the efficiency of the generator and ωgis the generator speed. In the numerical experiments, ηg=0.98is used, see [7].

Each of the three pitch actuators is modeled as a closed loop transfer function between the pitch angle, βs, and its reference βrs:

βsβrs=ωn2s2+2ξωns+ωn2E3

where ξis the damping ratio and ωnthe natural frequency that takes the fault-free values ξ=0.6and ωn=11.11rad/s, see [7].

The fault detection benchmark considers different types of faults at different components (sensors and actuators), as described in Table 3.

FaultTypeDescription
F1Pitch actuatorChange in dynamics: high air content in oil
F2Pitch actuatorChange in dynamics: pump wear
F3Pitch actuatorChange in dynamics: hydraulic leakage
F4Torque actuatorOffset (offset value equal to 2000Nm)
F5Generator speed sensorScaling (gain factor equal to 1.2)
F6Pitch angle sensorStuck (fixed value equal to 5°)
F7Pitch angle sensorStuck (fixed value equal to 10°)
F8Pitch angle sensorScaling (gain factor equal to 1.2)

Table 3.

Fault scenarios.

3. Condition monitoring (CM) strategy

The overall CM strategy is based on a three-tier framework:

  1. a multiway PCA (MPCA) model is built with the data that are collected from a healthy WT,

  2. when a new WT has to be diagnosed, the SCADA data are projected using the MPCA model created in (i), and

  3. the final decision is based on both univariate and multivariate HT.

3.1. The wind as a source for the excitation: the need for a new paradigm

In general, vibration-based structural health monitoring (SHM) is based on the fact that an alteration or difference in physical properties due to damage or structural change will motivate changes in dynamical responses that may be detected. Figure 2 represents this paradigm in the sense that a healthy structure is excited according to a prescribed signal to build a pattern. Afterward, the structure that has to be diagnosed is affected by exactly the same signal, where the response is measured, processed, and finally compared with the previous pattern. The strategy presented in Figure 2 is known as “guided waves in structures for SHM” [15].

Figure 2.

Vibration-based SHM is based on the fact that an alteration or difference in physical properties due to damage or structural change will motivate changes in dynamical responses that may be detected.

In the present chapter, the field of application is wind turbines and a realistic scenario is to consider that the excitation comes from the wind turbulence. The wind turbulence cannot be controlled and it is always different. Therefore, the paradigm of guided waves in WT for SHM as in Figure 2 cannot be considered. In this case, when the source of the excitation cannot be previously prescribed, a new paradigm is needed, as represented in Figure 3. The foundation of the new paradigm is that, even with a constantly different excitation, the CM strategy based on MPCA and univariate and multivariate HT will be able to disclose some hidden damage, misbehavior, or fault. To sum up, the fundamental idea behind the CM strategy is the hypothesis that a variation in the overall behavior of the WT, even with an unprescribed excitation, should be detected.

Figure 3.

The key idea behind the new paradigm of the detection strategy is the assumption that a change in the behavior of the overall system, even with a different excitation, has to be detected.

However, in our application, the only available excitation of the wind turbines is the wind turbulence. Therefore, guided waves in wind turbines for SHM as in Figure 2 cannot be considered as a realistic scenario. In spite of that, the new paradigm described in Figure 3 is based on the fact that, even with different wind turbulence, the fault detection strategy based on PCA and statistical multivariate hypothesis testing will be able to detect some damage, fault, or misbehavior. More precisely, the key idea behind the detection strategy is the assumption that a change in the behavior of the overall system, even with a different excitation, has to be detected. Section 4 includes the simulation results of the proposed CM strategy that validates this hypothesis.

3.2. Data-driven baseline modeling based on MPCA

Multiway principal component analysis (MPCA) is a natural extension of classical principal component analysis (PCA) to manage data in multidimensional arrays [16, 17]. A conventional two-dimensional data matrix can be treated as a two-way array, where experiments and variables (or discretization instant times) form the two different ways. Frequently, this arrangement has to be extended to multiway arrays, particularly if several sensors—in different experimental trials—are gathering data at different time instants. Consequently, MPCA is equivalent to the application of standard PCA to an unfolded version of the initial multiway array.

Westerhuis et al. [18] propose six different ways of unfolding a three-way data matrix. Besides, in [18], a critical analysis of several aspects of the treatment of multiway data is provided, including how the matrix is unfolded, but also mean-centering and scaling with respect to the effects on the analysis of batch data. Ruiz et al. [19] assign one of the first six letters of the alphabet to each one of the six different ways of unfolding. In this chapter, as well as in [6, 8, 20, 21], we have considered the so-called type E. However, we will present the collected SCADA data arranged in an already unfolded matrix.

The MPCA modeling starts by measuring, from a healthy wind turbine, a sensor during nL1Δseconds, where Δis the sampling time and n,L. The discretized measures of the sensor are a real vector

x11x12x1Lx21x22x2Lxn1xn2xnLRnLE4

where the real number xij,i=1,,n,j=1,,Lcorresponds to the measure of the sensor at time i1L+j1Δseconds. These collected data can be arranged in matrix form as follows:

x11x12x1Lxi1xi2xiLxn1xn2xnLn×LRE5

where n×LRis the vector space of n×Lmatrices over R. It is worth noting that nis the number of rows of the matrix in Eq. (5) and Lis the number of columns of the same matrix. The effect on the overall performance of the condition monitoring strategy on the choice of nand Lis thoroughly analyzed on [21].

Let us assume that the SCADA data are now collected from Nsensors also during the same period of time. In this case, the collected data, for each sensor, can be organized in a matrix as in Eq. (5). Subsequently, all the collected data coming from the whole set of sensors are concatenated and disposed in a matrix Xn×NLas follows:

X=x111x121x1L1x112x1L2x11Nx1LNxi11xi21xiL1xi12xiL2xi1NxiLNxn11xn21xnL1xn12xnL2xn1NxnLN=v1v2vLX1vL+1v2LX2vN1L+1vNLXN=X1X2XNn×NLRE6

where the superindex k=1,,Nof each element xijkin the matrix represents the number of sensor. Matrix Xn×NLR—where n×NLRis the vector space of n×NLmatrices over R—contains the measures from Nsensors at nLdiscretization instants. Consequently, each row vector xiT=Xi:RNL,i=1,,nrepresents the measurements from all the sensors at time instants i1L+j1Δseconds, j=1,,L. Equivalently, each column vector vj=X:jRn,j=1,,NLrepresents measurements from sensor number jLat time instants i1L+j1Δseconds, 1=1,,n, where is the ceiling function.

The objective of the subsequent analysis is to build the MPCA model, that is, the square orthogonal matrix PNL×NLRthat has to be used to transform or project the original data matrix Xaccording to the following matrix-to-matrix product:

T=XPn×NLR,E7

where the shape of the variance-covariance matrix of matrix Tin Eq. (7) is diagonal.

In the proposed approach in this chapter, the model defined in matrix Pin Eq. (7) is based only on measures that come from a healthy wind turbine. Posteriorly, data from the current WT to diagnose will be projected using the matrix-to-matrix multiplication also defined in Eq. (7). However, a different procedure can be considered, particularly, when the goal is not just to detect a damage or a fault but to classify it. In the latter case, matrix Xin Eq. (6) should contain measures from a WT in its healthy state but also in all the possible fault scenarios. This way, the generated model in matrix Pin Eq. (7) contains all the possible states of the structure.

3.2.1. Centering and scaling: group scaling (GS) vs. mean-centered group scaling (MCGS)

Considering that the data stored in matrix Xare affected by a changing wind turbulence, come from different sensors, and could have different magnitudes and scales, some kind of preprocessing step is required to rescale the data [22, 23]. According to Westerhuis et al. [18], the way this preprocessing step is carried out may affect the overall performance of the CM strategy. In the present chapter, we present two possible choices that have some common core. These two alternatives are as follows:

  1. group scaling (GS) and

  2. mean-centered group scaling (MCGS).

In the former case (GS), both the arithmetic mean and the variance of all measurements of the sensor are used. More precisely, for k=1,2,,N, we define

μk=1nLi=1nj=1Lxijk,E8
σk2=1nLi=1nj=1Lxijkμk2E9

where μkand σk2are the arithmetic mean and the variance of the whole set of elements in matrix Xk, respectively. In this case, matrix X=xijkis centered and scaled—using GS—to define a modified matrix X=XGS=xijkas

xijkxijkμkσk2,i=1,,n,j=1,,L,k=1,,N.E10

In the latter case (MCGS), the arithmetic of all measurements of the sensor at the same column is considered in the normalization. More precisely, for k=1,2,,N, we define

μjk=1ni=1nxijk,j=1,,L,E11

where μjkis the arithmetic mean of the measures placed at the same column. In this case, then, matrix X=xijkis centered and scaled—using MCGS—to define a modified matrix X=XMCGS=xijkas

xijkxijkμjkσk2,i=1,,n,j=1,,L,k=1,,N.E12

where σk2is defined as in Eq. (9) using μkas in Eq. (8). It is worth noting that the only difference between the expressions in Eqs. (10) and (12) is how the elements in matrix X=xijkare centered. When matrix X=xijkis scaled and centered according to the MCGS strategy described in Eq. (12), the average value of each column vector in the scaled matrix Xcan be calculated as

1ni=1nxijk=1ni=1nxijkμjkσk=1nσki=1nxijkμjkE13
=1nσki=1nxijknμjkE14
=1nσknμjknμjk=0E15

Taking advantage of the fact that the scaled matrix Xis a mean-centered matrix, the variance-covariance matrix can be straightforwardly computed as a matrix-to-matrix product of Xand its transpose, divided by n1, where nis the number of rows of matrix Xin Eq. (6). More precisely,

CX=1n1XTXNL×NLRE16

Clearly, GS and MCGS are not the only ways to center and scale data. For instance, feature scaling, also known as unity-based normalization, can also be considered. In this case, data are centered with respect to the minimum value and scaled with respect to the range of the set, that is,

x˜ijkxijkminxijkmaxxijkminxijk,i=1,,n,j=1,,L,k=1,,N.E17

However, to easily compute the variance-covariance matrix in the CM strategy that we present in this chapter, the mean-centered group scaling (MCGS) is the method that we have selected for the centering and scaling. In order to not to use the baroque notation Xthroughout the rest of this chapter, this centered and scaled matrix is redesignated as X, without the breve sign.

The MPCA model is described by the latent vectors

pj,j=1,,NL,E18

also known as eigenvector or proper vectors, and the latent roots

λj,j=1,,NL,E19

also known as eigenvalues or proper values, of the variance-covariance matrix CXas follows:

CXP=PΛE20

where

P=p1p2pNLNL×NLRE21
Λ=ΛijNL×NLRE22

and

Λjj=λj,j=1,,NLE23
Λij=0,i,j=1,,NL,ijE24

The latent vectors and latent roots in Eqs. (21) and (23) are arranged in descending order with respect to the absolute values of the latent roots, that is,

λiλi+1,i=1,,NL1E25

The latent vector p1—corresponding to the largest latent root λ1(in absolute value)—is called the first principal component (PC). Likewise, the latent vector p2—corresponding to the second largest latent root λ2(in absolute value)—is called the second principal component. Equivalently, the latent vector pj,j=1,,NL—corresponding to the latent root λj—is called the jth principal component.

Matrix Tin Eq. (7) represents the transformed or projected matrix onto the principal component space and it is also known as score matrix.

When, for the sake of dimensionality reduction, a decreased number of principal components are considered:

<NL,E26

a reduced multiway PCA model is then assembled:

P=p1p2pNL×R.E27

3.3. HT-based condition monitoring

As said in Section 3.2, the MPCA model is based only on measures that come from a healthy wind turbine. Posteriorly, data from the current WT to diagnose—and subjected to a different wind turbulence—are gathered from as many sensors as in the modeling phase described in Section 3.2 and during a period of time, νL1Δseconds, which is not necessarily equal. These new data are arranged in a new matrix Yin a similar way as in Eq. (6):

Y=y111y121y1L1y112y1L2y11Ny1LNyi11yi21yiL1yi12yiL2yi1NyiLNyν11yν21yνL1yν12yνL2yν1NyνLNν×NLR=w1w2wLY1wL+1w2LY2wN1L+1wNLYN=Y1Y2YNn×NLRE28

It should be noted that ν(the number of rows of matrix Y) does not necessarily need to match the natural number n, which represents the number of rows of matrix Xin Eq. (6). However, the number of columns, represented by the natural number NL, must agree.

The collected data in matrix Yin Eq. (28) are first centered and scaled to form a matrix Y=yijksimilar to the one in Eq. (12):

yijkyijkμjkσk2,i=1,,ν,j=1,,L,k=1,,N,E29

where σk2and μjkare the values of the variance and the arithmetic mean that have been previously calculated in Eqs. (9) and (11), respectively, with respect to Xin Eq. (6). After the preprocessing step, that is, centering and scaling the raw data collected from the current structure to diagnose, the scores related to each row vector

ri=Yi:RNL,i=1,,νE30

are computed using a vector-to-matrix product:

ti=riP^R,i=1,,νE31

where matrix P^is the reduced MPCA model in Eq. (27).

Let us consider the canonical basis

e1e2eRE32

of the dimensional real vector space R.

Given a row vector rias in Eq. (30), the real number

t1i=tie1RE33

is called the first score. Likewise, the scalar

t2i=tie2RE34

is called the second score. In general, the scalar

tji=tiejRE35

is called the score associated with the principal component pj,j=1,,or, simply, score j.

In addition, an sdimensional vector as can be built if more than one score is considered at the same time. Indeed,

tsi=t1it2itsiTRs,s.E36

3.3.1. Scores as a random sample

As said in Section 3.1, the excitation of the WT comes from a changing turbulent wind. Somehow, this turbulent wind can be viewed as a random signal. Therefore, the response of the WT can be also viewed as a random process and so the measurements in the row vector riin Eq. (30). As a consequence, the vector tireceives this random nature and it can be observed as an -dimensional random vector to construct the statistical approach in this chapter. As a motivating example, in Figure 4, two three-dimensional samples are represented: one is the three-dimensional baseline sample (left) and the other is referred to faults 1, 4, and 7 (right). In a classic application of the PCA strategy in the field of SHM, the scores allow a separation, clustering, or visual grouping [24]. However, in this case, it can be clearly monitored in Figure 4 (right) that a clustering, visual grouping, or separation cannot be performed. Therefore, more powerful and reliable tools are needed to be able to detect a fault in the WT.

Figure 4.

Baseline sample (left) and sample from the wind turbine to be diagnosed (right).

In structural health monitoring or condition monitoring applications, the final decision on whether the structure, the actuator and/or the sensor is healthy or not should not depend on graphical approaches. One of the most common approaches to reliable indicators of damage or faults is the use of the powerful machinery of statistical hypothesis testing. The differences in this kind of strategies rely on what is the subject of the test and, of course, how the raw data collected by the sensors are arranged and preprocessed. For instance, in Zugasti et al. [25] the damage detection is based on testing for significant changes in the parameter vector of an AutoRegressive model. A comprehensive three-tier modular structural health monitoring framework is proposed by Hackell et al. [26] where the hypothesis testing is used to declare decision boundaries, control charts, and ROC curves with the ultimate goal of distinguishing between healthy and potentially damaged data on an operational wind turbine. A somehow different approach is presented by Ng et al. [27] that includes a vehicle health monitoring system where several univariate hypothesis tests are considered in parallel. Again in the field of structural health monitoring or condition monitoring of wind turbines, a recent work by Tsiapoki et al. [28] where damage and ice detection is based on data normalization, feature extraction and hypothesis testing (HT).

The use of univariate hypothesis testing as a key element for structural health monitoring or condition monitoring has been increasing in the last years as a reliable method. Variations of these univariate HT for multiple indicators include the use of univariate HT in parallel, that is, testing for each component of a parameter vector rather than testing for the whole multidimensional parameter vector. The first approach for the detection of structural changes using a multivariate hypothesis testing has been proposed by Pozo et al. [8]. One of the key results in the work [8] is that multivariate HTs allow to get better results in damage or fault detection that just univariate test. One interesting example presented in the work by Pozo et al. [8] shows that, for a given level of significance α, five independent univariate hypothesis

H0:μc,i=μh,iH1:μc,iμh,iE37

where i=1,2,,5lead to a wrong decision while the single multivariate HT

H0:μc=μhH1:μcμhE38

where

μcT=μc,1μc,2μc,5μhT=μh,1μh,2μh,5E39

is able to correctly classify the structure. This example shows that multivariate HT is even more reliable than univariate HT. However, these benefits come at a price, in the sense that in order to apply the multivariate HT, the statistical distribution of the data must be multinormal. Of course, it may happen that five sets of 50 samples

x1ix2ix50iNμiσi,i=1,2,,5E40

are normally distributed, while the sample vector

x1x2x50NμΣ,E41

where

xj=xj1xj2xj5T,j=1,,50E42

and Σis the variance-covariance matrix, is not multinormally distributed.

3.3.2. Univariate case: testing for the equality of means

In this section, we present how a fault is detected in the WT using univariate HT. To this end, first we have to define what we consider our baseline. Given a principal component j=1,,, the baseline sample is the set of real numbers τjii=1,,ndefined by

τjiXi:P̂j=Xi:P̂ej,i=1,,n,E43

where ejis the j-th vector of the canonical basis in Eq. (32), Pis the MPCA model defined in Eq. (27), and Xis the centered and scaled matrix of the collected data from a healthy WT as in Eq. (6). Similarly, and given a principal component j=1,,, the sample of the current WT to diagnose is defined as the set of νreal numbers

tjii=1,,νE44

as defined in Eq. (35).

Before the univariate HT is applied, the following assumptions must be made:

  1. the baseline sample τjii=1,,nis a random sample of a random variable (RV) normally distributed with unknown mean μXand unknown variance σX2and

  2. the random sample tjii=1,,νin Eq. (44) of the current WT to diagnose follows a normal distribution with unknown mean μYand unknown variance σY2.

It is worth mentioning that the variances of these two samples are not supposed to be necessary equal.

Let us define

δμ=μXμYE45

as the difference between these two mean values. Since we want to know if the distribution of these two samples is related, this leads to a test of the hypothesis

H0:δμ=0versusE46
H1:δμ0E47

where the null hypothesis H0is “the sample of the WT to be diagnosed is distributed as the baseline sample” and the alternative hypothesis H1is “the sample of the WT to be diagnosed is not distributed as the baseline sample.” In other words, if the result of the test is that H0is accepted, the current WT is categorized as healthy. Otherwise, if H0is rejected in favor of H1, this would indicate the presence of some faults in the WT.

Given the assumptions of normality and considering that the two variances are not necessarily equal, the test for the equality of mean is based on the so-called Welch-Satterthwaite method [29], which is outlined below for the sake of completeness. If two random samples of size nand ν, respectively, are taken from two normal distributions NμXσXand NμYσYand the population variances are unknown and not necessarily equal, the random variable

WS=X¯Y¯+μXμYSX2n+SY2νE48

can be approximated with a t-distribution with ρdegrees of freedom (DOF), that is

WStρE49

where

ρ=sX2n+sY2ν2sX2/n2n1+sY2/ν2ν1,E50

S2is the sample variance as a random variable, s2is the variance of a sample, X¯,Y¯are the sample mean as a random variable, and is the standard floor function.

The magnitude of the test statistic using Welch-Satterthwaite method is defined as

tobs=x¯y¯sX2n+sY2νE51

where x¯,y¯is the mean of a particular sample. The quantity tobsis the fault indicator. We can then construct the following test:

tobstAcceptH0E52
tobs>tAcceptH1E53

where tis such that

Ptρt=α2,E54

where αis the level of significance for the test. To sum up,

  1. H0is rejected if tobs>t(the WT is classified as not healthy) and

  2. H0is accepted if tobst(the WT is classified as healthy).

3.3.3. Multivariate case: testing a multivariate mean vector

In Section 3.3.2, for each principal component j=1,,, a test for the equality of means is performed. This means that for a single sample of the current structure to diagnose, we obtain decisions on whether the structure is healthy or not. In the present section, more than one principal component will be considered jointly thus defining a vector. Therefore, a test for the plausibility of a value for a normal population mean vector will be performed.

As in Section 3.3.2, the objective of this work is to determine whether the distribution of the multivariate random samples that are obtained from the WT to be diagnosed (healthy or not) is connected to the distribution of the baseline.

Let us define sas the number of PCs that are considered at the same time. Before the multivariate HT is applied, the following assumptions must be made:

  1. the baseline projection is a multivariate random sample (MRS) of a multivariate random variable (MRV) following a multivariate normal distribution (MVND) with known population mean vector μhRsand known variance-covariance matrix s×sRand

  2. the multivariate random sample of the WT to be diagnosed also follows an MVND with unknown multivariate mean vector μcRsand known variance-covariance matrix s×sR.

In this case, opposite to what we have assumed in Section 3.3.2, both multivariate random variables have the same known variance-covariance matrix.

Similarly as in Section 3.3.2, the question that arises here is whether a given s-dimensional vector μcis a reasonable value for the mean of an MVND Nsμh. This leads to the following test of the hypothesis

H0:μc=μhversusH1:μcμh,E55

where H0is “the MRS of the WT to be diagnosed is distributed as the baseline projection” and H1is “the MRS of the WT to be diagnosed is not distributed as the baseline projection.” In other words, if the result of the test is that H0is accepted, the current WT is categorized as healthy. Otherwise, if H0is rejected in favor of H1, this would indicate the presence of some faults in the WT.

In this case, the multivariate test is based on Hotelling’s T2statistic and it is outlined below. When an MRS of size υis taken from an MVND Nsμh, the RV

T2=υX¯μhTS1X¯μhE56

is distributed as

T2υ1sυsFs,υs,E57

where Fs,υsdenotes an RV with an F-distribution with sand υsDOF, X¯is the sample vector mean as a MRV, and 1nSs×sRis the estimated variance-covariance matrix of X¯.

The value of the test statistic is defined as

tobs2=υx¯μhTS1x¯μh,E58

and is the fault indicator. We can then construct the following test:

tobs2υ1sυsFs,υsαAcceptH0,E59
tobs2>υ1sυsFs,υsαAcceptH1,E60

where Fs,υsαis the upper 100αth percentile of the Fs,υsdistribution, that is,

Fs,υs>Fs,υsα=α,E61

where is a probability measure and αis the level of significance for the test. To sum up,

  1. H0is rejected if tobs2>υ1sυsFs,υsα(the WT is classified as not healthy) and

  2. H0is accepted if tobs2υ1sυsFs,υsα(the WT is classified as healthy).

4. Simulation results

The results of the CM strategies presented in Sections 3.3.2 and 3.3.3 are organized into three subsections. The absolute value of samples that are correctly identified and the absolute number of false alarms and missing faults are included in Section 4.1. Sections 4.2 and 4.3 show the results, not as absolute values but as a percentage. More precisely, the sensitivity and the specificity are both comprised in Section 4.2, including the false-negative (FNR) and the false-positive rates (FPR). Besides, the true rate of both false negatives and false positives are contained in Section 4.3.

For the validation of the CM strategies presented in Sections 3.3.2 and 3.3.3, 24samples of ν=50elements each have been examined, in accordance with the following organization:

• 8 samples of a faulty WT (one sample for each one of the different fault scenarios described in Table 3) and

• 16 samples of a healthy WT.

All samples are acquired with changing wind data sets with turbulence intensity established to 10%and computed with TurbSim [14]. These wind data have the subsequent features:

  1. Kaimal turbulence model,

  2. logarithmic profile wind type,

  3. mean speed of 18.2m/s simulated at hub height, and

  4. a roughness factor of 0.01m.

Each sample of ν=50elements comes from the measures collected during νL1Δ=312.4875seconds. The values for these parameters are listed in Table 4.

We present, in Sections 4.1, 4.2, and 4.3, the results when the collected data are projected into:

  1. the first principal component,

  2. the second principal component,

  3. the third principal component,

  4. the first and the second principal components, jointly,

  5. the first seven principal components, jointly, and

  6. the first twelve principal components, jointly.

In the three univariate cases, (i)–(iii), we use the test for the equality of means, while in the three multivariate cases, (iv)–(vi), we use the test for the plausibility of a value for a normal population. In both cases, the chosen level of significance is α=10%.

4.1. Types I and II errors

In this section, each of the 24 samples is classified as follows:

  1. number of samples from the healthy WT (healthy sample), which were classified by the hypothesis test as “healthy” (accept H0) [right decision],

  2. faulty sample classified by the test as “faulty” (accept H1) [right decision],

  3. samples from the faulty WT (faulty sample) classified as “healthy” [wrong decision/missing fault/type II error], and

  4. healthy sample classified as “faulty” [wrong decision/false alarm/type I error].

The results displayed in Table 6 are disposed according to the scheme in Table 5.

ParameterSymbolMagnitude
Number of rowsν50
Number of columnsL500
Sampling timeΔ0.0125
Number of sensorsN13

Table 4.

The collected measures are arranged in a ν×NLmatrix Yas in Eq. (28)

Healthy sample (H0)Faulty sample (H1)
Accept H0Correct decisionType II error (missing fault)
Accept H1Type I error (false alarm)Correct decision

Table 5.

Scheme for the presentation of the results in Table 6

H0H1H0H1
Score 1Scores 1–2
Accept H0161Accept H0120
Accept H107Accept H148
Score 2Scores 1–7
Accept H0137Accept H0130
Accept H131Accept H138
Score 3Scores 1–12
Accept H0168Accept H0160
Accept H100Accept H108

Table 6.

Categorization of the samples with respect to the presence or absence of a fault and the result of the test considering the first score, the second score, and the third score (left) and scores 1–2 (jointly), scores 1–7 (jointly), and scores 1–12 (jointly) (right), when the size of the samples to diagnose is ν=50and the level of significance is α=10%

Healthy sample (H0)Faulty sample (H1)
Accept H0Specificity (1α)False-negative rate (γ)
Accept H1False-positive rate (α)Sensitivity (1γ)

Table 7.

Relationship between specificity and sensitivity.

H0H1H0H1
Score 1Scores 1–2
Accept H01.000.12Accept H00.750.00
Accept H10.000.88Accept H10.251.00
Score 2Scores 1–7
Accept H00.810.88Accept H00.810.00
Accept H10.190.12Accept H10.191.00
Score 3Scores 1–12
Accept H01.001.00Accept H01.000.00
Accept H10.000.00Accept H10.001.00

Table 8.

Sensitivity and specificity of the test considering the first score, the second score, and the third score (left) and scores 1–2 (jointly), scores 1–7 (jointly), and scores 1–12 (jointly) (right), when the size of the samples to diagnose is ν=50and the level of significance is α=10%

Healthy sample (H0)Faulty sample (H1)
Accept H0H0acceptH0True rate of false negatives H1acceptH0
Accept H1True rate of false positives H0acceptH1H1acceptH1

Table 9.

Relationship between the proportion of false negatives and false positives.

H0H1H0H1
Score 1Scores 1–2
Accept H00.940.06Accept H01.000.00
Accept H10.001.00Accept H10.330.67
Score 2Scores 1–7
Accept H00.650.35Accept H01.000.00
Accept H10.750.25Accept H10.270.73
Score 3Scores 1–12
Accept H00.670.33Accept H01.000.00
Accept H10.000.00Accept H10.001.00

Table 10.

True rate of false negatives and true rate of false positives of the test considering the first score, the second score, and the third score (left) and scores 1–2 (jointly), scores 1–7 (jointly), and scores 1–12 (jointly) (right), when the size of the samples to diagnose is ν=50and the level of significance is α=10%

4.2. Sensitivity and specificity

As in [20, 30], two more statistical indicators are analyzed to assess the efficiency of the test. On the one hand, the specificity of the test is defined as the fraction of samples from the healthy structure, which are correctly classified. On the other hand, the sensitivity—or the power of the test—is defined as the fraction of samples from the faulty wind turbine that are correctly classified as such.

The sensitivity and specificity of both the univariate HT and the multivariate case with respect to the 24 samples displayed in Table 8 are disposed according to the scheme in Table 7.

4.3. Reliability of the results

Finally, the true rate of false negatives and the true rate of false positives can be used to assess the performance of the proposed CM strategy. These two measures—closely related to Bayes’ theorem [31]—are described in Table 9. On the one hand, the true rate of false negatives is the fraction of samples from the faulty WT that have been wrongly identified as healthy. On the other hand, the true rate of false positives is the fraction of sample from the healthy WT that have been wrongly identified as faulty.

The true rate of false negatives and the true rate of false positives of both the univariate HT and the multivariate case displayed in Table 10 are disposed according to the scheme in Table 9.

5. Concluding remarks

A multifault detection method based on MPCA through uni- and multivariate hypothesis testing has been presented in this chapter. It is noteworthy to mention the obtained performance through the study of eight realistic different faults in different components of the WT, taking into account that the proposed strategy does not need extra sensors but only uses already available data from the WT SCADA system.

The three main conclusions, which show the benefits of the multivariate statistical hypothesis testing in comparison with the univariate case, for WT condition monitoring, are the following:

  1. Given a level of significance α=10%, when the first 12 scores are considered jointly, an accuracy of 100%is obtained, while in all the other studied cases, misclassifications are present.

  2. Multivariate analysis leads to average values of 100%for the sensitivity and 85.33%for the specificity, while for the univariate case, the average values are 33.33and 93.67%, respectively.

  3. Multivariate analysis leads to average value of the true rate of false negatives of 0%and the average value of the true rate of false positives of 20%, while for the univariate case, the average values are 24.67and 25%, respectively.

Acknowledgments

This work has been partially funded by the Spanish Ministry of Economy and Competitiveness through the research projects DPI2014-58427-C2-1-R and DPI2017-82930-C2-1-R, and by the Generalitat de Catalunya through the research project 2017 SGR 388.

Abbreviations

DOFdegrees of freedom
CMcondition monitoring
FASTfatigue, aerodynamics, structures, and turbulence
FDfault detection
FNRfalse-negative rate
FPRfalse-positive rate
GSgroup scaling
HThypothesis testing
MCGSmean-centered group scaling
MPCAmultiway principal component analysis
MRSmultivariate random sample
MRVmultivariate random variable
MVNDmultivariate normal distribution
O&Moperation and maintenance
PCAprincipal component analysis
RVrandom variable
SCADAsupervisory control and data acquisition
SHMstructural health monitoring
WTwind turbine

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Francesc Pozo and Yolanda Vidal (September 26th 2018). Condition Monitoring of Wind Turbine Structures through Univariate and Multivariate Hypothesis Testing, Structural Health Monitoring from Sensing to Processing, Magd Abdel Wahab, Yun Lai Zhou and Nuno Manuel Mendes Maia, IntechOpen, DOI: 10.5772/intechopen.78727. Available from:

chapter statistics

199total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Structural Health Monitoring of Bolted Joints Using Guided Waves: A Review

By Fei Du, Chao Xu, Huaiyu Ren and Changhai Yan

Related Book

Advances in Wavelet Theory and Their Applications in Engineering, Physics and Technology

Edited by Dumitru Baleanu

First chapter

Real-Time DSP-Based License Plate Character Segmentation Algorithm Using 2D Haar Wavelet Transform

By Zoe Jeffrey, Soodamani Ramalingam and Nico Bekooy

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us