Preventive Maintenance and Fault Detection for Wind Turbine Generators Using a Statistical Model Preventive Maintenance and Fault Detection for Wind Turbine Generators Using a Statistical Model

Vigilant fault diagnosis and preventive maintenance has the potential to significantly decrease costs associated with wind generators. As wind energy continues the upward growth in technology and continued worldwide adoption and implementation, the application of fault diagnosis techniques will become more imperative. Fault diagnosis and preventive maintenance techniques for wind turbine generators are still at an early stage compared to matured strategies used for generators in conventional power plants. The cost of wind energy can be further reduced if failures are predicted in advance of a major structural failure, which leads to less unplanned maintenance. High maintenance cost of wind turbines means that predictive strategies like fault diagnosis and preventive maintenance techniques are necessary to manage life cycle costs of critical components. Squirrel-Cage Induction Generators (SCIG) are the prevailing generator type and are more robust and cheaper to manufacturer compared to other generator types used in wind turbines. A statistical model was developed using SCADA data to estimate the relationships between winding temperatures and other variables. Predicting faults in stator windings are challenging because the unhealthy condition rapidly evolves into a functional failure.


Introduction
Wind energy has evolved into a mature, cost effective and sustainable power technology. The sizes of wind turbines are growing on a continuous basis and new topologies allow for better integration into electricity grids. Power electronics development has provided the functionality of variable speed operation, which is more energy efficient. A wind turbine typically comprises 8000 parts or more with the blades, rotor, main bearing, drivetrain and power module its major components. Figure 1 depicts the typical components of a wind turbine. A major component of the power module is the electrical generator. Squirrel-Cage Induction Generators (SCIG) are currently the most common electrical generator type used in wind turbines, because these are robust and cheaper to manufacturer compared to other generator types. As a complex power system it is important to understand how failures in wind turbines occur despite its current level of maturity. High reliability and availability is thus expected over a typical 20-year design life.

Wind energy evolution
Wind energy adoption has seen year-on-year continued growth and implementation. The global installed wind energy capacity is illustrated in Figure 2.
The operating principle of all wind turbines make use of either aerodynamic lift or aerodynamic drag forces. Aerodynamic lift forces are perpendicular to the direction of the wind whereas drag forces are in the same direction. Modern day wind turbines are mainly designed to use aerodynamic lift forces where the rotor blades are turned into the direction of the wind. The perpendicular lift force produces the required driving torque via the leverage of the rotor. Only wind turbines operating on aerodynamic lift will be discussed here and these are classified in accordance to the direction of the rotating axis i.e. horizontal axis wind turbines (HAWTs).

Wind energy cost
Wind energy has reached commercial maturity remarkably fast and has seen its cost dropped significantly to such levels that it's now cost competitive with coal power generation [2]. For any power generation technology, the cost of production is variable and influenced by technology maturity, operating conditions, location and the capacity rating of the plant [3]. The LCOE for wind energy is affected mainly by the following factors [4]: • Operation and Maintenance (O&M) costs; • Annual energy production (AEP); • Capital costs; • Financing costs. Figure 3 indicates the capital cost breakdown of all the major wind installation of a typical onshore wind turbine and it is evident that the major costs are related to the turbine itself. LCOE can be reduced if wind turbine manufacturers enhance turbine technology so that a variety of designs are available for different wind resource conditions. This can be achieved through larger rotors, improved blade aerodynamics and taller towers [4].
The capacity factor (CF) indicates how frequently the wind turbine was able to produce power at rated or name plate capacity over a given period (normally a year). Capacity factors for onshore wind turbines fall in the range between 30 and 35% [4]. This figure varies considerably depending on turbine design and the local wind resource. In conventional power generation technologies the AEP is generally proportional to the generator size. However in a wind turbine the rotor swept area can have a bigger influence than the generator size on the power generation capability [5].
Therefore the relationship between the rotor swept area and generator size can influence capacity factors of wind turbines. In other words a wind turbine with a specific rotor swept area connected to two different size generators will have different capacity factors. The smaller size generator will operate at a higher capacity factor compared to the bigger size generator with the same wind conditions. Wind turbine manufacturers should therefore optimise this relationship for specific site conditions and grid integration requirements to ensure the lowest possible costs. O&M costs of wind turbines vary over the lifespan of the plant and escalate with age as the risks of failure of the equipment increase. The O&M costs of wind turbines have reduced considerable over the last 30 years and accounts between 20 and 30% of the total life cycle costs for onshore projects [2]. O&M costs for offshore wind projects are higher because of the severe operating conditions in the sea, access to site, complex maintenance tasks and transmission infrastructure costs. The costs for onshore wind projects are approximately USD 30-60/MWh versus USD 71-155/MWh for offshore projects [6].

Speed characteristics of wind turbines
Wind turbines can rotate at a fixed speed where the optimum energy conversion takes place at a specific wind speed or at variable speed which has a more complicated electrical design [7] but is efficient over a wind speed range. The fixed speed of the wind turbine technology depends on the gearbox ratio, frequency of the grid and the electrical generator design characteristics [8]. From 1980 to early 1990s all wind turbines used for large scale power generation was fix speed and used gearboxes.
Fixed speed wind turbines are rugged, cost effective to build but experience higher power fluctuations as a result of the constant generator speed against varying wind speeds [7]. These turbines unfortunately draw large reactive power from the grid which are compensated for by installing power factor correction capacitors. The disadvantage of power factor correction capacitors is power quality problems like harmonic resonance on the grid [8].
Variable wind speed turbines are designed to reduce mechanical stresses, maximise wind energy capture and provide smoother output power which is more suited to the grid. This technology became popular in the 1990s at the same time when advances in power electronics, reactive power control, variable speed induction generators and synchronous generator systems happened [9].
By connecting the electrical generator via a power electronics system to the grid, the wind turbine speed can be adjusted. Harmonic currents from the power electronics systems in variable speed wind turbines also cause power quality problems. Associated transient voltage peaks of 100 times more than the expected values between windings cause insulation damage of windings and ultimately failure of the machine [10].
For a certain wind resource with specific Weibull distribution parameters, it was shown that additional annual energy captured by a variable speed turbine was 2.3% more than a similar rated fixed speed turbine. The additional costs of a variable speed wind turbine compared to a fixed speed wind turbine of the same rating at a given location are off-set by its ability to capture more energy in the wind [10].
The study in [10] revealed that a variable speed wind turbine produces more power than the fixed speed turbine of the same rating. Although the difference might appear small, the amount of power generated over the life cycle of the wind turbine which is typically 20 years can deliver substantial generation profit.
Power regulation is normally done by pitching the rotor blades, stall control or a combination of the two in order to avoid overloading the wind turbine. The aerodynamic forces acting on the rotor and the output power of the turbine are reduced during high wind speeds. Variable speed wind turbines in conjunction with dynamic blade pitch for power and load control is considered as the accepted industry standard for most modern wind turbines.

Wind turbine classes
Factors such as the average yearly wind speed magnitude, wind turbulence and severe gusts speeds, determine if a wind turbine design is suited for safe operation at a particular site. The International Electrotechnical Commission (IEC) standard IEC 61400-1, stipulates the different wind turbine classes based on aerodynamic loading [11]. Wind turbines classified as low wind "Class IV" i.e. S111 according to IEC 61400-1 are now becoming feasible to enter the power generation market [11].
Wind classes I, II and III can be equated to high, medium and low wind sites in general.
Locations with low wind resources are suited for wind turbines designed with bigger rotors and higher towers to balance energy conversion and costs. These wind turbines types are largely coupled to smaller drivetrain and power generating units to increase their effectiveness in these less promising wind conditions. Medium and low wind turbines have become more popular than high wind turbines with Asia leading the international market.

Wind turbines generator types
The electrical generator in the wind turbine converts the mechanical energy from the turbine rotor into electrical energy which is supplied to the grid. In conventional power systems where synchronous generators are used, power is produced at constant speed. Applying these generating systems in wind energy is a challenge because of the variable nature of the resource.
Induction generators also known as asynchronous generators because they do not rotate at a fixed speed are the most commonly used electrical generator in WECs today. The application of induction generators in the power industry is limited compared to induction motors, which are seen as the workhorses in power systems consuming approximately 33% of global generated electricity. There are several advantages that make induction generators suitable for wind energy technologies as mentioned by Das et al. [12].
Induction generators are classified according to their rotor structure, which is, squirrel cage and wound rotor types. The stator designs of both induction machines are the same. The term power converter in the following paragraphs refers to all power electronic systems such as soft-starters, inverters, rectifiers or frequency converters.

The squirrel cage induction generator
SCIGs are used in fixed speed or variable speed wind turbine concepts. The SCIG stator is connected to the grid via a power transformer and a power converter is used to reduce the inrush current. The function of the capacitor bank is to reduce the reactive power consumption and support the generator voltage. This configuration is also known as the Danish concept and the first generation was directly connected to the grid without any power converters Technology developments and subsequent reduction in power electronics costs have been main drivers for the use of SCIGs in variable speed wind turbines. The generator is connected to the grid via a full rated converter, which controls the stator current instead. This configuration has full control of real and reactive power and operates across the full speed range.
The size of the generator is more compact and lighter compared to other full converter designs. This type configuration is predominantly used by Siemens Wind Power which has a 4.1% global market share [11]. According to [11] North America has an installed capacity of 1.5 GW, the rest of the world 0.98 GW excluding European and Asian markets.
The power quality of SCIGs at low and high wind speeds are better compared to wound rotor induction generators (WRIGs) while the latter produce less harmonics near synchronous speed [76]. Other attributes, which make SCIGs desirable over WRIGs, are: • Better grid stability because of the larger converter; • No brushes or slip ring maintenance as well as reduced losses; • Robust rotors which can provide better electrical and mechanical performance; • It is cost effective and readily available.
The converter in this configuration needs to be sized to the full capacity of the generator, which makes it very expensive. The harmonic filters are also rated at full converter capacity which is costly and difficult to design [13]. The performance of the converter has to be very good over the entire power range to ensure optimum efficiency and generation capacity.

Synchronous generators
Synchronous generators are matured technologies in fossil fuels and nuclear power systems and produce grid power at constant speeds. Their robustness and ability to control grid voltage by adjusting the rotor excitation make them ideal for power systems. This is particularly important during grid problems like faults where the generator is to remain connected to the grid and support the grid voltage through reactive power control. Because of these attributes synchronous generators are now being used in WECs and their rotors can be separately excited or make use of permanent magnets [13].
For a synchronous generator the absence of slip rings, gearbox and external excitation reduce the overall losses and the full rated power converter maintains its flexibility. The full rated converter and magnetic material costs make this concept very expensive but energy efficiency is improved [13]. Different permanent magnet synchronous generators designs are described and analysed by [14].

Wind turbine failures overview
Failures in wind turbines can result from various sources including poor quality, inferior design and manufacturing standards, construction and erection deficiencies, local operating conditions, transmission system design and general maintenance [15]. Mechanical failures occur most often, gearbox failures cause the longest downtimes and failure rates above one failure per turbine annually is still common [16]. The failure rate of the majority of wind turbine components or systems increase as designs move away from well-established designs towards new concepts, which are less, matured. A similar observation was made when the wind turbine generator rating increases from small to large [17]. In a study of about 800 wind turbines it was established that the availability was over 90% for the majority of turbines irrespective of size [15]. This study also showed that the difference between availability figures amongst major wind turbine manufacturers were small. The primary course of failures is due to wear out as the hazard rate increases during the last phase of component design life [17]. The authors in [16] concluded that gearbox failures cause the longest downtimes and that the average downtime reduced as technologies improved. The failure rates and downtime of subsystems during a survey done on more than 1500 wind turbines in Germany over a 15 year period show generator failures represent approximately 4% of the total number of failures in the wind turbines.

Generator failures
The major cause of failure in electrical machines irrespective of their applications is related to bearings and windings. The following components are responsible for the majority failures in wind generators using induction generators [18]: • Bearings; • Winding failures in both the stator and rotor; • Rotor cages and leads; • Slip rings; • Magnetic wedges in the stator; • Cooling plant.
The size of the generator also influences which components fail as manufacturers try to optimise designs for various power requirements and wind conditions. The three major faults identified across various generator ratings are summarised in Table 1 [18]. Failure modes 1-3 represent the major faults ranging from most dominant to less dominant failure modes.
Rotor winding problems in small to medium generators are caused by conductor and banding failures while stator winding problems are related to contamination and maintenance issues. Failures of bearings, stator windings and rotor windings contribute more than 80% of the total failures in induction machines [18]. This translates to a failure distribution for bearings (41%), stator (37%), rotor (10%) and other faults (12%).

Stator windings failures
Main ageing mechanisms causing insulation failure of rotor and stator windings are thermal effects, vibration stresses, voltage spikes from the power converters and material degradation because of temperature changes. Environmental conditions can accelerate insulation degradation and moist operating conditions should be avoided. The occurrences of short circuits escalate with time and are caused by overheating; ageing and vibrations while open circuits result from termination problems or damaged windings. Voltage spikes caused by power converters in variable speed induction machines are also responsible for winding insulation failures. Because of very fast switching times in the PWM circuit, multiple reflected waves travel between the converter and the machine. Impedance differences between the output cable and the generator create these reflected waves which become more severe as the cable length increases and the switching frequency of the semiconductors increases [16]. The reflected waves occur at the front of the voltage wave and can reach magnitudes up to 2.5 kV for a generator rated at 690 V.
Winding insulation design requirements should comply with the following conditions as a minimum [19]: • Design life and mean time between failure (MTBF) of 20,000 h under accelerated ageing tests conditions; • Rated voltage capacity test plus 10-15% and then 2.5 kV peak-peak "withstand" voltage after the ageing test; • Initial partial discharge voltage test higher than the maximum peak-peak voltage after ageing test.

Stator wedge failures
Conductive wedges are used to keep the stator windings in the core and secure it against mechanical forces and vibrations. It also improves efficiency, limits magnetic flux distortion, inrush currents and increases the thermal properties of the machine [19]. There are instances of exposed stator coils where the wedges came loose and fell out of the stator slots. Figure 4 shows an example of this [18]. The rotating magnetic field is the main cause that stator wedges become loose and this can result in grounds faults and or damage to stator coils.

Bearing failures
Bearing failures contribute a significant amount towards wind generator failures and common causes are incorrect installation or misalignment as well as poor lubrication, overheating and mechanical breakage [15]. Bearing wear through normal ageing together with "indentation, smearing, surface distress, corrosion", electric current flow and overloading can also lead to bearing failure. It is recommended that maintenance practises comply with bearing lubrication schedules to reduce bearing failure rates. Damaged bearings can cause excessive vibrations of the rotor, which disturbs the uniform shape of the air gap between the stator and rotor. If not picked up these vibrations can cause contact between the stator and rotor, which will lead to catastrophic damage of both components.

Maintenance strategies
Maintenance is the activity that assist production operations with optimum levels of availability, reliability and operability at the lowest cost. Maintenance strategies can be broadly classified into three main strategies namely breakdown maintenance, preventive maintenance and corrective maintenance.
Currently all three maintenance strategies or a combination of them are used in the wind industry depending on the age of the wind turbine. Breakdown maintenance is the typical "run to failure" approach, preventive maintenance is done before a problem leads to a failure and corrective maintenance is scheduled to rectify existing plant specific problems. Preventive maintenance is further classified as use-based or predictive maintenance and the former is performed at predetermined instances which is related to the age of the equipment or at certain expired calendar times [20]. Use-based maintenance can lead to over or under maintenance as resources are not optimally used [21].
Condition based maintenance has the capability to estimate the remaining useful life of equipment in order to implement the best maintenance strategy before failure occurs. Doing inspections or monitoring certain variables using sensors like temperature, voltage, current, noise or vibrations to determine the condition of the equipment can do it. The process of condition monitoring can be online or offline and is made up of three primary steps [22]: • Data acquisition-gathering data that is pertinent to equipment health; • Data processing-analytical verification, comprehension and refinement of collected data; • Decision-making-deciding which maintenance strategy is ideal to ensure long term plant health at the lowest cost.

Condition monitoring techniques in wind turbines
The application of condition monitoring in WEC systems is ideal as concluded by [20]. Several condition monitoring techniques like oil analyses, vibration analysis, electrical effects, acoustic emissions, ultrasonic methods, radiographic inspections, strain measurements, thermography, temperature measurements, shock pulse method and equipment performance are used as discussed by [21]. Current wind turbine condition monitoring focus on critical equipment like the gearbox, generator and main bearing, which are high, cost components and cause long downtimes.
Vibration analysis is the most common condition monitoring method used in wind turbines although its ability to detect electrical faults could be limited. Its effectiveness in direct driven or other modern wind turbine concepts is also questionable. Probabilistic measures in addition to data received from sensors are required for a more precise determination of the equipment condition as the operating nature of wind turbines is stochastic.

Generator stator windings condition monitoring
Accurate condition monitoring techniques of stator winding faults are required as it is the second largest failure mechanism in generators. Shorted windings cause the most damage in the machine as it produces additional heat in the windings, which further reduce the design life of winding insulation material. These faults originate as undetected inter turn faults that gradually isolate multiple turns or when an arc exist between two points on a winding. Detection of inter turn winding faults is complex because the machine can still operate without any obvious fault signatures. These faults can rapidly evolve and cause complete failure of the winding and damage to the machine.
Temperature monitoring is considered as one of the oldest conditioning monitoring techniques and is commonly used in wind turbines to detect abnormalities in bearings and generator windings [22]. High stator winding temperatures under normal operating conditions is generally a sign of possible winding damage. Other factors such as high ambient temperatures or problems with the generator cooling have a similar effect. Insulation life is reduced by 50% for every 10°C increase in temperature as oxidation rates increase above certain temperature limits. Oxidation makes the insulation material fragile and some parts of the winding might experience delamination.
Majority of modern wind turbines are designed with condition monitoring systems, which incorporates a Supervisory Control, and Data Acquisition (SCADA) system. One of the functions of the SCADA system is to capture operating parameters from the wind turbine. Various mechanical and electrical sensors measure operating and performance data, which are recorded on a computer system for analysis. The SCADA data is typically recorded and stored by the computer system. Analysis of SCADA data for fault prognosis is seen as cost effective maintenance strategy although its data content does not reveal abnormalities in a clear and explicit manner.
Proper data analysis and modelling techniques are required to identify and understand component degradation. This will enhance component health predictions and guarantee the implementation of optimum maintenance strategies. According to [21] physical models depend on detailed understanding of failure modes whereas data driven models involve extensive data requirements to validate continuous degradation processes.
The application of SCADA data as a condition monitoring technique in the wind industry has become a prevalent research topic. These methods usually consist of various physical and statistical models of a particular system. Harmonics in line currents and magnetic flux, torque pulsations, reduced mean torque, high losses, abnormal winding temperatures and reduced efficiency are all indicators which highlight problems in induction machines [23]. The literature reveals that inter turn faults and asymmetries in the rotor or stator are the main focus of most condition monitoring techniques [23]. Electrical signature analyses of the stator parameters such as current, voltage and power under steady state operating conditions prove to be successful in sensing winding faults as well as other failure mechanisms.

SCIG design parameters
The SCADA data was obtained from Siemens, the operator for the electrical utility, Eskom's Sere wind farm in the Western Cape, South Africa. This is a 100 MW wind farm with a total of 46 x 2.3 MW turbines. The SCIG in this study has the following design parameters as shown in Table 2.

Prediction model for stator winding temperatures
SCADA data from two wind turbines is used to model generator winding temperature between minimum and maximum output power which corresponds to 0-2.4 MW. Data for wind turbines (WTs) number 4 and number 38 were collected from June 2015 until October 2015. The maximum designed generator stator winding insulation temperature for the wind turbines is 155°C, which corresponds to a Class F rated insulation material.
Multiple linear regression analysis is a statistical method that estimates or model relationships between different variables that are linked in a nondeterministic way [24]. It uses more than one independent variable compared to linear regression, which has only one independent variable. The stator winding temperature prediction model is designed using Stepwise Regression (SR) in Microsoft Excel. The model output also highlights which variables have the biggest influence on stator winding temperature. Modelling of stator winding temperature in this study equates to the generator temperature.
SR performs multiple regressions that add or remove independent variables at each step based on performing a partial F-test on the new independent variable. The F-test calculates if different variables are mutually important and that their output has a significant effect on the dependent variable. It selects the independent variable with the highest correlation with the dependent variable initially, then adds or removes independent variables in the model based on calculating its F-test value, which should be higher or at least equal to the previous value.
When there are two independent variables, the F-test value is calculated using [24]: where F 1 , F-statistic of independent variable x 1 ; SS R , sum of squared residuals due to regression; MS E , mean square error for the model containing x 1 and x 2 ; β 0 , β 1 , β 2 , slope coefficients.
The following assumptions are made to establish how a linear regression model fits the data [24].
• The residuals should be uncorrelated random variables with a zero average and constant variance.
• The residuals should be normally distributed.
• The order of the model is correct and that the data being investigated has linear characteristics.  A linear regression model where the dependent variable Y is related to k regressor (independent) variables has the form [24]:
The model therefore provides an acceptable estimation of the dependent variable across certain ranges of the independent variables because the real relationship between them cannot be determined [24]. Regression coefficients represent the rate at which the dependent variable changes in relation to individual independent variables.
They are calculated in SR using the least squares method represented by the following matrix notations: The calculation of the predicted value of y is obtained by [24]:

Evaluating the adequacy of the model
The SR model needs to satisfy certain criteria to justify whether its linear function is sufficient to predict generator stator winding temperature over the proposed output power range of the wind turbines.
The following parameters are selected as variables in the SR model: • Ambient Temperature (AT) The AT refers to the outside temperature conditions. The outside air is used to cool the generator as well as the inside of the nacelle. This independent variable is labelled as "Mean Ambient Tmp" in the SR model.

• Nacelle Temperature (NT)
The temperature in the nacelle affects the generator operating conditions directly as well as other components. High nacelle temperatures cause the generator to run hotter which affects its performance. The nacelle temperature is not regulated. This independent variable is labelled as "Mean Nacelle Tmp" in the SR model.

• Generator Output Power (GOP)
The stator winding temperature is related to the square of the phase current flowing in the windings. Therefore the higher the generated output power, the hotter the windings become. This independent variable is labelled as "Active Power" in the SR model.

• Stator Winding Temperature (SWT)
The stator winding temperature is the dependent variable, which the model regresses. Having knowledge which independent variable has the highest influence on stator winding temperature is important to optimise the generator operation. The SWT is predicted by the model based on the values of the independent variables AT, NT and GOP. The dependent variable is labelled as "Mean Winding Tmp U1" in the SR model.

Significance of regression model
The first check if the SR model is acceptable is to evaluate the value of the Coefficient of Determination R 2 (0 ≤ R 2 ≤ 1), which also means the goodness of fit test. It shows the proportion of the variation of the dependent variable explained by the independent variables. A value of R 2 close to 1 is ideal but it does not always imply that the model fits the data best or that future predictions by the model are perfect. It is affected by the number of independent variables, scatter or distribution of the independent variable(s) as well as adding higher polynomial values of the independent variable(s) in the model [24]. R 2 can be calculated using: where SS R , regression sum of squares; SS T , total sum of squares.
The F-test based on an F-distribution confirms the significance of the regression model. The following hypothesis is valid: H 1 : β j ≠ 0…..for at least one j.
The F-critical value of the F-distribution is calculated in Microsoft Excel using the function: F.INV, calculates the inverse of the F-distribution; Probability, 95% confidence level; DoF 1, degrees of freedom. Number of independent variables; DoF 2, degrees of freedom. Number of residuals.
If F-critical >value needs to be larger than F-model value for the Null Hypothesis H 0 to be rejected which confirms that the model fits the data adequately with a 95% confidence level. Additionally the regression coefficients (β 0 -β 3 ) in this model should all have p-value less than 0.05, which also confirms that H 0 can be rejected.

Analysis of residuals
The residuals also called the errors, are defined as the difference between the actual observation and the predicted observation from the model: where e i , residual or error; y i , actual observation; y ̂ i , -predicted observation from the model.
By plotting the residuals it can illustrate how the model best fit the data and show up any deviations from the previous assumptions made on applying linear regression. To check for normality in the residuals of the model, a normal probability plot of the residuals can be obtained in Microsoft Excel. A plot of the residuals versus the predicted observation y ̂ i can also be retrieved in the same manner. This plot has to show the residuals outlined in a horizontal distribution about the zero average without any distinctive pattern for the model to be adequate [24]. Residual plots can have one of the four general outlines as shown in Figure 5. Figure 5(a) shows that the model is ideal, whereas the other plots (b-d) contain anomalies which show that the model could be inadequate for the data sample.
In this study a normal probability plot of residuals versus their standardised Z-scores is given. The procedure to construct the normal probability plot is as follows: • Obtain the normal residuals from Microsoft Excel (SR); • Rank each of the residuals; • Calculate the percentile or proportion of the residuals that is smaller than a particular residual using: Where, n is the number of observations; • Calculate the Z-score using Microsoft Excel Normal Distribution function: • Print a scatter plot residuals vs. Z-scores.
This method is considered an improvement of the normal probability plot of the residuals in Microsoft Excel. If the residuals are normally distributed, 99.72% of the data will fall within 3 standard deviations of the mean. Therefore we can conclude that Z-score values outside these ranges do not have the same characteristics as the rest of the data and are possible outliers.

Intrinsically linear models
Linear regression can also be applied to investigate nonlinear characteristics between variables. Instead of using a straight line, linear regression has the functionality to fit curves to data which could be more appropriate for nonlinear conditions. In this case transformation of the dependent and or independent variables are required.
Intrinsically linear models or curve fitting the data can be done through polynomial regression where the independent variables are transformed in consecutive powers i.e. X, X 2 , X 3 etc. Polynomial regression is used to detect any nonlinearity between the independent and dependent variables. Therefore the 2nd and 3rd powers of all three dependent variables AT, NT and GOP together with the linear values are used in the SR model.
A cubic polynomial with one independent variable has the following form: If we set x 1 = X, x 2 = X 2 , x 3 = X 3 then Eq. (8) can be rewritten as: This is a multiple linear regression model similar to Eq. (2).

Statistical model analysis
Generator output power followed by nacelle temperature affects stator winding temperature the most as shown in Table 3.
This is expected as higher generated output power, cause more current flow through the windings and more heat is generated which is proportional to the square of the current. The nacelle temperature represents the ambient temperature of the generator and therefore also has a big impact. Insulation material of electrical machines is generally designed for an ambient temperature of 40°C and higher temperatures degrades the winding insulation material. Temperatures higher than 40°C in the nacelle can therefore cause the generator to shutdown to maintain the temperature raise limit of the insulation, which is 105°C for Class F. The temperature rise limit is calculated by subtracting the ambient temperature from the hot temperature of the insulation, which is 155-40°C. The outside temperature referred to as ambient temperature in the SR model is used for cooling of the stator windings. The outside air temperature has a limit of 45°C before the controller shuts down the machine to prevent overheating of the stator. Dirty or blocked air filters can also affect effective cooling. These can be checked during routine maintenance activities and replaced as required.
According to the SR model its ability to predict stator-winding temperature for WT4 and WT38 can be obtained using:  (13) where AP, active power (Generator output power); NT, nacelle temperature; AT, ambient temperature.
It can be concluded that the location and wind resource of the two turbines have a significant impact on the stator winding temperature. Environmental conditions could be less ideal for one turbine, which effects the cooling of the nacelle and generator. Access to optimum wind conditions means a higher capacity factor and also higher average stator winding temperatures. The level of maintenance also needs consideration as one turbine can be exposed to severe dusty or moist conditions.

Adequacy of the SR model
Linear regression models such as SR need to meet certain criteria for accurate modelling of relationship between variables. It is generally assumed that these relationships between the variables are linear for the modelling to be successful.

Significance of the model
The coefficient of determination or R 2 indicates how well the independent variables explain the variability in the dependant variable. The SR model calculated R 2 = 0.911 for WT4 and R 2 = 0.9234 for WT38. Although the value of R 2 in both models is high, the ability of the models to predict stator-winding temperature accurately is not guaranteed. It does however indicate that GOP, NT and AT has a huge impact on the stator winding temperature.
The F-test (value) confirms if the regression is significant. If the F-test falls to the left of the F-critical value in the F Distribution, the Null Hypothesis is accepted which means the regressors have no influence on the depended variable. If F-test > F-critical, the Null Hypothesis is rejected. The ANOVA Tables of both SR models in Table 4 shows that the regression is significant which means the models for both wind turbines are adequate. In Table 4 the p-values of the regressors are all less than 0.05, which also confirms the significance of the model.

Using intrinsically linear models
The use of intrinsically linear models allows linear regression to model nonlinear relationships through the transformation of the variables. In this study a 3rd degree polynomial regression model was applied to establish if it predicts stator-winding temperature more accurately than the straight-line model. The results of the polynomial regression models of WT4 and WT38 are shown in Table 5.
The value of R 2 in the polynomial regression models show an improvement of less than 0.1% compared to the SR models. Therefore both models explain the variation in stator winding temperature by the independent variables with the same accuracy. The F-test of the SR model is much higher than the polynomial regression model, which means the SR model is more significant. The significance of the independent variables as determined by SR indicates that the linear independent variables are more important than the transformed independent variables. The SR model is simple, easy to implement and performs better than the polynomial regression model according to the various tests that were done. Considering the complexity and timeous development of the polynomial regression model, its application in this study is not justified.

Performance of the SR models
The regression model in this study is applied to identify abnormal high stator winding temperatures in the induction generator. Stator temperature SCADA logs of 10-minute intervals during November 2015 will be used as input to both wind turbine models. High stator winding temperatures outside the normal operating range of the generator can possibly be attributed to: • Physical damage of the stator winding; • Inadequate maintenance or cooling; • Incorrect measurements, • Equipment failure or • Adverse operating conditions.
In WT4 where stator winding temperatures are below 40°C, the predicted temperatures by the SR model are higher than the actual temperatures. This over estimation can also be observed at the higher temperature ranges although the prediction errors are smaller. The SR model for WT38 has similar performances when the stator winding temperatures are below 40°C but has frequent under estimations at higher temperatures. The performances of the SR models for WT4 and WT38 are shown in Figures 6 and 7. Both models are able to predict the temperature trends in an acceptable manner and show very good accuracy when the stator winding temperatures are between 50°C and 90°C.    The SR model deficiencies at the two extreme ends of the data distribution are possibly caused by nonlinear behaviour. These data points fall outside the three standard deviations of the normal distribution of temperature ranges as shown by Figure 7. There is a clear deviation by these data points away from the straight-line function used in SR model. Because wind turbines produces power below rated capacity the majority of the time, very low power regions just above the cut-in wind speed can result in different stator winding temperatures even if the environmental conditions are the same. These represent the stator winding temperatures below 40°C where the SR model performances are inadequate. Above rated speeds the wind turbine control system regulates its output power, which requires predominantly nonlinear control strategies. The rotor blade aerodynamics are changed rapidly to prevent excess power generation and loading on the wind turbine.

Conclusion
The aim of this study was to develop a new condition monitoring technique for stators in SCIGs.
A statistical model was developed using SCADA data to estimate the relationships between winding temperatures and other variables. Predicting faults in stator windings are challenging because the unhealthy condition rapidly evolves into a functional failure. The analysis of SCADA data as a condition-monitoring tool for stator windings has been proven to be adequate. Active power, ambient and nacelle temperatures showed that the effects on stator winding temperature are significant as calculated by the statistical model. The capability of the model is proven in the analysis of the normal probability plots of the residuals, F-test and the value of R2. The statistical model performs very well when the wind turbine produces power at a constant rate below rated capacity. This operating region of the wind turbine has a more linear characteristic. Since a wind turbine spends the majority of the time in this operating region, the model can definitely be used as a conditioning monitoring tool for the SCIGs at Sere and similar wind farms.

Author details
Ian Kuiler, Marco Adonis* and Atanda Raji *Address all correspondence to: adonism@cput.ac.za Cape Peninsula University of Technology, Cape Town, South Africa