Automated Fault Detection System for Wind Farms

Due to the pollution and health hazards of nonrenewable resource-based energy generation systems, now focus is on the use of renewable resources. This chapter aims as providing an automated fault-detection system for increasing the robustness of offshore located wind farms. The method is based on the use of flexible threshold for calculation of the collected sample values. A fuzzy inference system (FIS) is designed for the automatic real-time fault detection system named as FIS-based fault detection system (FFDS) for offshore wind farms. The method uses the concept of combination-summation (CS) and flow-directions to determine the extent of fault occurrence in the wind farm. Based on the working conditions of the wind farm, preventive or corrective measures are suggested to the remote observer. The performance of these methods is evaluated on MATLAB.


Introduction
Wind energy is freely available everywhere in abundance. It is a renewable resource that will never get exhausted. This energy if properly utilized can lead to greener and safer energy generation compared to coal generated electricity. It is also one of the lowest priced renewable energy technologies available nowadays [1].
In 2015, energy produced in the United States was about 91% of U.S. energy consumption due to less import of petroleum [2]. Majority of energy production being due to fossil fuels, i.e., coal, petroleum, and natural gas. According to Ref.
However, using natural gas for energy generation has several issues. First, leakage of methane during drilling and extraction of natural gas from wells and its transportation in pipelines [3]. Methane is stronger than CO 2 at trapping heat and causing global warming. Methane emissions range from 1 to 9% of total life cycle emissions. Natural gas-fired power plants contribute to acid rain and ground-level ozone, both of which can damage forests and agricultural crops [4].
The present renewable energy-based generation plants such as offshore wind farms are not entirely capable of fulfilling the future needs of the society. Due to this reason, wind-based energy generation is still not very popular and is unable to replace coal or natural gas-based energy production. The monitoring and control systems used are now obsolete and new methods are required.
The control and maintenance actions require complete human interference, and it is a timeconsuming process. These challenges lead to extra cost on emergency maintenance, component screening, and physical designs.
Wind turbines consist of several components and are subject to various failures of electrical and mechanical nature [5], e.g., imbalance in electrical controls, gearbox, and yaw system. Some are more frequent and cause larger downtime of the whole system. These faults cause rotor imbalance, unbalances and harmonics in air gap flux, increase torque pulsation, and increase losses and reduction in efficiency by directly affecting the power, current, and voltage output of the generator. Therefore, monitoring of these critical components should be on the highest priority so that plant downtime can be reduced. The offshore located wind turbine generator system requires monitoring of parameters such as sea-surface temperature, wind velocity, water salinity, wave heights, and strain measurement [6,7]. However, the monitoring of wind turbine parts has several practical difficulties, e.g., limited accessibility, large size and complex geometry of the blades, effect of environmental parameters, etc.
Several papers have discussed methods to detect faults in wind farms, e.g., gearbox fault detection using discrete wavelet transformation [8]. Similarly, high frequency vibration data collected from gearbox testing were used to gearbox fault detection in Ref. [9], which included k-means clustering algorithm. The drawbacks of this system are the assumption that the underlying process is stationary and the time factor is eliminated. Brandão et al. [10] discuss neural networks for fault forecasting of wind turbine gearbox. Badihi [11] presents protection of against the decreased power generation caused by turbine blade erosion and debris on the blades. A fault diagnosis method based on signal analysis and recognition is presented [12]. Time-frequency representations have been proposed in the literature [13][14][15]. These techniques have high complexity and poor resolution [16]. One approach used Hilbert transformation in a doubly fed induction generator-based wind turbine [17].
Hence, there should be some automated systems to remotely monitor these parameters and notify about faults in the system. By using wireless sensor networks (WSNs), we can ensure reliable operation of wind farm. This helps in reducing manual interference and wind farm can be completely monitored for 24 hours every day. The following sections discuss how this can be performed.

Flexible threshold selection scheme
In the past, the monitoring systems used constant threshold to record the data independent of time of the day or month. The constant threshold is calculated as the average of the dataset. As a result of several observations, it can be concluded that such a scheme does not give accurate results if there are changes in the scene or environment pertaining to parameters under consideration, e.g., the temperature of air during daytime is higher compared to night time. Similar variation is true during different seasons, e.g., average temperature during winter season is different from the average temperature during summer season. Hence, if constant or fixed threshold value is chosen for the entire dataset, it is likely to give unoptimized results for both the scenes. Moreover, if the chosen fixed threshold value is very high, it will result in many missed detections, and if it is very low, it will lead to many false positives.
Hence, threshold value should be selected using an appropriate scheme that allows dynamic change in the threshold value to accommodate the variations in time of data recording. This method gives better performance in terms of sensed parameters. The threshold provides a reference for finding values that are higher or lower than the threshold both of which may indicate health failures in the wind farm.
The WSN topology in wind farm consists of tower fixed nodes [18]. These are wireless sensor nodes attached to the tower nodes that can continuously sense the parameter values (samples) throughout the day and night. This information is converted into data packets that are transmitted to the sink node by taking multiple hops through the scattered sensor nodes. The sink node is located at the end of the wind farm. Every tower-fixed node is allocated a fixed local unique address called as RTN id (row-tower-node), which is transmitted as an identification of the originator of packet.
Suppose X D is a set of samples collected by the tower-fixed sensor nodes during the day period, where and Y N is a set of samples collected by the tower-fixed sensor nodes during the night period, where The samples collected during the night period.
The decision of choosing a new threshold for the dataset depends on the correlation between the datasets. The correlation is the measure of the similarity content between the two datasets. If the correlation of the two datasets is high, it means that the two datasets correspond to the similar time duration of the collected data and hence eliminate the need for calculating another threshold for the new dataset. Similarly, low correlation is indicative of large variations and necessitates the calculation of new thresholds for better data interpretation.
The correlation between the two datasets RðX D ,Y N Þ can be expressed as [19]: where X i and Y i are the values of datasets X D and Y N at "i" time instant. X m and Y m are the average values of the datasets, X D , Y N , and N is the number of samples in each dataset which should be the same for X D and Y N . Figure 1 shows the scatter plot for wind speed dataset and its computed correlation coefficient. Table 1 shows the degree of similarity between the datasets depending on the calculated correlation coefficients.
To calculate the thresholds, T X and T Y , the method prefers geometric mean of the datasets with "N" samples, instead of arithmetic mean given as below: We consider geometric mean because the datasets are characterized by a majority of similar The use of a geometric mean normalizes the range being averaged, so that no range dominates the weighting, and a given percentage change in any of the properties has the same effect on the geometric mean. Table 2 shows the calculated threshold values for the flexible threshold method and the mean method (MM).
Furthermore, the method requires ranging of the infinite sample values into discrete levels without changing the meaning of information using quantization. To do this, first, the distance matrices d X and d Y are calculated as below where these matrices represent the values of X D and Y N after thresholding where, Finally, the quantization is performed on the above values independently with respect to their maximum and minimum values. This can be expressed as QαðmaxðdÞÀminðdÞÞ where Q is the number of quantization levels for distance matrix d. The present scenario considers five quantization levels 0, 1, 2, 3, and 4 calculated as from Eq. (10). If the variation in the datasets and the total number of samples in it is large, the number of levels may also increase for better accuracy in fault prediction. This would lead to increase in the size of transmitted packets because the number of bits required to encode each level into binary will also increase. This would cause greater energy depletion in packet transmission, reception by the sensor nodes thus lowering the WSN network lifetime. Thus, the choice of the number of quantization levels should be able to provide accurate fault prediction without compromising the network lifetime. Each level carries a significant and distinct meaning regarding the sensed value, e.g., level "0" indicates that there is no difference between the sample value and the threshold. Similarly, levels "2,"" 3," and "4" indicate increased levels of variation. Figure 2 depicts the above method.

Simulation results and discussion
The flexible threshold selection (FTS) method is compared with the mean method (MM). We have considered a total of 72 samples collected during daytime and nighttime for wind speed. The sampling frequency is 1 sample per 10 minutes over a period of 12 hours daytime and 12 hours nighttime. As observed by the simulation results, the flexible threshold method gives a better performance and accurate results for parameter monitoring. Table 2 depicts the range of collected samples and their calculated thresholds using flexible threshold selection (FTS) and mean method (MM). As observed, the datasets for source 1(X) and source 2(Y) have small variations. If both the datasets from the sources are instead, considered to be one single dataset, the variation of values is large. This causes the static threshold selected using the MM method tends to be biased toward the higher values. However, this is not the case with dynamic threshold. We can calculate different thresholds for datasets collected at different times, which will adapt with the true variations of the values known to nature. Hence, the flexible method is unbiased toward any extreme values and gives a balanced view of the data under consideration. The MM method does not consider computing new threshold every time but it remains unchanged for any dataset making it an unrealistic choice.
Two different thresholds for both the sources find the correlation between them by considering them individually.
It is clear from the above discussion that the choice of appropriate threshold has a large impact on the quantization levels. The MM method for threshold selection is only able to detect large variation in the values, i.e., levels "2" and "3" whereas in the FTS method the detected levels have a distributed pattern, i.e., it can detect both small and large variations. Also, the levels detected by the FTS method is consistent compared to the MM method, which provides a very accurate status of the conditions of the wind farm. Figure 2 shows that the FTS method suggests a majority of level "0" occurrences over other levels unlike the MM method where the majority is level "2" occurrences. Thus, it can be concluded that the flexible method is unbiased toward the larger values in the datasets and Fault Diagnosis and Detection 318 hence provides better accuracy of monitored parameters. The graph in Figure 2 is generated from real-time data from the Burbon-Nysted wind farm, Denmark.  [19,20]. (i) Collected samples of maximum and minimum wave heights, (ii) quantized levels using the FTS method, and (iii) quantized levels using the MM method.

Fault detection scheme
This system is called a fuzzy inference system (FIS)-based fault detection method (FFDS). This is an automated system, gives precise information of the health condition of the wind farm to the remote observer, and gives alarms for taking corrective or preventive measures for maintaining the reliability of the farm.
The observer needs not observe all the properties of the parameter values as a single signal, rather, the degree of similarity between the values finds the basis for choosing a new threshold. This is a simple method that helps in finding real-time data for monitoring purposes. These data when analyzed can predict all possible fault occurrences.

Automatic fault diagnosis method
The fault detection scheme uses combination-summation (CS) and flow directions (FDs) to design the FIS [9]. This aids to derive significant information from the quantized levels about fault event occurrence in the monitored data samples of offshore wind farm. The received quantized levels corresponding to monitored data samples represent the surrounding environmental conditions in offshore wind farm. The received values being fuzzy in form use FIS to provide accurate interpretation of the environmental conditions.
For determining the CS and FD, five consecutive received levels are considered in one period of time "T" where which represents five consecutive time intervals. Depending on the permutation and the summation of the levels, fuzzy logic is used to predict fault occurrences. For example, consider the levels received at "t" times are l t 1 ,l t 2 , l t 3 , l t 4 , and l t5 then the summation of levels is where the range of CS is [0-20]. The numeral 20 indicates constant occurrence of level 4, i.e., 44,444. The obtained levels can be either repeating or nonrepeating, e.g., 22,222, 31,224, 01,234, and 43,210 as depicted in Figure 3. The CS for these levels is 10, 12, 10, and 10. As observed, this alone is not sufficient for fault prediction. Fault prediction can give accurate results if the corresponding FD is also considered with the values. Here, FD means whether the received levels are in state of increasing, decreasing, remaining stable, or varying constantly. For example, if the CS is 10, it has multiple values, but if FD has raising edge, it means the combination suggests fault event occurrence in the future and calls for immediate preventive action. If the levels increase constantly, then FD is considered to be raising edge shown by arrow in upward direction (Figure 3).
The remote observer is able to predict meaningful information from these received quantized levels based on the fuzzy-logic rules as presented in Table 3.
Fault Diagnosis and Detection 320 In this system, the fuzzy-set "F" can be described [7,8] as  where ω is the combination-summation and flow-directions of the received levels by the remote observer, m(ω) is the membership function for the received level and U is the universal set representing the set of all levels, as shown in Table 3. The membership function alerts the remote observer whenever the probability of fault occurrence becomes high (Figure 4). It can be formulated with risk R i as This method is very simple to implement and efficient in enhancing the WSN lifetime. Table 4 shows the comparison for wind speed data computed from Figure 2. The results confirm the belief that FFDS is able to predict accurate conditions of the wind farm. As shown, it predicts normal operation of the farm, whereas MM is only able to detect extreme values of level "0" and "3." This leads to false alarm for corrective measures due to inaccurate calculations. Thus, it can be concluded that the FTS method is unbiased toward the larger values in the datasets and hence provides better accuracy of monitored parameters. Table 5 provides the details of simulation parameters used in the study. The sink node is located at the farthest point in the field. Figure 5 shows the round in which all the nodes in the area become dead (network-lifetime).

Simulations and discussion
The method FTS gives an accurate view of the parameter values in real time and the threshold selection does not indicate any biasing toward a particular value, which is confirmed from Table 4. Also, as observed from Figure 5, the network lifetime of WSN network is also increased by nearly 10 times with a packet size of 23 bits.    These observations conclude that the flexible threshold selection method improves WSN network-lifetime by increasing energy savings with respect to earlier methods. Moreover, it is suitable for automated monitoring for all area sizes, large number of nodes and if amount of information to be transmitted is large.

Conclusions
This chapter discusses the flexible threshold selection method for efficient environment monitoring of the offshore wind farm. It uses degree of similarity between the previous and the current datasets for calculating geometric mean-based flexible threshold as it does not get biased due to extreme values in the datasets. The method is compared with the static threshold mean method of threshold selection and the performance is seen to be enhanced.
Also, the automated fault detection method is presented in this chapter. This is a simple method that uses small integer values for indicating faulty conditions of the wind farm in real time. The method uses fuzzy inference system that takes integer values as input and gives output in the form of fault status prediction of the farm. Based on these predictions, the system suggests corrective or preventive measures. The method is proved to be very accurate in predicting the fault condition based on sensed parameter values. In addition, this method allows reduction in the size of the transmitted data packets to 23 bits, which help in increasing the overall network lifetime of the WSN system deployed in the wind farm. Figure 5. Round in which all the nodes become dead [21].
Fault Diagnosis and Detection 324