Recurrent Self-Organizing Map for Severe Weather Patterns Recognition

Weather patterns recognition is very important to improve forecasting skills regarding severe storm conditions over a given area of the Earth. Severe weather can damage electric and telecommunication systems, besides generating material losses and even losses of life (Cooray et al., 2007; Lo Piparo, 2010; Santos et al., 2011). In especial for the electrical sector, is strategic to recognize weather patterns that may help predict weather caused damages. Severe weather forecast is crucial to reduce permanent damage to the systems equipments and outages in transmission or distribution lines.


Introduction
Weather patterns recognition is very important to improve forecasting skills regarding severe storm conditions over a given area of the Earth. Severe weather can damage electric and telecommunication systems, besides generating material losses and even losses of life (Cooray et al., 2007;Lo Piparo, 2010;Santos et al., 2011). In especial for the electrical sector, is strategic to recognize weather patterns that may help predict weather caused damages. Severe weather forecast is crucial to reduce permanent damage to the systems equipments and outages in transmission or distribution lines.
This study aimed to evaluate the temporal extensions applicability of Self-Organizing Map (Kohonen, 1990(Kohonen, , 2001 for severe weather patterns recognition over the eastern Amazon region, which may be used in improving weather forecasting and mitigation of the risks and damages associated. A large part of this region is located at low latitudes, where severe weather is usually associated with the formation of peculiar meteorological systems that generate a large amount of local rainfall and a high number of lightning occurrences. These systems are noted for their intense convective activity (Jayaratne, 2008;Williams, 2008).
Convective indices pattern recognition has been studied by means of neural network to determine the best predictors among the sounding-based indices, for thunderstorm prediction and intensity classification (Manzato, 2007). The model was tested for the Northern Italy conditions (Manzato, 2008). Statistical regression methods have also been used for radiosonde and lightning observations data obtained over areas of Florida Peninsula in the U. S. A. (Shafer & Fuelberg, 2006).
These important contributions to this area of study have shown that the applications should be pursued to find out the best predicting statistical tools. Moreover, the achieved skills are largely dependent on the hour of the sounding and the local climatic conditions. So far few studies have been carried out for the extremely moist tropical conditions, prevailing over the Amazon region, where the data for the case studies analyzed in this chapter were obtained.
In this context, the convective patterns recognition for the Amazon region may be used in local weather forecast. These forecasts are subsidiary elements in decision-making regarding preventive actions to avoid further damage to the electrical system. These outages lead to productivity and information losses in the industrial production processes, which contribute negatively to the composition of the electric power quality indices (Rakov & Uman, 2005).
This study sought to recognize severe weather indices patterns, starting from an atmospheric sounding database. It is known that the atmospheric instability may be inferred from available radiosondes atmospheric profiling data. The stability indices drawn from observed atmospheric conditions have been used to warn people of potential losses (Peppier, 1988). Thus, this work analyzed the capacity of the Self-Organizing Map (SOM) and two of its temporal extensions: Temporal Kohonen Map and Recurrent Self-Organizing Map (Chappell & Taylor, 1993;Koskela et al., 1998aKoskela et al., , 1998bVarsta et al., 2000;Varsta et al., 2001) for clustering and classification of atmospheric sounding patterns in order to contribute with the weather studies over the Brazilian Amazon. The option of using this type of neural network was due to the fact that it uses only the input parameters, making it ideal for problems where the patterns are unknown.
Although there are other temporal extensions of SOM, such as recursive SOM -RecSOM (Hammer et al., 2004;Voegtlin, 2002), Self-Organizing Map for Structured Data -SOMSD (Hagenbuchner et al., 2003) and Merge Self-Organizing Map -MSOM (Strickert & Hammer, 2005), all these of global context, the option in this work was to apply local context algorithms, leaving to future studies the application of global context algorithms in this knowledge area. It is also important to refer the existence of the recent studies on the TKM and RSOM networks (Cherif et al., 2011;Huang & Wu, 2010;Ni & Yin, 2009).
In summary, with the original SOM algorithm and its extensions TKM and RSOM; stability indices data (Peppier, 1988); multivariate statistical techniques (principal component analysis and k-means); confusion matrix (Han & Kamber, 2006) and receiver operating characteristics (ROC) analysis (Fawcett, 2006); it was possible to evaluate the usefulness of these recurrent neural networks for the severe weather patterns recognition.
is the index (position) of the winner neuron, at time t.
The neurons of the SOM cooperate to receive future incoming stimuli in an organized manner around the winner neuron. The winner neuron will be the center of a topological neighbourhood where neurons help each other to receive input signals along the iterations of network training. Thus, after obtaining the winning neuron, its weights are adjusted to increase the similarity with the input vector, the same being done for the weights of its neighbours, by an update rule, according to equation 2: Where: Usually, the learning rate (t) is defined by equation 3: Where:  t is the number of iterations;   0 is the initial value of the learning rate (value between 0 and 1);   1 is the time constant.
The neural network decreases its ability to learn, gradually over time, in order to prevent the drastic change by new data, in the sedimented knowledge through several iterations. The time constant influences the network learning as follows: high  1 value generates long period of intensive learning.
The neighbourhood function in a SOM is a similar way to reproduce the interactions of biological neurons, which stimulate their neighbours, in decreasing order, by increasing the lateral distance between them. So, for the SOM, this feature is reproduced by the parameter h ib (t) that determines how each neuron will receive readjustment to gain the future input stimuli. The largest adjustments are applied to the winner neuron and its neighbours, and minors to the neurons further from the winner neuron, because this parameter decreases with increasing lateral distance. Normally it is used the Gaussian function to represent the rate of cooperation between the neurons, by equation 4: Where:  l ib is the lateral distance between neurons i and b;  (t) is the effective width of the topological neighbourhood.
Considering that the effective width of the topological neighbourhood will diminish with time increasingly specialized network regions will be built for certain input patterns. Over the course of iterations the radius of a neighbourhood should be smaller, which implies lower h ib (t) values, over time, thereby resulting in a restricted and specialized neighbourhood. For this, the exponential function is usually used, according to equation 5: Where:   0 is the initial value of effective width;   1 is a time constant.

Temporal Kohonen Map (TKM)
The SOM was originally designed for the static data processing, but for the dynamic data patterns recognition, it becomes necessary to include the temporal dimension in this algorithm. A pioneer algorithm in this adaptation was the Temporal Kohonen Map -TKM (Chappell & Taylor, 1993). It introduces the temporal processing using the same update rule of the original SOM, just changing the way of choosing the winner neuron. It uses the neurons activation history, by equation 6: Where:  Vi(t) is the neuron activation, at time t;  d is a time constant (value between 0 and 1); A TKM algorithm flow diagram is displayed in Figure 1. The current activation of the neuron is dependent on previous activation. In the original SOM, for each new iteration, the value 1 is applied to the winner neuron and the value 0 to the other neurons. This creates an abrupt change in the neuron activation. In the TKM occurs a smooth change in the activation value (leaky integrator), because it uses the previous activation value, as shown in equation 6. In the TKM algorithm, the neuron that has the highest activation will be considered the winner neuron, according to the equation 7: After choosing the winner neuron, the TKM network performs operations identical to the original SOM.
The basic differences between TKM and SOM networks are:  For the determination of the winner neurons in TKM is necessary to calculate and record the activation V i (t), while in SOM is necessary to calculate the quantization error The winner neuron in TKM is one with greater activation V i (t), while in SOM is one with smallest quantization error x(t)-w i (t).
Interestingly, for d=0 the TKM network becomes equivalent to SOM network used for static data (Salhi et al., 2009).

Recurrent self-organizing map (RSOM)
Another algorithm that introduced the temporal processing to the SOM was the Recurrent Self-Organizing Map -RSOM using a new form of selection of the winner neuron and weights update rule (Koskela et al., 1998a(Koskela et al., , 1998bVarsta et al., 2000;Varsta et al., 2001). This algorithm moved the leaky integrator from the unit outputs into the inputs. The RSOM allows storing information in the map units (difference vectors), considering the past input vectors, by equation 8: Where:  is the leaking coefficient (value between 0 and 1).
Considering the term x(t)w i (t) with the quantization error, the winner neuron will be one that has the least recursive difference, i.e., the smallest sum of the present and past quantization errors, according to the equation 9: In the RSOM the weights update occur according to the equation 10: www.intechopen.com Where: Thus, the RSOM takes into account the past inputs and also starts to remember explicitly the space-time patterns.
A RSOM algorithm flow diagram is exhibited in Figure 2.

Fig. 2. RSOM algorithm flow diagram
The basic differences between RSOM and SOM networks are:  For the determination of the winner neurons in RSOM is necessary to calculate and record the recursive difference y i (t), while in SOM the choice criterion of the winner neurons is the quantization error;  The winner neuron in RSOM is one with smallest recursive difference y i (t), while in SOM is one with smallest quantization error.
To note that if =1 the RSOM network becomes identical to a SOM network (Salhi et al., 2009). Angelovič (2005) discribes several advantages of the use RSOM for prediction systems. First, the small computing complexity, opposite to the global models. Then, the unsupervised learning. It allows building models from the data with only a little a priori knowledge.

Materials and methods
This section discusses the study area, data pre-processing and models training.

Study area and data pre-processing
The study used sounding data from weather station denominated SBBE, number 82193 (Belem airport), in the interval 2003 to 2010. The data were collected at the University of Wyoming website. The Figure 3 shows the station location, in the eastern Amazon Region, with their geographic coordinates (latitude: 1.38 S and longitude: 48.48 W).

Data characteristics
The sounding data are obtained by radiosondes transported by meteorological balloons. A radiosonde may determine various atmospheric parameters, such as atmospheric pressure, www.intechopen.com temperature, dewpoint temperature, relative humidity, among others, in various atmospheric levels. These parameters are used to calculate sounding indices that seek to analyze the state of the atmosphere at a given time. Figure 4 shows an example of sounding indices collected from a radiosonde launched on January 1, 2010 at 12 h UTC.

Fig. 3. SBBE station localization (Belem airport).
For the evaluation of the atmospheric static stability, used for thunderstorms forecasting, several indicators have been developed (Peppier, 1988). Some indicators admit as instability factors the temperature difference and humidity difference between two pressure levels; while others, besides these factors, add the characteristics of the wind (speed and direction) at the same pressure levels. There are also indices based on the energy requirements for the occurrence of convective phenomena. Some indices and parameters used for the thunderstorms forecasting are:

Data selection
A selection algorithm was used to identify all 24 available atmospheric indices from radiosoundings performed at 12h UTC (9h Local Time) in the period analyzed. Subsequently, the indices calculated with virtual temperature were eliminated, leaving only 18 indices.
After normalization of these 18 indices by the standard deviation, the principal component analysis was used to reduce the number of variables. It was found that among the 18 principal components, the first three represented about 70% (seventy percent) of the total variance. Four variables related to severe weather conditions had considerable numerical values of the respective coefficients in the linear combinations of these principal components. Namely: SWEAT index (SWET), Convective Available Potential Energy (CAPE), Level of Free Convection (LFCT) and Precipitable Water (PWAT). Therefore, these elements were defined as the input vectors variables of the SOM, TKM and RSOM networks. Figure 5 shows a variance explained for principal components, until the ninth principal component.

Fig. 5. Variance explained for principal components
The SWEAT index (or Severe Weather Threat Index) uses several variables (dewpoint, wind speed and direction, among others) to determine the likeliness of severe weather. The Convective Available Potential Energy (CAPE) is the integration of the positive area on a Skew-T sounding diagram. It exists when the difference between the equivalent potential temperature of the air parcel and the saturated equivalent potential temperature of the environment is positive. This means that the pseudo-adiabatic of the displaced air parcel is warmer than the environment (unstable condition). The Level of Free Convection (LFCT) is the CAPE region lower boundary. At this level a lifted air parcel will become equal in temperature to that of the environmental temperature. Once an air parcel is lifted to the LFCT it will rise all the way to the CAPE region top. The Precipitable Water (or Precipitable Water Vapor) is a parameter which gives the amount of moisture in the troposphere.

Data cleansing
The input vectors contained four variables: SWEAT index (SWET), Convective Available Potential Energy (CAPE), Level of Free Convection (LFCT) and Precipitable Water (PWAT). The vectors containing missing data from one or more variables were discarded. At the end of the cleansing process, a total of 1774 examples were obtained.

Data normalization
To reduce the discrepancies magnitude in the input vectors values, the min-max normalization was applied according to the equation 11

Clusters formation for evaluation of the models
For the evaluation of the applicability of SOM and two of its temporal extensions: Temporal Kohonen Map (TKM) and Recurrent Self-Organizing Map (RSOM) for the weather patterns recognition related to atmospheric instability factors, clusters were built using the K-means technique, which generated three clusters, containing 697, 484 and 593 examples, for the cluster 1, 2 and 3, respectively. Figure 6 shows the characteristics of the three clusters according to the four variables analyzed.
In describing some of the differential characteristics between the clusters, it is noticed that in cluster 1 CAPE and PWAT have their concentrations at low values, while in cluster 2 the concentration of the CAPE is in low values, however for PWAT the values are high. In cluster 3 both CAPE and PWAT have their concentrations at high values. Another distinctive feature among clusters is the gradual rise of the LFCT median value for the clusters 1, 2 and 3, respectively. It is also noticed that the cluster 1 has a SWET median value lower when compared with clusters 2 and 3. Fig. 6. Characteristics of the clusters

Training and evaluation of the models
For the performance analysis of the networks (SOM, TKM and RSOM) 3 maps were constructed, for each network type, in the grids: 5 x 5 units, 7 x 7 units and 9 x 9 units, therefore 9 maps in total. After, a ROC analysis was done. The ROC graph is a technique for visualizing and evaluating classifiers based on their performance (Fawcett, 2006). A ROC graph allows identifying relative tradeoffs of a discrete classifier (one that your output is only a class label). In ROC graph the true positive rate (tp rate) of a classifier is plotted on the Y axis, while the false positive rate (fp rate) is plotted on the X axis. Fig. 7 shows a ROC graph with three classifiers labeled A through C.

Results
This section presents the results of the assessment among the studied networks: Self-Organizing Map (SOM), Temporal Kohonen Map (TKM) and Recurrent Self-Organizing Map (RSOM) for the severe weather pattern classification. Table 1 shows the confusion matrices and the global accuracy of the neural networks studied. It is noticed that with the TKM and RSOM classifiers were provided superior performances to the original SOM, and between the recurrent networks, the RSOM network showed the best results. Fig. 8 exhibits the ROC graph for the SOM, TKM and RSOM when the grids of the neural networks are 5 x 5. One may notice that the RSOM presents a larger tp rate and a smaller fp rate for the labels 2 and 3. For the label 1, the SOM network presented itself as the most liberal, and the RSOM as the most conservative network. Therefore, for this grid, the results have indicated that the RSOM classifier has a better performance than the other networks analyzed in this work.   9 displays the ROC graph of the analyzed models for the 7x7 grid. It is evident that the RSOM network has a larger tp rate and a smaller fp rate for labels 1 and 2. For the label 3 the SOM and TKM networks presented similar liberal characteristics, while the RSOM network showed a more conservative behavior. For these dimensions the results also indicated a better performance of the RSOM classifier when compared to the SOM and TKM network options. The ROC graph for the SOM, TKM and RSOM in 9x9 grid is presented in Figure 10. One notices that for this grid, the RSOM network has a larger tp rate and smaller fp rate for all three labels considered. This fact confirms even more the best performance observed for the RSOM classifier, among all networks analyzed. Fig. 10. ROC graph for the SOM, TKM and RSOM in 9x9 grid Table 2 shows a comparison among the U-Matrices of the networks studied. The U-Matrices are representations of the self-organizing networks where the Euclidean distance between the codebook vector of the neighbouring neurons is represented in a two-dimensional color scale image.

U-matrix of the SOM, TKM and RSOM networks
It is observed that the RSOM network presented the best view among the networks studied, distinguishing clearly the existence of three clusters in the data set used for training this neural network.
Legend: Color scale represent the Euclidean distance between the codebook vector of the neighbouring neurons Table 2. U-matrix of the SOM, TKM and RSOM networks Table 3 shows a comparison between the labeling of the neurons after the training process, using as criteria the activation frequency. It is noticed that RSOM network has a higher organization when compared with the other networks. The labels: blue for the cluster 1, green for the cluster 2, and red for the cluster 3, were used.

www.intechopen.com
Legend: Blue for the cluster 1. Green for the cluster 2. Red for the cluster 3 Table 3. Labeling of the SOM, TKM and RSOM networks neurons in different grids

Time constants variation of the TKM and RSOM classifiers
One difference between the SOM network and its temporal extensions TKM and RSOM is the change in the performance when occur variation in the time constants. In the section 4.4.1 and 4.4.2 are shown the results when the time coefficients (d and ) vary in the range 0 to 1. www.intechopen.com

d variation
For different d values the TKM network presented different global accuracies, reducing their values in the range limits of 0 to 1. The table 4 shows the confusion matrices and Figure 11 shows the superposition of the global accuracies due to the d variation. For each TKM dimension studied the points were spaced at 0.25 intervals.  Figure 12 shows the ROC graph of the TKM model with 5x5 map units and d variations. It indicates that for lower values of d ( d=0.10 and d=0.35) this classifier presented more conservative characteristics for labels 1 and 3, and the most liberal behaviour for the label 2. On the other hand, for higher values of d (d=0.60 and d=0.85), in general one notices a decrease of the tp rate values and an increment of the fp rate for all labels. One may conclude therefore, that the TKM better performances were observed for the lower values of d. Fig. 12. ROC graph for the TKM model in 5x5 grid with d variation Figure 13 displays a ROC graph for the TKM model with 7x7 grid and d variation. In this particular case, it is even more evident the superior performance of this classifier when one uses the lower values of d. Indeed, its best performance was found for d=0.35 and the worst corresponded to d=0.85. The results relative to the TKM model with 9x9 grid and d variation are graphically displayed in Figure 14. For this case, one may notice an approximation among the performances of the model for d=0.60 and the results obtained for lower values of d, such as the cases for d=0.10 and d=0.35. This togetherness was also observed for the d=0.85 case, even though it remains as the worst performance case for the TKM model. Therefore, the conclusion was that the smaller values of d provided the best performances for the clusters classification by this network type.

 variation
For different  values the RSOM network presented a global accuracy significant variation. Table 5 shows the confusion matrices and figure 15 shows the superposition of the global accuracies due to the  variation, for each RSOM dimension studied. Figure 16 shows a ROC graph for the RSOM model with 5x5 grid and  variation.  In such cases the classifier becomes nearly ideal, with tp rate approaching 100% and fp rate near 0%. On the other hand, this classifier performance becomes very poor for the lowest extreme (=0.10) when compared to the other  parameters. In summary, we can say that the RSOM network was the one which offered the best clusters classification performance, as long as, using intermediate  values.

Conclusion
This work aimed to evaluate the applicability of the self-organizing map local temporal extensions (Temporal Kohonen Map and Recurrent Self-Organizing Map) in the severe weather patterns recognition. The study area was the eastern Amazon region, one of the areas of the Earth with higher frequency and intensity of extreme weather events, especially in terms of the lightning occurrences.
Its purpose was to contribute in the identification of a useful tool for the weather studies, to reduce the damages associated with this natural phenomenon. The performance analysis of the recurrent neural networks (TKM and RSOM) with relation to the original SOM resulted in the following main conclusions:  There is significant change in the global accuracy of the TKM and RSOM classifiers depending on the choice of the time coefficients (d and ), respectively;  The recurrent neural networks (TKM and RSOM), used as classifiers, presented improved performance over the original SOM when adjusted its time coefficients with core values within the range of 0 to 1;


The U-matrix of the RSOM network presented a better cluster visualization (in terms of separation) when compared with other networks studied, allowing a clear differentiation among the three clusters;  The labeling of the neurons in the maps, after the training, was better defined for the RSOM network when compared with the other networks studied;  Finally, after the ROC analysis, it was concluded that among the neural networks studied, the RSOM had the best performance as classifier, confirming its usefulness as a potential tool for studies related to the severe weather patterns recognition.