Open access peer-reviewed chapter

Graph Mining Based SOM: A Tool to Analyze Economic Stability

By Marina Resta

Submitted: March 23rd 2012Reviewed: July 2nd 2012Published: November 21st 2012

DOI: 10.5772/51240

Downloaded: 1685

1. Introduction

Living in times of Global Financial Crisis (GFC) has offered new challenges to researchers and financial policy makers, in search of tools assuring either to monitor or to prevent the incurrence of critical situations. This issue, as usual, can be managed under various perspectives.

Under the economic profile, two basic strands emerged: various contributions debated on the central role of systemic risk in conditioning countries financial fragility; a second vein concerned the role (either in positive or negative sense) of financial sector on economic growth. Provided the relevance for our work, we will discuss each of them in a deeper way.

For what it concerns the first aspect, there are several definitions of systemic risk (see for instance: [1]; [2], [3] and [4]), but there is not any widely accepted definition for it. Nevertheless, we agree with the position of [5] who claimed that systemic risk can be identified by the presence of two distinct elements: an initial random shock, as the source of systemic impact, and a contagion mechanism (such as the interbank market or the payment system), which spread the negative shock wave to other members of the system. Along this vein, a growing body of empirical research has already bloomed: [6] suggested a network approach to analyze the impact of liquidity shocks into financial systems; a similar approach was followed by [7] discussing the case of United Kingdom, Boss [8] for Austria, and [9] for Switzerland; more recently Soramaki et. al. (2012) developed a software platform

Financial Network Analysis (fna): free web version available at:

that employs graphs models for various purposes, including to monitor financial contagion spreading effects.

A second related point concerns the evaluation of how financial sector can condition countries' economic growth. There is a general agreement in financial economics literature about the existence of a link between bankruptcies and the business cycle. However, the same does not apply when one is asked to identify the methods and the variables by which bankruptcies and the business cycle interact. Basic streams of research moved along four directions. A number of papers focused on the application of discriminant analysis over a bunch of accounting variables (see for instance: [10], [11], [12]; [13]). A second group of papers (see among the others: [14]) employs the methodology initiated by [15], who used logistic regression models (logit) on macroeconomic variables. A third strand focuses on duration models, i.e. models that measure how long the economic system remains in a certain state. This is the line joined, for example, by [16], and [17]. Finally, there is a plenty of (more or less) sophisticated econometric techniques aided to estimate bankruptcies by means of macroeconomic variables. Interested readers may take a look to [18], and [19].

From all the above research streams dealing with crisis and financial (in)stability we extract three discussion issues. As first remark, our review highlighted that in general, in all periods of crisis there is always a strong financial component. As second remark, we may observe that the economic literature addressed the analysis mainly by means of either macroeconomic or accounting data. Finally, we want to focus on a methodological issue: quantitative papers generally studied the problem by means of econometric techniques; only over the past decade soft computing methods (namely: graphs models) have become of some interest for economic researchers and policy makers. Starting from this point, we think that there is enough room to add something newer towards the following directions: (i) studying the emergence of instability by way of financial markets data; (ii) using a hybrid approach combining graphs models together with non-linear dimension reduction techniques, in detail: with Self Organizing Maps [20]. To such purpose, it aids to remember that Self Organizing Maps (SOM) are nowadays a landmark among soft computing techniques, with applications which virtually span over all fields of knowledge. However, while the use of SOM in robotics, medical imaging, characters recognition, to cite most important examples, is celebrated by a consistent literature corpus (interested readers may take a look to [21], [22], and [23]), economics and financial markets seem relatively less explored, with some notable exceptions (from the pioneering works of [24], [25] to, more recently, [26] and [27]). Such lack of financial applications is quite non-sense, provided the great potential that relies on this kind of technique.

The rationale of this contribution is to offer some insights about the use of SOM to explore how financial markets organize during critical periods i.e. deflation, recession and so on. Something similar has been already discussed in [28] and [29], who deal with the use of SOM as support tool for Early Warning Systems (EWS), alerting the decision maker in case of critical economic situations. However, the present contribution goes one step forward under various points of view. The first element of innovation relies on the examined data. We studied the situation of markets characterized by different levels of (in)stability, but instead of using either financial or macroeconomic indicators as it is generally done in literature, we employed historical time-series of price levels for every enterprise quoted in the related stock exchanges, and we then trained a SOM for each market. A second innovative item relied on the use of the so obtained SOM best matching units, to build the corresponding Minimum Spanning Tree (MST). In this way we were able to capture both the clusters structure of every market and to analyze the impact of emerging patterns over the economic situation of the country. This was done both in a static way, i.e. by observing the situation with data referring to a fixed one year long period (from December 2010 to December 2011), and in a dynamic way, by comparison of MST obtained for each countries with data extracted by means of a 300 days long moving window over a time interval of overall length of 3000 days (approximately ten years).

Our major findings may be then summarised as follows: (i) using SOM we got an original representation of financial markets; (ii) by building from SOM winning nodes the corresponding MST it was possible both to emphasize the relations among various quoted enterprises, and to check for the emergence of critical patterns; (iii) we provided a global representation of countries financial situation that generates information that can be of help to policy makers, in order to realize more efficient interventions in periods of higher instability.


2. Methodology

As stated in Section 1, we examined financial markets data by means of a hybrid technique which assumes the joint use of both SOM and graphs formalism. In order to assure a better understanding of this framework, we will recall some basic definitions and notational conventions for both the aforementioned tools.

2.1. Self Organizing Maps: Basic principles

A Self Organizing Map (SOM) is a single layer neural network, where neurons are set along an n-dimensional grid: typical applications assume a 2-dimensions rectangular grid, but hexagonal as well as toroidal grids are also possible. Each neuron has as many components as the input patterns: mathematically this implies that both neuron and inputs are vectors embedded in the same space. Training a SOM requires a number of steps to be performed in a sequential way. For a generic input pattern x we will have:

  1. to evaluate the distance between x and each neuron of the SOM;

  2. to select the neuron (node) with the smallest distance from x. We will call it winner neuron or Best Matching Unit (BMU);

  3. to correct the position of each node according to the results of Step 2., in order to preserve the network topology.

Steps 1.- 3. can be repeated either once or more than once for each input pattern: a good stopping criterion generally consists in taking a view to the so called Quantization Error (QE), i.e. a weighted average over the Euclidean norms of the difference between the input vector and the corresponding BMU. When QE goes below a proper threshold level, say for instance 10-2 or lower, it might be suitable to stop the procedure.

In this way, once the learning procedure is concluded, we get an organization of SOM which takes into account how the input space is structured, and projects it into a lower dimensional space where closer nodes represent neighboring input patterns.

2.2. Graphs models: A brief review and some notational conventions

In order to understand how graphs theory can be used in clusters analysis, it is worth to review some basic terminology.

From the mathematical point of view, a graph (network) G = (V,E) is perfectly identified by a (finite) set V, and a collection E ⊆ V ×V, of unordered pairs {u, v} of distinct elements from V. Each element of V is called a vertex (point, node), and each element of E is called an edge (line, link). Edges of the form (u,u), for some u∈ V, are called self-loops, but in practical applications they typically are not contained in a graph.

A sequence of connected vertices forms a path; the number nof vertices, (i.e. the cardinality of V), defines the order of graphand it is denoted by |V|:=n. In a similar way, the number mof edges (the cardinality of E), is called the size of the graphand denoted by: |E|:= m. Finally, the number of neighbors of any vertex v∈ V in the graph identifies its degree.

Moreover, the graph G will be claimed to be:

  • directed, if the edges set is composed of ordered vertex (node) pairs; undirectedif the edge set is composed of unordered vertex pairs;

  • simple, if it has no loops or multiple edges;

  • acyclicif there is not any possibility to loop back again from every vertex; cyclicif the contrary holds.

  • connected, if there is a path in G between any given pair of vertices, otherwise it is disconnected;

  • regular, if all the vertices of G have the same degree;

  • complete, if every two distinct vertices are joined by exactly one edge;

  • a path, if consisting of a single path.

  • bipartite, if the vertex–set can be split into two sets in such a way that each edge of the graph joins a vertex in first set to a vertex in second;

  • a tree, if it is connected and it has no cycles. If G is a connected graph, the spanning tree in G will be a subgraph of G which includes every vertex of G and is also a tree. The minimum length spanning tree is called Minimum Spanning Tree (MST).

Our brief explanation highlights that Minimum Spanning tree is nothing but a particular graph with no cycles, where all nodes are connected and edges are selected in order to minimize the sum of distances.

Graphs representation passes through the building of the adjacency matrix, i.e. the matrix that marks neighbor vertexes with one, and with zero not adjacent nodes. Figure 1 provides an explanatory example.

In a number of real world applications there is the common habit to use graphs theory formalism, representing the problem data through an undirected graph. Each node is associated to a sample in the feature space, while to each edge is associated the distance between nodes connected under a suitably defined neighborhood relationship. A cluster is thus defined to be a connected subgraph, obtained according to criteria peculiar of each specific algorithm. Algorithms based on this definition are capable of detecting clusters of various shapes and sizes, at least for the case in which they are well separated. Moreover, isolated samples should form singleton clusters and then can be easily discarded as noise in case of cluster detection problems.

Figure 1.

From left to right: the adjacency matrix (a) for an undirected graph, and the corresponding graph (b). The ones in the matrix indicate the existence of a connection among nodes, while zeroes mean no connection.

With this in mind, one can easily understand that coping SOM (that satisfy topology preservation features) to graphs (that do not require any a priori assumption about the input space distribution) should result in a very powerful tool to analyze data domains.

2.3. A hybrid model combining SOM to MST

SOM achieves a bi-dimensional representation of the input domain, maintaining unchanged the basic relations among neighbor patterns: closer points in their r-dimensions (r>>2) initial space are still nearer one to each other in the SOM grid; in addition, they are projected into a space where relations can be easily visualized and understood. However, sometimes this cannot be enough.

Consider the issue to represent basic relations among quoted societies in a market (for example: in the Italian market). Figure 2 shows SOM, once the relations among Italian quoted companies have been learned.

Here we have a SOM assuring an overall good performance, in terms of quantization error (QE<10-3), but the winner nodes are even too much closer than desired, thus making difficult to understand the effective significance of their closeness.

Moving one step forward, we suggested a hybrid procedure that combines together SOM and MST (see also [31]). The idea by itself is not totally newer: [32], for instance, suggested a variant of SOM where neighborhood relationships during the training stage were defined along the MST; [33], and, more recently, [34] applied a MST to SOMs to connect similar nodes with each other, thus visualizing related nodes on the map. In all cited cases this was done by calculating the square difference between neighbor units on the trained map, and using this value to color the edge separating the units.

Figure 2.

SOM representation of Italian quoted companies.

In a likewise manner, we applied a clustering procedure whose main steps can be summarized as follows:

  1. Define a SOM M(made of a number nof neurons w) over the input space and run it.

  2. For each input sample extract the corresponding BMU. We set:

    1. B={w:wMw is a BMU}E1
  3. Build the correlation matrix C among the nodes belonging to B.

  4. Use C as starting point to compute the MST. In particular, since C is symmetric, one can consider only the lower (L) or Upper (U) triangular part of the matrix, and:

    1. sort the elements of L(U) in decreasing order, thus moving from Lto the list Lord(from Uto Uord).

    2. ii.Set the coordinates in Cof the first element of Lord(Uord) as those of the first two nodes of the MST.

    3. For each element in Lord(Uord) add the corresponding couple from Cto MST; in particular, if the graph is still acyclic (i.e. no loops are added to MST), then hold the inserted link, otherwise discard it.

    4. Repeat step iii. until all the elements in Lord(Uord) have been examined, and then stop the procedure

The result is a filtering of available information, letting only more significant patterns to emerge.

3. Experimental settings and results discussion

Our work is aimed to demonstrate how a fully data-driven approach can be helpful to analyze complex financial situations in quite an intuitive way, thus making SOM-MST a very reliable tool also for policy makers.

We performed both static and dynamic analyses, as we are going to explain. As starting point for the static analysis we selected a market and for each quoted enterprise we took all available price levels (pl)from December 2010 to December 2011. In this way for the generic i-th stock (i=1,…,N, where N is the overall number of quoted enterprises) we got the time-series S(i)=pvkiwith length T-1, being:


The transformation described in (2) turns price levels into price log-returns: this is a common practice in empirical financial studies to avoid any trend effect in data. The final result was a matrix Σof dimensions N× T-1, containing T-1 log-returns for each quoted enterprise (for an overall number of N). As final step, we performed on Σthe procedure we explained in Sec.2.3, coping SOM to MST.

The dynamical procedure is similar to the static one, but instead of considering last year sample for each stock, we examined a number of fixed length samples, going back in time (when possible) up to ten years. In practice, assuming as starting point t=3000 the day Dec. 30 2011, we build for each stock the block B1 going from t=2701 to t=3000; the block B2 with data from t=2401 to t=2700; and so on towards the block B10 that goes from t=1 to t=300. In practice, instead of having a single block of data to analyze, in the dynamical procedure we can monitor the situation of the country with different sets of data. Moreover, taking advantage of the networks representation, one can have a look to graphs statistics for every year and to compare them over the ten years time horizon.

We applied our methodology to the German and Spanish markets. Our choice obeys to a precise motivation: we have examined countries characterized by different levels of (in)stability: at the end of 2011 Spanish financial equilibrium seemed heavily compromised, while Germany still maintained its leading role in Europe.

4. Results and discussion

Before going to separately discuss various cases, we will spend a few words about some common features shared by our simulation study.

Starting from the static case, we examined German, Italian and Spanish markets from 30 December 2010 to 30 December 2011. For each market we considered data of quoted enterprises, transforming them according to the formula given in (2). Table 1 highlights some basic details concerning the markets we have considered.


Table 1.

Markets main features.

The column Countryreports the name of the countries whose assets have been examined, while Idxindicates the name of the national market index that has been employed to pick up quoted stocks; NrSis the number of stocks we included for every market; finally MDhighlights input matrix dimensions in our simulation study. In particular, we referred to the CDAX (Composite Deutscher Aktienindex) index for the German market, and the IGBM (Index General de Bolsa Madrid) index in the case of Spain. As a straightforward observation, one can argue that the overall number of quoted enterprises in those markets should be higher than the one we have reported in the third column of Table 1. However, for sake of comparison among graphs, we needed to eliminate from the markets those stocks for which it was not possible to go back in time enough (at least 600 days, approximately corresponding to two years and half of market tradings).

4.1. The case of Germany

Applying our procedure led us to obtain the skeleton framework of the German stock exchange that is shown in Figure 3.

Our procedure found out 14 clusters. At first glimpse the clusters seem to be natural, in the sense discussed in [35], i.e.:

  • each node is member of exactly one group;

  • each node has many edges to other members of its group;

  • each node has few or even no edges to nodes of other groups.

Figure 3.

German Market topology, as resulting in the static case. Natural clusters are highlighted with different colors.

This cluster structure is partly due to the filtering procedure operated on SOM BMUs after the learning stage, but the resulting organization makes sense also if we look at the statistical features of the clusters (Table 2) as well as at their composition, by industrial reference sector (Table 3).

CL. ID.mustdskkuSR

Table 2.

Statistical properties of the network of German stocks in the period December 2010-December 2011.

We examined basic clusters statistics: mean (mu), standard deviation (std), skewness (sk), and kurtosis (ku). We also evaluated the Sharpe Ratio of every cluster (SR):


where rfis the risk free rate, and mu, stdare as above described. According to financial literature, SR is a profitability index that measures how much attractive a risky investment is with respect to a riskless investment with return equal to rf: the ratio, in fact, opposes the excess of return (upper side of the ratio) to the excess of risk the investor assumes in charge when/if he decides to move his money from the riskless asset (whose standard deviation is zero) to the riskier one (lower side of the ratio), whose standard deviation is greater than zero. The beauty of SR stands in the fact it can be easily interpreted, giving an idea about the general attractiveness/profitability of the companies included in each group; at the same time, if we assume rf=0, the index turns to be the reciprocal of the coefficient of variation and it has also a (quite) trivial statistical interpretation.

The analysis of the results evidenced that all clusters have positive mean, relatively low variability, and good profitability (with the exception of CL01 and CL03 whose Sharpe Ratio is the lowest over all examined cases). Besides, companies returns are positively skewed.

Moving to Table 3, we checked whether companies tend to aggregate according to the sector they belong to or not, as well as if clusters composition may have affected the results that we have shown in Table 2.

Re. Serv0.

Table 3.

Clusters percentage composition according to the reference industrial sector.

In general, clusters did not show an exclusive, but rather a dominant composition. Looking at Table 3, in fact, CL01 exhibits a dominant percentage of companies from both Services (Serv) and Logistics (Log) sector (20%), CL02 is equally divided into firms belonging to Banking and Finance (B&F), Health-Care (HC), Logistics and Components (Comp) sectors which share the same 14.29% percentage. B&F dominates (30%) cluster CL03 as well as CL05 (42.86%), CL11 (21.43%), and CL13 (23.08%). Hi-Tech companies (HT) are preferably grouped into clusters CL04 (25%) and CL10 (28.57%). Companies working in the Health Care sector (HC) are more numerous in clusters CL07 and CL08 (17.39% and 27.28% respectively). Finally, clusters CL06 and CL09 have their most representative elements in societies of energy sector (En) (20% and 14.29%), while CL14 is dominated by Heavy Industry (HI) companies.

This seems to suggest that despite of the variety of sectors represented in German Stock Exchange, only a reduced number of them (i.e. clusters dominant sectors) may be considered the very driving engine of the German economy. Such information together with the one retrieved by looking at the Sharpe Ratio scores has strengthened the belief that Hi-Tech and Energy are, at present, the most challenging areas for investors in German market.

As a counterpart, we observed that there is a plenty of sectors

Fashion (Fash), Luxury Goods (Lux), Housing (Hou), Retail Services (Ret Serv.), Food and Drinking (F&D), Entertrainment (Ent), Press (Press), Import/Export (Imp/Exp), Public Utilities (PU), Telecommunications (TCom), Automotive (Auto), Gardening (Gard) and Manifacturing (Man).

whose incidence on clusters composition is lower, or better, they did not seem to cluster anyway. If this sounds reasonably for some niche-wise sectors (Luxury and Gardening, to make some examples), this is more surprising for other sectors (mainly Automotive and Telecommunications) that are worldwide known as strengths of German economy. This evidence, however, is somewhat aligned to the policy strategy that the German government has adopted in most recent times.

We can then conclude that Germany did not particularly suffer for the critical situation common to greater part of European countries. The role played by both Hi-Tech and Energy sectors has been probably a key issue. However, from now on Germany should carefully monitor the state of B&F companies that are those that actually are performing worse. Other sectors like F&D, Hou, Press, and Auto need to be constantly checked as well since they seem to be in a stage whose evolution (towards either better or worse phases) is uncertain.

At this point it makes sense to test whether or not the actual snapshot we have captured for Germany is the result of either a strategic issue, or a kind of natural evolution from previous situations. To do this we performed a dynamical analysis going back in time from December 2011 to December 2001. As explained in Section 3, we scanned data by means of a moving window, thus obtaining 10 matrices of dimensions 207×300, where 207 is the number of companies included into the simulation and 300 is the number of log-returns we took for each of them.

In order to make the discussion as clear as possible, we focused on the analysis of the periods: 2004-2005 and 2007-2008. The period 2004-2005, in fact, is a starting point of some symptoms anticipating the world financial crisis; while the period 2007-2008 is generally acknowledged as the one where deepest effects of the crisis were felt.

Figure 4 shows the market skeleton frame obtained for the German Stock Exchange in the periods 2007-2008 and 2004-2005 respectively. Tables 4-7 detail basic statistics and clusters composition.

Figure 4.

German market topology in 2008 (a) and 2004 (b).


Table 4.

Clusters percentage composition for the German market in the period 2007-2008.


Table 5.

Clusters percentage composition for the German market in the period 2004-2005.

Ret Serv0.

Table 6.

Clusters percentage composition during the period 2007- 2008.

Ret Serv0.000.0010.

Table 7.

Clusters percentage composition during the period 2004- 2005.

In both cases cluster statistics evidence (once again) positive mean and skewness, and lower variability. The Sharpe Ratio is generally higher than that evidenced in the static analysis. Looking at clusters composition, we primarily observe that, moving from one period to another, it did not maintain unchanged. However, it has been possible to isolate dominant sectors. In particular, in the period 2007- 2008, B&F companies prevail in five over thirteen clusters (CL01, CL03, CL07, CL08 and CL13); HC and Imp/Exp firms share dominance in CL02; Hi-Tech is the dominant sector in cluster CL06, CL10, CL11 and CL12. Finally, Logistics and TCom societies are concentrated in CL04 and CL09 respectively. Coping such results to the values of Sharpe Ratio, it seems possible to claim that good performances are mainly due to the leading activity of the High-Tech sector. Besides, by comparison with the performances discussed in the static analysis, Germany gave the impression to have suffered for the global crisis with some delay.

2004-2005 (NETG1)2007-2008 (NETG2)2010-2011 (NETG3)
Average Degree1.9901.9891.990

Table 8.

Measures of network organization. A comparison among German market topologies during the periods under examination. NET.

Most interesting results, in our opinion, come by the analysis of the period: December 2004-December 2005. The first element to highlight is that in this case we have only 11 clusters (versus 13 in the period: 2007-2008, and 14 in the period: 2010-2011). For what is concerning clusters composition now we have: B&F companies dominating clusters CL01, CL05, CL06, CL07, CL08, and CL10; fashion sector prevails in CL02, entertainment companies in CL03, housing societies in CL09, while Hi-Tech is the king of remaining clusters (CL04, CL05 and CL11). If we compare the results to those we have previously discussed, it is quite clear that during the observed period we have been witnesses of various companies reactions to the crisis: while Hi-Tech as well as financial companies maintained similar behaviors (and this is confirmed by the tendency to be clustered together), companies in other sectors did not group in any way. A possible explanation might stay in some policy action made by the national government, in order to address the economy, and to protect sectors with higher exposure.

To conclude, the joint use of SOM and MST makes also possible to analyze the results from a network (graphs theory) perspective. To such aim, Table 8 shows some relevant measures of network organization for the German market in the periods under examination.

Before discussing the values, we briefly explain the meaning of the observed variables. The Average Degree (AD) expresses the average number of ties of the networks nodes and measures how much immediate is the risk of nodes for catching whatever is flowing through the network. In the examined cases higher scores should mean an exposure to abrupt changes in the market arrangement. However, the AD values we have obtained are low and very similar one to each other. The Graph Density (GD) measures how close the network is to be complete: since a complete graph has all possible edges, its GD will be 1: the lower this value, the farther the graph is to be complete. The values in our nets are at least the same and lower. Both NetG1, NetG2 and NetG3 are far to be complete. Note that the reason is in the filtering procedure acted by MST on SOM that cleaned the original map from lesser significant ties. The Modularity, on the other hand, is a concept close to that of clustering, since it examines the attitude to community formation in the net, and it is then strictly related to the possibility to disclose clusters in a net. In order to be significant, values need to be higher than 0.4. This threshold has been largely exceeded in all examined nets.

4.2. The case of Spain

As done for Germany, we begin by the static analysis during the period: December 2010-December 2011. Our procedure identified eight clusters, as shown in Figure 5.

Figure 5.

Skeleton framework of the Spanish stock exchange in the period: 30 December 2010 - 30 December 2011.


Table 9.

Basic statistics for clusters in the Spanish stock exchange. The reference period is: Dec. 2010 - Dec. 2011.

At the first glance cluster statistics are not as dramatic as to justify the present critical situation of the Spanish market: mean is positive and so the Sharpe Ratio is. Obviously it is quite low, and hence it can be explained as a signal of overall reduced market profitability. Nevertheless, a warning comes matching mean to skewness. Skewness, in fact, is negative: under this light the positivity of the mean can be justified only by the presence of bursts (and hence speculative movements), like viewing at the Spanish market behavior (Fig 6) over the past year confirms.

Figure 6.

The behaviour of Spanish market (log returns) in the period December 2010-December 2011.

Fig 6, in fact, shows the log returns dynamics in the Spanish market in the period December 2010-December 2011. It sticks out immediately the spikynature of the observed time series.

Moving to the analysis of clusters composition (Table 10), by comparison to the situation discussed for Germany a number of sectors is now missing

This is the case, for instance, of Imp/Exp, Ret. Serv., and High Tech.

. In addition, companies in the B&F sector are widely disseminated and dominate five over eight clusters. In the remaining three clusters Housing (Hou) and Paper Factories (Pap, a new entry with respect to what already seen for Germany) have a dominant position.

The aforementioned clusters structure suggests a key to understand present financial instability in Spain: the highest number of financial companies in the market makes it weak and prone to speculation (as the bursts one can see by looking at Fig 6 confirms in turn). One the other hand, since the Housing sector has been the driving engine of the global crisis, it is reasonable that its higher influence in the Spanish market composition has negatively conditioned its behaviour.


Table 10.

Cluster percentage composition for Spain in the period December 2010-December 2011.

Replicating for Spain the analysis we have already performed for Germany, suggests a number of additional issues to be discussed. Figure 7 shows the market organization in the periods: 2007-2008 and 2004-2005, while Tables 11-14 report the corresponding basic statistics and clusters composition.

Figure 7.

Skeleton framework of Spanish market in the periods: 2007-2008 (a), and 2004-2005.


Table 11.

Clusters statistics for Spain in the period: 2007-2008.


Table 12.

Clusters statistics for Spain in the period: 2004-2005.


Table 13.

Cluster percentage composition for Spain in the period 2007-2008.

Looking to Table 11, basic statistics for 2008 highlight a situation that cannot be interpreted in an precise way: clusters CL01, CL04, CL06, and CL07 have positive mean, skewness and Sharpe Ratio, CL08 has positive mean and SR, CL03 and CL05 has gone negative, while CL02 is a hybrid of all above states, with negative mean and SR, and positive skewness. Going back to 2004, Table 12 sees two clusters (CL06, CL08) negative both in mean, SR and skewness, two negative only in mean and SR (CL01, CL03), and all remaining clusters with positive statistics.

The turning point to understand the crisis of Spain is in clusters composition. While in 2004 (Table 14) the Spanish market exhibited a strongest component in the Energy sector, this disappeared when we look to Table 13 that shows market organization in 2008. The snapshot we took by looking at this period, shows a market dominated by banks (i.e. an exposure to speculation), as well as by sectors like luxury goods, and fashion that did not assure any protective shield in period of global crisis.


Table 14.

Cluster percentage composition for Spain in the period 2004-2005.

Moving the attention towards networks statistics (Table 15), we may observe that the values of NETS2 and NETS3 are quite similar; conversely, they differ from those referring to the first period under examination (NETS1). In the attempt to give the data an economic interpretation, we can say that NETS2 and NETS3 mirror a steady situation. Moreover, looking to Density values the Spanish market gives the impression of a place where each company is undertaking its own way. Such de-clusteringorientation confirms the present exposure of the country to external speculation attacks.

Average Degree1.751.9741.974

Table 15.

Measures of network organization. A comparison among market topologies during the periods under examination.

5. Conclusion

In this chapter we provided an example of how to use Self Organizing Maps (SOMs) as a tool to analyze financial stability.

We moved from row data (price levels) of quoted enterprises to provide a snapshot of countries financial situation, and then we applied a hybrid procedure coping together SOMs and Minimum Spanning Tree (MST). We checked our approach on two markets featuring different levels of (in)stability: the German and the Spanish Stock Exchange.

Our study made us possible to highlight most important relations among quoted societies, as well as the natural clusters that tend to be created into those markets.

In particular, in the case of Germany we captured the country situation in three periods (2004-2005, 2007-2008 and 2010-2011). The study suggested that the German government was able to pay attention to warning signals emerging from the market. In this way Germany applied measures that allowed it to face last year critical situation. Protecting sectors with a strength tradition and promoting the challenge in emerging sectors Germany played a game that seems to maintain the country at the marginal side of current global crisis.

On the other hand, the case of Spain suggests the existence of a weak market dominated by banks that has been highly exposed to investors speculation. Local governors neither did take into account in the right way alerting signals or did apply correction/protection measures. In a positive sense our procedure highlighted some direction towards which policy makers could operate in order to reduce instability.

To conclude the joined SOM-MST approach seems able to suggest proper recipes that governments might consider in order to address their policy efforts.


  • Financial Network Analysis (fna): free web version available at:
  • Fashion (Fash), Luxury Goods (Lux), Housing (Hou), Retail Services (Ret Serv.), Food and Drinking (F&D), Entertrainment (Ent), Press (Press), Import/Export (Imp/Exp), Public Utilities (PU), Telecommunications (TCom), Automotive (Auto), Gardening (Gard) and Manifacturing (Man).
  • This is the case, for instance, of Imp/Exp, Ret. Serv., and High Tech.

© 2012 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Marina Resta (November 21st 2012). Graph Mining Based SOM: A Tool to Analyze Economic Stability, Applications of Self-Organizing Maps, Magnus Johnsson, IntechOpen, DOI: 10.5772/51240. Available from:

chapter statistics

1685total chapter downloads

1Crossref citations

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Social Interaction and Self-Organizing Maps

By Ryotaro Kamimura

Related Book

First chapter

A User Centred Approach for Bringing BCI Controlled Applications to End-Users

By Andrea Kübler, Elisa Holz, Tobias Kaufmann and Claudia Zickler

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us