6 Functional Holography and Cliques in Brain Activation Patterns

Yael Jacob1,2,3, David Papo1,4, Talma Hendler1,2 and Eshel Ben-Jacob3,5 1The Sackler School of Medicine, Tel Aviv University, Tel Aviv, 2Functional Brain Imaging Unit, Wohl Institute for Advanced Imaging, Tel Aviv Sourasky Medical Center, Tel Aviv , 3School of Physics and Astronomy, Tel Aviv University, Tel Aviv, 4Center for Biomedical Technology, Universidad Politécnica de Madrid, Pozuelo de Alarcón, Madrid, 5The Center for Theoretical and Biological Physics, University of California San Diego, La Jolla, California 1,2,3Israel 4Spain 5USA


Introduction
The brain is a complex spatially extended biological system, where a great number of neurons (~10 11 ) interact to carry out extremely sophisticated tasks. Alongside a wellestablished tradition of studies of single neuron activity, a wealth of neuroimaging techniques has been developed where brain activity at various spatial scales is observed in terms of multichannel recordings of the dynamics of its components.
Early neuroimaging studies of brain activity mainly focused on the functional specialization of segregated brain modules. The main concern of these studies was that of finding which brain areas change their activity as subjects carry out well-controlled tasks. A robust statistical underpinning for the quantitative analysis of results was offered by the general linear model and Gaussian field theory (Worsley & Friston, 1995), which allowed delineating a collection of significant cortical activations and deactivations associated with the execution of these tasks. From a computational point of view, this general univariate framework treated the brain as a collection of independent brain regions.
While the brain developed largely segregated modules, communication between and within these modules is essential to the transfer and processing of information. Accordingly, neuroimaging studies started incorporating the idea that the neural activity associated with the execution of given cognitive tasks is indeed diffuse, and that the influence that one brain region exerts over the others cannot be neglected. As a consequence, over the past few years, the neuroimaging literature has seen a shift towards a focus on measures of functional integration of brain activity. Many methods were developed to estimate functional and effective connectivity (Friston, 1994). These methods were designed to investigate how a generally rather small set of brain areas interact, and how different experimental manipulations may affect their mutual relationships (Friston et al., 1997). More or less coarse-grained brain regions are identified with the nodes of a network, while some metric of brain coupling between these regions is identified with an edge between these nodes. Prominent among these methods are data-driven methods such as Independent component analysis (ICA) (McKeown et al., 1998), Fuzzy Clustering Analysis (FCA) (Windischberger et al., 2003), Temporal Clustering Analysis (TCA) (Zhao et al., 2007), and autoregressive models such as the Granger Causal Mapping (GCM) (Goebel et al., 2003) and model-driven dynamical models expressed in dynamic causal modeling (DCM) (Friston et al., 2003). The former set of methods started exploiting the inherent multivariate and stochastic nature of fMRI data. Model-driven approaches, on the other hand, used causal influences among neural sources to produce an explicit computational model generating the observed signal. This method improved on early methods by incorporating an explicit temporal component into effective connectivity estimation (Penny et al., 2004). The main merits of these methods were that of making explicit the spatially non-local nature of task-related brain activity, and of adding to it a (rather coarse) temporal dimension. However, these methods are typically limited in the number of regions they can incorporate. Furthermore, while these methods incorporate the idea that correlations among neuronal assemblies play an important role in brain activity (Segev et al., 2004), no clear distinction between information processing and information transfer is made, and the output is essentially a flow-chart of communication between nodes. As a consequence, the meaningfulness of the networks that are delineated boils down to the combined functional properties attributed to the segregated brain regions that are identified with the network nodes, but it is unrelated to some general property of the network per se. This in turn implies, among other things, that no clear relationship exists between brain anatomy, the structure of functional networks of brain activity and the dynamics taking place on them.
While single region activity can be characterized in a straightforward way through timevarying profiles of amplitudes of some aspect of brain activity, network activity needs appropriate non-trivial observables to be defined. Graph theory (Boccaletti et al., 2006) offers a convenient and flexible way to analyze topological properties of systems with a network organization . Most importantly for neuroscientists, graph theory can be used to understand the complex relationship between structure, dynamics and function in the brain. Graph theory shows that the topology of structural networks influences the dynamical processes (namely synchronization) taking place on them (Boccaletti et al., 2006). For instance, small-world properties of dense or clustered local connectivity with relatively few long-range connections confer distinctive dynamical and functional properties: in addition to optimizing information processing (Strogatz, 2001), facilitating synchronization (Bucolo et al., 2003), ensuring rapid response and emergence of coherent oscillations (Lago-Fernández et al., 2000), and conferring resilience against pathological attack, small-world architecture has been shown to provide an optimum trade-off between efficiency and wiring costs, conferring high local and global efficiencies for relatively low connection costs (Latora & Marchiori, 2001).
Recently, an increasing number of neuroimaging studies using graph theoretical tools have started showing that the brain developed in such a way that a clear correspondence exists between anatomical network topology and dynamical processes taking place on it. It has been convincingly shown that brain anatomical networks have characteristically smallworld properties of dense or clustered local connectivity with relatively few long-range connections (Sporns et al., 2004). Similarly, human brain functional networks associated with the execution of cognitive tasks have also been associated with fractal small-world architecture Bassett et al., 2006;Eguíluz et al., 2005;Salvador et al., 2005), which support efficient parallel information transfer at relatively low costs and is differently impaired by normal aging and pharmacological manipulations (Achard & Bullmore, 2007;Bassett et al., 2009). Furthermore, specific neuroanatomical connectivity patterns are univocally associated with given functional complexity levels, and networks capable of producing highly complex functional dynamics share common structural motifs (see e.g. (Sporns et al., 2000(Sporns et al., , 2002). Finally, simulations showed that brain dynamics exhibits a modular hierarchical organization, where clusters coincide with the topological community structure of anatomical networks (Zhou et al., 2006).
Arguably graph theory's greatest strengths is that it has made possible to address a whole range of new research questions, far exceeding the original main one addressed by neuroimaging, of localizing brain activity, particularly issues related to how the brain organizes its activity as it carries out tasks of arbitrary complexity. A relative limitation of graph theoretical applications, in their current form, to neuroimaging is that both computations and visualization of functional brain networks are performed based on the Euclidean coordinates of observed activity. However, it has long been known that there is no straightforward correspondence between spatial and functional proximity between brain regions, so that regions that are contiguous to each other can in fact be involved in the execution of completely different tasks. It is then of great interest to be able to represent the topology of the functional space, and ultimately to delineate the correspondence between anatomical and functional spaces.
Here we propose a new method, Functional Holography (FH), designed to describe the information content of a network as it functions as a whole unit. The term used for the familiar holograms indicates that the photographic plates can capture the whole information about the 3D image. The FH method can overcome the main limitations of previous methods by visualizing networks of correlated activity in an auxiliary space of correlations and linking the components according to similarities between them.
The main objectives of the FH method are: 1. To overcome the limitations of existing methods taking into account only a fraction of the network components. 2. To identify underlying functional motives embedded in complex spatio-temporal behavior. 3. To identify functional subgroups functional clusters and to reveal the causal relations between them. 4. To relate the observed temporal ordering activity propagation to underlying causal motives propagation of information and causal connectivity. 5. To be able to compare the activity of two different networks or different modes of behavior of the same network.
In the remainder of this chapter, we will first illustrate the mathematical procedure of the method; we will then show some applications of the method to various neurophysiological signals, and will finally conclude by discussing the scope of the FH method in the context of brain imaging data analysis.

FH analysis
The FH method was developed from the perspective of cultured networks (Segev et al., 2004). Yet it can be applied to essentially any type of neural signal from the analysis of slices, to cortex-recorded activities, ranging from electro-or magneto-encephalographic (EEG and MEG respectively) to functional magnetic resonance imaging (fMRI) signals. Moreover, it makes it possible to place the recordings from all of these levels within the same presentation schema for comparative studies.
The FH approach allows identifying additional motifs embedded in the inter-neuron correlation matrices-analogous to the inter-location coherence matrices evaluated for ECoG recordings of brain activity (Milton & Jung, 2002;Towle et al., 1999) that are not transparent in the real space connectivity networks. The correlation matrix is represented in a higher dimensional space of functional correlations, or correlation affinities.
The FH method involves the following steps: 1. Evaluation of the similarity matrix between components. 2. Clustering by sorting or reordering of the similarity matrix.
3. Construction of a matrix of functional correlations. 4. Dimension reduction. 5. Retrieval from the correlation matrix of the information lost in the dimension reduction. 6. Inclusion of temporal causal information describing the activity propagation in the network. 7. Holographic zooming and comparison.

Correlation matrices
The first stage in the FH analysis is computation of the signals correlation matrices -the matrices of correlations between the dynamical responses of all pairs of signals. We used the Pearson formula (Pearson, 1901)  For N signals, the pair-wise correlations define a symmetric NxN correlation matrix. In order to reveal subgroups in the correlation matrix, we make use of the commonly used dendrogram clustering algorithm (Dubes & Jain, 1980). This algorithm reorders the correlation matrix such that highly correlated signals are closely located. This is performed using the correlation distance D(i,j) between signals (i) and (j), which is the Euclidean distance between the rows i and j in the correlation matrix (the vectors of correlations of each one of the signals with all other ones) where is the correlation vector between signal (i) and all other signals. Next, the algorithm reorders the correlation matrix by sorting it according to the hierarchical tree of correlation distances. In such a way we produce a real metric that satisfies the triangle inequality. In Fig. 1 we illustrate the analysis with a simple example. We generate 25 signals to imitate a multichannel recording of the activity of a network of 25 components. The signals ( Fig. 1a) include two subgroups of periodic signals with higher correlations and a group of random signals. In Fig. 1b we show the corresponding correlation matrix computed using the Pearson correlations. Applying the dendrogram clustering algorithm ( Fig. 1c) on the correlation matrix, the subgroups are delineated in the resulting sorted (reordered) matrix (Fig. 1d). The correlation matrix can be associated with the correlation space, i.e. the N-1 dimensional space of correlations (Baruchi et al., 2006;. We note that the correlation space does not represent a real space in the sense that the eigenvectors do not create an orthogonal mathematical space.

Collective normalization
The next step of the analysis is designed to capture mutual or relative effects between several signals. A collective normalization of the correlations (cross-correlation) is performed and an affinity matrix is computed. The affinity transformation represents a collective property of all channels, and can help capturing hidden collective motifs related to functional connectivity in the network behaviors (Baruchi et al., 2006;. The affinity matrix is calculated using the meta-correlation matrix MC(i,j), which is the Pearson's correlation between the rows of the reordered correlation matrix of any two components (i) and (j) as described in Eq.3. The affinity collective normalized matrix is the product of the correlation matrix and the meta-correlation matrix as defined in Eq. 4.
This MC matrix is calculated on the reshuffled rows of the matrix in such a way that all the elements between the signals (i) and (j) themselves are not included in the calculation. We note that the affinity transformation is performed after rescaling the range of the correlations to [0,1].

 
Ci  The dendrogram tree. The vertical axis is the correlation distance between the signals (the Euclidian distance between the vectors of correlations of each signal with all the others, or the row in the correlation matrix that corresponds to the signal). Longer/shorter distances correspond to lower/higher correlations. (d) The sorted correlation matrix using the dendrogramed clustering algorithm. In this matrix the two subgroups form distinct clusters.

Dimension reduction and construction of the holographic networks
To search for hidden functional motifs of brain activity induced by the execution of a given task, dimension reduction of the correlation matrices is performed. Principal component analysis (PCA) a standard dimension reduction algorithm can be used to extract the maximal relevant information embedded in the signal correlation matrices. The relevant information can then be presented in a 3-dimensional principal component space (Baruchi et al., 2006; the axes of which are the three leading PCA principal vectors. Each node is placed in this space according to its three eigenvalues for the three leading principal vectors. Reduction to three dimensions (projection on the three leading principal components) typically extracts most of the relevant information (above 85%), (Baruchi et al., 2006;Madi et al., 2008). To retrieve the information lost as a result of dimension reduction, we link each pair of nodes by lines color coded according to the correlations between them (Baruchi et al., 2006;Madi et al., 2008). The result is a holographic representation (Fig. 2) of a network (or manifold) of linked nodes in the PCA space. Fig. 2. Holographic representation of the synthetically produced signals from Fig.1 in the 3D space. The axes are the three leading principal PCA vectors of the correlation matrix. Each node is located in this space according to its eigenvalues corresponding to the leading principal vectors. All the nodes with correlations above 0.8 are linked by lines color coded according to the correlations (represented in the colorbar), creating the holographic manifold.

Holographic zooming
Often, one is interested in more details about a part of the manifold. Details cannot be extracted simply by rescaling of the axes as done, for example, when a part of a picture is magnified. The idea of the holographic zooming is to take advantage of the collective normalization in the following way: 1) Identifying the part of the manifold to be magnified; 2) isolating the subsimilarity matrix for the cluster; 3) performing a second iteration on this matrix, i.e., the affinity transformation, dimension reduction and construction of a manifold (see Fig. 3).

Inclusion of temporal information
An essential, though often neglected aspect of brain activity is represented by its temporal dimension. The similarity matrices, the cornerstone of the FH method exposed so far, do not include essential information about the temporal propagation of activity across the components. When available, this information can be presented in temporal ordering matrices the generic T i,j element of which describes the relative timing or phase difference between the activity of components i and j. Various methods can be used to evaluate the temporal ordering matrices. Recently, a new notion-the temporal center of mass, or temporal location was introduced in the context of cultured networks (Segev et al., 2004), but works equally well for other fast continuous neurophysiological signals, ECoG and EEG, and though with minor modifications even for relatively slow signals, viz. fMRI. Fig. 3. Holographic zooming. The FH algorithm conducted for each cluster separately; (a) cluster 1 (b) cluster 2 (c) noise signals nodes. All the nodes are linked by lines color coded according to the correlations (represented in the colorbar), creating the holographic manifold. Note that the two clusters are highly correlated whereas the noise group has no high or low correlations between them. The FH diagrams in (d,e and f) represent the same diagrams as in (a,b and c) from a different point of view while all nodes with correlations above 0.8 are linked.
The idea is to regard the activity density of each node i as a temporal weight function so that it's temporal center of mass, T in , during a synchronized bursting event (SBEs), i.e. a time segment in which most of the recorded neurons exhibit rapid firing is given by T i where the integral is over the time window of the SBE, and T n marks the temporal location of the n th SBE, which is the combined "center of mass" of all the neurons. The temporal center of mass of each neuron can vary between the different SBEs. Therefore we define the relative timing of a neuron i to be the average of the sequence of SBEs. Similarly, we define the temporal ordering matrix as follows: , Interestingly, when the temporal information is superimposed on the 3-D space of leading PCA eigenvectors, the activity propagates along the manifold in an orderly fashion from one end to the other (Fig. 4). For this reason, it is proposed to view the resulted manifold, which includes the temporal information as a causal manifold.  Fig. 3a,b respectively. The activity propagation is added by coloring the nodes location according to the relative phases or time lag between them. Blue is for early time (negative phases) and red for late times (positive phases). Note that adding this information helps to reveal the phase shifts imposed in the generation of the signals.

Holographic comparison and superposition of networks activity
A versatile method of data analysis should also come with the ability to quantitatively assess difference between experimental conditions. Clustering algorithms are often used for comparison between the activities of different networks, e.g., gene expression in different groups of patients, or between two modes of behavior of the same network, e.g., during and between epileptic seizures of the same patient. We propose the following holographic comparison between networks: 1) Compute the PCA leading eigenvectors of the affinity matrix for each network. 2) Project the affinity matrix of each network on the leading eigenvectors of the other one. Clearly, this approach can also be used for comparison between different modes of activity of the same networks, like the above-mentioned case of brain activity in between and during seizure, or different clusters identified in a given matrix. Once the clusters are identified, the similarity matrix for each is isolated from the combined matrix and the above two stages are applied. The holographic superposition is designed as additional method for comparison between different modes of activity of the same network. The idea is similar to the holographic comparison; however, the projection is on the mutual PCA leading eigenvectors, i.e. the leading eigenvectors of a combined matrix that includes the different modes.

Quantifying cluster information: Cluster entropy
Once clusters of functional brain activity are singled out, it is often useful to describe them in a quantitative fashion. This in particular enables to compare different clusters within and across subjects.
Entropy has been used in statistics and information theory to develop measures of the information content of signals (Shannon, 1948). However, entropy can also be used to measure the amount of information or variance embedded in a cluster, and to quantify the deviation of the cluster's eigenvalue distribution from a uniform one (Alter et al., 2000). This idea has been used in the context of biological systems (Varshavsky et al., 2007;Varshavsky et al., 2006) and economic systems (Shapira et al., 2009). The eigenvalue entropy is defined as where Ω is given by, N the number of signals, and denotes the matrix eigenvalues. S ranges from 0 to 1. Note that is a normalization factor ensuring that S reaches its maximum (S=1) for a uniform eigenvalue distribution (i.e. random correlations matrix).

Extracting topological information: MST analysis
The correlation matrix of the system creates a topological network structure, where the links between nodes are the pair-wise correlations, and the correlation coefficients as the weights of these links. Valuable information can be extracted from the topological properties of the network, over and above the mere localization in the brain volume. To extract this information, graph theoretical techniques can be applied to the data.
The graph induced by the correlation matrix is complete and therefore difficult to interpret per se. Extracting meaningful information from this complete graph involves providing a more compact description of the graph and analyzing its topological properties (e.g. (Newman, 2003;Tumminello et al., 2007). A graph and its connectivity can be synthetically described by its minimum spanning tree (MST) (West, 2001), i.e. a connected, undirected graph composed of subgroups of edges with the following two properties: I) The tree spans the graph, i.e. connects all the nodes of the graph. The number of links retained is (n − 1) for a network of (n) nodes. II) The sum of the edges' weights is minimal out of all possible spanning trees. The MST creates a subgraph without loops, maintains the connection of all nodes, using only the links with minimal weight.
The topological structure of the constructed tree creates a new visualization of the complex system, which allows visually tracking clusters of nodes, as well as structural similarities and differences of the system under different conditions. Other graph properties, such as node degree, node centrality and betweenness can be used to extract information from the tree (Newman, 2003;Tumminello et al., 2007;West, 2001).
The MST can be constructed based of the correlation matrix (i.e. the correlation based system) obtained by the FH algorithm. Plotting the MST upon the PCA affinity space and on the anatomical slice image enables us to monitor the dynamical changes of the selected voxels and the tree they create (their connections) over the entire time course of the experiment.

Functional Holography and Cliques in Brain Activation Patterns 111
The MST method requires assigning weights to the links between the nodes. The weights are assigned according to the affinity matrix. A commonly used distance transformation is the ultrametric distance, suggested by Rammal et al. (1986) (Rammal et al., 1986 and Mantegna et al. (2000) (Mantegna & Stanley, 2000). Using this distance, the pairwise correlation coefficient for each pair of nodes is translated into the distance between those two nodes. The ultrametric distance is UD(i,j) given by (9) where C(i,j) is the correlation coefficient between nodes i and j. This distance metric satisfies the ultrametric inequality, (I) The result of the ultrametric transformation is that strong positive correlations are translated into short distances, and strong negative correlations are translated into long distances. For the case of perfect positive correlation, i.e. C(i,j) = 1, the distance is 0; for the case of no correlation, i.e. C(i,j) = 0.5, the distance is 1; and for the case of perfect negative correlation, i.e. C(i,j) = 0, the distance is √2. The ultrametric distance matrix describes the complete network´s topological structure that yields no significant information (West, 2001). To construct the MST, the Kruskal algorithm (Kruskal, 1956) can be applied. This algorithm is considered greedy, as it runs in polynomial time (this problem however is not particularly severe if the networks have about 300 nodes, which renders the problem computable), and in each phase some local optimum is chosen. Fig. 5 demonstrates the use of the Kruskal algorithm to find the MST for a complete graph. link, DF with length 6, is highlighted using the same method. (d) The next-shortest links are AB and BE, both with length 7. AB is chosen arbitrarily, and is highlighted. The link BD cannot be chosen as it would form a cycle (ABD) and has therefore been marked in red. (e) The process continues to highlight the next-smallest link, BE with length 7. Many more links are highlighted in red at this stage: BC because it would form the loop BCE, DE because it would form the loop DEBA, and FE because it would form FEBAD. (f) Finally, the process finishes with the link EG, of length 9, and the minimum spanning tree is found.
The MST analysis can be used in combination with FH analysis. The FH analysis performs the aforementioned PCA dimension reduction algorithm creating a visualization of this complex network. The nodes in the reduced 3D space are then linked according to the MST connections, with lines color coded according to their original correlations. These lines create a topological MST manifold upon the 3D correlations space (see figs. 11 and 14 in section 2.2.2). This MST manifold is displayed on the brain slice image in the same way as that of the FH, thus providing information about connectivity on the real space (see figs. 12 and 15 in section 2.2.2). Fig. 6. The MST representation of the synthetically produced signals from Fig.1. The magenta and green colors represent clusters 1 and 2 respectively, and the cyan color represents the noise signals.

Dissimilarity measure for MSTs
The MST can conveniently be used to quantify similarities between different networks of brain activity. This can be done by resorting to the divergence rate measure developed by Lee et al. (Lee et al., 2006). This measure is based on the information metric d(X,Y), which quantifies the conditional entropies (or the difference) between two information sources, where H(X|Y) and H(Y|X) are the conditional entropies between sources X and Y. This metric satisfies the triangle inequality. The conditional entropy H(X|Y) denotes the amount of information that is obtained by measuring an information source Y with the knowledge of a different source X. The information gain can be approximated by the information change between two different sources. The source X is transformed into the Y source space by the transformation Y=f(X). The average information change by the transformation is defined as where N is the number of elements of an information source.
The divergence rate can be defined as an approximation of the average information change in order to apply it between two MSTs, Where D(X) (i) is the sum of all distances taken from a reference node i to the neighboring nodes in the XMST, and D(X|Y) (i) is the sum of all distances taken from the node i to the nodes in the YMST. Given the distances between node i to its neighbors in the XMST, D(Y|X) (i) evaluates how much the distance changed in the YMST. Note that the number of neighboring nodes for every reference node is different. This measure evaluates how much information is needed on average to explain YMST, given XMST. Then, to quantify the dissimilarity between two MSTs the metric distance is given by, If the MSTs are identical, D(X,Y) = 0, otherwise D(X,Y) > 0. Subgroups in the metric distance can then be delineated using the dendrogram clustering algorithm.

Analyzing ECoG recorded human brain activity
The occurrence of epilepsy is rising and is estimated to affect, at some level, 1%-2% of the world population (Towle et al., 2002). Due to availability of many antiepileptic drugs, approximately 80% of all epileptic patients can be kept seizure free. But for the remaining 20%, the only cure is surgical resection of the seizure focus (Chkhenkeli et al., 1998;Doyle & Spencer, 1998). One of the most challenging tasks facing epileptologists is precise identification of brain areas to be removed so that the problem can be cured with minimal damage and side effects. Often, the precise location of the epileptogenic region remains uncertain after obtaining conventional, noninvasive measurements such as electroencephalogram (EEG) and magnetoencephalogram (MEG) cannot provide sufficient information because of the relatively low spatial resolution of these methods. In these cases, the activity is directly recorded by the electrocorticography (ECoG) procedure in which the recording electrodes are placed directly on the brain surface.
Here we illustrate how the FH method can be applied to reveal the existence of hidden causal manifolds in the electrical brain activity of epileptic patients with implanted electrodes. We note that the method can also be applied to experimental seizure studies that have gained much attention (Ben-Jacob et al., 2007 ). Typical results are presented in Fig. 7. Fig. 7. Holographic networks of recorded brain activity. The holographic networks are for the ECoG recorded human brain activity for the inter-Ictal and Ictal activities. (a) and (b) show the connectivity diagram for the inter-Ictal and Ictal respectively, constructed upon the set of electrodes placed on the surface of the brain (the frontal lobe in this case). (c) and (d) show the corresponding dendrogram correlation matrices. (e) and (f) show the corresponding FH manifolds in the PCA affinity space. In the analysis we included only electrodes whose correlations with the other electrodes are above noise level. Note, that the locations change their functional role during seizure (Ictal) relative to those during the inter-Ictal durations.
Notably, the manifold of the inter-ictal activity has a very simple topology of almost circular horseshoe like part and another subgroup perpendicular to its plane and position at the center of the horseshoe. Although the new manifold has as expected a more complex topology, it retains some of the features of the one associated with inter-Ictal activity, when viewed from specific angle. Preliminary analyses also indicate that causal features are captured when the temporal (i.e., phase coherences) information is imposed on the manifolds. These results bear the promise that functional holography might become a valuable diagnostic procedure in the treatment of intractable epilepsy.

Analyzing fMRI recorded human brain activity
When applied to fMRI data, FH is an effective clustering method, capable of capturing system level networks using voxel-voxel correlation matrices (Jacob et al., 2010). Here we show how the algorithm using a dendrogram clustering method combined with a standard deviation (STD) filter can effectively be used to identify and extract voxel clusters. Subjects were instructed to clench and open either their left or right hand, according to an auditory cue. The paradigm consisted of 11 blocks of 114 volumes. Each consisted of a resting period with cross fixation (6-15 s), an auditory instruction period regarding hand movement (right or left; 3 s), and a period of hand movement execution (15 s). The blocks were presented in a constant order across subjects with regard to which hand to move. Two types of sequences were examined: repetitive (two consecutive movements of the same hand) and alternating hand movements.
Even for this simple hand clenching motor task, the FH analysis conducted for a single block revealed interesting motifs. For example, unilateral hand movement yielded two clusters, one located in the contralateral primary motor cortex showing increased signal (i.e. activation), and the other one in the ipsilateral homologue region, showing reduced signal (i.e. deactivation) (Fig. 8). Inspection of repetitive vs. alternating hand movements suggested that this pattern could be indicative of an inhibition mechanism of the ipsilateral hemisphere. In addition, a single-block level analysis, using only 12 time points corresponding to a 36 second recording session, was enough to determine which hand was moved by the subject, while other methods required the entire experiment time course. Moreover, cluster quantification based on eigenvalue entropy showed lower entropy for the motor-dominant hemisphere clusters. This lower entropy demonstrates less variability in the cluster's correlations, suggesting a higher modular organization in specific motor dominant hemisphere.
The MST was extracted for each subject and each block. The divergence rate measure was used for quantification of the structural similarities and differences of the system under the two different conditions of right or left hand movement. Fig. 9 displays examples of the dendrograms constructed from the divergence rate measure. Each dendrogram represents the distance, i.e. the divergence rate measure or similarity between all pairwise MSTs of the experiment's different blocks. Half of the subjects exhibited good separation between right and left hand movement MSTs (e.g. Fig. 9a), displaying clusters of 3-4 same hand movement trees. The other half of the subjects did not yield such good separation (e.g. Fig. 9b). Fig. 10 depicts the MSTs for a single subject that showed good separation. To elucidate the functional and structural meaning of the MST, the tree's nodes were colored according to their location in the brain; the red and blue colors represent voxels in the right and left hemisphere respectively, while yellow represents the midline SMA region. All the resulting MSTs showed two distinct clusters, one dominated by the right (red) and one by the left hemisphere (blue). The interesting result is that this representation highlights for all blocks a few red voxels in the blue cluster and blue ones in the red cluster, these voxels are the same in every block. To further investigate whether the two groups of subjects as obtained by the divergence rate measure, differed in terms of topological structure of their functional networks of correlated activity, the MST was used in combination with the FH visualization. The MST constructed for every single block shows a good separation between the hemispheres. For half of the subjects, the divergence rate measure allowed to partition the MSTs into two clusters of right and left hand movements. Figs.11-15 depicts the MSTs on the FH correlation PCA space, and on the anatomical brain slice image of a single subject that showed a good separation between right and left hand movements . A single subject who did not yield a good separation is also shown (Figs.13-15). Looking at the dynamic changes of the MSTs along the experiment time course in the anatomical slice image it becomes visible that blocks of sequences of repetitive hand movements resulted in numerous connections between the clusters as opposed to the alternating hand movements' blocks. The graph in Fig. 16 displays the average Z score of the number of connections between the clusters for each block across subjects (N=15). This demonstrates that these two kinds of sequences can also be differentiated by the MST visualization.  coded according to the brain anatomic structure. The red and blue colors represent the right and left hemisphere voxels respectively, and the yellow color represents the SMA area. The MSTs give a good separation between the right and left hemisphere. Note that in every MST there are few red voxles in the blue cluster and one blue voxle in the red cluster, since all the voxles are labeled a closer inspection shows that these outlier voxels are the exact same voxles in every MST. Each 3D graph represent the same voxels (color coded according to their original cluster with magenta and green), for each experiment block. These voxels are presented in their new location in the correlation space and connected according to the block's MST with lines color coded according to their correlation coefficient. In this visualization it is hard to detect the similarities and dissimilarities between the tree's structures. However it becomes visible in this case, that right hand movement's MSTs yielded a better separation of the clusters as opposed to left hand movements. Fig. 12. The same MSTs from figs. 10 and 11 constructed upon the brain slice EPI image. This presentation shows that according to the MSTs the two hemispheres are highly coupled (with positive correlation and with negative correlation) in the motor task. Fig. 13. Example of the MSTs constructed for a single left-handed subject that does not show a good separation in the divergence rate dendrogram. Displaying the MSTs for all ten blocks i.e. right hand (RH) and left hand (LH) movements. The red and blue colors represent the right and left hemisphere voxels respectively, and the yellow color represents the SMA area. Although this subject does not yield a good separation between the right and left hand MSTs his MSTs do show a good separation between the right and left hemispheres.  fig. 13 constructed on the FH affinity PCA space. Demonstrating the MSTs of a subject who displayed no similarities or dissimilarities between right and left hand block MSTs in the divergence rate dendrogram tree. Fig. 15. The same MSTs from figs. 13 and 14 constructed upon the brain slice image. In this presentation, the two hemispheres seem to be highly negatively connected within the motor task blocks.
The topological structure of the constructed tree allows visual tracking clusters of nodes, and the variations undergone by the system as it faces different experimental conditions. This visualization is of paramount importance when dealing with highly complex systems, and is particularly helpful in the identification of clusters and their hierarchies. Thus two different clusters, each dominated by one hemisphere (figs. 10 and 13) or "outlier" voxels, which may have an important part in inter-hemispheric communication, could be highlighted.
Finally we point out one drawback of the MST method. When applied on a correlationbased system the MST uses the shortest distances. This induces a bias to positive correlations, while anti-correlations are overlooked. Further analysis of the data treating positive and the negative correlations on a par level (e.g. using the absolute values of the correlation matrix) may be recommended.
Overall, with this extremely simple example, we have illustrated how analyzing a very basic topological network property of networks of correlated activity associated with different cognitive conditions can reveal, in a rather parsimonious way, it's most important connections, suggesting the potential of this type of analysis in dealing with more challenging and rich data. Fig. 16. The averaged Z score of number of links in the MST connecting the two clusters across subjects (N=15) for each experiment block. The two repetitive hand movements (i.e. blocks 2 and 8) resulted in higher average than the rest of the blocks. A statistical Z-test was calculated for each block with the null hypothesis that the scores in each block are a random sample from a normal distribution with mean zero. According to this test the two repetitive hand movements were found significant (p=2x10 -4 and p=6.9x10 -10 for blocks 2 and 8 respectively).

Conclusion
Over the past two decades, the development of new neuroimaging techniques has produced spectacular improvements in the amount of detail with which brain activity can be monitored. As precision has rapidly been gained, though, so has the typical data set size grown steadily. The range of questions that researchers and clinicians alike have started finding an answer for with neuroimaging techniques has also dramatically expanded. All this, in turn, has created a demand for new methods of data analysis, These new methods were developed on the one hand to provide new ways to represent brain activity, and on the other hand, to make quantitative sense of the rich information embedded in very high dimensional data and to visualize them in a way that can be read and understood in a sufficiently straightforward way by researchers first and, ultimately, by clinicians.
In this chapter, we presented FH a method which effectively tackles these issues. The FH algorithm deals with the multivariate and multiscale nature of brain imaging data sets and simplifies their complexity by representing patterns in a low-dimensional space which preserves the higher dimensional information of the original pattern of connectivity. In this sense, the FH analysis may be regarded as a system-level analysis that produces a complete, holographic, representation of brain activation.
The method also provides an effective visualization of the system, of critical importance when dealing with highly complex systems, and is particularly helpful in the identification of clusters and their hierarchies. Even more important though is the ability of the FH analysis to reveal subtle, system-level dynamical features that are hard to detect through other methods, even at the single subject level. And that might be overlooked due to prior assumptions by hypothesis-driven methods. In fact, the FH method can capture sensitive hemodynamic variations at the single block level, without further need for averaging or for contrasts between experimental conditions. In addition the method requires far less time point to localize activations than other clustering methods, viz. ICA (Bell & Sejnowski, 1995), FCA (Windischberger et al., 2003) or TCA (Zhao et al., 2007), suggesting that the FH method may play a prominent role in the development of classification algorithms for blind identification of different conditions in extremely short time series.
It is important to portray the FH method not only as an alternative but also as a valuable complement to existing methods. For instance, its dimension reduction step could be carried out using a variety of clustering techniques. Perhaps even more cogently, there is a clear complementarily between network theory and the FH method. The application of the former that we presented, i.e. the MST, clearly represents but one out of the many possible applications. To the extremely vast field of issues that network theory allows to address in a versatile but quantitatively rigorous and qualitatively explicit way, the FH method adds a compact representation in an auxiliary field that makes functional networks more explicit, as it divorces them from the anatomical space in which they live.
A distinctive quality of the FH method is represented by its versatility. While originally developed for cultured neural networks, the method can be applied to the analysis of essentially any type of signal, including the main tools for system-level neuroimaging, viz. EEG/MEG and fMRI. Although fewer examples of application to the latter are and further investigation of the method on different (viz. event-related) designs is needed, the proposed method shows great potential even for fMRI data in differentiating experimental conditions particularly when the corresponding signals are separated (Jacob et al., 2010). Since the outcome of the analysis is a holographic presentation in an abstract reduced space, it represents an ideal tool for multi-modal analysis of data from experiments combining EEG´s temporal precision with fMRI´s spatial one. Finally, the principles and implementation of the FH analysis are relatively simple and straightforward; taken together with the methods efficiency in delineating and tracking the time-varying unfolding of fine details of clustered activity at different spatial scales, it may represent a tool of election for brain scientists and for clinical neurologists alike.