Bidimensional functions. The range limits the values of
1. Introduction
Particle Swarm Optimization (PSO) is an evolutionary computation technique that exploits social intelligence found in bird flocking or fish schooling (Kennedy & Eberhart, 1995). The PSO algorithm generates potential solutions expressed as a particle swarm exploring the search space. Like genetic algorithms, a fitness function indicates the quality of the candidate solutions. The fitness function may have more than one optimum and some local optima may also interest the user. Many variants of PSO are capable of dealing with multimodal optimization problems. As many real world applications present several equivalent or nearby solutions, tools must be provided for the identification of multiple optima. Algorithms have been designed that either explore sequentially the search space or introduce a kind of parallelism. For instance, by analogy with bird swarms, niching strategies have been implemented.
In our model, called WPSO, particles are considered as celestial bodies, subject to the effect of gravity. A high density of particles causes the creation of a gravitational singularity called a wormhole. A particle falls into a wormhole if it crosses a boundary called the event horizon. It is then rejected into another region of space. Wormholes possess two interesting features. Firstly, a wormhole indicates a potential solution: the aggregation of particles around a particular spot reveals the existence of a local or optimal fitness value. Secondly, the presence of a wormhole avoids the premature convergence of the whole swarm since particles that are close to a previously explored region are redirected into another space region. A wormhole can then be seen as a mutation operator that dynamically preserves the diversity of the swarm.
The WPSO algorithm has been applied to the discovery of gene expression patterns in DNA microarrays. This new technology enables the simultaneous monitoring of the expression levels of thousands of genes in particular conditions (the effect of pathologies, drugs or stress, …). Large-scale experiments can be carried out to observe the co-expression of genes in order to better understand the underlying biological processes. An expression pattern is an expression profile common to a set of genes, characterizing a cancer signature or a biological function such as immune response or programmed cell death. Since these patterns correspond generally to dense regions of the expression measurements, our method is particularly well adapted.
The rest of the paper is organized as follows. Section 2 discusses related work; Section 3 describes the principle of PSO; Section 4 is devoted to the presentation of WPSO and a study of its performances; Section 5 deals with the application of WPSO to microarray data analysis. Finally a conclusion and some perspectives are presented in Section 6.
2. Related work
Evolutionary algorithms have been successfully applied to solve multimodal optimization problems. The concept of speciation has been inspired from the observation of biological evolution. Different species are in competition with each other in territories containing finite resources. These species tend to occupy different environmental niches.
The Island model (Whitley et. al. 1998) is for instance a technique providing multiple subpopulations that helps to preserve genetic diversity. Niching methods have been incorporated in Genetic Algorithms to promote the formation and maintenance of subpopulations (Mahfoud 1995). Two kinds of approaches have been proposed: while parallel niching methods maintain several niches simultaneously, sequential niching methods apply iteratively a swarm on the search space to find multiple optima. Sequential Niching (Beasley et. al. 1993) is based on a derating function that operates on the fitness landscape at a position where a solution was found. Sharing (Goldberg 1987) is another mechanism for maintaining population diversity. The fitness of an individual is scaled by the number of similar individuals in order to penalize redundant solutions. This sharing function depends on the distance between individuals, which is a way to control the diversity of species. Deterministic Crowding (Mahfoud 1995) is a selection method that facilitates the formation of niches, these latter being maintained by an elitist scheme.
PSO algorithm can easily be extended to perform multimodal optimization. An intuitive adaptation consists in restricting the social cognition of a particle. In the SPSO algorithm (Li 2004), species are created dynamically at each iteration according to the location of the particles. A species is determined by a neighborhood, defined by a representative particle and a radius specified by the user. In the UPSO algorithm, introduced by (Parsopoulos and Vrahatis 2004), the evolution of the particle velocity is controlled by two terms related to a global and a local neighborhood. The PVPSO method proposed by (Schoeman & Engelbrecht 2005) uses vector operations to determine the particles belonging to a niche. More precisely, dot products indicate if a particle converges on the neighbourhood-best of a given niche. Another parallel niching technique has been proposed in (
Brits.a et. al. 2002
). The NbestPSO algorithm is based on neighborhood-best defined for each particle
3. Particle swarm optimization
PSO is an algorithm inspired by the model of social learning that simulates the flocking behavior of birds and fish. A PSO is based on a population of interacting elements (particles). A swarm of particles explores an n-dimensional space; the location of a particle determines a potential solution. Each particle knows the most promising position (
The algorithm starts with a population composed of random particles. The location of each particle can be arbitrary defined by using a uniform random sequence but it has been shown that randomized low-discrepancy sequences better cover the search space and then improve the performances of PSO (Nguyen et. al. 2007). In our implementation, particle locations have been initialized by using Sobol quasi random sequences. In the following step of the algorithm, the fitness value is computed for each particle and
where:
In eq.1,
The next step of the algorithm consists in updating the location vector as follows:
The algorithm iterates until convergence is achieved or a maximal number of iterations is reached. The convergence of the swarm can be defined in different ways, a common approach being the convergence of the
4. Gravitational singularities and PSO
4.1. Description of the WPSO algorithm.
The WPSO algorithm proposed in this paper follows the same principles as those described previously, as long as the density of particles remains below a critical value. The convergence of many particles into a region causes the formation of a gravitational singularity, as it is the case when a massive star collapses. In our model, the agglomeration of particles gives rise to the formation of a wormhole. An intra-universe wormhole is a hypothetical entity that connects two locations of the universe, offering a short cut through space and time. Particles approaching a wormhole are then instantaneously projected in another region (in our case a random location of the search space). This feature guarantees that a previously found solution is not reachable and it preserves the diversity of the swarm.
The algorithm works as summarized in Fig. 1. In the beginning, particles move in a uniform space as in the classical model. After a certain time, a high density of particles gives birth to a wormhole. At this stage, the wormhole is not entirely formed. The group of particles (called W) that have created the wormhole are separated from the rest of the main swarm. This group goes across the space until convergence, as if its particles are the only individuals in the search space. When W converges, the wormhole is mature and remains in a constant location, determined by the position of the best particle in W at convergence. Particles belonging to W are then released and redirected in a random location and restored to the swarm; a new cycle can then be performed.
The wormhole is defined by its event horizon, that is the gravity field where the space-time is so bent that nothing can escape from it. In our case, the event horizon is the region where particles vanish. Our implementation determines this boundary by a radius around the best particle belonging to W. This radius remains constant from the birth to the mature stage of the wormhole; only the center of the wormhole moves during its formation.
We need also to specify the critical point that causes the formation of the wormhole. As already stated, the birth of a wormhole is due to a high density of particles. One could compute the density of every particle by examining its neighborhood. A less computationally expensive strategy consists in only considering the best particle at a given time. The density is then defined as the number of particles
The formation of a gravitational singularity induces two types of particles: those at the origin of the wormhole and the rest of the swarm. The former group searches the refined location of an optimum. As one expects this group to converge in a short period of time, it is necessary to decrease the inertia of the particles. For that purpose, we consider that the particles evolve in different times; the lifetime of the group W corresponds to the maximal period of time
where
The convergence of W group is determined by the velocity of its best particle. When the velocity is less than a given threshold ε=10−5, we consider that the location is stable and the wormhole mature. Note that any particle can be trapped by already-formed wormholes. A group W can then be dispersed and may be eventually unable to converge after a period of
4.2. Experimental results.
Six well-known benchmark functions have been used for comparison sake. These two-dimensional functions presenting multiple global and local optima are described in Table 1.
Label | Function | Range | Global | Local |
f1 | z =(x2+y-11)2-(x+y2-7)2 | -10,+10 | 4 | 0 |
f2 | z =sin(2.2πx+π/2)(2-|y|/2) (3-|x|/2) +sin(2.2π y2+π/2)(2-|y|/2) (2-|x|/2) |
-2,+2 | 4 | 8 |
f3 | z =cos(x)2+sin(y)2 | -4,+4 | 6 | 0 |
f4 | z =((cos(2x+1)+2cos(3x+2)+3cos(4x+3)+ 4cos(5x+4)+5cos(6x+5)) (cos(2y+1)+2cos(3y+2)+3cos(4y+3)+ 4cos(5y+4)+5cos(6y+5)) |
-2.0,+2.5 | 2 | 38 |
f5 | z =(y2-4.5 y2)xy-4.7cos(3x- y2(2+x)) sin(2.5πx) +(0.3x)2 |
-1.2,+1.2 | 1 | 9 |
f6 | z=4x2–2.1x4+(1/3)x6+xy–4y2+4y4 | -1.9,+1.9 | 2 | 4 |
In (Özcan & Yılmaz 2007), these functions have been used to test the performances of the PSO algorithms presented in section 2: SPSO, NichePSO, mNichePSO, NbestPSO, UPSO, PVPSO) and CPSO. The same parameters and evaluation metrics have been taken into consideration for comparison purpose with WPSO.
The input parameters common to all the algorithms are given in Table 2. The additional parameters of WPSO are given in Table 3.
swarm size | 50 |
number of iterations | 25,000 |
wmin | 0.6 |
wmax | 0.8 |
g1 | 1.5 |
g2 | 1.5 |
vmax | range_size/20 |
r | range_size/8 |
nw | 15 |
tmax | 500 |
The measure proposed in (Özcan and Yılmaz 2007) has been retained for the estimation of performances: the overall success rate
where
Table 4 shows the performance results observed for the functions of Table 2. CPSO and mNichePSO efficiently locate global optima in absence of local optima (functions f1 and f3). These two algorithms also respectively achieve the second and third best average performances, preceded by WPSO. Our algorithm performs well in functions presenting many local optima (functions f2, f4 and f5). It seems surprising that it is not always capable of finding all the optima of simpler functions, such as f1, but this is due to the restriction of the performance test requiring to set the same input parameters for all the functions. The experiments reveal that the most sensible parameter of WPSO is the radius denoting the event horizon. This may explain the weakness of some results, even if the merging of wormholes tends to minimize this effect.
PSO algorithm | f1 | f2 | f3 | f4 | f5 | f6 | avr.osr. | stdev. |
SPSO | 0.95 | 0.48 | 0.72 | 0.27 | 0.64 | 0.82 | 0.65 | 0.24 |
NichePSO | 0.19 | 0.21 | 0.45 | 0.13 | 0.66 | 0.45 | 0.38 | 0.21 |
NbestPSO | 0.90 | 0.43 | 0.74 | 0.31 | 0.46 | 0.58 | 0.57 | 0.22 |
CPSO | 1.00 | 0.66 | 1.00 | 0.52 | 0.63 | 0.62 | 0.74 | 0.21 |
UPSO | 0.88 | 0.54 | 0.55 | 0.48 | 0.45 | 0.54 | 0.57 | 0.16 |
mNichePSO | 1.00 | 0.66 | 0.99 | 0.14 | 0.47 | 0.91 | 0.70 | 0.34 |
PVPSO | 0.21 | 0.38 | 0.46 | 0.20 | 0.35 | 0.71 | 0.42 | 0.20 |
WPSO | 0.98 | 0.92 | 0.99 | 0.58 | 0.84 | 0.62 | 0.82 | 0.18 |
5. Application to the analysis of DNA microarrays.
5.1. PSO for the extraction of gene expression patterns.
DNA microarrays allow to monitor the expression level of thousands of genes in a single experiment. They provide a better understanding of how cells work and respond to different stimuli. For instance, this technology is being applied to early diagnosis of cancer (Ochs & Godwin 2003) or drug discovery (Debouck & Goodfellow 1999). A common microarray data mining application consists in the analysis of the co-expression of genes. The underlying idea is that genes respecting a certain common expression pattern may be co-regulated. The identification of these patterns or signatures can help biologists to infer regulatory modules or biological functions (Yeung 2004). Generally, as microarray experiments focus on a particular disease or on the effect of a particular drug, only a low fraction of genes reveals a significant expression profile, related to the observed phenomenon. Therefore, we need to extract the most relevant signatures among many sets of poorly co-expressed genes. This task can then be seen as a multimodal optimization problem where the fitness is associated to the quality of the extracted signatures. In our application, a pattern is defined by an expression profile (i.e. a set of expression levels) and its neighborhood, that is the
The WPSO algorithm described in the previous section has been adapted to the analysis of DNA microarrays. The rule leading to the merging of two wormholes has been refined. Experiments have revealed that the distance between two pattern profiles is not an optimal criterion to determinate if two signatures are similar. A more precise indicator consists in the estimation of the likelihood of this situation. The corresponding hypothesis assumes that the members of both signatures are actually issued from the same distribution. We use the Bayesian Information Criterion (BIC) to accept or reject a wormhole on this basis. The BIC (Schartz 1978) is associated to the maximization of a log likelihood function and is capable of estimating the best model. The BIC formula used in the PSO is described in (Pelleg & Moore 2000).
This figure shows a heat map graphical representation of the expression matrix. Data have been ordered in rows and columns using a hierarchical clustering, using the Euclidean distance and the ward method (dendrogram not showed). Rows correspond to genes and columns to samples (i.e. a microarray experiment). The conventional color palette shows under-expressed values in green, normal values in black and over-expressed value in red. The left bar indicates from which class each gene is issued (σ(1)=0.1, σ(2)=0.2, σ(3)=0.3, σ(4)=0.4, σ(5)=0.5, σ(6)=1). Note that the dense signatures (σ=0.1 to σ=0.3) are correctly aggregated in the figure, but the last two patterns have been aggregated with genes issued from the noise pattern.
5.2. Experiments on synthetic data.
To study the efficiency of the PSO, synthetic patterns have been generated. The creation of the expression matrix is based on the generation of five signatures. First, a pattern profile is randomly generated, using a uniform distribution in the range [-1,+1]. Second, 20 members of the pattern are generated: for each point of the profile, the gene expression value is expressed as the corresponding expression value of the pattern, perturbed by a Gaussian noise having a null mean and a standard deviation of σ. The homogeneity of the five gene sets are given by σ; the selected values are respectively (0.1, 0.2, 0.3, 0.4, 0.5). A random noise is added to the expression matrix, defined by a pattern having only null expression values and a strong standard deviation (σ =1). A fraction
The four lines correspond to the mean value of the performance measure, expressed for five values of σ (indicated in abscise). Each line is relative to a different fraction
The same input parameters as for the previous experiments have been used, except for the event horizon radius (set to 10) and the dimension of the search space. Gene profiles have been normalized (mean set to zero and variance to one). This pre-processing reduces the variation of measurements between genes, and limits the search space to a uniform range set to [-3,+3].
The performance test identifies the patterns given by the WPSO: the 20th closest genes to each pattern of the matrix are extracted and compared to the actual list of genes generated for the creation of the matrix. The performance measure is the similarity of these two sets, expressed as the ratio of two cardinals: the intersection over the union. 50 random matrices have been produced for each of the four matrix sizes. The results are summarized in Fig. 3. One can observe that the PSO algorithm perfectly identifies strong signatures (σ=0.1 to σ=0.3), independently of the matrix size. The most heterogeneous patterns (σ=0.4 to σ=0.5) are more problematic and sensible to the matrix size. These are limit cases where the probability of finding a neighbor issued from the noise pattern is not negligible. As shown in the example of Fig. 2, the corresponding clusters are indeed not correctly discriminated by the hierarchical clustering algorithm. It is thus not surprising that the increase of the number of noisy genes decreases the performance measure.
5.3. Experiments on real data.
The WPSO algorithm has been applied to a well-known study (Ross et. al. 2000), which particularly illustrates the power of microarray analysis. This remarkable work explores the variation of expression of 8,000 genes among 60 cell lines derived from human tumors. The dataset then presents a vast potential of expression patterns, due to the variety of the observed phenotypes. In a similar manner as that proposed in the original paper, the set of genes has been filtered to remove the genes that do not show a significant variation of expression among the cell lines. After this pre-processing, a set of 1,200 genes has been retained and normalized like in the previous section. The same algorithm has also been applied.
Fig. 4 shows the resulting expression patterns. An interesting feature of these signatures is that they are related to sub-sets of cell lines specific to certain tissue origins. The blue lines indicate the most remarkable observations: for instance, columns labeled A (resp. B, C, D, E) are notably differentially expressed in the 6th (resp. 4th, 2nd, 1st, 5th) signature and specific to samples belonging to breast (resp. colon, leukaemia, melanoma, renal) tumors.
The overall average silhouette width (
6. Conclusion
An original PSO algorithm called WPSO has been proposed, inspired from the concept of gravitational singularity. The convergence of particles generates a wormhole whose action is twofold: during its formation, it searches the best location of an optima; in a mature phase, it rejects particles to other regions of the search space. Experiments reveal that this algorithm provides efficient results on benchmark functions. A bioinformatics application has been implemented to extract signatures of co-expressed genes from microarray data. On a real experiment, the discovered patterns are tight, well separated and related to known phenotypes. As the size of event horizon determines the efficiency of the WPSO algorithm, a major improvement would consist in an automatic adaptation of this parameter according to the shape of the fitness function. A potential indicator for this optimization could be the frequency of particles absorbed by a wormhole.
References
- 1.
Beasley D. Bull D. R. Martin R. R. 1993 A Sequential Niching Technique for Multimodal Function Optimization, 1(2),101 125 , MIT Press.0302-9743 - 2.
Brits R. Engelbrecht A. P. van den Bergh F. 2002 Solving Systems of Unconstrained Equations using Particle Swarm Optimization, in ,3 6 Piscataway, NJ,0-78037-437-1 Hammamet, Tunisia. - 3.
Brits R. Engelbrecht A. P. van den Bergh F. 2002 A niching particle swarm optimizer, in ,2 692 696 , Singapur. - 4.
Debouck C. Goodfellow P. N. 1999 DNA microarrays in drug discovery and development, .,21 48 50 , January. - 5.
Goldberg D. E. Richardson J. 1987 Genetic Algorithm with Sharing for Multimodal Function Optimization, in ,41 49 , Cambridge, Massachusetts, United States. - 6.
Kennedy J. Eberhart R. 1995 Particle swarm optimization, in ,1942 1948 , Piscataway, NJ. - 7.
Kennedy J. 1997 The particle swarm: Social adaptation of knowledge, in ,303 308 . IEEE Service Center, Piscataway, NJ. - 8.
Li X. 2004 Adaptively Choosing Neighborhood Bests using Species in a Particle Swarm Optimizer for Multimodal Function Optimization, in ,105 116 , LNCS 3102, eds. Deb, K. et al., Springer-Verlag, Seattle, USA. - 9.
Mahfoud S. 1995 Niching Methods for Genetic Algorithms . , University of Illinois at Urbana Champaign, intern. report 95001. - 10.
Nguyen Q. U. Nguyen X. H. McKay R. I. Tuan P. M. 2007 Initialising PSO with randomised low-discrepancy sequences: the comparative results, in ,1985 1992 , 25-28 September, Singapore. - 11.
Ochs M. F. Godwin A. K. 2003 Microarrays in cancer: research and applications, ,Vol. suppl.,4 15 , March. - 12.
Özcan E. Yılmaz M. 2007 Particle Swarms for Multimodal Optimization. , Eds. Lecture Notes In Computer Science,4431 Springer-Verlag, Berlin, Heidelberg,366 375 , Warsaw, Poland, April 11- 14.0302-9743 - 13.
Parsopoulos K. E. Vrahatis M. N. 2004 UPSO: A Unified Particle Swarm Optimization Scheme, Lecture Series on Comp. and Computational Sci.,1 , Vouliagmeni-Kavouri, Greece,868 873 . - 14.
Pelleg D. Moore A. 2000 X-means: Extending K-means with efficient estimation of the number of clusters, in ,727 734 .1-55860-707-2 - 15.
Ross D. T. Scherf U. Eisen M. B. Perou C. M. Rees C. Spellman P. Iyer V. Jeffrey S. S. Van de Rijn M. Waltham M. Pergamenschikov A. Lee J. C. Lashkari D. Shalon D. Myers T. G. Weinstein J. N. Botstein D. Brown P. O. 2000 Systematic variation in gene expression patterns in human cancer cell lines. Nat Genet,march,24 3),227 235 ,1061-4036 - 16.
Rousseeuw P. 1987 Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. .1 20 no 1,53 65 . Elsevier Science Publishers B.0377-0427 - 17.
Schoeman I. L. Engelbrecht A. P. 2005 A Parallel Vector-Based Particle Swarm Optimizer, ,268 271 .978-3-21124-934-5 - 18.
Schwarz G. 1978 Estimating the Dimension of a Model , ,6 2 461 464 ,0090-5364 - 19.
Whitley D. Rana S. Heckendorn R. B. 1998 The Island Model Genetic Algorithm: On Separability, Population Size and Convergence, ,7 33 47 . - 20.
Yeung K. Medvedovic M. Bumgarner R. 2004 From co-expression to co-regulation: how many microarray experiments do we need? , ,5 7 R48,1465-6906