Open access peer-reviewed chapter

Generalized Simulated Annealing

By Yang Xiang, Sylvain Gubian and Florian Martin

Submitted: March 24th 2016Reviewed: September 28th 2016Published: April 26th 2017

DOI: 10.5772/66071

Downloaded: 946


Many problems in mathematics, statistics, finance, biology, pharmacology, physics, applied mathematics, economics, and chemistry involve the determination of the global minimum of multidimensional real‐valued functions. Simulated annealing methods have been widely used for different global optimization problems. Multiple versions of simulated annealing have been developed, including classical simulated annealing (CSA), fast simulated annealing (FSA), and generalized simulated annealing (GSA). After revisiting the basic idea of GSA using Tsallis statistics, we implemented a modified GSA approach using the R package GenSA. This package was designed to solve complicated nonlinear objective functions with a large number of local minima. In this chapter, we provide a brief introduction to this R package and demonstrate its utility by solving non‐convexoptimization problems in different fields: physics, environmental science, and finance. We performed a comprehensive comparison between GenSA and other widely used R packages, including rgenoud and DEoptim. GenSA is useful and can provide a solution that is comparable with or even better than that provided by other widely used R packages for optimization.


  • classical simulated annealing (CSA)
  • fast simulated annealing (FSA)
  • generalized simulated annealing (GSA)
  • GenSA

1. Introduction

Determining the global minimum of a multidimensional function is the focus of many problems in statistics, biology, physics, applied mathematics, economics, and chemistry [16]. Although there is a wide spectrum of problems, computing the global minimum remains a challenging task, because, for example, modern problem dimensionality is increasing.

The optimization of convex functions is usually reasonably conducted using standard optimization approaches, such as simplex optimization, the steepest descent method, and the quasi‐Newton method. These methods can also provide reasonable results for the study of simple non‐convex functions with only a few dimensions and well‐separated local minima.

Deterministic methods are usually faster than stochastic methods although they tend to be trapped into a local minimum. To overcome this particular issue, stochastic methods have been widely developed and can determine a good approximation of the global minimum with a modest computational cost. Among stochastic methods, genetic algorithms [7], evolution algorithms [8], simulated annealing (SA) [9], and taboo search [1012] have been successfully applied.

Among popular approaches, genetic algorithms [7] mimic the process of natural DNA evolution. In this approach, a population of randomly generated solutions is generated. The solutions are encoded as strings and evolve over many iterations toward better solutions. In each generation, the fitness of each individual in the population is evaluated, and in the next generation, strings are generated by crossover, mutation, and selection, based on their fitness. Differential evolution belongs to such genetic algorithms.

Ant colony optimization (ACO) [13] is another set of stochastic optimization methods, which is inspired by ants wandering to find food for the colony. An ant starts wandering randomly while laying down pheromone trails that will influence other ants because they will be attracted (increase in probability) by the trail, and if they eventually locate food, will return and reinforce the trail. To avoid the algorithm converging to local minima, the pheromone trail is set to evaporate proportionally to the time it takes to traverse the trail to decrease its attractiveness. As a consequence, the pheromone density of short paths becomes higher than that of longer paths. The design of ACO perfectly matches graph‐based optimization (e.g., traveling salesman problem), but it can be adapted to determine the global minimum of real‐valued functions [14] by allowing local random moves in the neighborhood of the current states of the ant.

The SA algorithm was inspired by the annealing process that takes place in metallurgy, whereby annealing a molten metal causes it to achieve its global minimum in terms of thermodynamic energy (crystalline state) [9]. In the SA algorithm, the objective function is treated as the energy function of a molten metal, and one or more artificial temperatures are introduced and gradually cooled, which is analogous to the annealing technique, to attempt to achieve the global minimum. To escape from local minima, this artificial temperature (or set of temperatures) acts as a source of stochasticity. Following the metallurgy analogy, at the end of the process, the system is foreseen to reside inside the attractive basin of the global minimum (or in one of the global minima if more than one global minimum exists). In classical simulated annealing (CSA), the visiting distribution is a Gaussian function (a local search distribution) for each temperature. It has been observed that this distribution is not optimal for moving across the entire search space [5]. Generalized simulated annealing (GSA) was developed to overcome this issue by using a distorted Cauchy‐Lorentz distribution.

For a more extensive review of stochastic optimization algorithms, see the review provided by Fouskakis and Draper [15].

The R language and environment for statistical computing will be the language of choice in this chapter because it enables a fast implementation of algorithms, access to a variety of statistical modeling packages, and easy‐to‐use plotting functionalities. These advantages make the use of R preferable in many situations to other programming languages, such as Java, C++, Fortran, and Pascal [16].

In this chapter, we elaborate further on the background and improvements of GSA and the use of the R package GenSA [17], which is an implementation of a modified GSA. We will also discuss the performance of GenSA and show that it outperforms the genetic algorithm (R package rgenoud) and differential evolution (R package DEoptim) in an extensive testbed comprising 134 testing functions based on the success rate and number of function calls. The core function of GenSA is written in C++ to ensure that the package runs as efficiently as possible. The utility of this R package and its use will be presented by way of several applications, such as the famous Thomson problem in physics, non‐convex portfolio optimization in finance, and kinetic modeling of pesticide degradation in environmental science.

2. Method

As mentioned above, SA methods attempt to determine the global minimum of any objective function by simulating the annealing process of a molten metal. Given an objective function f(x)with  x=(x1, x2, , xn)T, we attempt to determine its global minimum using SA. The general procedure for SA is as follows:

  1. Generate an initial state x0=(x10, x20, , xn0)Trandomly and obtain its function value E0=f(x0). An initial temperature T0is set. imax is set to be any big integer.

  2. For step i=1to imax,

    • The temperature Tiis decreased according to some cooling function.

    • Generate a new state  xi=xi1+Δx, where Δxfollows a predefined visiting distribution (e.g., Gaussian distribution). Ei=f(xi)and ΔE=EiEi1.

    • Calculate the probability p of xi1xi.If  p<random(0, 1), xiis set back to its previous state xi1and Eiis also set back to  Ei1.

  3. Output the final state ximaxand its function value  Eimax.

We provide more details of SA methods as follows.

2.1. Classical simulated annealing (CSA)

According to the process of cooling and the visiting distribution, SA methods can be classified into several categories, amongwhich CSA [9], fast simulated annealing (FSA) [18], and the GSA [19] are the most common.

In CSA, proposed by Kirkpatrick et al., the visiting distribution is a Gaussian function, which is a local search distribution [5, 19]:


where Δxis the trial jump distance of variable x and T is an artificial temperature in the reduced unit. In a local search distribution, for example, a Gaussian distribution Δxis always localized around zero. The jump is accepted if it is downhill of the energy/fitness/objective function. If the jump is uphill, it might be accepted according to an acceptance probability, which is computed using the Metropolis algorithm [20]:


Geman and Geman [21] showed that for the classical case, a necessary and sufficient condition for having probability 1 of ending at the global minimum is that the temperature decreases logarithmically with the simulation time, which is impossible in practice because this would dramatically increase the computational time.

2.2. Fast simulated annealing (FSA)

In 1987, Szu and Hartley proposed a method called FSA [18], in which the Cauchy‐Lorentz visiting distribution, that is, a semi‐local search distribution, is introduced:


where D is the dimension of the variable space. In a semi‐local search distribution, for example, the Cauchy‐Lorentz distribution, the jumps Δxare frequently local, but can occasionally be quite long. The temperature Tin FSA decreases with the inverse of the simulation time, and the acceptance algorithm is the Metropolis algorithm shown in Eq. (2).

2.3. Generalized simulated annealing (GSA)

2.3.1. Introduction to GSA

A generalization of classical statistical mechanics was proposed by Tsallis and Stariolo [19]. In the Tsallis formalism, a generalized statistic is built from generalized entropy:


where qis a real number, iis the index of the energy spectrum, and sqtends to information entropy:


when q1. Maximizing the Tsallis entropy with the constraints


where εiis the energy spectrum, and the generalized probability distribution is determined to be


where zqis the normalization constant that ensures that piintegrates to 1. This distribution pointwise converges to the Gibbs‐Boltzmann distribution, where qtends to 1.

The GSA algorithm [19] refers to the generalization of both CSA and FSA according to Tsallis statistics. It uses the Tsallis‐Stariolo form of the Cauchy‐Lorentz visiting distribution, whose shape is controlled by the parameter qv:


Please note that qvalso controls the rate of cooling:


where Tqvis the visiting temperature. In turn, a generalized Metropolis algorithm is used for the acceptance probability:


where β=1/(KTqa). The acceptance probability is controlled by the artificial temperature, Tqa. When qv=1and qa=1, then GSA is exactly CSA. Another special instance is given by qv=2and qa=1for which GSA is exactly FSA. Asymptotically, GSA has a similar behavior thanthe stochastic steepest descentas T0. A faster cooling than CSA and FSA is achieved when  qv>2.

It has been shown in a few instances that GSA is superior to FSA and CSA. Xiang et al. established that a pronounced reduction in the fluctuation of energy and a faster convergence to the global minimum are achieved in the optimization of the Thomson problem and Nickel cluster structure [46]. Lemes et al. [22] observed a similar trend when optimizing the structure of a silicon cluster.

2.3.2. Improvement of GSA

A simple technique to accelerate convergence is as follows:


where Tqais the acceptance temperature, Tqvis the visiting temperature, and t is the number of time steps. Our testing shows that this simple technique accelerates convergence [6].

The performance of GSA can be further improved by combining GSA with a local search method, large‐scale bound‐constrained Broyden‐Fletcher‐Goldfarb‐Shanno (BFGS) method [23] for a smooth function, or Nelder‐Mead [24] for a non‐smooth function. A local search is performed at the end of each Markov chain for GSA.

3. Results

The GenSA R package has been developed and added to the toolkit for solving optimization problems in the Comprehensive R Archive Network (CRAN) R packages repository. The package GenSA has proven to be a useful tool for solving global optimization problems [17].

3.1. Usage

The GenSA R package provides a unique function called GenSA whose arguments were described in the associated help [25]:

  1. par: Numeric vector. Initial values for the parameters to be optimized over. Default value is NULL, in which case, the initial values will be generated automatically.

  2. lower: Numeric vector with a length of par. Lower bounds for the parameters to be optimized over.

  3. upper: Numeric vector with a length of par. Upper bounds for the parameters to be optimized over.

  4. fn: A function to be minimized, where the first argument is the vector of parameters over which minimization is to take place. It should return a scalar result. The function has to always return a numerical value; however, not applicable (NA) and infinity are supported.

  5. ...: Allows the user to pass additional arguments into fn.

  6. control: A list of control parameters, discussed below. The control argument is a list that can be used to control the behavior of the algorithm. Some components of control are the following:

  • maxit: Integer. Maximum number of iterations of the algorithm. Default value is 5000, which could be increased by the user for the optimization of a very complicated objective function.

  • threshold.stop: Numeric. The program will stop when the objective function value is ≤ threshold.stop. Default value is NULL.

  • smooth: Logical. TRUE when the objective function is smooth, or differentiable almost everywhere, and FALSE otherwise. Default value is TRUE.

  • Integer. Maximum number of calls of the objective function. Default value is 10,000,000.

  • max.time: Numeric.Maximum running time in seconds.

  • temperature: Numeric. Initial value for the temperature.

  • visiting.param: Numeric.Parameter for the visiting distribution.

  • acceptance.param: Numeric.Parameter for the acceptance distribution.

  • verbose: Logical. TRUE means that messages from the algorithm are shown. Default value is FALSE.

  • trace.mat: Logical. Default value is TRUE, which means that the trace matrix will be available in the returned value of the GenSA call.

The default values of the control components are set for a complicated optimization problem. For typical optimization problems with medium complexity, GenSA can determine a reasonable solution quickly; thus,using the variable threshold.stop to the expected function value is recommended to make GenSA stop at an earlier stage. A maximum run time can be also set by max.time argument or argument for setting the maximum run time or number of calls, respectively.

3.2. Performance

An extensive performance comparison of currently available R packages for continuous global optimization problems has been published [26]. In this comparison, 48 benchmark functions were used to compare 18 R packages for continuous global optimization. Performance was measured in terms of the quality of the solutions determined and speed. The author concluded that GenSA and rgenoud are recommended in general for continuous global optimization [26]. Based on this conclusion, we set out to perform a more extensive performance test by including more benchmark functions and the additional algorithm DEoptim. The SciPy Python scientific toolkit provides an extensive set of 196 benchmark functions. Because these 196 benchmark functions are coded in Python, we had to convert the Python code to R code. As a result, a subset containing 134 functions was available for this test. One hundred runs using a random initial starting point were performed for every combination of the 134 benchmark functions and the aforementioned three methods. We used a local search method to further refine the best solution provided by Deoptim, because this technique provides a more accurate final result [17]. The default values of the parameters of every package were used in this comparison. A tolerance of 1e-8 was used to establish whether the algorithm determines the global minimum value.

The reliability of a method can be measured by the success rate%, which is defined as the number of successful runs over 100 runs. For each testing function, the number of function calls required to achieve the global minimum was recorded for every method, and we refer to this as the number of function calls. Please note that rgenoud required a longer time to complete 100 runs compared with GenSA and DEoptim. Table 1 shows the success rate% and the average number of function calls (in parentheses).

Function nameGenSADEoptim‐LBFGSrgenoud
AMGM100% (93.0)100% (62.6)100.0% (88.8)
Ackley01100% (746.2)100% (1710.0)100.0% (1840.1)
Ackley02100% (182.9)100% (1703.6)100.0% (2341.6)
Ackley03100% (352.5)100% (1420.2)100.0% (1860.9)
Adjiman100% (33.3)100% (1133.9)100.0% (1677.5)
Alpine01100% (756.0)70% (2756.8)95.0% (46852.2)
Alpine02100% (56.4)100% (913.6)100.0% (1688.0)
BartelsConn100% (263.1)100% (1539.8)100.0% (2343.5)
Beale100% (145.6)100% (1311.8)100.0% (1711.0)
BiggsExp02100% (85.6)100% (763.3)100.0% (1710.5)
BiggsExp03100% (190.7)100% (2614.4)100.0% (1975.9)
BiggsExp04100% (3498.8)88% (8182.5)100.0% (4234.5)
BiggsExp0598% (40117.8)14% (10864.2)17.0% (13871.5)
Bird100% (112.3)100% (1777.3)100.0% (1702.9)
Bohachevsky1100% (875.1)100% (1107.5)100.0% (2673.1)
Bohachevsky2100% (1372.9)100% (1155.2)76.0% (2554.5)
Bohachevsky3100% (623.4)100% (1342.4)96.0% (2596.9)
BoxBetts100% (129.2)100% (1866.3)100.0% (2018.8)
Branin01100% (42.2)100% (1747.6)100.0% (1694.9)
Branin02100% (1495.7)28% (2830.9)96.0% (1752.3)
Brent100% (11.0)100% (987.5)100.0% (1687.9)
Bukin02100% (39.9)100% (1477.4)100.0% (1679.7)
Bukin04100% (217.9)100% (1029.4)100.0% (1744.2)
Bukin060% (NA)0% (NA)0.0% (NA)
CarromTable100% (119.5)100% (2040.9)100.0% (1729.6)
Chichinadze100% (517.4)100% (1063.9)92.0% (2219.8)
Colville100% (4515.8)100% (8230.9)100.0% (2996.1)
CosineMixture100% (22.0)100% (3875.3)100.0% (1670.6)
CrossInTray100% (70.8)100% (1512.8)100.0% (1772.1)
CrossLegTable0% (NA)0% (NA)2.0% (51829.0)
CrownedCross0% (NA)0% (NA)2.0% (16045.0)
Cube100% (1717.2)100% (2030.3)52.0% (21649.8)
DeVilliersGlasser01100% (2343.8)0% (NA)1.0% (43919.0)
DeVilliersGlasser022% (173501.0)0% (NA)0.0% (NA)
Deb01100% (57.1)100% (4000.0)100.0% (1700.8)
Deb03100% (78.8)100% (4028.9)100.0% (1708.1)
Decanomial100% (1519.3)100% (741.4)100.0% (2050.8)
DeckkersAarts100% (74.6)100% (1525.0)100.0% (1988.7)
Dolan100% (2504.7)1% (10293.0)82.0% (25067.4)
DropWave100% (3768.7)85% (4009.8)83.0% (2973.3)
Easom97% (5077.5)100% (1343.0)100.0% (1875.7)
EggCrate100% (122.9)100% (912.2)100.0% (1697.2)
ElAttarVidyasagarDutta100% (675.7)93% (1625.9)100.0% (2115.7)
Exp2100% (85.3)100% (781.2)100.0% (1707.7)
Exponential100% (20.6)100% (580.3)100.0% (1682.5)
FreudensteinRoth100% (395.4)83% (2620.7)100.0% (1700.9)
Gear100% (2225.8)100% (1118.0)93.0% (1609.6)
Giunta100% (39.6)100% (592.3)100.0% (1686.6)
GoldsteinPrice100% (158.7)100% (1023.0)100.0% (1703.5)
Gulf100% (1739.0)100% (4076.4)1.0% (1810.0)
Hansen100% (149.7)100% (104.0)100.0% (132.0)
HimmelBlau100% (53.7)100% (2384.8)100.0% (1698.7)
HolderTable100% (138.0)100% (2010.3)100.0% (1701.7)
Hosaki100% (49.8)100% (583.2)100.0% (1699.5)
Infinity100% (225.8)100% (113.4)100.0% (492.0)
JennrichSampson0% (NA)0% (NA)0.0% (NA)
Keane100% (21.4)100% (679.1)100.0% (629.5)
Leon100% (128.0)100% (1338.2)35.0% (2156.5)
Levy05100% (152.8)100% (144.7)100.0% (224.6)
Levy13100% (867.3)100% (1138.5)100.0% (1771.7)
Matyas100% (33.8)100% (967.7)100.0% (1702.4)
McCormick100% (42.2)100% (747.8)100.0% (1705.4)
Michalewicz100% (97.3)100% (69.0)100.0% (157.3)
MieleCantrell100% (2935.1)100% (1942.7)100.0% (3438.9)
Mishra0381% (138334.7)0% (NA)24.0% (5745.0)
Mishra040% (NA)0% (NA)13.0% (1833.2)
Mishra05100% (1289.1)77% (1396.9)100.0% (1680.0)
Mishra060% (NA)0% (NA)0.0% (NA)
Mishra07100% (38.9)100% (3055.9)100.0% (1679.7)
Mishra08100% (1519.3)100% (741.4)100.0% (1900.8)
Mishra09100% (80.0)85% (4832.0)99.0% (12356.0)
Mishra11100% (30.9)100% (3152.2)100.0% (1613.1)
MultiModal100% (134.5)100% (481.8)100.0% (1691.6)
NewFunction011% (243864.0)0% (NA)27.0% (9823.3)
NewFunction020% (NA)0% (NA)33.0% (12260.0)
Parsopoulos100% (29.9)100% (3267.7)100.0% (1683.9)
Paviani100% (423.3)100% (18764.2)100.0% (2168.7)
PenHolder100% (94.3)100% (1391.3)100.0% (1701.6)
Plateau100% (35.5)100% (22.4)100.0% (27.6)
Powell100% (692.4)100% (6501.1)100.0% (2052.4)
Price01100% (27.4)100% (2107.0)100.0% (1686.6)
Price02100% (1242.4)92% (4031.3)85.0% (2087.7)
Price03100% (3840.5)64% (2574.6)56.0% (1855.1)
Price04100% (440.1)100% (1075.4)94.0% (6413.9)
Qing0% (NA)0% (NA)0.0% (NA)
Quadratic100% (26.0)100% (872.2)100.0% (1695.8)
Quintic100% (928.2)76% (2694.1)36.0% (42496.5)
Rastrigin100% (482.4)98% (3753.6)100.0% (1840.2)
Ripple01100% (5742.4)71% (4053.3)99.0% (4758.6)
Ripple25100% (414.7)99% (3165.8)100.0% (1771.5)
RosenbrockModified100% (1343.5)24% (3625.6)95.0% (2329.0)
RotatedEllipse01100% (26.0)100% (1345.8)100.0% (1699.9)
RotatedEllipse02100% (28.0)100% (1218.1)100.0% (1700.8)
Salomon95% (9790.3)86% (4050.4)21.0% (3143.1)
Schaffer0199% (7105.6)92% (3092.8)59.0% (4520.3)
Schaffer02100% (2078.9)100% (2125.2)100.0% (2677.8)
Schaffer03100% (2786.3)91% (4096.2)99.0% (3236.3)
Schaffer04100% (2920.9)98% (4063.5)86.0% (6505.1)
Schwefel01100% (131.2)100% (651.8)100.0% (1734.5)
Schwefel04100% (63.9)100% (842.8)100.0% (1698.8)
Schwefel06100% (3536.0)100% (2302.3)100.0% (1790.4)
Schwefel20100% (2079.8)100% (1670.2)100.0% (1779.1)
Schwefel21100% (2158.6)100% (1853.2)100.0% (1774.8)
Schwefel22100% (2772.0)100% (1678.6)100.0% (1775.0)
Schwefel26100% (131.3)100% (1648.2)100.0% (1687.5)
Schwefel36100% (361.3)100% (1298.7)100.0% (1930.6)
SixHumpCamel100% (83.3)100% (909.2)100.0% (1695.7)
Sodp100% (64.3)100% (477.7)100.0% (1697.5)
Sphere100% (17.4)100% (730.2)100.0% (1683.2)
Step100% (471.0)100% (219.3)100.0% (1131.5)
Step2100% (433.4)100% (222.0)100.0% (1177.1)
StyblinskiTang100% (94.7)100% (861.4)100.0% (1824.2)
TestTubeHolder100% (839.0)98% (3957.5)100.0% (2088.5)
ThreeHumpCamel100% (86.3)100% (817.5)100.0% (1699.0)
Treccani100% (49.1)100% (1015.2)100.0% (1696.4)
Trefethen64% (20972.8)0% (NA)46.0% (29439.5)
Trigonometric02100% (1766.2)100% (1319.6)99.0% (4809.6)
Ursem01100% (47.7)100% (682.1)100.0% (1753.8)
Ursem03100% (584.0)100% (1645.7)100.0% (1777.7)
Ursem04100% (210.2)100% (1372.3)100.0% (1847.3)
UrsemWaves100% (2565.0)26% (4025.8)50.0% (1674.8)
VenterSobiezcczanskiSobieski100% (698.3)100% (1748.0)100.0% (2004.8)
Vincent100% (41.4)100% (3013.4)100.0% (1703.5)
Wavy100% (535.1)100% (3110.8)100.0% (1745.0)
WayburnSeader01100% (156.1)96% (2258.3)93.0% (16665.5)
WayburnSeader02100% (262.8)100% (1772.3)100.0% (1826.0)
Wolfe100% (50.3)100% (3252.9)100.0% (1720.7)
XinSheYang02100% (1048.0)87% (2284.5)100.0% (1768.5)
XinSheYang04100% (457.0)88% (3329.7)100.0% (1777.5)
Xor100% (26204.2)48% (18245.1)100.0% (1852.9)
YaoLiu09100% (482.3)98% (3753.6)100.0% (1824.3)
Zacharov100% (59.7)100% (944.3)100.0% (1696.6)
Zettl100% (123.4)100% (846.8)100.0% (1712.5)
Zirilli100% (131.0)99% (702.3)100.0% (1695.1)

Table 1.

The success rate% and average number of function calls (in parentheses) for GenSA, DEoptim, and rgenoud.

The mean of the success rate% over all the benchmark functions is 92, 85, and 86% for GenSA, DEoptim, and rgenoud, respectively. Because the number of function calls changes dramatically, the median rather than the mean of the number of function calls is provided: 244.3 for GenSA, 1625.9 for Deoptim, and 1772.1 for rgenoud.

A heatmap of the success rate% for GenSA, DEoptim, and rgenoud is displayed in Figure 1 . The color scaling from red to green represents the success rate% from 0 to 100. Clearly, GenSA has a larger green region (high success rate%) than DEoptim and rgenoud.

Figure 1.

Heat map of the success rate% for GenSA, DEoptim, and rgenoud.

Both the success rate% and number of function calls show that GenSA performed better than DEoptim and rgenoud in the testbed composed of 134 benchmark functions.

3.3. Applications

3.3.1. The Thomson problem in physics

The physicist J.J. Thomson posed the famous Thomson problem after proposing his plum pudding atomic model, based on his knowledge of the existence of negatively charged electrons within neutrally charged atoms [27]. The objective of the Thomson problem is to determine the minimum electrostatic potential energy configuration of N equal point charges constrained at the surface of a unit sphere that repel each other with a force given by Coulomb's law. The Thomson model has been widely studied in physics [2831]. In the Thomson model, the energy function is


The number of metastable structures (local minima) of the Thomson problem grows exponentially with N [28]. The region containing the global minimum is often small compared with those of other minima [30]. The Thomson problem has been selected as a benchmark for global optimization algorithms in a number of previous studies [4, 5, 28, 32]. The Thomson problem has been solved by both deterministic methods, including steepest descent [28], and stochastic methods, including (but not limited to) constrained global optimization (CGO) [29], the GSA algorithm [4, 5], genetic algorithms [30], and the Monte Carlo method [31, 33]. Typically, deterministic methods with multiple starts can provide a good solution when there are fewer point charges, whereas stochastic methods have to be used when N is large.

We can define an R function for the Thomson problem as follows:

>Thomson.fn<‐ function(x) {<<‐ + 1

x <‐ matrix(x, ncol = 2)

y <‐ t(apply(x, 1, function(z) {

c(sin(z [1]) * cos(z [2]),

sin(z [1]) * sin(z [2]), cos(z [1])) }))

n <‐ nrow(x)

tmp<‐ matrix(NA, nrow = n, ncol = n)

index<‐ cbind(as.vector(row(tmp)), as.vector(col(tmp)))

index<‐ index [index [, 1] < index [, 2],, drop=F]

rdist<‐ apply(index, 1, function(z) {

tmp<‐ 1/sqrt(sum((y [z [1],] ‐ y [z [2],])^2))


res<‐ sum(rdist)



In this example, we chose six point charges because our purpose is only to show how to use our package GenSA. The global energy minimum of six equal point charges on a unit sphere is 9.98528137 [28].

We applied GenSA with default settings to the Thomson problem. As the number of point charges is small, GenSA can determine the global minimum easily. We set the maximum number of function calls allowed,, to 600:

>n.particles<‐ 6 # regular octahedron with global minimum 9.98528137

>lower.T<‐ rep(0, 2 * n.particles)

>upper.T<‐ c(rep(pi, n.particles), rep(2 * pi, n.particles))


>options(digits = 9)


><<‐ 0

>out.GenSA<‐ GenSA(par = NULL, lower = lower.T, upper = upper.T,

fn = Thomson.fn, control = list(

>print(out.GenSA[c(“value”, “counts”)])


[1] 9.98528137


[5] 600

>cat(“GenSA call function”,, “times.\n”)

GenSA call function 600 times.

The global minimum 9.98528137 for six point charges is determined within 600 function calls.

3.3.2. Kinetic modeling of pesticide degradation

Various types of pesticides have been widely used in modern agriculture. It is important to calculate the concentration of a pesticide in groundwater and surface water. We will show how GenSA can be used to fit a degradation model for a parent compound with one transformation product. All the data and models are from the R packages mkin [34] and FOCUS (2006) [35].

After loading the library, we obtain the data (FOCUS Example Dataset D). The observed concentrations are in the column named “value” at the time specified in column “time” for the two observed variables named “parent” and “m1.”




>options(digits = 9)



'data.frame':44 obs. of3 variables:

$ name : Factor w/2 levels “m1”,”parent”: 2 2 2 2 2 2 2 2 2 2 ...

$ time :num0 0 1 1 3 3 7 7 14 14 ...

$ value: num99.5 102 93.5 92.5 63.2 ...

The measured concentration of parent and m1 change with time is displayed in the lower panel of Figure 2 .

Figure 2.

Kinetic modeling of pesticide degradation. Upper panel: illustration of pesticide degradation model. Lower panel: experimental concentration data and fitting curves for parent and m1.

According to the kinetic model displayed in the upper panel of Figure 2 , we define the derivative function as follows:

>df<‐ function(t, y, parameters) {

+d_parent<‐ ‐parameters [”k_parent_sink”] * y [”parent”] ‐

+parameters [”k_parent_m1”] * y [”parent”]

+d_m1 <‐ parameters [”k_parent_m1”] * y [”parent”] ‐

+parameters [”k_m1_sink”] * y [”m1”]

+return(list(c(d_parent, d_m1)))

+ }

There is one initial concentration, parent_0, and three kinetic parameters, k_parent_m1, k_parent_sink, and k_m1_sink, whose values need to be fitted out by fitting the concentration curves. The initial concentration of m1, m1_0, is always zero. We define the objective function, fn, as the sum of the squares of residuals (deviations predicted from observed values for both the parent and m1):

>fn<‐ function(x,

+names_x = c(“parent_0”, “k_parent_sink”, “k_parent_m1”, “k_m1_sink”),

+names_y = c(“parent”, “m1”),

+names_parameters = c(“k_parent_sink”, “k_parent_m1”, “k_m1_sink”),

+tspan = if (!is.null(dat.fitting))

+sort(unique(dat.fitting [[”time”]])) else NULL,

+df, dat.fitting = NULL) {

+m1_0 = 0

+names(x) <‐ names_x

+y0 <‐ c(x [”parent_0”], m1_0)

+names(y0) <‐ names_y

+parameters<‐ x [c(“k_parent_sink”, “k_parent_m1”, “k_m1_sink”)]


+out<‐ ode(y0, tspan, df, parameters)

+if (is.null(dat.fitting)) {

+rss<‐ out

+} else {

+rss<‐ sum(sapply(c(“parent”, “m1”), function(nm) {

+o.time<‐ as.character(dat.fitting [dat.fitting$name == nm,


+sum((out [, nm][match(o.time, out [, “time”])] ‐

+dat.fitting [dat.fitting$name == nm, “value”])^2,

na.rm = TRUE)




+ }

Then, we use GenSA to estimate the four parameters. As the model is not complicated, we limit the running time of GenSA to 5 seconds by setting max.time = 5:

>res<‐ GenSA(fn = fn, lower = c(90, rep(0.001, 3)),

+upper = c(110, rep(0.1, 3)),

+control = list(max.time = 5), df = df,

+dat.fitting = FOCUS_2006_D)

>names(res$par) <‐ c(“parent_0”, “k_parent_sink”, “k_parent_m1”, “k_m1_sink”)

>print(round(res$par, digits = 6))

parent_0 k_parent_sinkk_parent_m1k_m1_sink


GenSA successfully determines the correct value of the initial concentration of the parent and the three kinetic parameters. The fitting curves for the parent and m1 are displayed in the lower panel of Figure 2 .

3.3.3. Finance: risk allocation portfolios

Portfolio selection problems were addressed by mean‐risk models in the 1950s. The most popular measures of downside risk are the value‐at‐risk (VaR) and conditional VaR(CVaR). Portfolio weights for which the portfolio has the lowest CVaR and each investment can contribute at most 22.5% to the total portfolio CVaR risk were estimated using differential evolution algorithms in Mullen et al. [16] and Ardia et al. [36]. The code for the objective function in portfolio optimization is rewritten below from Ardia et al. [36]:


> tickers <‐ c(“GE”, “IBM”, “JPM”, “MSFT”, “WMT”)

>getSymbols(tickers, from = “2000‐12‐01”, to = “2010‐12‐31”)

[1] “GE” “IBM” “JPM” “MSFT” “WMT”

> P <‐ NULL

>for(ticker in tickers) {

+tmp<‐ Cl(to.monthly(eval(parse(text = ticker))))

+P <‐ cbind(P, tmp)

+ }

>colnames(P) <‐ tickers

> R <‐ diff(log(P))

> R <‐ R [‐1,]

> mu <‐ colMeans(R)

> sigma <‐ cov(R)


>pContribCVaR<‐ ES(weights = rep(0.2, 5),

+method = “gaussian”, portfolio_method = “component”,

+mu = mu, sigma = sigma)$pct_contrib_ES

>obj<‐ function(w) {

+if (sum(w) == 0) {w <‐ w + 1e‐2 }

+w <‐ w/sum(w)

+CVaR<‐ ES(weights = w,

+method = “gaussian”, portfolio_method = “component”,

+mu = mu, sigma = sigma)

+tmp1 <‐ CVaR$ES

+tmp2 <‐ max(CVaR$pct_contrib_ES ‐ 0.225, 0)

+out <‐ tmp1 + 1e3 * tmp2



GenSA can be used to determine the global optimum of this function using a bounded search domain from 0 to 1 values for the five parameters to be optimized:


>lb<‐ rep(0, 5) # lower bounds, minimum values for all 5 parameters are 0

>ub<‐ rep(1, 5) # upper bounds, maximum values for all 5 parameters are 1

> out1.GenSA <‐ GenSA(fn = obj, lower = lb, upper = ub)

For non‐differentiable objective functions, the smooth parameter in the control argument can be set to FALSE, which means that the Nelder‐Mead method is used in the local search:

> out2.GenSA <‐ GenSA(fn=obj, lower=rep(0, 5), upper=rep(1, 5),


The parameter is set to 3000 to make the algorithm stop earlier:

> out2.GenSA$value

[1] 0.1141484884

> out2.GenSA$counts

[1] 3000

>cat(“GenSA call functions”,, “times.\n”)

GenSA call functions 3000 times.

>wstar.GenSA<‐ out2.GenSA$par

>wstar.GenSA<‐ wstar.GenSA/sum(wstar.GenSA)

>rbind(tickers, round(100 * wstar.GenSA, 2))

[,1] [,2] [,3] [,4] [,1]

tickers “GE” “IBM” “JPM” “MSFT” “WMT”

”18.92” “21.23” “8.33” “15.92” “35.6”

>100 * (sum(wstar.GenSA * mu) ‐ mean(mu))

[1] 0.03790568876

GenSA determined a minimum of 0.1141484884 within 3000 function calls.

4. Discussion and conclusions

The discrete optimization problem, in particular, the feature selection problem, exists extensively. GSA can also be used for the discrete problem. Please refer to [37] for details.

GSA is a powerful method for the non‐convex global optimization problem. We developed an R package GenSA based on Tsallis statistics and GSA. In an extensive performance testbed composed of 134 benchmark functions, GenSA provided a higher average success rate% and a smaller median of the number of function calls compared with two widely recognized R packages: DEoptim and rgenoud. GenSA is useful and can provide a solution that is comparable with or even better than that provided by other widely used R packages for optimization.

R is very good for program prototype. When there is a need for heavy computation, other computational languages, such as C/C++, Fortran, Java, and Python,are recommended. Considering both speed and usability, aPython version of GSA, PyGenSA, is being developed and will be released within the SciPy scientific toolkitat the end of 2016.


This work would not have been possible without the R open‐source software project. We would like to thank our colleague Julia Hoeng for inspirational discussions and support. This study was funded by Philip Morris International.

© 2017 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Yang Xiang, Sylvain Gubian and Florian Martin (April 26th 2017). Generalized Simulated Annealing, Computational Optimization in Engineering - Paradigms and Applications, Hossein Peyvandi, IntechOpen, DOI: 10.5772/66071. Available from:

chapter statistics

946total chapter downloads

1Crossref citations

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

A Simulated Annealing Based Optimization Algorithm

By Yoel Tenne

Related Book

First chapter

Final Results of Assembly Line Balancing Problem

By Waldemar Grzechca

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us