A Review and Comparative Study of Firefly Algorithm and its Modified Versions

Firefly algorithm is one of the well-known swarm-based algorithms which gained popularity within a short time and has different applications. It is easy to understand and implement. The existing studies show that it is prone to premature convergence and suggest the relaxation of having constant parameters. To boost the performance of the algorithm, different modifications are done by several researchers. In this chapter, we will review these modifications done on the standard firefly algorithm based on parameter modification, modified search strategy and change the solution space to make the search easy using different probability distributions. The modifications are done for continuous as well as non-continuous problems. Different studies including hybridization of firefly algorithm with other algorithms, extended firefly algorithm for multiobjective as well as multilevel optimization problems, for dynamic problems, constraint handling and convergence study will also be briefly reviewed. A simulation-based comparison will also be provided to analyse the performance of the standard as well as the modified versions of the algorithm.


Introduction
An optimization problem refers to the maximization or minimization of an objective function by setting suitable values for the variables from a set of feasible values. These problems appear not only in complex scientific studies but also in our day-to-day activities. For instance, when a person wants to go from one place to another and has multiple possible routes, a decision needs to be made on which route to take. The decision can be with the objective to minimize travel time, fuel consumption and so on. However, these kinds of problems with fewer number of alternatives can easily be solved by looking at the outcome of each of the alternatives. However, in real problems, it is not always the case to have a finite and small number of alternatives. Hence, different solution methods are proposed based on the behaviour of the problem.
Since the introduction of evolutionary algorithms, many studies have been conducted on heuristic algorithms. Introducing new algorithms has been one of the leading research areas [1]. Currently, there are more than 40 metaheuristic algorithms [2]. Most of these new algorithms are introduced by mimicking a scenario from nature. For instance, genetic algorithm is inspired by the Darwin theory of survival of the fittest [3]; particle swarm optimization is another metaheuristic algorithm mimicking how a swarm moves by following each other [4]; firefly algorithm is inspired by how fireflies signal each other using the flashing light to attract for mating or to identify predators [5] and prey predator algorithm is another new algorithm inspired by the behaviour of a predator and its prey [6]. These algorithms use different degree of exploration and exploitation based on their different search mechanisms.
Firefly algorithm is among those metaheuristic algorithms which have different applications. Its uncomplicated and easy steps with its effectiveness attract researchers from different disciplines it. Different studies have been performed to modify the standard firefly algorithm to boost its performance and to make it suitable for a problem at hand. In this chapter, a comprehensive study will be presented on firefly algorithm and its modified versions. A brief discussion on extended firefly algorithm with other relevant studies will also be provided. In the next section, a discussion on optimization problems with their solution methods will be given followed by a review on studies on firefly algorithm, which includes a discussion on the standard firefly algorithm with its modified versions and other relevant studies on firefly algorithm, in Section 3. In Section 4, a comparative study based on simulation results will be presented followed by summary of the chapter in Section 5.
When these problems are formulated mathematically, they are called mathematical optimization problems. It will have a set of feasible actions, also called feasible regions, and a measure of performance of these actions called the objective. A standard single objective minimization problem can be given as in Eq. (1).
where f : ℝ n → ℝ is called the objective function, S is the feasible region and the vector x is the decision variable. A vector x is said to be an optimal solution for the minimization problem given in Eq. (1) if and only if x ∈ S and f (x) ≤ f (x), ∀ x ∈ S. A local solution x′ is a member of S and f (x ') ≤ f (x), for all x in the neighbourhood of x′.
In a broad sense, optimization solution methods can be categorized as exact and approximate solution methods. Exact solution methods are methods which use an exhaustive search for the exact solution in the solution space. They use mathematical and statistical arguments to get an exact solution. They mainly used calculus-based and iterative procedures. Perhaps Fermat is the first to use a calculus-based argument to solve optimization problems [19]. Iterative methods were first proposed and used by Newton and Gauss [20]. Since then, several exact solution methods are proposed and used in different problems. Branch and bound, simplex method and gradient descent method are good examples of exact solution methods. However, due to complex problems modelled from complex real aspects, it becomes challenging for the deterministic solution methods. This leads to the search of new 'out of the box' way of solving these problems, which in turn gives rise to the birth of metaheuristic solution algorithms.
Metaheuristic algorithms are approximate solution methods for an optimization problem which use a randomness property with an 'educated guess' in their search mechanism and try to improve the quality of the solutions at hand through the iterations, from a randomly generated set of feasible solutions, by exploring and exploiting the solution space. Even though these algorithms do not guaranty optimality, they are tested to give a reasonable and acceptable solution. Furthermore, they have the advantage of not to be affected much by the behaviour of the problem; this makes them useful in many applications. Having a variety of algorithms will give the option to choose a suitable one to solve a problem according to its behaviour.

Introduction
Nature has been an inspiration for the introduction of many metaheuristic algorithms. It has managed to find solution to problems without being told but through experience. Natural selection and survival of the fittest was the main motivation behind the early metaheuristic algorithms. Different animals communicate with each other through different mode of communications. Fireflies use their flashing property to communicate. There are around 2000 firefly species with their own distinct flash patterns. They usually produce a short flash with a certain pattern. The light is produced by a biochemical process called the bioluminescence. The flashing communication is used to attract their mate and also to warn predators. Based on the pattern of the light, a suitable mate will communicate back by either mimicking the same pattern or responding with a specific pattern. It also needs to be noted that the light intensity decreases through distance; hence, a flashing light emanating from a firefly gets a response from fireflies around it within a visual range of the flash.
In addition to enjoying the beautiful view of a summer sky created by fireflies, they have motivated and have been the centre for many scientific researches [5,21,22]. In the sense of optimization, if we consider the fireflies as solution on the landscape of the solution space, then the attraction and movement of fireflies can inspire an optimization algorithm in which solutions follow better (brighter) solutions. Hence, firefly algorithm is motivated and inspired by these properties.

The standard firefly algorithm
Firefly algorithm is a swarm-based metaheuristic algorithm which was introduced by Yang [5]. The algorithm mimics how fireflies interact using their flashing lights. The algorithm assumes that all fireflies are unisex, which means any firefly can be attracted by any other firefly; the attractiveness of a firefly is directly proportional to its brightness which depends on the objective function. A firefly will be attracted to a brighter firefly. Furthermore, the brightness decreases through distance based on inverse square law, as given in Eq. (2).
If the light is passing through a medium with a light absorption coefficient γ, then the light intensity at a distance of r from the source can be given as in Eq. (3).
where I 0 is light intensity at the source. Similarly, the brightness, β, can be given as in Eq. (4).
A generalized brightness function for ω ≥ 1 is given in Eq. (5) [5]. In fact, any monotonically decreasing function can be used.
In the algorithm, a randomly generated feasible solution, called fireflies, will be assigned with a light intensity based on their performance in the objective function. This intensity will be used to compute the brightness of the firefly, which is directly proportional to its light intensity. For minimization problems, a solution with smallest functional value will be assigned with highest light intensity. Once the intensity or brightness of the solutions is assigned, each firefly will follow fireflies with better light intensity. For the brightest firefly, it will perform a local search by randomly moving in its neighbourhood. Hence, for two fireflies, if firefly j is brighter than firefly i, then firefly i will move towards firefly j using the updating formula given in Eq. (6).
where β 0 is the attractiveness of x j at r = 0, in [5] the author recommended that β 0 = 1 for implementation, γ is an algorithm parameter which determines the degree in which the updating process depends on the distance between the two fireflies, α is an algorithm parameter for the step length of the random movement and ε() is a random vector from uniform distribution with values between 0 and 1. For the brightest firefly, x b , the second expression in Eq. (6) will be omitted, as given in Eq. (7). These updates of the location of fireflies continue with iteration until a termination criterion is met. The termination criterion can be maximum number of iterations, a tolerance from the optimum value if it is known or no improvement is achieved in consecutive iterations. The algorithm is summarized in Table 1.

Modified versions of firefly algorithm with critical analysis
Firefly algorithm is efficient and an easy-to-implement algorithm. It is also suitable for parallel implementation. However, researches show that it is slow in convergence and easily gets trapped in local optimum for multimodal problems. In addition, the updates solely depend on current performance and no memory on previous best solutions and performances are kept. That may lead to losing better solutions. Furthermore, since the parameters are fixed, the search behaviour remains to be the same for any condition in all iterations. Hence modifying the standard firefly algorithm to boost its performance has been one of the research issues. Furthermore, the standard firefly algorithm is designed for continuous optimization problems; hence in order to use it for non-continuous problems it needs to be modified and adjusted.

Modification for problems with continuous variables
Basically, there are three classes of modification. Class 1 modification is the modification on the parameters. It is the first category in which the parameters of the algorithm are modified and the same updating mechanisms or formulas are used. Class 2 contains new updating mechanisms. It includes modifications which change part or all of the updating formulas, add mutation operator and the likes. The last category, Class 3, includes modifications on the search space, perhaps with the same updating mechanism it may be easier to switch to another 'easyto-search' space, and changes in the probability distribution when generating random numbers. The categories are not necessarily disjoint as some of the modifications may fall in multiple classes.

Class 1 (parametric modification)
In the standard firefly algorithm, the parameters in Eq. (6) are user-defined constants. Like any other metaheuristic algorithms, the performance of a firefly algorithm depends on these parameter values. They control the degree of exploration and exploitation.
Some of the modifications of firefly algorithm are done by making these parameters variable and adaptive. In recent researches on the modification of firefly algorithms, the parameters α, γ and also r are modified. The modification of α affects the random moment of the firefly, whereas modifying either γ or r affects the degree of attraction between fireflies. Adjusting the brightness at the origin, β 0 , has also been done in some researches.

a. Modifying the random movement parameter:-
To deal with parameter identification of infinite impulse response (IIR) and nonlinear systems, firefly algorithm is modified in [23]. The modification with the random movement is based on initial and final step lengths α 0 and α ∞ using α : = α ∞ + (α 0 − α ∞ )e −Itr . In addition, additional fourth term in the updating process, given by αε(x i − x b ) where x b is the brightest firefly of all the fireflies, is added so that firefly algorithm will resemble and have search behaviour like particle swarm optimization. In order to implement this modification, initial and final randomization parameters, α 0 and α ∞ , need to be supplied by the user. The randomized parameter is set to decrease exponentially and within a couple of iteration it will vanish. For example, if α 0 = 1 and α ∞ = 0.2 starting from 0.089 in the first iteration it will decrease to 0.0001 in the seventh iteration. Furthermore, the additional term in the updating formula takes the solution x i away from x b , with a given step length αε. This is in contradiction to the following concept of the current best solution of the algorithm. Assuming that the step length for the new term as well as the randomness is the same, there is only one additional parameter, either α 0 or α ∞ in place of a single parameter α.
Another firefly algorithm with adaptive α is presented in [24]. The modification is given by It r Max . In addition, two new solutions are generated based on three solutions from the population chosen randomly, the one with the better brightness from the two new solutions, and x i will replace x i and pass for the next iteration. It is also used to solve optimal capacitor placement problem [25]. Similar modification of α with additional mutation and crossover operators is also given in [26].
In [27], the randomized parameter is modified based on the number of iterations using 0.4 1 + e 0.005(Itr −It r Max ) . The simulation results in 16 benchmark problems show that the modification increases the performance of the standard firefly algorithm significantly.
In extending firefly algorithm for multiobjective problems, an adaptive α is proposed and used in [28]. Here α is made adaptive based on the number of iteration and is given by α : = α 0 0.9 Itr .
Hence, the step length decreases faster than linearly.
Self-adaptive step firefly algorithm is another modification done to the third term of the updating process by Yu et al. [29]. The step length α is updated based on the historical information and current situation using α i iteration Itr, f i best (Itr −1) and f i best (Itr −2) the best performance of solution x i until Itr − 1 and Itr − 2 iterations. Sixteen two-dimensional benchmark problems are used showing that the proposed approach produces better result with smaller standard deviation. It is a promising idea to control the randomized parameter based on the solution's previous and current performances. The update works in such a way that whenever the solution approaches the brightest firefly its step length will decrease, since the performance of the solutions needs to be saved and the memory complexity should be studied.
Another study of modification of the random movement parameter based on the historic performance of the solution is presented in [30]. Based on its best position until current iteration, x i,best , and the global best solution until current iteration, x gbest ,

. Modifying the attraction parameters
The attraction of one firefly by another depends on the light intensity at the source of the brighter firefly as well as on the distance between them and the light absorption coefficient of the medium.
For a small change in the distance between two fireflies results in a quick decrease of the attraction term. To deal with this problem, Lin et al. [31] introduced a virtual distance which will put r in between 0 and 1 and is defined by r ' = r − r min r max − r min , where r min = 0 and x min (i) being the maximum and minimum values of the i th dimension over all fireflies, respectively. Furthermore, β is set as β = β 0 γ(1 − r '). In later iterations, the swarm tends to converge around an optimal solution. It means that the distance r decreases and so does r max . However, in most cases, the decreasing rate of r is faster than r max , resulting in a slight increase in β. In order to overcome the possibility of the attraction term dominating the attraction term, the authors proposed a new updating equation in later iterations using Indeed the new updating formula omits the random movement of the firefly. The firefly will only move towards a brighter firefly with a step length of βαε.
Tilahun and Ong [32] suggested that, rather than making β 0 = 1, it should be a function of the light intensity given by β 0 = e I 0, j −I 0,i for a firefly i to be attracted to j, where I 0,j and I 0,i are intensity of fireflies j and i at r = 0. In addition, moving the brightest firefly randomly may decrease the brightness; hence a direction which improved the brightness will be chosen from m random directions. If such a direction is not among these m directions, it will stay in its current position.
The complexity of the algorithm may increase with respect to the new parameter m, and therefore it should be taken into consideration.
Due to the non-repetition and ergodicity of chaos, it can carry out overall searches at higher speeds. Hence, Gandomi et al. [33] proposed a modification on β and γ using chaos functions. This approach has attracted many researchers, and it has been used in different problem domains. The approach is successfully applied using Chebyshev chaos mapping for MRI brain tissue segmentation in [34], for heart disease prediction using Gaussian mapping [35], reliability-redundancy optimization [36] and for solving definite integral problems in [37]. In [38], chaotic mapping is used for β or γ. In addition, α is made to decrease based on the intensity of solutions using α = α max − (α max − α min ) I max − I mean I max − I min where I max , I mean, and I min are the maximum, the average and the minimum intensities of the solutions.
Another modification in this category is done in [39]. In this chapter, β is modified using β = (β max − β min )e −γr 2 + β min , where β min and β max are user-supplied values. Similar modification is also done in [40].

c. Modifying both the random and attraction movement parameters
To overcome this challenge arising with an increase in the problem dimension and the size of the feasible region, Yan et al. [41] proposed a modification for the standard firefly algorithm. This modification is done on the generalized distance term given in Eq. (5), in which r ω is replaced by r K n(Range) , where K is a constant parameter, n is the dimension of the problem and Range is the maximum range of the dimensions. The parameter α also reduces with iteration from a starting value α 0 to a final value α end linearly. In addition, a firefly is attracted to another firefly if it is brighter and if it is winking. The winking is based on a probability p w = 0.5 + 0.1count _ i, where count_i is the value of a firefly i winking state counter. The larger the counter the greater will be the probability of shifting the state. The maximum counter is five, and after that it will be reset to 0.
In order to solve economic dispatch problem, firefly algorithm is modified in [42]. To increase the exploration property, the authors replaced the Cartesian distance by the minimum variation distance. In addition, they used mutation operator on α but no explanation on how the mutation works is given.
To deal with premature convergence, firefly algorithm has also been modified based on the light intensity [43]. The light intensity difference is defined by ξ = where η 1 is a new parameter and α = α 0 (0.02r max ) where for another new algorithm parameter η 2 . The modification shows that for two fireflies, the brighter one will have a small attraction and randomness step length compared to the brighter ones.
For the optimal sizing and siting of voltage-controlled distribution generator in distributed network, firefly algorithm is modified and is used in [44]. The problem is to minimize the power loss by selecting optimal location for distributed generations and the power produced. In the modification β 0 = 1, whereas γ and α are modified based on the problem property (location and maximum power per location in each iteration). This modification is done based on the problem characteristic. The effectiveness and quality of a solution for a metaheuristic algorithm depend on the proper tuning of an algorithm parameter as well as on the behaviour of the landscape of the objective function.
Another modification of the standard firefly algorithm to be listed in this category is done in [45]. The randomized parameter α has been made adaptive using α = α max − Itr (α max − α min )

MaxGen
. Furthermore, the distance function has been made to be influenced not by their location in the landscape of the feasible region but the brightness or functional values of the fireflies using For two fireflies with similar performance in the objective function, they are considered to be near each other.
For path planning of autonomous underwater vehicle, the parameters γ and α of the standard firefly algorithm are modified by [46]. Furthermore, the updating formula is defined as x i : = x i + β(x i − x j ) + αrε. As the iteration increases, α decreases and γ increases linearly, implying both the randomness movement and the attraction decrease as a function of the iteration. In the updating formula, the random movement is multiplied by the distance.
A similar approach in which the parameters α,β and γ are encoded in the solution is proposed in [47]. Unlike in [44], the update is done using ψ : = ψ + σ ψ N (0, 1) where σ ψ : = σ ψ e τ'N (0,1)+τN (0, 1) for learning parameters τ,τ′ and ψ = {α, β, γ} . Another modification that can be listed in this category is done in [48]. The parameter γ is modified using γ max − (γ max − γ min ) ( Itr MaxGen ) 2 for 2 ≤ γ max ≤ 4 and 0.5 ≤ γ min ≤ 1. In addition, for a new parameter λ, α = α max − (α max − α min )( where G 0 is an iteration number in which α = α min . This results in a decrease in α quicker than linear function if λ is in the range (0, 1), linearly if λ = 1 and slower than linear function if λ > 1. Furthermore, in order to overcome the trapping of the solutions in local optimal solution, Gauss distribution is applied to move the brightest solution, This will be applied if the variance of the solutions before a predetermined M iteration is less than a given precision parameter η. The authors also suggested that chaos, particularly cubic mapping, can be used to improve the distribution of the initial solutions.
In [49], γ and α are computed using γ = 0.03 | G 1 | and α = 0.03 | G 2 | where G 1 and G 2 are generated from Gaussian or normal distribution with mean 0 and variance 1. Supported by two case studies for multivariable proportional-integral-derivative (PID) controller tuning, a similar study was also done in [50]. They used Tinkerbell mapping to tune γ, using MaxGen where x has normalized values generated by the Tinkerbell map in the range [0, 1]. In addition to that, α is modified to decrease linearly using

Class 2 modifications (new updating mechanisms)
The updating mechanism in the standard firefly algorithm is guided by Eqs. (6) and (7). In Class 1 modification, the same updating equations are used but with adaptive preference. Class 2 modifications include modification on the updating equations including modification in the updating process of the best (the brightest) and the worst (the dimmer) solutions changing part of the updating equations and some modification with additional mutation operator.

a. Modifying the movement of the brightest or dimmer firefly
In a high dimensional problem, the exploration is weak which results in premature convergence. To deal with this, two modifications are proposed in [51] for the standard firefly algorithm. That is, for the initial random N solutions, their opposites will be generated, and the best N solutions will be chosen from the N solutions and their opposites where an opposite number for x is given by x min + x max − x. The brightest solution x b will be updated as follows: Similar to the previous modification, here also the best solution will improve or will not change in each of the iterations.
Opposition-based learning is also used in [52], to update the dimmer solution x w , a solution with worst performance, using , for an algorithm parameter p.
Indeed, it relocates the worst solution to a new position that may give the algorithm a good explorative behaviour.

b. Mutation incorporation
Jumper firefly algorithm is a modified firefly algorithm in which a memory on the performance of each of the solution is kept [53]. A criterion called hazard condition is defined, and solutions will be tested based on their previous performance. If they are in hazardous condition, they will be randomly replaced by a new solution. Hence, based on the hazard condition, a mutation can be done by replacing the weak solution based on previous performance by a new solution.
Another modification in this category is done by Kazemzadeh-Parsi [54], where each iteration k random 'newborn' solutions will be generated to replace the weak solutions. In addition to this mutation, highly ranked k 1 solutions from the previous iteration will replace the same number of weak solutions. The other modification is that, rather than having consecutive movement of a firefly towards brighter fireflies, a single combined direction, the average of the directions ( 1 l ∑ i=1 l x j ) for brighter fireflies x j 's, will be computed and used. Similar approach is used in [55]. In the first case, where newborn solutions replace weak solutions, the number of solutions should not be large; otherwise it behaves as an exhaustive search. In the second case, whenever some solutions are replaced by others from the previous solutions, the solutions coming from the previous iteration will more or less perform similar search behaviour as what has been done in the previous iteration.
Another modification of firefly algorithm by introducing new solutions as a mutation or crossover is given in [26,56]. In addition to adaptive parameter α, they introduced two mutation operators and five crossover operators based on the mutated solutions. The first mutation operator works by combining randomly selected three solutions, x q1 , x q2 and x q3 , different from x i , from the solution set x mute1 = x q1 + ε(x q2 − x q3 ) and the second based on the mutated solution from the first mutation operators, the best and worst solutions (for an iteration t x mute2 = x mute1 + εt(x b − x w ) . Based on these two solutions, five solutions will be generated, and the best one from the mutated as well as the new five solutions replace x i . Similar modification of α and two mutation types are also proposed in [24]. In [57], the parameter α is made to adapt using chaotic mapping and mutation operators.

c. New updating strategy
This is another category of Class 2 modification in which the updating formula, given by Eqs. (6) and (7), is modified or changed. The first modification, to mention in this category, is the modification proposed in [57]. For a firefly i attracted by another firefly j, the search is updated to be in the vicinity of x j , as given by x i : = x j + β(x j − x i ) + αε. Furthermore, after the update, only improving solutions are accepted. Since the update is done based on the location of x j , the exploration property of points in between the two solutions will not be done, and it will be trapped in local optimum solution provided x j is a local solution, and the step lengths are small. Through iterations, the solutions will be forced to be in a neighbourhood of the best solution. The diversity of the solutions will also be low.
A similar modification in the vicinity of the brighter firefly is given in [58]. They proposed two updating formulas, with and without division, as the authors name them. The updating formula, without division, is given by x i : = x j + αε. Once the fireflies are sorted according to their brightness, increasing with their index, the updating formula will be defined by x i : = x j + α j ε, which will decrease α as the brightness increases and which will give a good intensification or exploitation property. In addition, the parameters α and γ are made adaptive. This put this modification in both Class 1 and Class 2 in our classification. Similar discussion holds for a similar work done in [57]. In addition, unlike in [59], there is an attraction term in the direction from the brighter x j towards x i , that means moving the brighter firefly in a nonpromising direction replaces x i . Hence, in this sense, the modification in [58] is better as it moves the solution not in a non-promising direction but randomly.
For a data clustering problem, the standard firefly algorithm is modified firefly algorithm [60,61]. They proposed a new updating formula to increase the influence of the brightest firefly.
The new updating formula is given by This means that a firefly is not only attracted to brighter fireflies but also by the best solution found so far, x gbest . Suppose there are l brighter fireflies, brighter than x i , at the end of the iteration the attraction term will be ∑ j=1 l β(x j − x i ) + lβ 0 e −γr i ,gbest 2 (x gbest − x i ). Hence, repeatedly moving a firefly to the best solution increases the attraction step length, and based on the feasible region, it usually may not be acceptable. Furthermore, the global solution found can be an optimal solution, and it may result in the solutions to be forced to follow that local solution rather than exploring other regions in each loop of iteration. In [61], four ten-dimensional benchmark problems and four twenty-dimensional benchmark problems along with Iris data set are used for clustering. Similar modification is done in [62]. In addition, to update α in a decreasing manner, the updating formula for a solution x i based on the brighter firefly x b and the best solution from memory gbest with a new algorithm parameter λ is modified as Fuzzy firefly algorithm is another modification of the standard firefly algorithm [63]. Even though they start with a wrong claim by saying "in the standard firefly algorithm, only one firefly in each iteration can affect others and attract its neighbours", they try to increase the exploration property of the algorithm by adding an additional term in which the top k fireflies attract with l being a new algorithm parameter. The effect of the best k fireflies is doubled, and if this updating mechanism is done for each brighter firefly x j then its effect is more than double. Furthermore, multiplying the random term with the second expressions affects their step length and deletes the random movement. Hence, it forces the fireflies to follow best k solutions. Exploring other regions is not possible with this update.
Another modification with a new updating formula is proposed in [64] and is given by , otherwise where β 0 is computed based on the location of x i normalized by the locations of fireflies in the search space and γ is computed with a direct relation with β 0 and additional two parameters; η is a value based on the difference between the location of the fireflies.
Diversity-guided firefly algorithm is one of the recent modified versions [65]. The modification is done to make the solutions as diverse as possible with a given threshold. The updating mechanism of the standard firefly is used until diversity of the solution falls beyond the given threshold. The diversity is measured by where L is the longest diagonal of the feasible region, and x is the average position of all fireflies. If the diversity is less than a predefined threshold value, then the updating formula will be x i : = x i + β(x j − x i ) + αε(x i − x). The modification proposed is effective in diversifying the solutions by replacing the random movement by moving the solutions away from the average position of fireflies.
In [66,67], a mutated firefly algorithm is proposed in such a way that the brighter firefly donates some of its features based on a new algorithm parameter called probability of mutation, p m . The features and their amount copied from the bright firefly are not mentioned. However, based on the context, it seems some components of the vectors for x i will be replaced by the corresponding components from the brighter firefly x j . In [66], this mutation operator will replace the updating formula given in Eq. (6) whereas in [67], the mutation will be done after the update is taken place.
In [68], a firefly located at x i first checks the direction towards other brighter fireflies and looks for the one that improves its performance. If there exists such a solution in which x i moves towards the firefly, its brightness increases. Checking the direction towards each of the solution may increase the complexity. Furthermore, in order to escape a local solution some solution should be allowed to decrease its performance; hence in this modification it is highly affected by local optimum solutions especially in misleading problems.
Another modification in this category is introduced to deal with economic dispatch optimization of thermal units [69]. A memory is used to record the best solution found so far. Based on cultured differential evolution, the updating formula is modified as f max − f min and b = x max − x min for x max and x min being the maximum and minimum component of vector x, respectively.
Another modification in this category is presented in [70]. The updating formula becomes

Class 3 modifications (change in search space or probability density functions)
This class of modifications is on the abstract level modification and includes two types of modifications. The first one is changing the solution space to an easy search space, and the second one is on the types of probability distribution that is used to generate a random vector direction for the random movement.

a. Change in search space
In the modified version presented in [71], each component of a solution will be represented by quaternion x i (k) = (y 1 (i) , y 2 (i) , y 3 (i) , y 4 (i) ) for all components k of x i , and the updating will be done over the quaternion space. In order to compute the brightness, the Euclidian space is used by changing the quaternion space to the search space by taking the norm function, x i (k) = . Even though the search space increases fourfold, it is interesting to zoom in into each component and perform the search for optimal solution. However, since a norm is used to convert quaternion space to the search space, a mechanism to deal with negative values should be studied. A more mathematical support should be provided along with complexity study.

b. Change in probability distribution function
Perhaps the first work which tries to adapt the randomness movement in the updating process is by Farahani et al. [72,73]. Even though they started with a wrong claim by saying 'In standard Firefly algorithm, firefly movement step length is a fixed value. So all the fireflies move with a fixed length in all iterations' by ignoring the random variable ε that makes the step length between 0 and α. They updated the step length to decrease with iteration and introduced a new parameter which updates each solution using x i : = x i + αε (1 − p), where p is a random vector from Gaussian distribution. This will increase the randomness property of the algorithm as it randomly moves once using the usual updating equation. The same modification is also employed in [74].
By enhancing the random movement of a firefly algorithm, Levy firefly algorithm is introduced in [75]. This is the first modification made to firefly algorithm with the Levy distribution guiding the random movement by generating a random direction as well as the step length. The update formula is modified as x i : = x i + β(x j − x i ) + α sign(ε) ⊗ Levy; ⊗ indicates a componentwise multiplication between the random vector from the Levy distribution and the sign vector. Similarly, in [76], Levy distribution is used to guide the random term of the updating formula.
In addition, the parameter α is made to decrease with iteration, using α = α max It r 2 . Furthermore, what they call information exchange between top fireflies will be done. That is, two solutions are randomly chosen from the top fireflies, and a new solution on the line joining the two fireflies near the brightest one will be generated. Similar approach of using Levy distribution with the step length is generated using a chaotic random number and has applications in image enhancement [77]. The same updating using Levy distribution and same formula for α is used in [78]. In addition to these updates, a communication between top fireflies is used in [79]. Levy distribution along with other probability distribution is suggested for the randomized parameter and used in [80,81].

Modifications for problems with non-continuous variables
Even though firefly algorithm is introduced for continuous problems, due to its effectiveness it has been modified for non-continuous problems as well. In this section, we will look at three classes of modification. The first one is when modifications are made to solve binary problems. The second is for integer-valued problems which include problems whose variable can have discrete values. The last one is mixed problems in which some of the variables are continuous and the rest are non-continuous.

Modifications for binary problems
To deal with set covering problem, a binary firefly algorithm is proposed in [82]. There is no modification in the updating process except converting the solution to either one or zero. Three ways of conversion are proposed in [82]. The conversion works dimension wise. After a solution x i is updated, for each component p of x i , three rules based on a transfer function T, which will change the new value of x i in the interval [0, 1], are given with eight transfer functions. The first rule of conversion is given by ( p) are the p th components of x i from the previous iteration and after the update. The last rule is given by where x* is the best memory from memory.
Another modification for binary problems which works dimension wise, in each dimension, is presented in [83]. The update formula of the standard firefly algorithm is used. After the update, the solution will lie in the interval [0, 1] using tanh(x i ( p)) function for the p th component of solution x i . Based on the user-defined threshold value, x i (p) = 1; if the result is greater than the threshold value, then it will be 0. Another similar work of using sigmoid function is given in [84,85]. In [86][87][88], tangent hyperbolic sigmoid function is used for discretizing the solution and also in the updating process using , where λ is a parameter close to one.
Another discrete firefly algorithm, in order to deal with job, schedule problem is proposed in [89]. Each firefly x i in the problem will have two classes of index as x i (p, q) where p represents the jobs and q their priority in that particular firefly. In order to change real values to binary after the update as sigmoid function, ( p,q) is used. Based on the values of the sigmoid function for each job p, the one with higher probability in q will be assigned with value 1 and the rest priority for that particular job with 0.
In [90], for a dynamic knapsack problem, firefly algorithm has been modified. The conversion of the solutions is done based on the property of the problem using priority-based encoding. In addition to making the algorithm to suit for the problem, some modifications are done to increase its effectiveness. One of the modifications is that a firefly i moves towards firefly j if j is brighter and ε < rank i , no updating mechanism is mentioned in the chapter. A similar modification is used in [91]. In addition to the discretization done in [90], the authors in [91] proposed adaptive step length given by α : = 1 − φe −mod ( Itr Itr +1 ,1 ) α for a scaling parameter ϕ. Furthermore, after the updates, two additional moves are introduced. The first one is a random flight by 10% of the top fireflies with 0.45 probability. The move will be accepted only if it is improving. The second is a local search of x b , after 10% of the iterations x b will do a local search, and the update will be accepted if it is improving. The additional local searches help to improve the quality of the solution. Furthermore, they also mentioned that chaotic mapping can be used to generate random numbers.
Another modification in this category is presented in [92]. In addition to the discretization, Levy flight centred at x b . Three discretization methods, the sigmoid, elf function and rounding function are used to change values in the range [0, 1] along with three updating processes. The first one is done on the continuous space, and sigmoid function will be used to change the results to binary; the second one is the update done on the discrete space, and the discretized results can be used and the third one, instead of using the updating formula, uses a probabilistic method based on sigmoid function.

Modifications for integer optimization problems
In [93], firefly algorithm has been modified to deal with software modularization as a graphpartitioning problem. Initially, random integer-encoded solutions are generated. The hamming distance, the number of different entries between two solutions with the same index, is used to measure the distance between two solutions. The update is done by switching a number of entries of a firefly by the entries from a brighter firefly.
Another modification in this category is done in [94]. The modification is based on a concept of random key, which is proposed in [95]. The method uses a mapping of a random number space, [0,1] D , to the problem space.
In [96,97], the standard firefly algorithm is modified for loading pattern enhancement. The generation of random solutions uses random permutation, and the distance between fireflies, d (x i , x j ), is measured using hamming's distance. The updating process is separated and made sequentially; first the β step, a move due to the attraction, and next the α step, a move due to the random movement. In the β step, first same entries with same index for both fireflies, x i and x j , are preserved and then an entry will be copied from x j if ε < β, where β = 1 1 + γd (x i , x j ) 2 ; otherwise the x i entry will be used. The α step is done using x i : = Int(x i + αε), with a swapping mechanism to preserve feasibility. A similar modification of sequentially applying β step and α step is also used in [98], with additional modification on β and α using β = e It is a good idea to adapt and increase the step length with the dimension of the problem. For instance, when D = 12, the step length α will start from 11 and decrease to zero in last iterations. However, if the feasible region is in [0, 4], the search in more than 60% of the time α will be at least 4. Hence, the moves in the first 60 plus % are not acceptable because it will force the solution out of the feasible region. Hence, the modification needs to consider the size of the feasible region. Another similar modification with the modifications done in [97] is given in [99] with additional modification to keep the best solution and use it in the updating process. That is based on ρ = 0.5 +

0.5Itr
MaxGen , and if ε > ρ the brighter firefly x j will be replaced by the global best from memory.
For travel salesman problem, firefly algorithm has been modified in [100]. Initial solutions are generated using permutation of D, and each solution is represented as a string of chromosomes of these numbers. The distance between two solutions is computed using r = 10A D , where A is the number of different arcs. The movement is done randomly selecting the length of movement between 2 and r and then using inversion mutation towards better solution; if there is no better solution, a random move will be done. Each firefly will produce m solutions and the best N solutions will be chosen to pass to the next generation.
Another modification in this category is proposed in [101]. The decision variables, x i 's, represent assembly sequence. In the update, the random movement is omitted, and the attraction move is done in the discrete space. The attraction direction is computed for each dimension k using s ji (k) = { x j (k), if x j (k) ≠ x i 0, else , and the update will be done by . In addition, the visual range, dv, which will control a firefly to be influenced by fireflies in tis visual range, is introduced. The visual range is computed by MaxGen dv max , otherwise for parameters dv max and dv min . This means that a firefly will not be attracted to any brighter firefly but to a brighter firefly in a visible range.
Firefly algorithm has been discretized for supply selection problem in [102]. The sum of the absolute differences between the entries is used to measure the distance r = ∑ k=1 D | x j (k) − x i (k) | . In addition, the movement is modified based on the property of the problem using rounding up for step length. In most cases, a modification specific to a problem is effective for that particular problem or a class of problems. However, it is hard to generalize the problems in other domains. It would be interesting to generalize the approach to be tested in other problem domains as well.

Modifications for mixed optimization problems
Perhaps the first modification to the standard firefly algorithm in this category is presented in [103]. The updating of solutions is conducted using the updating mechanism of the standard firefly algorithm. To deal with the discrete variables, constraint handling mechanism is used based on penalty function. In addition, the authors proposed two ways to generate a diverse set of random initial solutions. An adaptive random step length is also proposed using similar updating way in [104]. The same approach is improved in [105] by adding a scaling parameter for the random movement based on the difference between the maximum and minimum values for each variable. Portfolio optimization can be expressed as a mean-variance problem which belongs to the group of quadratic mixed-integer programming problems. In [106,107], firefly algorithm has been extended with the use of rounding function and constraint handling approach. Deb's method [108] is also used for constraint handling. In addition, α is modified
The decreasing scenario for α starting from the first iteration may not always be a good idea. Perhaps, it is better to keep a constant α for a number of iterations and start the decreasing scenario based on the performance of the solutions, especially when the solution approaches an optimum point. This can be one of the possible ideas for future work. Some modifications are done making the parameter α adaptive based on the performance of the solution [29,38,43]. Some of the modifications also involve a random term, and it behaves neither in decreasing nor in an increasing way [47,49,59]. In addition, other approaches such as encoding the parameters in the solution [109] and modifying the parameters based on the problem [44] are also proposed.
The attraction term has also been modified in different ways. Adaptive light absorption constant of the medium changing with iteration is given in some studies. This modifications use increasing function [46], decreasing function [43,67] or neither increasing nor decreasing function [33-38, 49, 50] of γ. The modification is neither increasing nor decreasing especially when a chaotic distribution [33][34][35][36][37][38] or normal distribution [34,49,50] is used to compute the update. Increasing γ implies the decrease of the attraction step length, and its decrease implies an increase in the step length. The attraction step length β has also been modified. A chaotic mapping is used to modify β in some of the studies [33][34][35][36][37][38]. It should be noted that using a chaotic map or updating γ does update β. For instance, Figure 2 shows that when γ is updated using a logistic map, the resulting β is also chaotic.
Hence, γ or β should not be updated at the same time. In addition to γ updating, β has also been done based on minimum and final values [39,40], depending on the location of the solution [31] and the light intensity of solutions [32]. In addition, different approaches are used to measure the distance between the fireflies [31,41,45]. Modifying the feasible region should be considered as a very big step length that may take the solution far away from the brighter solution and possibly out of the feasible region. The property of the random step should also be properly tuned in agreement with the attraction; otherwise, the random movement may dominate the attractiveness step length. The movement of the best solution should be tuned properly. If it is allowed to decrease then its best performance may get lost. Hence, the approaches used to preserve the best solution are ended effectively [32,52].
Mutation is another good approach to diversify the solutions which in turn increase the exploration behaviour of the algorithm [24,[53][54][55][56][57][58]. However, generating many solutions may hinder the search as it will take long to run. In addition, accepting weak solution should also be incorporated in deceiving problems; a solution needs to decrease in order to escape local solutions.
Modifying the update equation is another interesting modification featured in some studies [56, 58-61, 63-67, 69]. These studies suggest that the update should be done in the vicinity of the brighter firefly [58,60]. This is not always a good idea as the region in between the two solutions will not be explored. Some of the studies indicate an increase in the attraction towards brighter fireflies [61,64]. It simply means that increasing the step length of the attraction may dominate the random movement or even take the solution out of the feasible region. A memory is utilized to save the best solution found and additional attraction term towards that global solution is added in [63,69]. It is a good idea in which the best solution will not be lost through iteration. To increase the diversity of the solution, an effective modification is proposed in [110]. Using such kind of modification, the diversity of the solutions will be preserved, and the exploration behaviour of the algorithm will be improved.
Basically, two updating strategies are proposed for the non-continuous case. The first one is using the same updating formula and changing the results to discrete values afterwards [82,94,104]. And the second is to modify the updating formula on the discrete space [97][98][99][100][101]. The first problem is susceptible of trapping in local solution and misses the optimal solution. The optimal solution for a continuous version of a discrete problem may not always be an optimal solution for the discrete problem. Hence, the algorithm will tend to converge to the optimal solution of the continuous version of the problem. Hence, the second approach has an advantage in such cases.

Benchmark problems and simulation setup
Five benchmark problems are selected from different categories as presented in Table 2. The simulations are performed on Intel® Core™i3-3110M CPU @ 2.40 Ghz 64 bit operating system. MATLAB 7.10.0 (R2010a) is used for these simulations. The algorithm parameters are set as given in Table 2 for dimensions 2 and 5.

Simulation results and discussion
The simulation results, as presented in Table 3, show that some of the algorithms are very expensive in terms of computational time but give a good result, and others have small running time. For instance, in second problem, when the dimension is 2 on average, FFA3 outperforms all with average CPU time of 8.8, whereas FFA1 and FFA2 give a good approximation with smaller average CPU time. In general, it can be seen that FFA4 is very effective but not with the computational time. FFA1 and FFA2 give good approximate results with smaller CPU time compared to FFA4. However, when the dimension increases, FFA2 outperforms FFA1. Perhaps it is due to the fixed random direction m for all the simulations.

Conclusion
In this chapter, a detailed review of modified versions of firefly algorithm is presented. The modifications are used to boost its performance for both continuous and non-continuous problems. Three classes of modifications are discussed for continuous problems. The first one being parameter level modification which will improve the performance of the algorithm. The second class is on the updating mechanism level, in which new updating equation or mechanisms are introduced. The last class is in the abstract level in which change of solution space and probability distribution of the randomness term are discussed. The strength and weakness of the approaches are also presented. Simulation results show that mutation-incorporated firefly algorithm gives better result with larger computational time, whereas versions of firefly algorithm with opposition-based learning and elitist movement for the brighter firefly give approximate solution with smaller computational time. Hence, if a proper way of implementation is used, mutation operator and elitist move of brighter firefly algorithm along with possible implementation of opposition-based approach may perform better.