A Multilevel Genetic Algorithm for the Maximum Satisfaction Problem A Multilevel Genetic Algorithm for the Maximum Satisfaction Problem

Genetic algorithms (GA) which belongs to the class of evolutionary algorithms are regarded as highly successful algorithms when applied to a broad range of discrete as well continuous optimization problems. This chapter introduces a hybrid approach com- bining genetic algorithm with the multilevel paradigm for solving the maximum constraint satisfaction problem (Max-CSP). The multilevel paradigm refers to the process of dividing large and complex problems into smaller ones, which are hopefully much easier to solve, and then work backward toward the solution of the original problem, using the solution reached from a child level as a starting solution for the parent level. The promis-ing performances achieved by the proposed approach are demonstrated by comparisons made to solve conventional random benchmark problems.


Introduction
Many problems in the field of artificial intelligence can be modeled as constraint satisfaction problems (CSP). A CSP is a tuple X; D; C h iwhere, X ¼ x 1 ; x 2 ; …; x n f gis a finite set of variables, D ¼ D x1 ; D x2 ; …; D xn f g is a finite set of domains. Thus each variable x ∈ X has a corresponding discrete domain D x from which it can be instantiated, and C ¼ C 1 ; C 2 ; …; C k f gis a finite set of constraints. Each k-ary constraint restricts a k-tuple of variables, x 1 ; x 2 ; …; x k ð Þand specifies a subset of D 1 Â … Â D k , each element of which are values that the variables cannot take simultaneously. A solution to a CSP requires the assignment of values to each of the variables from their domains such that all the constraints on the variables are satisfied. The maximum constraint satisfaction problem (Max-CSP) aims at finding an assignment so as to maximize the number of satisfied constraints. Max-CSP can be regarded as the generalization of CSP; the solution maximizes the number of satisfied constraints. In this chapter, attention is focused on binary CSPs, where all constraints are binary, that is, they are based on the cartesian product of the domains of two variables. However, any non-binary CSP can theoretically be converted to a binary CSP [1]. Algorithms for solving CSPs apply the so-called 1-exchange neighborhood under, which two solutions are direct neighbors if, and only if, they differ at most in the value assigned to one variable. Examples include the minimum conflict heuristic MCH [2], the break method for escaping from local minima [3], and various enhanced MCH (e.g., randomized iterative improvement of MCH called WMCH [4], MCH with tabu search [5], and evolutionary algorithms [6]). Algorithms based on assigning weights on constraints are techniques that work by introducing weights on variables or constraints in order to avoid local minima. Methods belonging to this category include genet [7], guided local search [8], the exponentiated subgradient [9], discrete Lagrangian search [10], the scaling and probabilistic smoothing [11], evolutionary algorithms combined with stepwise adaptation of weights [12], methods based on dynamically adapting weights on variables [13], or both (i.e., variables and constraints) [14]. Methods based on large neighborhood search have recently attracted several researchers for solving the CSP [15]. The central idea is to reduce the size of local search space relying on a continual relaxation (removing elements from the solution) and re-optimization (re-inserting the removed elements). Finally, the work introduced in [16] introduces a variable depth metaheuristic combing a greedy local search with a self-adaptive weighting strategy on the constraints weights.

Multilevel context
The multilevel paradigm is a simple technique, which at its core involves recursive coarsening to produce smaller and smaller problems that are easier to solve than the original one. Multilevel techniques have been developed in the period after 1960 and are among the most efficient techniques used for solving large algebraic systems arising from the discretization of partial differential equations. In recent years, it has been recognized that an effective way of enhancing metaheuristics is to use them in the multilevel context. The pseudo-code of the multilevel genetic algorithm is shown in Algorithm 1. Figure 1 illustrates the multilevel paradigm used for six variables and two coarsening levels. The multilevel paradigm consists of four phases: coarsening, initial solution, uncoarsening, and refinement. The coarsening phase aims at merging the variables associated with the problem to form clusters. The clusters are used in a recursive manner to construct a hierarchy of problems each representing the original problem but with fewer degrees of freedom. The coarsest level can then be used to compute an initial solution. The solution found at the coarsest level is uncoarsened (extended to give an initial solution for the parent level) and then improved using a chosen optimization algorithm. A common feature that characterizes multilevel algorithms, is that any solution in any of the coarsened problems is a legitimate solution to the original one. Optimization algorithms using the multilevel paradigm draw their strength from coupling the refinement process across different levels.

Multilevel genetic algorithm (GA)
GAs [17] are stochastic methods for global search and optimization and belong to the group of nature-inspired metaheuristics leading to the so-called natural computing. It is a fast-growing interdisciplinary field in which a range of techniques and methods are studied for dealing with large, complex, and dynamic problems with various sources of potential uncertainties. GAs simultaneously examine and manipulate a set of possible solutions. A gene is a part of a chromosome (solution), which is the smallest unit of genetic information. Every gene is able to assume different values called allele. All genes of an organism form a genome, which affects the appearance of an organism called phenotype. The chromosomes are encoded using a chosen representation and each can be thought of as a point in the search space of candidate solutions. Each individual is assigned a score (fitness) value that allows assessing its quality. The members of the initial population may be randomly generated or by using sophisticated mechanisms by means of which an initial population of high-quality chromosomes is produced. The reproduction operator selects (randomly or based on the individual's fitness) chromosomes from the population to be parents and enter them in a mating pool. Parent individuals are drawn from the mating pool and combined so that information is exchanged and passed to off-springs depending on the probability of the crossover operator. The new population is then subjected to mutation and enters into an intermediate population. The mutation operator acts as an element of diversity into the population and is generally applied with a low-probability to avoid disrupting crossover results. Finally, a selection scheme is used to update the population giving rise to a new generation. The individuals from the set of solutions, which is called population will evolve from generation to generation by repeated applications of an evaluation procedure that is based on genetic operators. Over many generations, the population becomes increasingly uniform until it ultimately converges to optimal or near-optimal solutions. The different steps of the multilevel weighted genetic algorithm are described as follows: • construction of levels: Þbe an undirected graph of vertices V and edges E. The set V denotes variables and each edge x i ; x j À Á ∈ E implies a constraint joining the variables x i and x j . Given the initial graph G 0 , the graph is repeatedly transformed into smaller and smaller graphs G 1 , To coarsen a graph from G j to G jþ1 , a number of different techniques may be used. In this chapter, when combining a set of variables into clusters, the variables are visited in a random order. If a variable x i has not been matched yet, then the algorithms randomly select one of its neighboring unmatched variable x j , and a new cluster consisting of these two variables is created. Its neighbors are the combined neighbors of the merged variables x i and x j . Unmatched variables are simply left unmatched and copied to the next level.
• initial assignment: the process of constructing a hierarchy of graphs ceases as soon as the size of the coarsest graphs reaches some desired threshold. A random initial population is generated at the lowest level The chromosomes, which are assignments of values to the variables are encoded as strings of bits, the length of which is the number of variables. At the lowest level, the length of the chromosome is equal to the number of clusters. The initial solution is simply constructed by assigning to all variable in a cluster, a random value v i . In this work, it is assumed that all variables have the same domain (i.e., same set of values), otherwise different random values should be assigned to each variable in the cluster. All the individuals of the initial population are evaluated and assigned a fitness expressed in Eq. (1), which counts the number of constraint violations where < x i ; s i ð Þ, x j ; s j À Á > denotes the constraint between the variables x i and x j where x i is assigned the value s i from D xi and x j is assigned the value s j from D xj .
• initial weights: the next step of the algorithm assigns a fixed amount of weight equal to 1 across all the constraints. The distribution of weights to constraints aims at forcing hard constraints with large weights to be satisfied thereby preventing the algorithm at a later stage from getting stuck at a local optimum.
• optimization: having computed an initial solution at the coarsest graph, GA starts the search process from the coarsest level G k ¼ V k , E k ð ) and continues to move toward smaller levels. The motivation behind this strategy is that the order in which the levels are traversed offers a better mechanism for performing diversification and intensification. The coarsest level allows GA to view any cluster of variables as a single entity leading the search to become guided in faraway regions of the solution space and restricted to only those configurations in the solution space in which the variables grouped within a cluster are assigned the same value. As the switch from one level to another implies a decrease in the size of the neighborhood, the search is intensified around solutions from previous levels in order to reach better ones. • genetic operators: the task of the crossover operator is to reach regions of the search space with higher average quality. The two-point crossover operator is applied to each matched pair of individuals. The two-point crossover selects two randomly points within a chromosome and then interchanges the two parent chromosomes between these points to generate two new offspring.
• survivor selection: the selection acts on individuals in the current population. Based on each individual quality (fitness), it determines the next population. In the roulette method, the selection is stochastic and biased toward the best individuals. The first step is to calculate the cumulative fitness of the whole population through the sum of the fitness of all individuals. After that, the probability of selection is calculated for each individual as • updating weights: the weights of each current violated constraint is then increased by one, whereas the newly satisfied constraints will have their weights decreased by one before the start of new generation.
• termination condition: the convergence of GA is supposed to be reached if the best individual remains unchanged during five consecutive generations.
• projection: once GA has reached the convergence criterion with respect to a child level graph G k ¼ V k ; E k ð Þ, the assignment reached on that level must be projected on its parent graph G kÀ1 ¼ V kÀ1 ; E kÀ1 ð Þ . The projection algorithm is simple; if a cluster belongs to G k ¼ V k ; E k ð Þ is assigned the value vl i , the merged pair of clusters that it represents belonging to G kÀ1 ¼ V kÀ1 ; E kÀ1 ð Þare also assigned the value vl i ,

Experimental setup
The benchmark instances were generated using model A [18] as follows: each instance is defined by the 4-tuple n, m, p d , p t , where n is the number of variables; m is the size of each variable's domain; p d , the constraint density, is the proportion of pairs of variables, which have a constraint between them; and p t , the constraint tightness, is the probability that a pair of values is inconsistent. From the n Â n À 1 ð Þ=2 ð Þ possible constraints, each one is independently chosen to be added in the constraint graph with the probability p d . Given a constraint, we select with the probability p t , which value pairs become no-goods. The model A will on average have p d Â n À 1 ð Þ=2 constraints, each of which has on average p t Â m 2 inconsistent pairs of values. For each pair of density tightness, we generate one soluble instance (i.e., at least one solution exists). Because of the stochastic nature of GA, we let each algorithm do 100 independent runs, each run with a different random seed. Many NP-complete or NP-hard problems show a phase transition point that marks the spot where we go from problems that are under-constrained and so relatively easy to solve, to problems that are over-constrained and so relatively easy to prove insoluble. Problems that are on average harder to solve occur between these two types of relatively easy problem. The values of p d and p t are chosen in such a way that the instances generated are within the phase transition. In order to predict the phase transition region, a formula for the constrainedness [19] of binary CSPs was defined by: The tests were carried out on a DELL machine with 800 MHz CPU and 2 GB of memory. The code was written in C and compiled with the GNU C compiler version 4.6. The following parameters have been fixed experimentally and are listed below: • Population size = 50 • Stopping criteria for the coarsening phase: the reduction process stops as soon as the number of levels reaches 3. At this level, MLV-WGA generates an initial population.
• Convergence during the optimization phase: if there is no observable improvement of the fitness function of the best individual during five consecutive generations, MLV-WGA is assumed to have reached convergence and moves to a higher level.

Results
The plots in Figures 2 and 3 compare the WGA with its multilevel variant MLV-WGA. The improvement in quality imparted by the multilevel context is immediately clear. Both WGA and MLV-WGA exhibit what is called a plateau region. A plateau region spans a region in the search space where crossover and mutation operators leave the best solution or the mean solution unchanged. However, the length of this region is shorter with MLV-WGA compared to that of WGA. The multilevel context uses the projected solution obtained at G mþ1 V mþ1 ; E mþ1 ð Þas the initial solution for G m V m ; E m ð Þ for further refinement. Even though the solution at G mþ1 V mþ1 ; E mþ1 ð Þis at a local minimum, the projected solution may not be at a local optimum with respect to G m V m ; E m ð Þ . The projected assignment is already a good solution leading WGA to converge quicker within few generations to a better solution. Tables 1-3 show a comparison of     the two algorithms. For each algorithm, the best (Min) and the worst (Max) results are given, while mean represents the average solution. MLV-WGA outperforms WGA in 53 cases out of 96, gives similar results in 20 cases, and was beaten in 23 cases. The performance of both algorithms differs significantly. The difference for the total performance is between 25 and 70% in the advantage of MLV-GA. Comparing the worst performances of both algorithms, MLV-WGA gave bad results in 15 cases, both algorithms give similar results in 8 cases, and MLV-WGA was able to perform better than WGA in 73 cases. Looking at the average results, MLV-WGA does between 16 and 41% better than WGA in 84 cases, while the differences are very marginal in the remaining cases where WGA beats MLV-WGA.

Conclusion
In this work, a multilevel weighted based-genetic algorithm is introduced for MAX-CSP. The results have shown that the multilevel genetic algorithm returns a better solution for the equivalent run-time for most cases compared to the standard genetic algorithm. The multilevel paradigm offeres a better strategy for performing diversification and intensification. This is achieved by allowing GA to view a cluster of variables as a single entity thereby leading the search becoming guided and restricted to only those assignments in the solution space in which the variables grouped within a cluster are assigned the same value. As the size of the clusters gets smaller from one level to another, the size of the neighborhood becomes adaptive, and allows the possibility of exploring different regions in the search space while intensifying the search by exploiting the solutions from previous levels in order to reach better solutions.