A Multilevel Approach Applied To Sat-Encoded Problems 3 Algorithm 1 : The Multilevel Generic Algorithm input

The satisfiability problem (SAT) which is known to be NP-complete (7) plays a central role problem in many applications in the fields of VLSI Computer-Aided design, Computing Theory, and Artificial Intelligence. Generally, a SAT problem is defined as follows. A propositional formula Φ = ∧m j=1 Cj with m clauses and n Boolean variables is given. Each Boolean variable, xi, i ∈ {1, . . . , n}, takes one of the two values, True or False. A clause , in turn, is a disjunction of literals and a literal is a variable or its negation. Each clause Cj has the form:


Introduction 1.1 The satisfiability problem
The satisfiability problem (SAT) which is known to be NP-complete (7) plays a central role problem in many applications in the fields of VLSI Computer-Aided design, Computing Theory, and Artificial Intelligence.Generally, a SAT problem is defined as follows.A propositional formula Φ = m j=1 C j with m clauses and n Boolean variables is given.Each Boolean variable, x i , i ∈{ 1, . . ., n}, takes one of the two values, True or False.A clause , in turn, is a disjunction of literals and a literal is a variable or its negation.Each clause C j has the form: where I j , Īj ⊆{1, .....n}, I ∩ Īj = ∅, and xi denotes the negation of x i .The task is to determine whether there exists an assignment of values to the variables under which Φ evaluates to True.Such an assignment, if it exists, is called a satisfying assignment for Φ, and Φ is called satisfiable.Otherwise, Φ is said to be unsatisfiable.Since we have two choices for each of the n Boolean variables, the size of the search space S becomes |S| = 2 n .That is, the size of the search space grows exponentially with the number of variables.Since most known combinatorial optimization problems can be reduced to SAT (8), the design of special methods for SAT can lead to general approaches for solving combinatorial optimization problems.
Most SAT solvers use a Conjunctive Normal Form (CNF) representation of the formula Φ.
In CNF, the formula is represented as a conjunction of clauses, with each clause being a disjunction of literals.For example, P ∨ Q is a clause containing the two literals P and Q.
The clause P ∨ Q is satisfied if either P is True or Q is True.When each clause in Φ contains exactly k literals, the resulting SAT problem is called k-SAT.
The rest of the paper is organized as follows.Section 2 provides an overview of algorithms used for solving the satisfiability problem.Section 3 reviews some of the multilevel techniques that have been applied to other combinatorial optimization problems.Section 4 gives a general description of memetic algorithms.Section 5 introduces the multilevel memetic algorithm.Section 6 presents the results obtained from testing the multilevel memetic algorithm on large industrial instances.Finally, in Section 7 we present a summary and some guidelines for future work.

SAT solvers
One of the earliest local search algorithms for solving SAT is GSAT (36)(3) (38).Basically, GSAT begins with a random generated assignment of values to variables, and then uses the steepest descent heuristic to find the new variable-value assignment which best decreases the number of unsatisfied clauses.After a fixed number of moves, the search is restarted from a new random assignment.The search continues until a solution is found or a fixed number of restarts have been performed.Another widely used variant of GSAT is the Walksat algorithm and its variants in (37) (5) based on both fitness and diversity to choose individuals to participate in the reproduction phase of a genetic algorithm.Experiments showed that the resulting genetic algorithm was able to find solutions of a higher quality than the scatter evolutionary algorithm (4).
Lacking the theoretical guidelines while being stochastic in nature, the deployment of several meta-heuristics involves extensive experiments to find the optimal noise or walk probability settings.To avoid manual parameter tuning, new methods have been designed to automatically adapt parameter settings during the search (25) (32), and results have shown their effectiveness for a wide range of problems.

Multilevel techniques
The multilevel paradigm is a simple technique which at its core involves recursive coarsening to produce smaller and smaller problems that are easier to solve than the original one.The pseudo-code of the multilevel generic algorithm is shown in Algorithm 1.
The multilevel paradigm consists of four phases: coarsening, initial solution, uncoarsening and refinement.The coarsening phase aims at merging the variables associated with the problem to form clusters.The clusters are used in a recursive manner to construct a ; level := level -1; end hierarchy of problems each representing the original problem but with fewer degrees of freedom.The coarsest level can then be used to compute an initial solution.The solution found at the coarsest level is uncoarsened (extended to give an initial solution for the next level) and then improved using a chosen optimization algorithm.A common feature that characterizes multilevel algorithms, is that any solution in any of the coarsened problems is a legitimate solution to the original one.Optimization algorithms using the multilevel paradigm draw their strength from coupling the refinement process across different levels.Multilevel techniques were first introduced when dealing with the graph partitioning problem (GGP) (1) (14) (16) (22) (23) (44) and have proved to be effective in producing high quality solutions at a lower cost than single level techniques.The traveling salesman problem (TSP) was the second combinatorial optimization problem to which the multilevel paradigm was applied (45) (46) and has clearly shown a clear improvement in the asymptotic convergence of the solution quality.When the multilevel paradigm was applied to the graph coloring problem (42), the results do not seem to be in line with the general trend observed in GCP and TSP as its ability to enhance the convergence behavior of the local search algorithms was rather restricted to some class of problems.Graph drawing is another area where multilevel techniques gave a better global quality to the drawing and the author suggests its use to both accelerate and enhance force drawing placement algorithms (43).

Memetic Algorithms (MAs)
An important prerequisite for the multilevel paradigm is the use of an optimization search strategy in order to carry out the refinement during each level.In this work, we propose a memetic algorithm (MA) that we use for the refinement phase.Algorithm 2 provides a canonical memetic algorithm.
MAs represent the set of hybrid algorithms that combine genetic algorithms and local search.In general the genetic algorithm improves the solution while the local search fine tunes the solution.They are adaptive based search optimizations algorithms that take their inspiration from genetics and evolution process (31).Memetic algorithms simultaneously examine and manipulate a set of possible solution.Given a specific problem to solve, the input to MAs is an initial population of solutions called individuals or chromosomes.A gene is part of a chromosome, which is the smallest unit of genetic information.Every gene is able to assume different values called allele.All genes of an organism form a genomem which Replace the parent population with a new generation; ; end affects the appearance of an organism called phenotype.The chromosomes are encoded using a chosen representation and each can be thought of as a point in the search space of candidate solutions.Each individual is assigned a score (fitness) value that allows assessing its quality.The members of the initial population may be randomly generated or by using sophisticated mechanisms by means of which an initial population of high quality chromosomes is produced.
The reproduction operator selects (randomly or based on the individual's fitness) chromosomes from the population to be parents and enters them in a mating pool.Parent individuals are drawn from the mating pool and combined so that information is exchanged and passed to offspring depending on the probability of the crossover operator.The new population is then subjected to mutation and entered into an intermediate population.The mutation operator acts as an element of diversity into the population and is generally applied with a low probability to avoid disrupting crossover results.The individuals from the intermediate population are then enhanced with a local search and evaluated.Finally, a selection scheme is used to update the population giving rise to a new generation.The individuals from the set of solutions which is called population will evolve from generation to generation by repeated applications of an evaluation procedure that is based on genetic operators and a local search scheme.Over many generations, the population becomes increasingly uniform until it ultimately converges to optimal or near-optimal solutions.

The Multilevel Memetic Algorithm (MLVMA)
The implementation of a multilevel algorithm for the SAT problem requires four basic components: a coarsening algorithm, an initialization algorithm, an extension algorithm (which takes the solution on one problem and extends it to the parent problem), and a memetic algorithm which will be used during the refinement phase.In this section we describe all these components which are necessary for the derivation of a memetic algorithm operating in a multilevel context.This process, is graphically illustrated in Figure 1 using an example with 10 variables.The coarsening phase uses two levels to coarsen the problem down to three clusters.Level 0 corresponds to the original problem.A random coarsening procedure is used to merge randomly the variables in pairs leading to a coarser problem with 5 clusters.This process is repeated leading to the coarsest problem with 3 clusters.An initial solution is generated where the first cluster is assigned the value of true and the remaining two clusters are assigned the value false.At the coarsest level, our MA wil generate an initial population and then improves it.As soon as the convergence criteria is reached at Level 2 , the uncoarsening phase takes the solution from that level and extends it to give an initial solution for Level 1 and then proceed with the refinement.This iteration process ends when MA reaches the stop criteria that is met at Level 0 .

Coarsening
The coarsening procedure has been implemented so that each coarse problem P l+1 is created from its parent problem P l by merging variables and representing each merged pair v i and v j with a child variable that we call a cluster in P l+1 .The coarsening scheme uses a simple randomized algorithm similar to (16).The variables are visited in a random order.If a variable v i has not been merged yet, then we randomly select one randomly unmerged variable v j , and a cluster consisting of these two variables is created.Unmatched variables are simply copied to the next level.The new formed clusters are used to define a new and smaller problem and recursively iterate the coarsening process until the size of the problem reaches some desired threshold.

Initial solution & refinement
As soon as the coarsening phase is ended, a memetic algorithm is used at different levels.The next subsections describes the main features of the memetic algorithm used in this work.

Fitness function
The notion of fitness is fundamental to the application of memetic algorithms.It is a numerical value that expresses the performance of an individual (solution) so that different individuals can be compared.The fitness of a chromosome (individual) is equal to the number of clauses that are unsatisfied by the truth assignment represented by the chromosome.

171
A Multilevel Approach Applied to Sat-Encoded Problems www.intechopen.com

Representation
A representation is a mapping from the state space of possible solutions to a state of encoded solutions within a particular data structure.The chromosomes (individuals) which are assignments of values to the variables are encoded as strings of bits, the length of which is the number of variables (or clusters if MA is operating on a coarse level).The values True and False are represented by 1 and 0 respectively.In this representation , an individual X corresponds to a truth assignment and the search space is the set S = {0, 1} n .

Initial population
A initial solution is generated using a population consisting of 50 individuals.According to our computational experience, larger populations do not bring effective improvements on the quality of the results.At the coarsest level, MA will randomly generate an initial population of 50 individuals in which each gene's allele is assigned the value 0 or 1.

Crossover
The task of the crossover operator is to reach regions of the search space with higher average quality.New solutions are created by combining pairs of individuals in the population and then applying a crossover operator to each chosen pair.Combining pairs of individuals can be viewed as a matching process.The individuals are visited in random order.An unmatched individual i k is matched randomly with an unmatched individual i l .Thereafter, the two-point crossover operator is applied using a crossover probability to each matched pair of individuals.The two-point crossover selects two randomly points within a chromosome and then interchanges the two parent chromosomes between these points to generate two new offspring.Recombination can be defined as a process in which a set of configurations (solutions referred as parents ) undergoes a transformation to create a set of configurations (referred as offspring).The creation of these descendants involves the location and combinations of features extracted from the parents.The reason behind choosing the two point crossover are the results presented in (41) where the difference between the different crossovers are not significant when the problem to be solved is hard.The work conducted in (39) shows that the two-point crossover is more effective when the problem at hand is difficult to solve.In addition, the author propose an adaptive mechanism in order to have evolutionary algorithms choose which forms of crossover to use and how often to use them, as it solves a problem.

Mutation
The purpose of mutation which is the secondary search operator used in this work, is to generate modified individuals by introducing new features in the population.By mutation, the alleles of the produced child have a chance to be modified, which enables further exploration of the search space.The mutation operator takes a single parameter p m , which specifies the probability of performing a possible mutation.Let C = c 1 , c 2 , ......c m be a chromosome represented by a binary chain where each of whose gene c i is either 0 or 1.In our mutation operator, each gene c i is mutated through flipping this gene's allele from 0 to 1 or vice versa if the probability test is passed.The mutation probability ensures that, theoretically, every region of the search space is explored.If on the other hand, mutation is applied to all genes, the evolutionary process will degenerate into a random search with no benefits of the information gathered in preceding generations.The mutation operator prevents the searching process form being trapped into local optimum while adding to the diversity of the population and thereby increasing the likelihood that the algorithm will generate individuals with better fitness values.

Selection
The selection operator acts on individuals in the current population.During this phase, the search for the global solution gets a clearer direction, whereby the optimization process is gradually focused on the relevant areas of the search space.Based on each individual quality (fitness), it determines the next population.In the roulette method, the selection is stochastic and biased toward the best individuals.The first step is to calculate the cumulative fitness of the whole population through the sum of the fitness of all individuals.After that, the probability of selection is calculated for each individual as being , where f i is the fitness of individual i.Finally, the last component of our MA is the use of local improvers.By introducing local search at this level, the search within promising areas is intensified.This local search should be able to quickly improve the quality of a solution produced by the crossover operator, without diversifying it into other areas of the search space.In the context of optimization, this rises a number of questions regarding how best to take advantage of both aspects of the whole algorithm.With regard to local search there are issues of which individuals will undergo local improvement and to what degree of intensity.However care should be made in order to balance the evolution component (exploration) against exploitation (local search component).Bearing this thought in mind, the strategy adopted in this regard is to let each chromosome go through a low rate intensity local improvement.Algorithm 3 shows the local search algorithm used.This heuristic is used for one iteration during which it seeks for the variable-value assignment with the largest decrease or the smallest increase in the number of unsatisfied clauses.Random tie breaking strategy is used between variables with identical score.

Convergence criteria
As soon as the population tends to loose its diversity, premature convergence occurs and all individuals in the population tend to be identical with almost the same fitness value.During each level, the proposed memetic algorithm is assumed to reach convergence when no further improvement of the best solution (the fittest chromosome) has not been made during two consecutive generations.

173
A Multilevel Approach Applied to Sat-Encoded Problems www.intechopen.com

Uncoarsening
Having improved the assignment at the level L m+1 , the assignment must be projected onto its parent level L m .The uncoarsening process is trivial; if a cluster C i ∈ L m+1 is assigned the value of true then the matched pair of clusters that it represents, C j and C k ∈ L m are also assigned the value true.The idea of refinement is to use the projected population from L m+1 onto L m as the initial population for further improvement using the proposed memetic algorithm.Even though the population at L m+1 is at local minimum, the projected population at level L m may not be at a local optimum.The projected population is already a good solution and contains individuals with high fitness value, MA will converge quicker within a few generation to a better assignment.

Boundary model checking
The instances used in our experiments arise from model checking (6) which is considered to be one among many real-world problems that are often characterized by large and complex search spaces.Model checking is an automatic procedure for verifying finite-state concurrent systems.Given a model of a design and a specification in temporal logic, one is interested to check whether the model satisfies the specification.Methods for automatic model checking of complex hardware design systems are gaining wide industrial acceptance compared to traditional techniques based on simulation.The most widely used of these methods is called Bounded Model Checking (2) (BMC).In BMC the design to be validated is represented as a finite state machine, and the specification is formalized by writing temporal logic properties.The reachable states of the design are then traversed in order to verify the properties.The basic idea in BMC is to find bugs or counterexamples of length k.
In practice, one looks for longer counterexamples by incrementing the bound k, and if no counterexample exists after a certain number of iterations , one may conclude that the correctness of the specification holds.The main drawback with model checking real systems is the so-called state-explosion problem: as the size of of the system being verified increases, the total state space of the system increases exponentially.This problem makes exhaustive search exploration intractable.
In recent years, there has been a growing interest in applying methods based on propositional satisfiability (SAT) (29) (12) in order to improve the scalability of model checking.The BMC problem can be reduced to a propositional satisfiability problem, and can therefore be solved by SAT solvers.Essentially, there are two phases in BMC.In the first phase, the behavior of the system to be verified is encoded as a propositional formula.In the second phase, that formula is given to a propositional decision algorithm, i.e., a satisfiability solver, to either obtain a satisfying assignment or to prove there is none.If the formula is satisfiable, a bug has been located in the design, otherwise one cannot in general conclude that there is no bug; one must increase the bound, and search for "larger bugs".

Test suite
We evaluated the performance of the multilevel memetic algorithm on a set of large problem instances taken from real industrial bounded model checking hardware designs.This set is taken from the SATLIB website (http://www.informatik.tu-darmstadt.de/AI/SATLIB).All the benchmark instances used in this experiment are satisfiable instances.Due  • Crossover probability = 0.85.
• Stopping criteria for the coarsening phase: The coarsening stops as soon as the size of the coarsest problem reaches 100 variables (clusters).At this level, MA generates an initial population.
• Convergence during the refinement phase: If no improvement of the fitness function of the best individual has not been observed during 10 consecutive generations, MA is assumed to have reached convergence and moves to a higher level.

Experimental results
Figures 2-9 show how the best assignment (fittest chromosome) progresses during the search.The plots show immediately the dramatic improvement obtained using the multilevel paradigm.The performance of MA is unsatisfactory and is getting even far more dramatic for larger problems as the percentage excess over the solution is higher compared to that of MLVMA.The curves show no cross-over implying that MLVMA dominates MA.The plots suggest that problem solving with MLVMA happens in two phases.The first phase which corresponds to the early part of the search, MLVMA behaves as a hill-climbing method.This phase which can be described as a long one, up to 85% of the clauses are satisfied.The best assignment improves rapidly at first, and then flattens off as we mount the plateau, marking the start of the second phase.The plateau spans a region in the search space where flips typically leave the best assignment unchanged, and occurs more specifically once the refinement reaches the finest level.Comparing the multilevel version with the single level version, MLVMA is far better than MA, making it the clear leading algorithm.The key behind the efficiency of MLVMA relies on the multilevel paradigm.MLVMA uses the multilevel paradigm and draw its strength from coupling the refinement process across different levels.This paradigm offers two main advantages which enables MA to become much more powerful in the multilevel context: • During the refinement phase MA applies a local a transformation ( i.e, a move) within the neighborhood (i.e, the set of solutions that can be reached from the current one ) of the current solution to generate a new one.The coarsening process offers a better mechanism for performing diversification (i.e, the ability to visit many and different regions of the search space) and intensification (i.e, the ability to obtain high quality solutions within those regions).
• By allowing MA to view a cluster of variables as a single entity, the search becomes guided and restricted to only those configurations in the solution space in which the variables grouped within a cluster are assigned the same value.As the size of the clusters varies from one level to another, the size of the neighborhood becomes adaptive and allows the possibility of exploring different regions in the search space while intensifying the search by exploiting the solutions from previous levels in order to reach better solutions.Figures 9 shows the convergence behavior expressed as the ratio between the best chromosome of the two algorithms as a function of time.The plots show that the curves are below the value 1 leading to conclude that MLVMA is faster compared to MA.The asymptotic performance offered by MLVMA is impressive, and dramatically improves on MA.In some cases, The difference in performance reaches 30% during the first seconds, and maintains it during the whole search process.However, on other cases, the difference in performance continues to increase as the search progresses.

Conclusion
In this work, we have described a new approach for addressing the satisfiability problem.which combines the multilevel paradigm with a simple memetic algorithm.Thus, in order to get a comprehensive picture of the new algorithm's performance, we used a set of benchmark instances drawn from Bounded Model Checking.The experiments have shown that MLVMA works quite well with a random coarsening scheme combined with a simple MA used as a refinement algorithm.The random coarsening provided a good global view of the problem, while MA used during the refinement phase provided a good local view.It can be seen from the results that the multilevel paradigm greatly improves the MA and always returns a better solution for the equivalent runtime.The quality of the solution provided by MLVMA can get as high as 77%.A scale up test shows that the difference in performances between the two algorithms increases with larger problems.Our future work aims at investigating other coarsening schemes and study other parameters which may influence the interaction between the memetic algorithm and the multilevel paradigm.

169A
Multilevel Approach Applied to Sat-Encoded Problems www.intechopen.comAlgorithm 2: A Canonical Memetic Algorithm begin Generate initial population ; Evaluate the fitness of each individual in the population ; While (Not Convergence reached) Select individuals according to a scheme to reproduce ; Breed if necessary each selected pairs of individuals through crossover; Apply mutation if necessary to each offspring ; Apply local search to each chromosome ; Evaluate the fitness of the intermediate population ;

Fig. 4 .
Fig. 4. bmc-ibm-5: |V| = 9396, |C| = 41207.Along the horizontal axis we give the time in seconds , and along the vertical axis the number of unsatisfied clauses.

Fig. 9 .
Fig.9.Results on the convergence behavior for bmc-ibm-2.cnf,bmc-ibm-3.cnf,bmc-ibm-5.cnf.Along the horizontal axis we give the time (in seconds) , and along the vertical axis the convergence rate.

179A
Multilevel Approach Applied to Sat-Encoded Problems www.intechopen.com (18)(17)(26)(27)(18). It first picks randomly an unsatisfied clause, and then, in a second step, one of the variables with the lowest break count, appearing in the selected clause, is randomly.
(13)ing stuck in non attractive areas of the underlying search space have become increasingly popular in SAT solving.A new approach to clause weighting known as Divide and Distribute Fixed Weights (DDFW)(20)exploits the transfer of weights from neighboring satisfied clauses to unsatisfied clauses in order to break out from local minima.Recently, a strategy based on assigning weights to variables (33) instead of clauses greatly enhances the performance of the Walksat algorithm, leading to the best known results on some benchmarks.Evolutionary algorithms are heuristic algorithms that have been applied to SAT and many other NP-complete problems.Unlike local search methods that work on a current single solution, evolutionary approaches evolve a set of solutions.GASAT (21)(24) is considered to be the best known genetic algorithm for SAT.GASAT is a hybrid algorithm that combines a specific crossover and a tabu search procedure.Experiments have shown that GASAT provides very competitive results compared with state-of-art SAT algorithms.Gottlieb at al. proposed several evolutionary algorithms for SAT(13).Results presented in that paper show that evolutionary algorithms compare favorably to Walksat.Finally,Boughaciet al. introduced a new selection strategy randomization nature of the algorithms, each problem instance was run 20 times with a cutoff parameter (max-time) set to (300sec).We use |.| to denote the number of elements in a set, e.g., |V| is the number of variables, while |C| denotes the number of clauses.Table shows the instances used in the experiment.The tests were carried out on a a DELL machine with 800 MHz CPU and 2 GB of memory.The code was written in C and compiled with the GNU C compiler version 4.6.The parameters used in the experiment are listed below: