A Simulated Annealing Algorithm for the Satisﬁability Problem Using Dynamic Markov Chains with Linear Regression Equilibrium

Since the appearance of Simulated Annealing algorithm it has shown to be an efficient method to solve combinatorial optimization problems such as Boolean Satisfiability problem. New algorithms based on two cycles: one external for temperatures and other internal, named Metropolis, have emerged. These algorithms usually use the same Markov chain length in the Metropolis cycle for each temperature. In this paper we propose a method based on linear regression to find the Metropolis equilibrium. Experimentation shows that the proposed method is more efficient than the classical one, since it obtains the same quality of the final solution with less processing time.


Introduction
Since the appearance of Simulated Annealing algorithm it has shown to be an efficient method to solve combinatorial optimization problems such as Boolean Satisfiability problem. New algorithms based on two cycles: one external for temperatures and other internal, named Metropolis, have emerged. These algorithms usually use the same Markov chain length in the Metropolis cycle for each temperature. In this paper we propose a method based on linear regression to find the Metropolis equilibrium. Experimentation shows that the proposed method is more efficient than the classical one, since it obtains the same quality of the final solution with less processing time.
Today we have a considerable interest for developing new and efficient algorithms to solve hard problems, mainly those considered in the complexity theory (NP-complete or NP-hard) [8]. The Simulated Annealing algorithm proposed by Kirkpatrick et al. [18] and Cerny [5,6] is an extension of the Metropolis algorithm [23] used for the simulation of the physical annealing process and is specially applied to solve NP-hard problems where it is very difficult to find the optimal solution or even near-to-optimum solutions.
Efficiency and efficacy are given to Simulated Annealing algorithm by the cooling scheme which consists of initial (c i ) and final (c f ) temperatures, the cooling function ( f (c k )) and the length of the Markov chain (L k ) established by the Metropolis algorithm. For each value of the control parameter (c k ) (temperature), Simulated Annealing algorithm accomplishes a certain number of Metropolis decisions. In this regard, in order to get a better performance of the Simulated Annealing algorithm a relation between the temperature and Metropolis cycles may be enacted [13].
The Simulated Annealing algorithm can get optimal solutions in an efficient way only if its cooling scheme parameters are correctly tuned. Due this, experimental and analytical parameters tuning strategies are currently being studied; one of them known as ANDYMARK [13] is an analytical method that has been shown to be more efficient. The objective of these methods is to find better ways to reduce the required computational resources and to increment the quality of the final solution. This is executed applying different accelerating techniques such as: variations of the cooling scheme [3,27], variations of the neighborhood scheme [26] and with parallelization techniques [12,26].
In this chapter an analytic adaptive method to establish the length of each Markov chain in a dynamic way for Simulated Annealing algorithm is presented; the method determines the equilibrium in the Metropolis cycle using Linear Regression Method (LRM). LRM is applied to solve the satisfiability problems instances and is compared versus a classical ANDYMARK tune method.

Background
In complexity theory, the satisfiability problem is a decision problem. The question is: given the expression, is there some assignment of TRUE and FALSE values to the variables that will make the entire expression true? A formula of propositional logic is said to be satisfiable if logical values can be assigned to its variables in a way that makes the formula true.
The propositional satisfiability problem, which decides whether a given propositional formula is satisfiable, is of critical importance in various areas of computer science, including theoretical computer science, algorithmics, artificial intelligence, hardware design, electronic design automation, and verification. The satisfiability problem was the first problem refered to be as NP complete [7] and is fundamental to the analysis of the computational complexity of many problems [28].

Boolean satisfiability problem (SAT)
An instance of SAT is a boolean formula which consists on the next components: • A set S of n variables x 1 , x 2 , x 3 , ..., x n .
• A set L of literals; a literal l i , is a variable x i or its negation x i .
• A set of m clauses: C 1 , C 2 , C 3 , ..., C m where each clause consists of literals l i linked by the logical connective OR (∨). This is: SAT is also easier if the number of literals in a clause is limited to 2, in which case the problem is called 2 − SAT, this problem can also be solved in polynomial time [2,10]. One of the most important restrictions of SAT is HORN-SAT where the formula is a conjunction of Horn clauses (a Horn clause is a clause with at most one positive literal). This problem is solved by the polynomial-time Horn-satisfiability algorithm [9].
The 3-satisfiability (3-SAT) is a special case of k-satisfiability (k-SAT), when each clause contains exactly k = 3 literals. 3-SAT is NP-complete and it is used as a starting point for proving that other problems are also NP-hard [31]. This is done by polynomial-time reduction from 3-SAT to the other problem [28].

Simulated Annealing algorithm
Simulated Annealing improves this strategy through the introduction of two elements. The first is the Metropolis algorithm [23], in which some states that do not improve energy are accepted when they serve to allow the solver to explore more of the possible space of solutions. Such "bad" states are allowed using the Boltzman criterion: e −ΔJ/T > rnd(0, 1), where ΔJ is the change of energy, T is a temperature, and rnd(0, 1) is a random number in the interval [0, 1). J is called a cost function and corresponds to the free energy in the case of annealing a metal. If T is large, many "bad" states are accepted, and a large part of solution space is accessed.
The second is, again by analogy with annealing of a metal, to lower the temperature. After visiting many states and observing that the cost function declines only slowly, one lowers the temperature, and thus limits the size of allowed "bad" states. After lowering the temperature several times to a low value, one may then "quench" the process by accepting only "good" states in order to find the local minimum of the cost function.
The elements of Simulated Annealing are: • A finite set S.
• A cost function J defined on S. Let S * ⊂ S be the set of global minima of J.
• For each i ∈ S, a set S(i) ⊂ S − i is called the set of neighbors of i.
• For every i, a collection of positive coefficients q ij , j ∈ S(i), such that ∑ j∈S(i) q ij = 1. It is assumed that j ∈ S(i) if and only if i ∈ S(j).
• A nonincreasing function T : N → (0, ∞), called the cooling schedule. Here N is the set of positive integers, and T(t) is called the temperature al time t.
The Simulated Annealing algorithms consists of a discrete time inhomogeneus Markov chain x(t) [4]. If the current state x(t) is equal to i, chose a neighbor j of i at random; the probability that any particular j ∈ S(i) is selectec is equal to q ij . Once j is chosen, the next state x(t + 1) is determined as follows: In a formal way: In Simulated Annealing algorithm we are considering a homogeneus Markov chain x T (t) wich temperature T(t) is held at a constant value T. Let us assume that the Markov chain x T(t) is irreducible and aperiodic and that q ij = x ji ∀i, j, then x T(t) is a reversible Markov chain, and its invariant probability distribution is given by: In Equation 4 Z T is a normalized constant and is evident that as T → 0 the probability π T is concentrate on the set S * of global minima of J, this property remains valid if the condition q ij = q ji is relaxed [11].
In the optimization context we can generate an optimal element with high probability if we produce a random sample according to the distribution π T , known as the Gibbs distribution. When is generated an element of S accomplished by simulating Markov chain x T (t) until it reaches equilibrium we have a Metropolis algorithm [23].
The Simulated Annealing algorithm can also be viewed as a local search method occasionally moves to higher values of the cost function J, this moves will help to Simulated Annealing escape from local minima. Proof of convergence of Simulated Annealing algorithm can be revised [4]. Figure 1 shows the classic algorithm simulated annealing. In the algorithm, we can see the cycle of temperatures between steps 2 and 5. Within this temperature cycle, are the steps 3 and 4 which correspond to the Metropolis algorithm.

Traditional Simulated Annealing algorithms
As described in the simulated annealing algorithm, Metropolis cycle is repeated until thermal equilibrium is reached, now we use the formalism of Markov chains to estimate how many times it is necessary to repeat the cycle metropolis of so that we ensure (with some probability) that all solutions of the search space are explored.
Similarly we can estimate a very good value for the initial and final temperature of the temperature cycle. All these estimates were made prior to running the simulated annealing algorithm, using data information SAT problem is solved.  It is well known that Simulated Annealing requires a well defined neighborhood structure and other parameters as initial and final temperatures T i and Tf . In order to determine these paratmeters we follow the next method proposed by [30]. So following the analysis made in [30] we give the basis of this method.
Let P A (S j ) be the accepting probability of one proposed solution S j generated from a current solution S i , and P R (S j ) the rejecting probability. The probability of rejecting S j can be established in terms of P A (S j ) as follows: Accepting or rejecting S j only depends on the cost deterioration size that this change will produce to the current solution, that means: In Equation 6, J(S i ) and J(S j ) are the cost associated to S i and S j respectively, and g(ΔJ ij ) is the probability to accept the cost difference The solution selected from S i may be any solution S j defined by the next neighborhood scheme:

is a mapping and S is the solution space of the problem being solved.
It can be seen from the Definition 2 that neighbors of a solution S i only depends on the neighborhood structure V established for a specific problem. Once V is defined, the maximum and minimum cost deteriorations can be written as: where ΔJ V max and ΔJ V min are the maximum and minimum cost deteriorations of the objective function through J respectively.

Markov Chains and Cooling Function
The Simulated Annealing algorithm can be seen like a sequence of homogeneous Markov chains, where each Markov chain is constructed for descending values of the control parameter T > 0 [1]. The control parameter is set by a cooling function like: and T k must satisfy the next property: At the beginning of the process T k has a high value and the probability to accept one proposed solution is high. When T k decreases this probability also decreases and only good solutions are accepted at the end of the process. In this regard every Markov chain makes a stochastic walk in the solution space until the stationary distribution is reached. Then a strong relation between the Markov chain length (L k ) and the cooling speed of Simulated Annealing exists: Because the Markov chains are built through a neighborhood sampling method, the maximum number of different solutions rejected at T f when the current solution S i is the optimal one, is the neighborhood size |V S i |. In this regard the maximum Markov chain length is a function of |V S i |. In general L k can be established as: In Equation 11, L max is the Markov chain length when T k = T f , and g(|V S i |) is a function that gives the maximum number of samples that must be taken from the neighborhood V S i in order to evaluate an expected fraction of different solutions at T f . The value of L max only depends on the number of elements of V S i that will be explored at T f .
Usually a Simulated Annealing algorithm uses a uniform probability distribution function G(T k )given by a random replacement sampling method to explore V S i at any temperature T k , where G(T k ) is established as follows: In this regard, the probability to get the solution S j in N samples is: Notice in Equation 13 that P A (S j ) may be understood as the expected fraction of different solutions obtained when N samples are taken. From Equation 13, N can be obtained as: In Equation 14, we define: You can see that P R (S j ) = 1 − P A (S j ), P R (S j ) is the rejection probability. Constant C establishes the level of exploration to be done In this way different levels of exploration can be applied. For example: if a 99% of the solution space is going to be explored, the rejection probability will be P R (S j ) = 0.01, so, from Equation 15 we obtain C = 4.60.

Definition 3.
The exploration set of the search space, Φ C , is defined as follows: • Because a high percentage of the solution space should be explored, C varies from 1 ≤ C ≤ 4.6 which guarantees a good level of exploration of the neighborhood at T f .
When the process is at the beginning the temperature T i is very high. This is because in the Boltzman distribution the acceptance probability is directly related with the cost increment P A = e −(ΔJ/T k ) ; where T k is the temperature parameter, therefore: At the beginning of the process, P A is close to one (normally 0.99, [21]) and the temperature is extremely high. Almost any solution is accepted at this temperature; as a consequence the stochastic equilibrium of a Markov cycle is reached with the first guess solution. Similarly, when the process is ending the acceptance probability (tipically 0.01) and the temperature closer to zero but the Metropolis cicle is very long.
For instance SAT values ΔJ V max and ΔJ V min in the energy of different states can be estimated at the beginning of the execution on the simulated annealing algorithm. To estimate these values, we can count the maximum number of Clauses containing any of the variables of the problem, the largest number of clauses that can change when we change the value of a variable, is an upper bound to change maximum of Energy and: Similarly, the minimum of change Energy can be estimated by counting the clauses that are changed when creating a new neighbor and obtain the lowest of these values: Some criticisms about Simulated Annealing are about the long time of execution of standard Boltzmann-type Simulated Annealing, has many times driven these projects to utilize a temperature schedule too fast to satisfy the sufficiency conditions required to establish a true ergodic search. In this chapter we use a logarithmic an exponential temperature schedule that is consistent with the Boltzmann algorithm follow: From Equation 20 we can obtain: and if in the previous expression Δk is equal to 1 then obtain the equation for two successive values of the temperature T k+1 = αT k , 0 < α < 1, k 1 where T k is the "temperature," k is the "time" index of annealing [16,17].

Simulated Annealing algorithm with the Markov chain Lenght dynamically
In [13,20,21] authors shown a strong relation between the cooling function and the length of the Markov chain exists. For the Simulated Annealing algorithm, the stationary distribution for each Markov chain is given by the Boltzmann probability distribution, which is a family of curves that vary from a uniform distribution to a pulse function.
At the very beginning of the process (with T k = T i ), Simulated Annealing has a uniform distribution, henceforth any guess would be accepted as a solution. Besides any neighbor of the current solution is also accepted as a new solution. In this way when Simulated Annealing is just at the beginning the Markov chain length is really small, L k = Li ≈ 1. When running the temperature cycle of simulated annealing, for values of k greater than 1, the value of T k is decremented by the cooling function [16], until the final temperature is reached (T k = T f ): In Equation 24 α is normally in the range of [0.7, 0.99][1].
In this regard the length of each Markov chain must be incremented at any temperature cycle in a similar but in inverse way that T k is decremented. This means that L k must be incremented until L max is reached at T f by applying an increment Markov chain factor (β). The cooling function given by Equation 24 is applied many times until the final temperature T f is reached. Because Metropolis cycle is finished when the stochastic equilibrium is reached, it can be also modeled as a Markov chain as follows: In previous Equation 25, L k represents the length of the current Markov chain at a given temperature, that means the number of iterations of the Metropolis cycle for a T k temperature. So L k+1 represents the length of the next Markov chain. In this Markov Model, β represents an increment of the number of iterations in the next Metropolis cycle.
If the cooling function given by Equation 24 is applied over and over, n times, until T k = T f , the next geometrical function is easily gotten: Knowing the initial (T i ) and the final (T f ) temperature and the cooling coefficient (α), the number of times that the Metropolis cycle is executed can be calculated as: If we make a similar process for increasing the equation of the Markov chain length, another geometrical function is obtained: Once n is known by Equation 27, the value of the increment coefficient (β) is calculated as: Once L max (calculated form Equation 16), L 1 and β are known, the length of each Markov chain for each temperature cycle can be calculated using Equation 27. In this way L k is computed dynamically from L 1 = 1 for T i until L max at T f . First we can obtain T i from Equation 18 and T f from Equation 27, with both values and Equation 29 algorithm can calculate β [30].
In Figure 2 we can see the simulated annealing algorithm modifications using Markov chains described above. Below we will explain how we will use the linear regression for the simulated annealing algorithm run more efficiently without losing quality in the solution.

Linear Regresion Method (LRM)
We explain, in Section 3.2, how to estimate the initial and final temperature for SAT instances that will be provided to the simulated annealing algorithm to determine if it is satisfiable or not.
As shown in the Figure 3, the algorithm found Metropolis various configurations with different energy at a given temperature.
The typical behavior of the energy for a given temperature can be observed in Figure 3. We set out to determine when the cycle of Metropolis reaches the equilibrium although not all Metropolis cycle: Stop criterion: In order to determine this zone in adaptive way, we will fit by least squares a straight line and will stop the Metropolis cycle if the slope of this line is equal or smaller than zero. This Linear Regression Method LRM is a well known method but never was applied to detect Metropolis equilibrium in Simulated Annealing.
Suppose that the data set consists of the points: We want to find a function f such that f (x i ) ≈ y i . To attain this goal, we suppose that the function f is of a particular form containing some parameters (a 1 , a 2 , a 3 , ...., a m ) which need to be determined. In our problem: In Equation 31 a and b are not yet known. In our problem f (x i , a, b) = f (i, a, b) = J i .
As usual, we now seek the values of a and b, that minimize the sum of the squares of the residuals as follows: As it is well known regression equations are obtained by differentiating S in Equation 32 with respect to each parameter a and b, and we obtain this system of linear equations: In Equation 33 and Equation 34 we can define the following constants: Then the system of equations (Equation 33 and 34) can be rewritten as: We recall that parameter a, the slope of Equation 31 is: In our data x i = 1, 2, 3, ..., n, then we can write: and in the same way: and By substitution of equations: 39, 40, 41 and 42; in Equation 38 finally we get the equation In order to apply LRM to traditional Simulated Annealing, we apply the following strategy: 1. Metropolis cycle is running as usual, just as explained in the Section 3.3, using the maximum value of Markov chain length calculated by the Equation 28 L C max , C is calculated P i A ∈ Φ P A 2. When the repeats of metropolis, L, are equal to L C−1 max (L C−1 max is calculated by Equation 28

If the value of the slope a, in the equation of the line (Equation 31), found by Equation 43
is close to zero, then stop the cycle of Metropolis although this has not reached the value L max .
Notice from Equation 43 and for Figure 4, that the computation of LRM is O(n) where n is the number of points taken to compute the slope. So the complexity of Simulated Annealing with LRM is not affected [19,22].

Experimental results
In order to prove LRM algorithm we used the SAT instances in Table 1 and Table 2. Some of these instances were generated using the programs proposed by Horie et al. in 1997 [15] and Temperatures cycle: Metropolis cycle:  other are in SATLIB [14]. We generated several instances that had the same relation of clauses and variables σ [24,25].
The measurement of efficiency of this algorithm was based on the execution time and we also obtained a solution quality measure (SQM), SQM is taken as the number of "true" clauses in an instance at the end of the program execution.

Experiment design
The experiments with these algorithms require a considerable run-time, because each instance SAT, is solved several times to take average values of performance.
Another important element to consider is to guarantee that the conditions of execution on the various algorithms are similar (because we measure time of execution on). In this regard, the first thing we did was on the Evaluation of a set of computers with similar hardware and software conditions.
To run this task, there were used programs available from the Internet, that perform different computers test and give us the result of this evaluation [29,32,33].
In Table 3, MOS means: million operations per second, MSS million strings per second and MBTS represent millions of bytes transferred per second. As we can see Table 3  Both results, SA_C and SA_LRM, were compared using two quotients which we denominated time improvement Q time and quality improvement Q quality defined by: If Q quality is close to 100% this means that both algorithm found good solutions, however Q quality factor must decrease, which implies that the new algorithm SA_LRM is faster than SA_C.  Table 4 and Table 5 experimental results we can obtain the average values for the magnitudes Q time and Q quality , as shown in the following Table 6.
As you can see, in Table 6, the quality factor of the solutions, Q quality is very close to 100%, which implies that the SA_LRM algorithm finds solutions as good as the SA_C algorithm, it is important to note that 37% of SAT instances used for the experiments, are not-SAT, which implies that their respective SQM can not be equal to 100% and therefore the quality factor must be less than 100%.
Also in Table 6, we see that the factor Q time diminishes values less than 30%, showing that our algorithm, SA_LRM, is 70% faster than the SA_C algorithm but maintains the same quality of the solution.
As shown in Figure 5 are some instances in which the reduction of run time is only 25% while other reducing runtime up to 90%.  Table 6. Q time and Q quality averages for all instances tested

Conclusions
In this paper a new adaptive Simulated Annealing algorithm named SA_LRM that uses least squares method as a way to find the equilibrium zone in Metropolis cycle is presented. When this zone is found our algorithm abort the Metropolis cycle, although the iterations calculated with the dynamic chains of Markov have not been completed. After experimentation we show that SA_LRM is more efficient than those tuned using only an analytical method.

Author details
Felix