Improvements in Simulated Quenching Method for Vehicle Routing Problem with Time Windows by Using Search History and Devising Means for Reducing the Number of Vehicles

© 2012 Kokubugata et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Improvements in Simulated Quenching Method for Vehicle Routing Problem with Time Windows by Using Search History and Devising Means for Reducing the Number of Vehicles


Introduction
In many countries, rationalization of freight transportation is recognized to be an important problem. For example in Japan surrounded by sea, about 60% of freight transportation is carried out by road traffic. The proportion of freight road transportation to total road transportation is close to half. Although development of information technology accelerates electronic communication, physical distribution of goods is left behind. On the contrary, because electronic commerce has enhanced door-to-door delivery services, delivery distribution of goods has increased in urban areas. The demands for high-quality delivery services such as small-amount high frequency deliveries with time windows have been made by many clients (including companies and individuals).
From the aspect of freight carrier, decease of fuel consumption makes big profit, since the proportion of fuel to total cost is large. The rationalization in terms of increasing the loading rate and decreasing the total travel time is aimed not only for reducing operational costs in each freight carrier but also for relieving traffic congestion, saving energy and reducing exhaust gas. Effective distribution of goods should be realized by sophisticated delivery planning.
A typical delivery problem is modelled mathematically in Vehicle Routing Problem (VRP). In VRP, scattered clients are serviced only once by exactly one of plural vehicles with load

Vehicle Routing Problem with Time Windows (VRPTW))
In actual delivery operations, delivery time windows are often imposed by clients. Time window at node i is described as [ei, li], where ei is the earliest service starting time, li is the latest service starting time at node i. Vehicle routing problem taking account of time windows is called Vehicle Routing Problem with Time Windows (VRPTW). The first characteristic of VRPTW is the strictness of restriction on solutions. It often imposes increasing the number of operating vehicles. The second characteristic is the existence of waiting time. If the vehicle arrives before ei, it must wait until ei and then starts unloading service. Because VRP belongs to NP-hard problems, VRPTW belongs to them, too. Moreover, time windows make sequential delivery order restrictive. Hence, although both of VRP and VRPTW belong to same NP-hard problems in computational complexity theory, from a point of view with making practical algorithms, VRPTW is more difficult than VRP because of its tight constraint.

Precedent studies on heuristics for VRPTW
Because VRPTW belongs to NP-hard problems, exact methods are not fit for large problems. Therefore, heuristics have been important in the application of the VRPTW. Before the proposed method is explained, precedent studies on heuristics for VRPTW are introduced briefly. The heuristics for solving routing problems are classified into two major classes. The one is the family of traditional heuristics and the other is the family of metaheuristics including Simulated Annealing.

Traditional heuristics approaches for VRPTW
Comprehensive survey on traditional heuristics for VRPTW is presented in [3] by Bräysy & Gendreau. In this section, an outline of it is sketched. The traditional heuristics have been specially invented for solving VRPTW. They utilize the proper characteristics of VRPTW. They are further classified into two types.
The first one is the type of constructive heuristics that produce vehicle routes by merging existing routes or inserting nodes into existing routes. Ioannou et al. proposed an efficient constructive heuristic in [4]. They use the generic sequential insertion framework proposed in [5] by Solomon.
The second one is the type of improvement heuristics which make changes in one vehicle route or between several vehicle routes. Bräysy proposed several efficient local search heuristics in [6] using a three-phase approach. In the first phase, several initial solutions are created using the route construction heuristics with different combinations of parameter values. In the second phase, an effort is put to reduce the number of routes. In the third phase, classical Or-opt exchanges, which replace three edges in the original tour by three new ones without modifying the orientation of the route, are used to minimize total travelled distance.  [11]. They combined the SA process with the parallel construction approach that incorporates improvement procedures during the construction process.

Metaheuristics for VRPTW
In a comprehensive survey on metaheuristics for VRPTW given by Bräysy & Gendreau in [12], it is described that some hybrid methods are very effective and competitive with two good GA algorithms listed above. They are briefly introduced as follows. Bent & Van Hentenryck present a two-stage hybrid metaheuristic in [13], where in the first stage is a basic SA used to minimize the number of routes, and the second stage focuses on distance minimization using the large neighbourhood search. Bräysy presents a four-phase deterministic metaheuristic algorithm in [14] which is based on a modification of the variable neighbourhood search. Ibaraki et al. propose three metaheuristics in [15] to improve randomly generated initial solutions.

Data model and method of generating neighbours in searching process of simulated quenching for VRPTW
Although some precedent methods based on metaheuristics mentioned above show good performance, their procedures are considerably complex. In particular, the local search procedures incorporated into them are rather complicated. In practical application of VRPTW algorithms to real-world problems, ease of implementation and flexibility are very important as well as quality of solution, running time and robustness. Hence, the authors of this chapter have proposed a simpler data model and a one-phase algorithm to solve VRPTW in [16] which is not the two-phase algorithm composed by construction and improvement.

Data modelling for VRPTW
The model to express a state of solution of VRPTW is realized as a sequence of integers, i.e., a string. In the string, the position of an integer, which is a symbol of the node with demand, implies not only which vehicle tours the node but also the routing order of it. An example of the string model is illustrated in Figure 2. The special number '0' should be interpreted not only as the depot but also as the delimiter which partitions the trips. If the number of vehicles is denoted by m, (m−1) '0's are provided in the string. If there is no integer between '0' and '0', the relevant vehicle is not in use.
This data model is coincidentally similar to that invented for the solution based on a kind of GA. It was introduced by Gendreau et al. in [17] as the original idea was given by Van Breedam in [18]. However, the proposed transformation rules in this chapter based on the data model are quite different from those of precedent methods. They will be described in the following section.

Transformation rules for generating neighbours
In a repetition in the proposed procedure, a new state of solution is generated from the present state by one of the following three types of transformation rules for generating neighbours. The first rule is to exchange an integer with another one in the string. The

Node with Demand
Depot second rule is to delete an arbitrary integer and then insert it to another position in the string. The third rule is that after a substring is taken out temporally, the direction of the substring is reversed, and then embedded in the place where the substring is taken out. These three transformation rules are illustrated in Figure 3.
Note that the rules are also applied to the special number '0' in the string illustrated in Figure 2. In other words, '0' is treated impartially with other integers. If 'one-to-one exchange' is executed within a substring partitioned by two '0's, only a route is changed. An example of the case is illustrated in Figure 4. If 'one-to-one exchange' is executed between two non-zero integers striding over '0', two nodes are exchanged between two routes. An example of this case is illustrated in Figure 5. If 'one-to-one exchange' is executed between a non-zero integer and '0', two routes are merged, while another route is divided into two routes. An example is illustrated in Figure 6.
When the second transformations rule 'delete and insert' is applied, several different cases also arise. If a non-zero integer is deleted and inserted at '0', a node is moved to another vehicle route. An example is illustrated in Figure 7.
When the third transformations rule 'partial reversal' is applied, several different cases also arise. If a substring including '0' is reversed, the relevant plural routes are changed. An example is illustrated in Figure 8. These three transformation rules were originally invented for VRP in [19] by the authors of this chapter.

Objective function
The objective of the VRPTW is the minimization of total cost which is subject to constraints including the loading capacity of each vehicle and the time windows imposed by clients. The objective function of the VRPTW is formulated as follows.
where s = (s1, s2, · · · , sn) is a string that consists of the nodes with demands and a depot; s0 and sn+1 are the implicit expressions of the depot omitted in the string s; ck is the servicing Figure 6. A result of 'one-to-one exchange' between non-zero integer and '0'. cost at the node k (if k = 0,then ck = 0); dk,l is the minimal traversing cost from the node k to the node l. Each value of dk,l might be given by input data; or calculated as the Euclidean distances between a pair of coordinates of nodes; or calculated by the shortest path search algorithm (Warshall-Floyd's algorithm) when road network is given and vehicles must follow the roads in the network.

One to One Exchange
In order to impose solutions of VRPTW to satisfy time window constraints and load capacities and to reduce the number of vehicles in use, three penalty terms are added to the objective function (1) as follows:

One to One Exchange
where i s a is arriving time at node si; i s l is the latest service starting time at node si; m is the number of vehicles in use; i s w is the amount of demand at node si; zk is the position of kth '0' in the string s = (s1, s2, · · · , sn) (provided that z0 = 0; zm = n+1) and Wk is the loading capacity of vehicle k. α, β and γ are weight parameters.

Optimization algorithm using Simulated Quenching
Simulated Quenching is adopted as the optimization technique for the proposed method since it is characterized by simple stochastic procedures and by global searching scope. In the original Simulated Annealing (SA), starting with a random initial state, it is expected to approach an equilibrium point. In order to obtain global optimum, cooling schedule should be logarithmic. However, it spends too much time to implement it. Hence, in practical applications, exponential cooling schedule (3) is often adopted.
In the proposed method, it is adopted too. According to the strict theory of Simulated Annealing, the optimization technique using exponential cooling schedule (3) belongs to Simulated Qeunching (SQ) as described in [20]. SQ is considerd to be a practical variant of SA.
In the proposed method, the three transformation rules described in Sec. 4.2 are applied randomly to the string model. The entire algorithm for the VRP is described as follows.

Partial Reversal
Step III, that is the main part of this algorithm, is detailed as follows.
The words noted by capital letters are parameters used in SA and SQ and values of them are specified in Sec 6.2. As descibed in Sec. 4.2, the transformation procedure of a solution of the proposed method is carried out randomly to all over the string data model. Hence, the transformation might derive changes in a vehicle route on one occation, it might derive changes over several vehicle routes on other occation. This method is applied to VRP in [19], VRP with backhaouls (VRPB) in [21], Pick up and Delivery Problem (PDP) and VRPTW in [16] by the autors of this chaper. It is also applied to other types of routing problems including Capacitated Arc Routing Problem (CARP) in [22] and a general routing problem with nodes, edges, and arcs (NEARP) in [23]. A precise analysis of this method is presented in [24].

Improvement of optimization algorithm based on SQ by adaptation of devices inspired by ACO
Most of metaheuristics belong to stochastic local search (SLS) which starts at some position in search space and iteratively moves to neighbour, using randomised choices in generating and modifying candidates. In application of metaheuristics, both intensification and diversification are important. For sufficient convergence of solution, intensification of search scope is necessary. On the other hand, in order to avoid stagnation in local but not global minimum area, diversification of search scope is also necessary. As compared to 'stupid fly', search process in pure SA and SQ is completely random. Making use of history records during search processes might be possible to improvement of optimization algorithm based on SQ.

Ant Colony Optimization
ACO was introduced by Dorigo et al. in [25]. It was inspired by foraging behaviour of ants. Ants often communicate via chemicals known as pheromones, which are deposited on the ground in the form of trails. With time, paths leading to the more interesting food sources become more frequented and are marked with larger amounts of pheromone. Pheromone trails provide the basis for stochastic trail-following behaviour underlying, for example, the collective ability to find shortest paths between a food source and the nest. ACO is described as follows.

Initialise pheromone trails; While termination criterion is not satisfied Generate population sp of candidate solutions using subsidiary randomised constructive search Perform subsidiary local search on sp;
Update pheromone trails based on sp (6) In applying ACO to TSP (Travelling Salesman Problem) which is single vehicle version of VRP, details are specified as follows.
1. Pheromone trail τij is associated with each edge (i, j) in G, while heuristic values ηij = 1 / dij is used, where di j is traversing cost of edge (i, j). 2. In the beginning, all weights are initialised to small value τ0. 3. In constructive search, each artificial ant starts with randomly chosen node and iteratively extends partial round trip  by selecting node not contained in  with probability: where N'(i) is the feasible neighbourhood of node i, that is, the set of all neighbours of i that are not contained in  . a and b are parameters which control the relative impact of the weights vs. the heuristic values.
4. After the constructive search, subsidiary local search which is iterative improvement based on standard 2-exchange neighbourhood is operated until local minimum is reached.

In the end of loop, pheromone trail is updated according to
where 0 1    is a parameter regarding vaporization of pheromone, E(s k ) is total cost of the k th ant's cycle s k , m is the number of ants (= size of sp) and Q is a constant. Criterion for weight increase is based on intuition that edges contained in short round trips should be preferably used in subsequent constructions.
As mentioned in Sec. 3.2, not so many methods based on ACO for VRPTW are appeared in the literature.

Application of information obtained in search history to SQ
One of main drawbacks of SA and SQ which are pointed out by users of other metaheuristics is lack of learning in search history, that is, blind random search often compared to 'stupid fly'. Although the complete ACO belongs population-based SLS methods in which genetic algorithm is also contained, a predecessor of ACO is Ant System which is a single ant version of ACO. It was also proposed by Dorigo et al. in [25] and it is recognized as a member of Adaptive Iterated Construction Search (AICS) methods. In the Ant System single artificial ant works and uses information obtained in its own preceding searches. Utilization of some information about good solutions obtained in the preceding search processes is able to be incorporated into SQ procedures. It would be possible to overcome the blind random searches in SQ. Because traversed arcs in good solutions of VRPTW are recognized as characteristics of them, such arcs are expected to be not drastically changed in the succeeding search processes.
In order to embody the idea described above, artificial pheromone trail τij is associated with each edge (i, j) and τij is updated at the end of the loop of temperature T in SQ procedures. The 'characteristic of good solution' is embodied in increase of probability of selecting better candidate in random search process in SQ. In end of loop, weight is updated according to , where 0 1    is a parameter regarding vaporization of pheromone,  , j). When edges (a, i) and (i, b) which are connected with node i are frequently contained in the best routes and costs of the edges are small, the value of ri becomes small. Because such a situation of node i is agreeable to good solutions, it should not be drastically changed in the succeeding search processes. In order to embody the idea stated before, ri is used for assigning node i the biased small probability with which node i is selected for change, instead of obeying uniform distribution as described in Sec. 4.2. That is to say, node i is selected for transformation with probability: The core part of the basic SQ algorithm (5) is replaced by the revised algorithm which is called SQph described as follows.

A device for decreasing the number of vehicles in use
When performance of plural solutions for VRPTW is compared, the first measure is the number of vehicles in use, which is denoted by m, while the second is total cost E. Although SQph is expected to utilize characteristics of good solutions already found and to reduce E, it could not reduce m directly. In order to reduce it, another device should be included in SQ procedures. In the string model described in Sec. 4.1, successive substring of '0's is interpreted as there is no tours between two '0's, that is to say that there is a vehicle not in use. For example shown in Figure 9, when '0' is replaced by another symbol in the string, one vehicle will become not in use. In order to urge to carry out this kind of transformation, artificial pheromone trail associated with edges (i, 0) or (0, i) should be decreased.
The effect of the device (14) could make the probability p0 large according to mechanism described in Sec.5.2. This further revised algorithm in which the device (14) is incorporated with pheromone update (10) is called SQph * in this chapter.

Computational experiments on the proposed method
Computational experiments have been attempted for testing the performance of the proposed method compared with basic SQ method. They have been tried on typical instances for VRPTW.

Solomon's benchmark problems and extended problems for VRPTW
Solomon's benchmark problem sets are produced by Solomon in [5] and provided from Solomon's own website in [26]. They are extremely popular VRPTW instances, and have been used for testing performance of methods by many researchers. Although in some of instances, optimum solutions have been already found by using exact methods, in others, they have not found yet. In both cases, the best solutions found by heuristics have been presented in the literature. Instances including 25, 50, and 100 clients have been provided

One to One Exchange
from Solomon's problem sets. In the instances, each position of clients is given as xcoordinate and y-coordinate. Link cost between client i and client j is calculated with the Euclidian distance. Service time ci is also given to each client i, in addition to the earliest arriving time ei and the latest arriving time li. In this benchmark problems, link cost is directly considered as traversing time of (i, j). Arriving time ai of each node i is calculated by summing up link cost of traversing edges and service time of traversing nodes. Concerning 100 clients problem sets, the geographical data are clustered in C-series 23 instances, randomly generated in R-series 17 instances, and a mix of random and clustered structures in RC-series 16 instances.
Gehring & Homberger extended Solomon's benchmark problems to larger scale problem sets including 200, 400, 600, 800 and 1000 clients in [27]. They are provided from their website [28]. Concerning 200 clients problem sets, the geographical data are clustered in Cseries 20instances, randomly generated in R-series 19 instances, and a mix of random and clustered structures in RC-series 20 instances.
In this chapter, all of 56 instances with 100 clients from Solomon's problems and all of 59 instances with 200 clients from Gehring & Homberger's problems are chosen for computational experiments.

Values of parameters used in the algorithms and specs of the computer in use
In the computations, the values of the parameters with respect to SQ that appear in the basic SQ, SQph and SQph * are set commonly as follows according to the preliminary experiments and the reference to the recommended values by Johnson et al. in [29][30].
are set according to the preliminary execution.
The computational experiments are executed on Windows 7, with Core i7 960, 3.2GHzCPU.

Comparison between basic SQ and SQph
Ratio of application of artificial pheromone trail τij to all transformations is set to 3 cases regarding 50%, 75% and 100%. In other words, the ratio of random transformation is 50%, 25% and 0% in each case. Computational experiments are performed ten times on all benchmark problems with 100 clients and 200 clients.
Concerning the number of vehicles in use m, significant difference is not detected. Regarding total traversing cost E, improvement ratio of SQph to SQ is illustrated in Figure 10.
Values corresponding to the best cost solutions and the worst cost solutions of SQ in ten executions are shown by two bars. Computing time consumed by the methods using SQ and SQph is illustrated in Figure 11. According to these results, larger ratios of application of artificial pheromone trail bring better total traversing cost and longer computing time. Figure 10. Improvement ratio of total traversing cost E using SQph to that using SQ.

Comparison between SQ, SQph and SQph *
In order to evaluate effect of device for reducing the number of vehicles in use m, results of experiments using four types of SQph * are compared. Processes in SQ are divided into two parts. The first half processes correspond to higher temperature, the latter processes to lower temperature. δ (coefficient of reducing pheromone on the edges connecting depot and other clients) is set to 0.25, 0.5 and 1(= not reducing). Four types of SQph * (SQph * 1, SQph * 2, SQph * 3, SQph * 4) are defined in Table 1.
In the latter half processes in SQ δ = 0.25 δ = 0.5 δ = 1 In the first half processes in SQ

Comparison of the number of vehicles m
Computational experiments are performed ten times on all benchmark problems with 100 clients and 200 clients. Regarding the number of vehicles in use m, improvement ratio of SQph * to SQph100% (also to SQ) is illustrated in Figure 12. According to this result, severer reducing coefficient δ (corresponding to SQph * 3 and SQph * 4) brings smaller number of vehicles in use. Improvement in worst cases is larger than in best cases. This situation might be caused by the fact that the value of m is so large in worst case using SQph100% that there is potential to be greatly improved by using SQph * .

Comparison of traversing cost E
Comparison of traversing cost E is significant only between situations based on same number of vehicles m. There are 15 instances with 100 clients and 23 benchmark instances with 200 clients in which optimal m is already obtained by basic SQ. Because in these instances further reduction of m cannot be brought by SQph * , comparison of E is attempted in these instances. Regarding E, improvement ratio of SQph * to SQ is illustrated in Figure 13.
As defined in Table 1, SQph * 2 and SQph * 4 accompany reduction of pheromone on the edges connecting depot conducted only in the first processes in SQph, while SQph * 1 and SQph * 3 accompany reduction of the pheromone in all processes in SQph. These results are interpreted to mean that most of reduction of the number of vehicles in use is likely attained in the first processes in SQph * , while convergence of E mainly depends on the latter processes in SQph. Computing time consumed by the methods using SQ, SQph and SQph * is compared in Figure 14. Figure 13. Improvement ratio of total traversing cost E using SQph and SQph * to that using SQ. To summarize these experiments, SQph * 4 is the most well-balanced method among methods discussed in this chapter by taking account of both of reduction of the number of vehicles in use m and reduction of total traversing cost E. Moreover, regarding computing time, SQph * 4 is moderate. Although computing time consumed by SQph * 4 is about two times longer than that by basic SQ, it takes about only 1.6 min for solving 100 clients problems, and it takes about 6.5 min even in 200 clients problems. It is applicable to make actual vehicle routing plans in freight carriers.

Conclusion
In this chapter, in order to relieve blind searches in Simulated Quenching (SQ), that is are a practical variant of Simulated Annealing (SA), utilization good characteristics of history records during SQ search processes is attempted. Two new devices which are inspired by the effect of pheromone in ant colony optimization (ACO) are adjusted and incorporated into SQ procedures to solve VRPTW. The one is a device to reduce total traversing cost E and the other is a device to reduce the number of vehicles in use m. By computational experiments on all of 56 benchmark instances with 100 clients and all of 59 benchmark instances with 200 clients, it is shown that both of two devices are effective. However, there is a trade-off between effects for reducing E and for reducing m. Taking account of putting the right device in the right place, SQph * 4 in which the device for reducing m is set in the first half processes and the device for reducing E is set in all processes in SQ seems to be the best method. Moreover, it is moderate in computing time consumed. Reducing m in the first half processes is corresponding to diversification, while reducing E in all processes is corresponding to intensification of search. Hence, it is considered that this method improves both diversification and intensification in SQ procedures.
As mentioned before, ease of implementation and flexibility are very important as well as quality of solution, running time and robustness in practical application of VRPTW algorithms to real-world problems. The proposed method is composed by a simple data model and straightforward one-phase algorithm to solve VRPTW. Therefore, the proposed method has comparative ease of implementation and much flexibility.
Two devices incorporated in SQ procedures in this chapter are able to be incorporated into SQ procedures in other routing problems which are embodied in the string model. As introduced in Sec.4.4, VRP with backhaouls (VRPB), Pickup and Delivery Problem (PDP), Capacitated Arc Routing Problem (CARP) and a general routing problem with nodes, edges, and arcs (NEARP) have aleady been embodied in string model and solved by SQ method. Application of two devices to these problems are left for future study.