Search Algorithms on Logistic and Manufacturing Problems

The supply chain comprehensively considers problems with different levels of complexity. Nowadays, design of distribution networks and production scheduling are some of the most complex problems in logistics. It is widely known that large problems cannot be solved through exact methods. Also, specific optimization software is frequently needed. To overcome this situation, the development and application of search algorithms have been proposed to obtain approximate solutions to large problems within reasonable time. In this context, the present chapter describes the development of Genetic Algorithms (an evolutionary search algo-rithm) for vehicle routing, product selection, and production scheduling problems within the supply chain. These algorithms were evaluated by using well-known test instances. The advances of this work provide the general discussions associated to designing these search algorithms for logistics problems.


Introduction
According to the Council of Supply Chain Management Professionals (CSCMP), logistics is defined as the process of planning, implementing and controlling all operations and information flow for the efficient and effective transportation and storage of goods or services from a point of origin to a point of consumption. As presented in Figure 1, many operations are involved in a logistics network, and manufacturing is a crucial operation to transform inbound goods (e.g., raw materials) into outbound goods (e.g., end products, sub-assemblies, work-in-process, etc.) throughout this network.
Due to the complexity of these operations, where many of them involve problems of NP-hard computational complexity, research and improvement efforts require the use of advanced of quantitative and qualitative strategies and tools. Among these, the use of Search Algorithms such as meta-heuristics has been proposed to solve to near-optimality large NP-hard problems within reasonable time [1].
As presented in Figure 1, transportation is needed for the efficient flow of goods throughout the supply chain (SC). Thus, the analysis and solution of routing problems are the first set of problems to be addressed in this chapter.
Then, manufacturing planning is needed to achieve the required quantities of sub-assemblies and end-products to supply the customers (or even other suppliers) in time through the SC. Thus, production planning problems are the second set of problems to be addressed in this chapter. Note that both sets are mutually important and dependent for the appropriate performance of the SC.
While there are many search algorithms or meta-heuristic approaches to solve these problems, this chapter addresses the specific configuration settings to apply Genetic Algorithms (GA) to solve both sets of problems. As the solutions have different representations (i.e., permutations, binary chains, real numbers), having a common algorithmic base can lead to a better understanding for successful implementation for other problems and contexts.
GA are based on the principle of natural selection of "survival of the fittest" where individuals within a population compete between each other for vital resources (i.e., food, shelter, etc.) and/or to attract mates for reproduction. Due to this selection mechanism, it is expected that poorly performing individuals have less chance to survive in contrast to the most adapted or "fit" individuals which are more likely to reproduce, inheriting their good characteristics to their offspring to make them better and more adapted to their environment [2]. Figure 2 presents the general structure and main elements of a GA. This meta-heuristic is population-based. Thus, it works by continuously improving on a set of solutions by using reproduction operators which facilitate the search mechanisms for the solution space of the problem. This set, known as the population, consists of N feasible solutions which are evaluated through a fitness function (i.e., the total distance equation, or objective function, to determine the total cost associated to each solution). Then, the solutions with the best fitness values become candidates for reproduction to (hopefully) inherit their best features to new solutions and improve the overall population in the next generation (iteration). It is expected that after X generations the mean fitness of the population converges to a local optimum.
Within this context, the present chapter addresses the different representations of candidate solutions, fitness functions, and reproduction operators, for the application of GA to solve the following sets of problems: • Routing Planning (Section X.2): Traveling Salesman Problem (TSP) and Capacitated Vehicle Routing Problem (CVRP).  This chapter ends with a discussion of the results and the practical implications of the future work (Section X.5).

Traveling salesman problem
The Traveling Salesman Problem (TSP) represents the scenario of a salesperson who must visit each place within a set of cities or towns. This must be performed with the following considerations: the salesperson starts and ends the whole journey at a single location (i.e., the main office) and must visit each place only once [3]. Although this is the basic understanding of the TSP, the main feature of finding a single route, or sequence of minimum distance or cost, is shared by other real-world applications such as vehicle routing [4], production planning [5], service time [6], and design of computer networks [7]. Figure 3 presents an overview of the TSP model with n = 12 cities.
Note that each single route that complies with the previous restrictions represents a candidate solution, and there are as much as n! candidate solutions if brute search were to be considered as solving method to find the optimal or best solution. Just for the example presented in Figure 3, there are up to 12! = 479′001,600 or 479.00e+006 feasible solutions to visit all 12 cities. This number increases exponentially as n increases linearly. Thus, if just a single city is added to the TSP problem, the number of feasible solutions can increase to 13! = 6.23e+009.
This leads to a problem with an infinite solution space if large sets of cities are considered. This classifies the TSP as an NP-hard problem, which is very difficult to solve within reasonable time, even with the most advanced computational systems. Thus, different meta-heuristics have been developed to provide fast near-tooptimal solutions. Among these meta-heuristics the following can be mentioned [8]: Nearest Neighbor (NN), Simulated Annealing (SA), Tabu Search (TS), Genetic Algorithm (GA), Ant Colony Optimization (ACO), Particle Swarm Optimization (PSO) and Tree Physiology Optimization (TPO).
As presented in [8] GA and SA are among the most suitable heuristics, achieving error gaps from best known solutions within the 10% mark for small (n < 100), moderate (100 < n < 150) and large (150 < n < 450) TSP instances. However, within the context of TSP solutions, it is always recommended to test the solving methods with very large instances (i.e., n > 500) to corroborate their performance.
Thus, the developed GA considers TSP instances with n ≈ 1000. For this purpose, the GA considers the structure presented in Figure 2 with the settings and reproduction operators presented in Table 1 and described in Figure 4.  Implementation of the GA was performed in MATLAB with an Intel Core i7-5500 CPU at 2.40 GHz and 8GB RAM. Testing was performed with a set of TSP instances from the TSPLIB95 database [9]. The details of these instances, including the GA's population size N used for each case, are presented in Table 2. The results of the tests can be observed in Figure 5 and Figure 6.
As presented in Figure 5, the mean error gap through all instances begins to decrease as the selection and reproduction mechanisms of the GA start to operate on the initial and updated populations. By the 300th generation the mean error gap decreases under the 10% mark to finally reach an approximate of 7% by the 1000th generation. This corroborates the performance reported in [8].
Finally, Figure 6 presents the performance of the GA based on the size of the test instances (n). With the settings reported in Table 1, as n increases, the GA takes more time to converge to a local optimum which, in some cases, it is slightly over the 10% mark. Also, the size of the population (N) must be increased to improve the search performance.
Based on these findings, particularly for the TSP with n ≈ 1000, the following recommendations can be made: • Diversification of solutions depends of the size of the population (N) and the TSP (n). Because N is the only controllable parameter, it is important to find an appropriate balance between it and n because a large N can increase the computational memory load of the algorithm which is already affected by n.
• A larger number of generations should be considered for large TSP problems. This because convergence may get slower due to n, independently of the reproduction or selection operators, or the size of the population.
• Integration with other heuristics or meta-heuristics can improve on the initial population or some of the search operators, and thus, on the convergence of the GA through all generations. This process, called hybridization, has led to obtain very suitable results for large TSP instances [10].
As an example of hybridization, Figures 5 and 6 present the performance of the revised GA (hybrid-GA) with a much smaller N (= 50 for all instances) and a Greedy algorithm to improve four offspring (two by crossover, one by flip mutation, one by swap mutation) which are included within the updated population. This increases the speed of the GA, reaching the 10% by the 100th generation, with a final mean error gap of 5% by the 1000th generation. Also, improvement of the large instances (n > 500) is observed, achieving error gaps under the 10% mark.

Capacitated vehicle routing problem
The Capacitated Vehicle Routing Problem (CVRP) represents an extension on the TSP. As shown in Figure 7, the CVRP determines a set of routes that start and end at a specific place or location (e.g., a distribution center). These routes must visit or serve a finite number of locations and meet their demand requirements. Each route must be served by a single vehicle (e.g., a salesperson) with finite capacity, and only one vehicle can serve a location. Thus, the CVRP can be understood as a variant of the multiple-TSP with capacity restrictions [11].
As in the case of the TSP, the CVRP is a combinatorial problem of NP-hard complexity which cannot be solved within a reasonable polynomial time [12]. Due to this, the CVRP has been addressed by different meta-heuristics such as Tabu -Search (TS) [13,14], GA [15], SA [16,17], and Particle Swarm Optimization (PSO) [18].    For this case, the GA presented in Figure 2 was modified to solve the CVRP. The GA and its configuration settings are presented in Figure 8 and Table 3 respectively. Note that the reproduction operators remain the same as considered for the TSP. Testing was performed with a set of instances from the CVRPLIB database [19,20]. Table 4 presents the details of the selected instances.
As presented in Figure 9, the mean error gap reaches the 10% mark by the 200th generation, with an approximate of 8.5% by the 1000th generation. In contrast to the patterns observed in Figure 6, in Figure 10 there is not a clear relationship between the size of the instance (n) and the error gap. Thus, there are large instances with very small error gaps (approximately 6%) and medium instances with large error gaps (over 10%). This however is expected because there are more tasks to be performed on the CVRP such as route segmenting and capacity restriction compliance. This leads to frequently consider GAs for small CVRP instances (n < 200) [15,21].  Based on these findings, particularly for the CVRP with n ≈ 1000, the following recommendations can be made: • Due to the size of the population and the additional tasks, faster processes are needed for diversification of solutions. In example, Tabu Search (TS)

Name of the Instance
Size of the CVRP (n)  Table 4. CVRPLIB instances for GA testing.   uses small sets of candidate solutions (neighbors) through the consideration of movements (or moves). Also, convergence to a local optimum can be minimized by forbidding certain moves (e.g., make them tabu) which would make the algorithm to revisit a region within the solution space. This is an advantage when compared to GA, which requires full-candidate solution populations, and avoidance of previously obtained solutions may require additional tasks.
• Hybridization can improve the convergence and overall search performance of near-optimal solutions. In example, implementing a tabu mechanism on the population can reduce the rate of previously visited solutions (same solutions) and even dynamically reduce the size of the population.
• Initial convergence of the GA, and overall initial performance, may benefit from an initial population with very suitable solutions. However, this may restrict the diversification of solutions through later generations.

Economic lot quantity with multiple items
In manufacturing, an important aspect is the supply of resources such as raw materials, sub-assemblies, end/final products, etc. The availability of these resources must comply with time and cost restrictions.
Within this aspect, the Economic Lot Quantity (EOQ ) models are aimed to estimate the lot size Q which minimizes operational costs associated to inventory management. In general, Q minimizes the following cost function: Where C o is the ordering cost per lot, C h is the holding cost per unit of product, and D is the cumulative demand through a planning horizon [22]. As presented in Figure 11, Q can also be understood as the lot size that equals the total order cost with the total holding cost through a planning horizon (and this leads to minimize T): Note that Eq. (2) leads to define: Which computes the optimal value for Q. Now, if N items with independent orders are considered, then: Under the assumption of independence, Q i can be optimally computed by using Eq. (3) for each item [22]. Thus, for the present case, the GA is only developed to verify its efficiency to solve the EOQ to optimality with a large N.
The GA follows the standard structure presented in Figure 2. As the solution consists of a set of Q i values, the restrictions associated to permutations (such as in the case of TSP/CVRP) are not present. Thus, a simpler crossover operator can be used. Figure 12 presents an overview of the linear crossover operator used for the GA. On the other hand, Table 5 presents the configuration settings of the GA.
The average results for different randomly generated sets of N products are presented in Figure 13. As this is a simpler problem than both, the TSP and the CVRP, optimality can be reached within 100-200 generations. Note that it is always recommended to select an exact method if it is available and results can be obtained within very reasonable time.

Knapsack problem
The Backpack or Knapsack Problem (KP) is a binary multicriteria problem of NP-hard computational complexity and it is frequently considered as a strategy to select items to maximize profits without affecting capacity restrictions [23,24].
The KP can be mathematically formulated as a vector of binary variables where = 1 if the item j is selected, and = 0 otherwise. Then, if p j is a measure of importance (in this case, profit) for an item j, w j represents the size of said item, and cv is the size of the backpack, the problem refers to the selection of the quantity of all elements whose binary vectors x j satisfy the following restrictions [24]:  Table 5.
GA settings for the multiple-item EOQ.     The KP also can be extended to consider more restrictions. In example, if cv is the volumetric capacity of the backpack, cz can be added to include its weight capacity. Thus, if w j represents the volume of the item j, z j can be used to represent its weight, leading to the following restriction: = £ å 1 n j j j z x cz (8) Figure 14 presents an overview of the reproduction operator for the GA considered to solve a large KP instance. Note that, due to the binary nature of the decision variable, the crossover and mutation operators can be implemented faster. Then, the configuration settings of the GA are reviewed in Table 6.
Based on the instance reported in [24], six random test instances with N = 250 items were generated. Figure 15 presents the mean results for these instances. Error gap assessment was performed with the optimization software Lingo. This led to an error gap of 4.0% which is consistent with the results reported in [24].

Genetic algorithm for production scheduling problems
This chapter ends with an application of GA for solving one of the most useful models for manufacturing planning. This model, known as the Permutation Flow-Shop Scheduling Problem (PFSP), consists of finding the optimal sequence of N-jobs to be processed on M-machines [25]. The optimal sequence of jobs is the one that minimizes the make-span of the N-jobs through the M-machines, thus, minimizing the completion time of the last job on the last machine. Note that this sequencing implies two important restrictions: (a) no job can be started on the following machine until it is finished in the previous machine; and (b) a job cannot be started on a machine if it is busy processing another job. As consequence, this is one of the main strategies to reduce idle and waiting times within a workshop [26].
For illustration purposes, Figure 16 shows an example of a solution for a 5-jobs (a, b, c, d, e) and 3-machines (1, 2, 3) PFSP. Note that each job may take different processing times depending of the assigned machine, and the established sequence remains the same for all machines. Thus, the established sequence has a direct effect on the completion time or makespan.
Thus, the information (i.e., processing times) of a PFSP with N-jobs and M-machines is frequently presented as shown in Table 7. As in the case of the TSP/ CVRP models, the PFSP is also of NP-hard computational complexity, thus, metaheuristic methods are frequently considered to solve it within reasonable time. Search Algorithm -Essence of Optimization 16 As it is a permutation-based problem, the structure and settings considered for the TSP GA (see Figure 2 and Table 1) were considered for the PFSP with 500 generations. For testing purposes, the library and best results reported in [27] for 30 randomly selected 20-jobs, 20-machines PFSP instances were considered. The results are presented in Figure 17.
As observed, the mean error gap reaches the 10% mark at the beginning of the GA, with a final mean error gap of 0.005% by the 500th generation. Thus, the GA can provide near-optimal results for the PFSP.

Conclusions and future work
In this chapter the basic elements of a GA were reviewed to describe its application for different logistics and manufacturing problems. The routing problems, beyond the transportation context, can be applied on machine maintenance schemes or material changing services within production plants to minimize operational times. Also, they can be applied to improve the material flow through the warehouse, which is a main facility within the SC. Operations such as order-picking and bin-shelving can be optimized by modeling them as TSP instances [28].
On the other hand, the KP for selection of items is a problem shared with other contexts such as waste reduction in cutting processes, selection of investments and portfolios, decisions for capital budgeting and asset-backed securitization [29]. The

Job
Processing Times PFSP has been also extended on other fields such as in scheduling of quality control tasks on different machines [30]. Thus, the relevance of solving these combinatorial problems, particularly those of large scale, is very important due to their impact in other science and industrial fields.
Within the search algorithms, the GA can provide very suitable results for these problems. However, as presented in Sections X.2, X.3., and X.4, final performance depends of the type of problem. While the GA can achieve mean error gaps under the 10% mark for TSP/CVRP, for the PFSP the GA can achieve near optimal results under the 1% mark.
These results were supported by extensive experiments which were performed with well-known test databases or libraries. In practice, these experiments also provide important feedback to consider alternative meta-heuristics or develop hybrid approaches for improvement of performance. This is because, as reviewed, a single meta-heuristic or search algorithm may not be enough to solve all problems if near-optimality is required. In this case, hybridization between different methods have improved on the search mechanisms of meta-heuristics, either deterministic or stochastic. Also, the integration with mathematical programming (which implies an exact solving method) has provided innovative proposals to solve NP-hard problems [31].
Future work is extensive on this field because: • better solving methods are required due to the presence of increasingly complex combinatorial problems; • advanced mathematical modeling is required to reduce the complexity of NP-hard problems and thus, make them more suitable to optimization through meta-heuristics or exact methods such as Branch & Bound; • automatic decision models require the use of Big Data Analysis which, to some extend, depends of meta-heuristic methods.
Thus, as a concluding remark, it can be stated that any advance on these algorithms can impact on different fields. Just to mention an important field within the current industry, meta-heuristics are playing an important role on the implementation of dynamic decision models within Industry/Manufacturing 4.0 systems. Within this context, recent works have reported the application and improvement of these search algorithms for cost-efficient deployment of computing systems in logistics centers [32], dynamic CVRP [33], and development of Digital-Twin platforms [34].