## 1. Introduction

The theory of the computational complexity is the part of the theory of the computation that studies the resources required during the calculation to solve a problem (Cook, 1983). The resources commonly studied are the time (execution number of an algorithm to solve a problem) and the space (amount of resources to solve a problem). In this area exist problems classifications that are approached within the theory of the complexity, some definitions of the complexity classes are related to this investigation:

P class.-It is the class of recognizable languages by a determinist Turing Machine of one tape in polynomial time (Karp, 1972).

NP class.-It is the class of recognizable languages by a Non-determinist Turing Machine of one tape in polynomial time (Karp, 1972).

NP-equivalent class.-It is the class of problems that are considered NP-easy and NP-hard (Jonsson & Bäckström, 1995).

NP-easy class.-It is the class of problems that are recognizable in polynomial time by a Turing Machine with one Oracle (subroutine). In other words a problem

*X*is NP-easy if and only if a*Y*problem exists in NP like*X*is reducible Turing in polynomial (Jonsson & Bäckström, 1995).NP-hard class.-A Q problem is NP-hard if each problem in NP is reducible to Q (Garey & Johnson, 1979; Papadimitriou & Steiglitz, 1982). It is the class of problems classified as problems of combinatorial optimization at least as complex as NP.

NP-complete class.-A

*L*language is NP-complete if*L*is in NP, and Satisfiability ≤_{p}*L*(Cormen et al., 2001; Karp, 1972; Cook, 1971). It is the class of problems classified like decision problems.

A combinatorial optimization problem is either a minimization problem or a maximization problem and consists of three parts: a) a set of instances, b) candidate solutions for each instance, c) a solution value (Garey & Johnson, 1979). The combinatorial optimization problems that was used in this paper: the General Asymmetric Traveling Salesman Problem (ATSP), JobShop Scheduling Problem (JSSP) and Vehicle Routing Problem with Time Windows (VRPTW).

The general Asymmetric Traveling Salesman Problem (ATSP), which can be stated as follows: given a set of nodes and distances for each pair of nodes, find a route of minimal overall length that visits each of the nodes exactly once (Cormen et al., 2001). The distance of node *i* to node *j* and the distance of node *j* to node *i* can be different (Equation 1, 2, 3, 4).

The JobShop Scheduling Problem (JSSP) contains a number of machines and a set of Jobs each one with precedence restrictions, the problem is to solve the question if exist a scheduling of jobs that help to improve and to efficiency the use of the machines being eliminated the idle times. It is recognized by that it does not have to be able human nor machine sufficiently fast that it can obtain the optimal solution for JSSP due to the solutions space, which cannot be expressed by a polynomial function (deterministic algorithm), the space of solutions for this kind of problem can be only expressed like an exponential function. For the problem of JSSP is necessary to diminish makespan (*cmax*), this can be formulated as it follows (Equation 5, 6, 7, 8):

The Vehicle Routing Problem with Time Windows (VRPTW) is a combinatorial optimization problem complex (Toth & Vigo, 2001; Ruiz-Vanoye et al., 2008b; Cruz-Chávez et al., 2008). The VRPTW (Toth & Vigo, 2001) consists basically of to minimize the costs of subject transportation to time restrictions of each route and capacity on the cradle of the demand of each client (Equation 9 and 10).

In this paper, we propose specify the complexity of instances for problems of combinatorial optimization (ATSP, VRPTW, and JSSP).

## 2. Related works

An important measurement of complexity that we can attribute to Shannon (1949), is the complexity of Boolean circuits. For this measurement he is advisable to assume that *f* function at issue transforms finite strings of bits into finite strings of bits, and the complexity of *f* is the size of the smaller Boolean circuit than it calculates *f* for all the inputs of *n* length. But, it does not exist a classification of the complexity of instances for combinatorial optimization problems.

In some combinatorial optimization problems exists diverse types of instances size at the moment, for example:

REINELT (Reinelt, 1991) mentions that in General Asymmetric Traveling Salesman Problem (ATSP) instances exist whose size based on the number of cities or *nc*. See Table 1.

YAMADA (Yamada & Nakano, 1997) mentions that for the problem JobShop Scheduling Problem (JSSP) the size of the problem is the number of jobs or *J* and the number of machines or *M*. See Table 2.

SOLOMON (Solomon, 1987; Toth & Vigo, 2001) mentions that in the problem Vehicle Routing Problem with Time Windows (VRPTW) exists classifications of type C for instances clustered, type RC for instances Random and Clustered, Type R for Random instances and in addition the instance is determining by the number of clients or *CN*. See Table 3.

## 3. Complexity of Instances for Combinatorial Optimization Problems

In this section, we propose a basic methodology (Figure 1) for create the metric that permit to classify or to determine the complexity instances of combinatorial optimization problems.

Step 1. To identify the maximum instance solved of the problem.

Step 2. To identify the instances parameters of the problem.

Step 3. Elaboration of the general metric of instances complexity taking into account measured of descriptive statistics and the parameters of the problems (Equation 11).

Where: *PS* is the instance average size at the rate of the instance maximum solved of the problem, *AM* is the sum of the arithmetic mean of each instance parameters (that have several elements), *SD* is the sum of the standard deviations of each type of parameter (that have several elements), *GD* is the sum of the genetic distances which quantifies the similarity or differentiates between the populations from the frequencies of the elements from each parameters (that have several elements) of the problem, *RA* is the sum of the existing reasons between the greatest value and the value smaller of each parameter (that has several elements) of the problem.

## 4. Experimentation

The experimentation was performed on a computer with Celeron processor 1.5GHz, 512 MB of memory, 80 GB of hard disk.

We use: the ATSP instances obtained of TSPLIB (Reinelt, 1991) benchmark, the JSSP instances of JSSPLIB (Yamada & Nakano, 1997) benchmark, the VRPTW instances of Solomon (Solomon, 1987) benchmark, and a genetic algorithm from Heuristics Lab (Wagner & Affenzeller, 2004) software.

A genetic algorithm (GA) is one of the heuristic methods used to find approximate solutions to NP-complete problems, GA is inspired by the Darwinian principles of the evolution of the species, and use own techniques of the genetics, such as: inheritance, mutation, natural selection and recombination (or crossover). The simplest form of genetic algorithm involves three types of operators: selection, crossover (single point), and mutation (Holland, 1975; Mitchell, 1998). Selection, this operator selects chromosomes in the population for reproduction. The fitter the chromosome, the more times it is likely to be selected to reproduce. Crossover, this operator randomly chooses a locus and exchanges the subsequences before and after that locus between two chromosomes to create two offspring. The crossover operator roughly mimics biological recombination between two single-chromosome organisms. Mutation, this operator randomly flips some of the bits in a chromosome. Mutation can occur at each bit position in a string with some probability. The genetic algorithm (Wagner & Affenzeller, 2004) was used to verify the time and the quality of instances solution with the purpose of determining if the metric generated classify in complexity terms. The input parameters were: selection operator = Roulette, crossover operator = OX, mutation operator = Simple Inversion, generations = 1000, population size = 100, mutation rate = 0.05, replacement strategy = Elitism, crossover rate = 1, n-Elitism = 1, tournament group size = 2.

### 4.1. Experimentation in ATSP

Using the process contained in the general methodology to ATSP, the steps they would be of the following way:

Step 1. The maximum instance solved for ATSP is of 1.904.711 cities or World TSP (Applegate et al., 2006).

Step 2. The instances parameters of general ATSP can be codified in a hypothetical language: *L* = “*nc*, *d*_{1}, *d*_{2}, …, *d _{z}*”. In other words two general parameters

*nc*= number of cities and

*d*=distances between the cities.

_{z}Step 3. Elaboration of the instances complexity metric for ATSP. In order to create the complexity general metric, for which it is necessary to determine *PS* (Equation 12), *AM* (Equation 13), *SD* (Equation 14), *GD* (Equation 15), *RA* (Equation 16) indicators.

Where: *PS* is the average size of the instance at the rate of the instance maximum solved of ATSP, *nc* is the value of the problem to solve and *ncmax* is the size of greater instance solved.

Where: *AM* is the sum of the arithmetic mean of each parameter (that has several elements) of the instance, *d _{i}* are the elements of the parameter that has several elements called matrix of cities and

*nc*is the number of cities.

Where: *SD* is the sum of the standard deviations of each type of parameter (that has several elements), *d _{i}* = Value of distances between cities

*i*,

*nc*= number of cities,

*AM*= index of the arithmetic mean of distances.

Where: *GD* is the sum of the genetic distances which quantifies the similarity or differentiates between the populations from the frequencies of the elements from each parameter of the problem.

Where: *RA* is the sum of the existing reasons between the greatest value and the smallest value of each parameter (that has several elements) of the NP-hard problem, min *d _{i}* = minimal distance between cities, max

*d*= value of maximum distances between cities.

_{i}In Table 4 are the results of applying the instances complexity metric (IC) on ATSP instances, in addition to the summary of the obtained result to execute algorithm GA with the ATSP instances.

Where: *d* is the difference between best know and the obtained solution, *t* is the solution time for the instance on the GA algorithm. In the instance complexity (*IC*) between ft70 and ftv70 instances exist different values for instances with equal number of cities (*nc*), indicates that the ft70 instance is more complex than ftv70 instance and is verified with the obtained solution to apply the GA algorithm to the instances. In addition to being able to compare between the ftv70 and ftv64 instances, the ftv70 instance is more complex than ftv64 instance.

### 4.2. Experimentation in JSSP

Using the methodology with problem JSSP, are the following steps:

Step 1. The maximum instance resolved is of 20 by 20 symmetrical (Yamada & Nakano, 1997).

Step 2. The parameters of the instances of the JSSP can be codified in a hypothetical language: *L* = “*nj*, *nm*, *J*_{1}(*m*_{0}, *pt*_{0},…..m_{nm-1}, pt_{nm-1}),…, J_{nj} (m_{0}, pt_{0},..., m_{nm-1}, pt_{nm-1}). Where nj = Number of jobs, nm = number of machines, J = Jobs, m = machine and pt = processing time.

Step 3. Elaboration of the complexity metric for JSSP. In order to create the complexity general metric is used the Equation 1, for which it is necessary to determine PS (Equation 17), AM (Equation 18), SD (Equation 19), DG (Equation 20), RA (Equation 21) indicators.

Where: PS is the so large average of the instance at the rate of the maximum instance from problem JSSP. And MAXJSSP = the maximum instance solved in literature or MAX(nj ∗ nm).

Where the AM metric is the sum of the arithmetic mean of the m, pt parameters.

Where the SD metric contains the sum of the standard deviations of the m, pt, J parameters. And MA is the arithmetic mean of m parameter; MApt is the arithmetic mean of pt parameter.

Where the GD metric contains the genetic distances between the populations by the frequencies of the parameters (m, pt).

Where the RA metric contains the sum of the reasons between the minimal and maximuml values of the m and pt parameters.

The Table 5 contains the results of applied the complexity metric for JSSP, and the results of the GA on JSSP instances. This can be observed that in the Instance Complexity (IC) between la01 and la05 instances exist different values with equal Number of Jobs and Number of Machines, indicates that the la01 instance is more complex than the la05 instance is verified with the quality of the obtained solution. Also sample that in between the yn1 and yn4 instances, the yn4 instance is more complex that yn1 instance. Where: d is the difference between best know and the obtained solution, t is the solution time for the instance on the GA algorithm.

### 4.3. Experimentation in VRPTW

Using the methodology with VRPTW, the following steps are obtained:

Step 1. The maximum instance solved of problem VRP is F-n135-k7, with VN = 12, C = 2210, CN = 135 (Toth & Vigo, 2001).

Step 2. The parameters of the instances of problem VRPTW can be codified in a hypothetical language: L = “VN, C, (CN_{1}, XCO_{1}, YCO_{1}, D_{1}, RT_{1}, DT_{1}, ST_{1}, …, CN_{z}, XCO_{z}, YCO_{z}, D_{z}, RT_{z}, DT_{z}, ST_{z})”. Where VN = Vehicle Number, C = Capacity, CN = Customer Number, XCO = X Coord., YCO = Y Coord., D = Demand, RT = Ready Time, DT = Due date, ST = Service Time.

Step 3: Elaboration of the complexity metric for problem VRPTW. In order to create the complexity general metric the Equation 1 is used, for which it is necessary to determine PS (Equation 22), AM (Equation 23), SD (Equation 24), GD (Equation 25), RA (Equation 26) indicators.

Where: PS = is the problem size of the instance at the rate of the maximum instance solved from VRPTW, max(VN * C * CN_{n}) = the value of the maximum instance solved by literature.

Where: In AM contains the sum of the arithmetic mean of the XCO, YCO, D, RT, DT y ST parameters.

(24) |

Where: SD contains the sum of the standard deviations of the XCO, YCO, D, RT, DT y ST parameters. AMXCO is the arithmetic mean of XCO parameter, AMYCO is the arithmetic mean of YCO parameter, AMD is the arithmetic mean of D parameter, AMRT is the arithmetic mean of RT parameter, AMDT is the arithmetic mean of DT parameter, and AMST is the arithmetic mean of ST parameter.

Where: GD contains the genetic distances between populations from the frequencies of each parameter (XCO, YCO, D, RT, DT y ST).

(26) |

Where RA contains the sum of the reasons between the minimal and maximuml values of the XCO, YCO, D, RT, DT y ST parameters.

In Table 6 are the results of applying the complexity metric on VRPTW instances, in addition to the summary of the obtained result to execute algorithm GA with the VRPTW instances. Where: d is the difference between best know and the obtained solution, t is the solution time for the instance on the GA algorithm, VN = Vehicle Number, C = Capacity, CN = Customer Number. It can be observer that in the Instance Complexity (IC) between c204 and c201 different values for instances with equal Vehicle Number Capacity and Customer Number exist, this indicates that the instance c204 is more complex than c201 and is verified with the obtained solution to apply GA to the instances.

It is necessary to mention that the values of complexity of the instances of single VRPTW are comparable between instances of the same problem.

## 5. Conclusions

We can conclude that the creation of a mathematical expression based on the descriptive statistics able is possible to measure the complexity the instances of combinatorial optimization problems, in this paper was demonstrated for ATSP, VRPTW and JSSP. It is necessary to mention that, the values of instances complexity of ATSP, VRPTW and JSSP problems are only comparable between instances of the same problem.

The intention of this paper was to classify the complexity of the instances and not therefore the complexity between problems NP, as future works, we propose to validate the methodology for other NPs problems.