Using Grid Computing for Constructing Ternary Covering Arrays

rooted in and high performance computing, started in mid-to-late 1990s. Soon afterwards, national and international research and development authorities realized the importance of the Grid and gave it a primary position on their research and development agenda. The Grid evolved from tackling data and compute-intensive problems, to addressing global-scale scientific projects, connecting businesses across the supply chain, and becoming a World Wide Grid integrated in our daily routine activities. This book tells the story of great potential, continued strength, and widespread international penetration of Grid computing. It overviews latest advances in the field and traces the evolution of selected Grid applications. The book highlights the international widespread coverage and unveils the future potential of the Grid.

. Parameters of Web-based system example. Fig. 1 shows the CA corresponding to CA(9; 2, 4, 3); given that its strength and alphabet are t = 2andv = 3, respectively, the combinations that must appear at least once in each subset of size N × Finally, to make the mapping between the CA and the Web-based system, every possible value of each parameter in Table 1 is labeled by the row number. Table 2 shows the corresponding pair-wise test suite; each of its nine experiments is analogous to one row of the CA shown in Fig. 1 The trivial mathematical lower bound for a covering array is v t ≤ CAN (t, k, v), however, this number is rarely achieved. Therefore determining achievable bounds is one of the main research lines for CAs. Given the values of t, k,a n dv, the optimal CA construction problem (CAC) consists in constructing a CA (N; t, k, v) such that the value of N is minimized.
The construction of CAN(2,k,2) can be efficiently done according with Kleitman & Spencer (1973); the same is possible for CA (2, k, v) when the cardinality of the alphabet is v = p n , where p is a prime number and n a positive integer value (Bush, 1952). However, in the general case determining the covering array number is known to be a hard combinatorial problem (Colbourn, 2004;Lei & Tai, 1998). This means that there is no known efficient algorithm to find an optimal CA for any level of interaction t or alphabet v. For the values of t and v that no efficient algorithm is known, we use approximated algorithms to construct them. Some of these approximated strategies must verify that the matrix they are building is a CA. If the matrix is of size N × k and the interaction is t,thereare( k t ) different combinations which implies a cost of O(N × ( k t )) for the verification (when the matrix has N ≥ v t rows, otherwise it will never be a CA and its verification is pointless). For small values of t and v the verification of CAs is overcame through the use of sequential approaches; however, when we try to construct CAs of moderate values of t, v and k, the time spent by those approaches is impractical. This scenario shows the necessity of Grid strategies to construct and verify CAs.
Grid Computing is a technology which allows sharing resources between different administration domains, in a transparent, efficient and secure way. The resources comprise: computation hardware (supercomputers or clusters) or storage systems, although it is also possible to share information sources, such as databases or scientific equipment. So, the main concept behind the Grid paradigm is to offer a homogeneous and standard interface for accessing these resources. In that sense, the evolution of Grid Middlewares has enabled the deployment of Grid e-Science infrastructures delivering large computational and data storage capabilities. The current infrastructures rely on Globus Toolkit (Globus Alliance, 2011), UNICORE (Almond & Snelling, 1999), GRIA (Surridge et al., 2005) or gLite (gLite, 2011) mainly as core middleware supporting several central services dedicated to: user management, job metascheduling, data indexing (cataloguing) and information system, providing consolidated virtual view of the whole or larger parts of the infrastructure. The availability of hundreds and thousands of processing elements (PEs) and the efficient storage of Petabyes of data is expanding the knowledge on areas such as particle physics, astronomy, genetics or software testing. Thus, Grid Computing infrastructures are the cornerstone in the current scientific research.
In this work is reported the use of Grid Computing by means of the use of the European production infrastructure provided by the European Grid Infrastructure (EGI) (EGI, 2011) project. The availability of this kind of computing platforms makes feasible the execution of computing-intensive applications, such as the construction and verification of CAs. In this work we focus on the construction of ternary CAs when 5 ≤ k ≤ 100 and 2 ≤ t ≤ 4.
The chapter is structured as follows. First, Section 2 offers a review of the relevant related work. Then, the algorithm for the verification of CAs is exposed in Section 3. Moreover, Section 4 details the algorithm for the construction of CAs by using a simulated annealing algorithm. Next, the Section 5 explains how to parallelize the previous algorithm using a master-slave approach. Taking the previous parallelization, Section 6 describes how to develop a Grid implementation of the construction of CAs. The results obtained in the 223 Using Grid Computing for Constructing Ternary Covering Arrays www.intechopen.com 4 Grid Computing experiments performed in the Grid infrastructure are showed in Section 7. Finally, Section 8 presents the conclusions derived from the research presented in this work.

Relevant related work
Because of the importance of the construction of (near) optimal CAs, much research has been carried out in developing effective methods for construct them. There are several reported methods for constructing these combinatorial models. Among them are: (a) direct methods, (b) recursive methods, (c) greedy methods, and d) meta-heuristics methods. In this section we describe the relevant related work to the construction of CAs.
Direct methods construct CAs in polynomial time and some of them employ graph or algebraic properties. There exist only some special cases where it is possible to find the covering array number using polynomial order algorithms. Bush (1952) reported a direct method for constructing optimal CAs that uses Galois finite fields obtaining all CA(q t ; t, q + 1, q) where q is a prime or a prime power and q ≤ t. Rényi (1971) determined sizes of CAs for the case t = v = 2w h e nN is even. Kleitman & Spencer (1973) and Katona (1973) independently determined covering array numbers for all N when t = v = 2. Williams & Probert (1996) proposed a method for constructing CAs based on algebraic methods and combinatorial theory. Sherwood (2008) described some algebraic constructions for strength-2 CAs developed from index-1 orthogonal arrays, ordered designs and CAs. Another direct method that can construct some optimal CAs is named zero-sum (Sherwood, 2011). Zero-sum leads to CA(v t ; t, t + 1, v) for any t > 2; note that the value of degree is in function of the value of strength. Recently, cyclotomic classes based on Galois finite fields have been shown to provide examples of binary CAs, and more generally examples are provided by certain Hadamard matrices (Colbourn & Kéri, 2009).
Recursive methods build larger CAs from smaller ones. Williams (2000) presented a tool called TConfig to construct CAs. TConfig constructs CAs using recursive functions that concatenate small CAs to create CAs with a larger number of columns. Moura et al. (2003) introduced a set of recursive algorithms for constructing CAs based on CAs of small sizes. Some recursive methods are product constructions (Colbourn & Ling, 2009;Colbourn et al., 2006;Martirosyan & Colbourn, 2005). Colbourn & Torres-Jimenez (2010) presented a recursive method to construct CAs using perfect hash families for CAs contruction. The advantage of the recursive algorithms is that they construct almost minimal arrays for particular cases in a reasonable time. Their basic disadvantage is a narrow application domain and impossibility of specifying constraints.
The majority of commercial and open source test data generating tools use greedy algorithms for CAs construction (AETG (Cohen et al., 1996), TCG (Tung & Aldiwan, 2000), IPOG (Lei et al., 2007), DDA (Bryce & Colbourn, 2007) and All-Pairs (McDowell, 2011)). AETG popularized greedy methods that generate one row of a covering array at a time, attempting to select the best possible next row; since that time, TCG and DDA algorithms have developed useful variants of this approach. IPOG instead adds a factor (column) at a time, adding rows as needed to ensure coverage. The greedy algorithms provide the fastest solving method.
A few Grid approaches has been found in the literature. Torres-Jimenez et al. (2004) reported a mutation-selection algorithm over Grid Computing, for constructing ternary CAs. Younis et al. (2008) presented a Grid implementation of the modified IPOG algorithm (MIPOG).  Calvagna et al. (2009) proposed a solution for executing the reduction algorithm over a set of Grid resources.
Metaheuristic algorithms are capable of solving a wide range of combinatorial problems effectively, using generalized heuristics which can be tailored to suit the problem at hand. Heuristic search algorithms try to solve an optimization problem by the use of heuristics. A heuristic search is a method of performing a minor modification of a given solution in order to obtain a different solution.
Some metaheuristic algorithms, such as TS (Tabu Search) (Gonzalez-Hernandez et al., 2010;Nurmela, 2004), SA (Simulated Annealing) (Cohen et al., 2003;Martinez-Pena et al., 2010;Torres-Jimenez & Rodriguez-Tello, 2012), GA (Generic Algorithm) and ACA (Ant Colony Optimization Algorithm) (Shiba et al., 2004) provide an effective way to find approximate solutions. Indeed, a SA metaheuristic has been applied by Cohen et al. (2003) for constructing CAs. Their SA implementation starts with a randomly generated initial solution M which cost E(M) is measured as the number of uncovered t-tuples. A series of iterations is then carried out to visit the search space according to a neighborhood. At each iteration, a neighboring solution M ′ is generated by changing the value of the element a i,j by a different legal member of the alphabet in the current solution M. The cost of this iteration is evaluated as ΔE = E(M ′ ) − E(M).I fΔE is negative or equal to zero, then the neighboring solution M ′ is accepted. Otherwise, it is accepted with probability P(ΔE)=e −ΔE/T n ,w h e r eT n is determined by a cooling schedule. In their implementation, Cohen et al. use a simple linear function T n = 0.9998T n−1 with an initial temperature fixed at T i = 0.20. At each temperature, 2000 neighboring solutions are generated. The algorithm stops either if a valid covering array is found, or if no change in the cost of the current solution is observed after 500 trials. The authors justify their choice of these parameter values based on some experimental tuning. They conclude that their SA implementation is able to produce smaller CAs than other computational methods, sometimes improving upon algebraic constructions. However, they also indicate that their SA algorithm fails to match the algebraic constructions for larger problems, especially when t = 3. Some of these approximated strategies must verify that the matrix they are building is a CA. If the matrix is of size N × k and the interaction is t,thereare( k t ) different combinations which implies a cost of O(N × ( k t )) (given that the verification cost per combination is O(N)). For small values of t and v the verification of CAs is overcame through the use of sequential approaches; however, when we try to construct CAs of moderate values of t, v and k,t h e time spent by those approaches is impractical, for example when t = 5, k = 256, v = 2there are 8, 809, 549, 056 different combinations of columns which require days for their verification. This scenario shows the necessity of grid strategies to solve the verification of CAs.
The next section presents an algorithm for the verification of a given matrix is a CA. The design of algorithm is presented for its implementation in grid architectures.

An algorithm for the verification of covering arrays
In this section we describe a grid approach for the problem of verification of CAs. See (Avila-George et al., 2010) for more details.
A matrix M of size N × k is a CA (N; t, k, v) if f every t-tuple contains the set of combination of symbols described by {0, 1, ..., v − 1} t . We propose a strategy that uses two data structures 225 Using Grid Computing for Constructing Ternary Covering Arrays www.intechopen.com 6 Grid Computing called P and J, and two injections between the sets of t-tuples and combinations of symbols, and the set of integer numbers, to verify that M is a CA.
Let C = {c 1 , c 2 , ..., c ( k t ) } be the set of the different t-tuples. A t-tuple c i = {c i,1 , c i,2 , ..., c i,t } is formed by t numbers, each number c i,1 denotes a column of matrix M.T h es e tC can be managed using an injective function f (c i ) : C→Ibetween C and the integer numbers, this function is defined in Eq. 1.
Now, let W = {w 1 , w 2 , ..., w v t } be the set of the different combination of symbols, where w i ∈{ 0, 1, ..., v − 1} t . The injective function g(w i ) : W→Iis defined as done in Eq. 2. The function g(w i ) is equivalent to the transformation of a v-ary number to the decimal system.
The use of the injections represents an efficient method to manipulate the information that will be stored in the data structures P and J used in the verification process of M as a CA. The matrix P is of size ( k t ) × v t and it counts the number of times that each combination appears in M in the different t-tuples. Each row of P represents a different t-tuple, while each column contains a different combination of symbols. The management of the cells p i,j ∈ P is done through the functions f (c i ) and g(w j ); while f (c i ) retrieves the row related with the t-tuple c i , the function g(w i ) returns the column that corresponds to the combination of symbols w i . The vector J is of size t and it helps in the enumeration of all the t-tuples c i ∈C. Table 3 shows an example of the use of the function g(w j ) for the Covering Array CA(9; 2, 4, 3) (shown in Fig. 1). Column 1 shows the different combination of symbols. Column 2 contains the operation from which the equivalence is derived. Column 3 presents the integer number associated with that combination.
MappingofthesetW to the set of integers using the function g(w j ) in CA(9; 2, 4, 3) shown in Fig. 1.
The matrix P is initialized to zero. The construction of matrix P is direct from the definitions of f (c i ) and g(w j ); it counts the number of times that a combination of symbols w j ∈Wappears in each subset of columns corresponding to a t-tuple c i , and increases the value of the cell p f (c i ),g(w j ) ∈ P in that number.
Table 4(a) shows the use of injective function f (c i ). Table 4(b) presents the matrix P of CA(9; 2, 4, 3). The different combination of symbols w j ∈Wareinthefirstrows.Thenumber appearing in each cell referenced by a pair (c i , w j ) is the number of times that combination w j appears in the set of columns c i of the matrix CA(9; 2, 4, 3).
In summary, to determine if a matrix M is or not a CA the number of different combination of symbols per t-tuple is counted using the matrix P. The matrix M will be a CA if f the matrix P contains no zero in it.
The grid approach takes as input a matrix M and the parameters N, k, v, t that describe the CA that M can be. Also, the algorithm requires the sets C and W. The algorithm outputs the total number of missing combinations in the matrix M to be a CA. The Algorithm 1 shows the pseudocode of the grid approach for the problem of verification of CAs; particularly, the algorithm shows the process performed by each core involved in the verification of CAs. The strategy followed by the algorithm 1 is simple, it involves a block distribution model of the set of t-tuples. The set C is divided into n blocks, where n is the processors number; the size of block B is equal to ⌈ C n ⌉. The block distribution model maintains the simplicity in the code; this model allows the assignment of each block to a different core such that SA can be applied to verify the blocks.
Algorithm 1: Grid approach to verify CAs. This algorithm assigns the set of t-tuples C to size different cores.
Input: A covering array file, the number of processors (size) and the current processor id (rank). Result: A file with the number of missing combination of symbols. The t_wise function first counts for each different t-tuple c i the times that a combination w j ∈ W is found in the columns of M corresponding to c i . After that, it calculates the missing combinations w j ∈Win c i . Finally, it transforms c i into c i+1 , i.e. it determines the next t-tuple to be evaluated.
The pseudocode for t_wise function is presented in Algorithm 2. For each different t-tuple (lines 5 to 28) the function performs the following actions: counts the expected number of times a combination w j appears in the set of columns indicated by J (lines 6 to 14, where the combination w j is the one appearing in M n,J ,i . e . i nr o wn and t-tuple J); then, the counter covered is increased in the number of different combinations with a number of repetitions greater than zero (lines 10 to 12). After that, the function calculates the number of missing combinations (line 15). The last step of each iteration of the function is the calculation of the next t-tuple to be analyzed (lines 16 to 27). The function ends when all the t-tuples have been analyzed (line 5).
Algorithm 2: Function to verify a CA.  To make the distribution of work, it is necessary to calculate the initial t-tuple f for each core according to its ID (denoted by rank), where F = rank ·B. Therefore it is necessary a method to convert the scalar F to the equivalent t-tuple c i ∈C. The sequential generation of each t-tuple c i previous to c F can be a time consuming task. There is where lies the main contribution of our grid approach; its simplicity is combined with a clever strategy for computing the initial t-tuple of each block.
We propose the getInitialTuple function as a method that generates c F (see Algorithm 3), according to a lexicographical, without generating its previous t-tuples c i ,w h e r ei < F.T o explain the purpose of the getInitialTuple function, lets consider the CA(9; 2, 4, 3) shown in Fig. 1. This CA has as set C the elements found in column 1 of Table 4(a). The getInitialTuple The getInitialTuple function is optimized to find the vector J = {J 1 , J 2 , ..., J t } that corresponds to F.Th ev a lueJ i is calculated according to Algorithm 3: Get initial t-tuple to PA.
Input: Parameters k and t; the scalar corresponding to the first t-tuple (K l ). Output: The initial t-tuple.
In summary, the Algorithm 3 only requires the computation of O(t × k) binomials to compute the n initial t-tuples of the PA. This represents a great improvement in contrast with the naive approach that would require the generation of all the ( k t ) t-tuples, as done in the SA. The next three sections presents a simulated annealing approach to construct CAs. Section 4 describes in depth the components of our algorithm. Section 5 presents a method to 229 Using Grid Computing for Constructing Ternary Covering Arrays www.intechopen.com parallelizing our SA algorithm. Section 6 describes how to implement our algorithm on a grid architecture.

An algorithm for the construction of covering arrays using a simulated annealing technique
Often the solution space of an optimization problem has many local minima. A simple local search algorithm proceeds by choosing random initial solution and generating a neighbor from that solution. The neighboring solution is accepted if it is a cost decreasing transition. Such a simple algorithm has the drawback of often converging to a local minimum. The simulated annealing algorithm (SA), though by itself it is a local search algorithm, avoids getting trapped in a local minimum by also accepting cost increasing neighbors with some probability. SA is a general-purpose stochastic optimization method that has proven to be an effective tool for approximating globally optimal solutions to many types of NP-hard combinatorial optimization problems. In this section, we briefly review SA algorithm and propose an implementation to solve CAC problem.
SA is a randomized local search method based on the simulation of annealing of metal. The acceptance probability of a trial solution is given by Eq. 3, where T is the temperature of the system, ΔE is the difference of the costs between the trial and the current solutions (the cost change due to the perturbation), Eq. 3 means that the trial solution is accepted by nonzero probability e (−ΔE/T) even though the solution deteriorates (uphill move).
Uphill moves enable the system to escape from the local minima; without them, the system would be trapped into a local minimum. Too high of a probability for the occurrence of uphill moves, however, prevents the system from converging. In SA, the probability is controlled by temperature in such a manner that at the beginning of the procedure the temperature is sufficiently high, in which a high probability is available, and as the calculation proceeds the temperature is gradually decreased, lowering the probability (Jun & Mizuta, 2005).

Internal representation
The following paragraphs will describe each of the components of the implementation of our SA. The description is done given the matrix representation of an CA. An CA can be represented as a matrix M of size N × k, where the columns are the parameters and the rows are the cases of the test set that is constructed. Each cell m i,j in the array accepts values from the set {1, 2, ..., v j } where v j is the cardinality of the alphabet of j th column.

Initial solution
The initial solution M is constructed by generating M as a matrix with maximum Hamming distance. The Hamming distance d(x, y) between two rows x, y ∈ M is the number of elements in which they differ. Let r i be a row of the matrix M. To generate a random matrix M of maximum Hamming distance the following steps are performed: 1. Generate the first row r 1 at random.
2. Generate two rows c 1 , c 2 at random, which will be candidate rows.
3. Select the candidate row c i that maximizes the Hamming distance according to Eq. 4 and added to the i th row of the matrix M.
4. Repeat from step 2 until M is completed.
An example is shown in Fig. 2; the number of symbols different between rows r 1 and c 1 are 4 and between r 2 and c 1 are 3 summing up 7. Then, the hamming distance for the candidate row c 1 is 7.

Rows
Example of the hamming distance between two rows r 1 , r 2 that are already in the matrix M and a candidate row c 1 .

Evaluations function
The evaluation function E(M) is used to estimate the goodness of a candidate solution.
Previously reported metaheuristic algorithms for constructing CA have commonly evaluated the quality of a potential solution (covering array) as the number of combination of symbols missing in the matrix M (Cohen et al., 2003;Nurmela, 2004;Shiba et al., 2004). Then, the expected solution will be zero missing.
In the proposed SA implementation this evaluation function definition was used. Its computational complexity is equivalent to O(N( k t )).

Neighborhood function
Given that our SA implementation is based on Local Search (LS) then a neighborhood function must be defined. The main objective of the neighborhood function is to identify the set of potential solutions which can be reached from the current solution in a LS algorithm. In case two or more neighborhoods present complementary characteristics, it is then possible and interesting to create more powerful compound neighborhoods. The advantage of such an approach is well documented in (Cavique et al., 1999). Following this idea, and based on the results of our preliminary experimentations, a neighborhood structure composed by two different functions is proposed for this SA algorithm implementation.
Two neighborhood functions were implemented to guide the local search of our SA algorithm. The neighborhood function N 1 (s) makes a random search of a missing t-tuple, then tries by setting the j th combination of symbols in every row of M. The neighborhood function N 2 (s) randomly chooses a position (i, j) of the matrix M and makes all possible changes of symbol.
During the search process a combination of both N 1 (s) and N 2 (s) neighborhood functions is employed by our SA algorithm. The former is applied with probability P, while the latter In the next section it is presented our parallel simulated annealing approach for solving CAC problem.

Cooling schedule
The cooling schedule determines the degree of uphill movement permitted during the search and is thus critical to the SA algorithm's performance. The parameters that define a cooling schedule are: an initial temperature, a final temperature or a stopping criterion, the maximum number of neighboring solutions that can be generated at each temperature, and a rule for decrementing the temperature. The literature offers a number of different cooling schedules, see for instance (Aarts & Van Laarhoven, 1985;Atiqullah, 2004). In our SA implementation we preferred a geometrical cooling scheme mainly for its simplicity. It starts at an initial temperature T i which is decremented at each round by a factor α using the relation T k = αT k−1 . For each temperature, the maximum number of visited neighboring solutions is L. It depends directly on the parameters (N, k and v is the maximum cardinality of M)o ft h e studied covering array. This is because more moves are required for CAs with alphabets of greater cardinality.

Termination condition
The stop criterion for our SA is either when the current temperature reaches T f ,whenitceases to make progress, or when a valid covering array is found. In the proposed implementation al a c ko fp r o g r e s se x i s t si fa f t e rφ (frozen factor) consecutive temperature decrements the best-so-far solution is not improved.

SA Pseudocode
The Algorithm 4 presents the simulated annealing heuristic as described above. The meaning of the four functions is obvious: INITIALIZE computes a start solution and initial values of the parameters T and L; GENERATE selects a solution from the neighborhood of the current solution, using the neighborhood function N 3 (s, x); CALCULATE_CONTROL computes a new value for the parameter T (cooling schedule) and the number of consecutive temperature decrements with no improvement in the solution.

Parallel simulated annealing
In this section we propose a parallel strategy to construct CAs using a simulated annealing algorithm.
A common approach to parallelizing simulated annealing is to generate several perturbations in the current solution simultaneously. Some of them, those with small variance, locally explore the region around the current point, while those with larger variances globally explore the feasible region. If each process has got different perturbation or move generation, each process will probably get a different solution at the end of iterations. This approach may be described as follows: 1. The master node set T = T 0 , generates an initial_solution using the Hamming distance algorithm (See Section 4.2) and distributes them to each workers.
2. At the current temperature T, each worker begins to execute iterative operations (L).
3. At the end of iterations, the master is responsible for collecting the solution obtained by each process at current temperature and broadcasts the best solution of them among all participating processes.
4. If the termination condition is not met, each process reduces the temperature and goes back to step 2, else algorithm terminates.
Algorithm 5 shows the pseudocode for master node. The function INITIALIZE computes a start solution (using Hamming distances algorithm) and initial values of the parameters T and L.T h emaster node distributes the initial parameters to slave nodes, and awaits the results. Each L iterations, the slaves send their results to the master node (See Algorithm 6). The master node selects the best solution. If the termination criterion is not satisfied, the master node computes a new value for the parameter T (cooling schedule) and the number of consecutive temperature decrements with no improvement in the solution.

Grid Computing approach
Simulated annealing (SA) is inherently sequential and hence very slow for problems with large search spaces. Several attempts have been made to speed up this process, such as development of special purpose computer architectures (Ram et al., 1996). As an alternative, we propose a Grid deployment of the parallel SA algorithm for constructing CAs, introduced in the previous section. In order to fully understand the Grid implementation developed in this work, this subsection will introduce all the details regarding the Grid Computing Platform used and then, the different execution strategies will be exposed.

Grid Computing Platform
The evolution of Grid Middlewares has enabled the deployment of Grid e-Science infrastructures delivering large computational and data storage capabilities. Current infrastructures, such as the one used in this work, EGI, rely on gLite mainly as core middleware supporting several services in some cases. World-wide initiatives such as EGI, aim at linking and sharing components and resources from several European NGIs (National Grid Initiatives).
In the EGI infrastructure, jobs are specified through a job description language (Pacini, 2001) or JDL that defines the main components of a job: executable, input data, output data, arguments and restrictions. The restrictions define the features a resource should provide, and could be used for meta-scheduling or for local scheduling (such as in the case of MPI jobs). Input data could be small or large and job-specific or common to all jobs, which affects the protocols and mechanisms needed. Executables are either compiled or multiplatform codes (scripts, Java, Perl) and output data suffer from similar considerations as input data.
The key resources in the EGI infrastructure are extensively listed in the literature, and can be summarized as: 1. WMS / RB (Workload Management System / Resource Broker): Meta-scheduler that coordinates the submission and monitoring of jobs.
(sequential) or more (parallel) computing nodes. In the case that no free computing nodes are available, jobs are queued. Thus, the load of a CE must be considered when estimating the turnaround of a job.
3. WN (Working Nodes): Each one of the computing resources accessible through a CE. Due to the heterogeneous nature of Grid infrastructure, the response time of a job will depend on the characteristics of the WN hosting it.
4. SE (Storage Element): Storage resources in which a task can store long-living data to be used by the computers of the Grid. This practice is necessary due to the size limitation imposed by current Grid Middlewares in the job file attachment (10 Megabytes in the gLite case). So, use cases which require the access to files which exceed that limitation are forced to use these Storage Elements. Nevertheless, the construction of CAs is not a data-intensive use case and thus the use of SEs can be avoided. Some of this terms will be referenced along the following sections.

Preprocessing task: Selecting the most appropriate CEs
A production infrastructure such as EGI involves tens of thousands of resources from hundreds of sites, involving tens of countries and a large human team. Since it is a general-purpose platform, and although there is a common middleware and a recommended operating system, the heterogeneity in the configuration and operation of the resources is inevitable. This heterogeneity, along with other social and human factors such as the large geographical coverage and the different skills of operators introduces a significant degree of uncertainty in the infrastructure. Even considering that the service level required is around 95%, it is statistically likely to find in each large execution sites that are not working properly. Thus, prior to beginning the experiments, it is necessary to do empirical tests to define a group of valid computing resources (CEs) and this way facing resource setup problems. These tests can give some real information like computational speed, primary and secondary memory sizes and I/O transfer speed. These data, in case there are huge quantities of resources, will be helpful to establish quality criteria choosing resources.

Asynchronous schema
Once the computing elements, where the jobs will be submitted, have been selected, the next step involves correctly specifying the jobs. In that sense, it will be necessary to produce the specification using the job description language in gLite. An example of a JDL file can be seen in the Fig. 3.
As it can be seen in Fig. 3, the specification of the job includes: the virtual organisation where the job will be launched (VirtualOrganisation), the main file that will start the execution of the job (Executable), the arguments that will used for invoking the executable (Arguments), the files in which the standard outputs will be dumped (StdOutput y StdError), and finally the result files that will be returned to the user interface (OutputSandBox).
So, the most important part of the execution lies in the program (a shell-script) specified in the Executable field of the description file. The use of a shell-script instead of directly using the executable (gridCA) is mandatory due to the heterogeneous nature present in the Grid. Although the conditions vary between different resources, as it was said before, the administrators of the sites are recommended to install Unix-like operative systems. This measure makes sure that all the developed programs will be seamlessly executed in any machine of the Grid infrastructure. The source code must be dynamically compiled in each of the computing resources hosting the jobs. Thus, basically, the shell-script works like a wrapper that looks for a gcc-like compiler (the source code is written in the C language), compiles the source code and finally invokes the executable with the proper arguments (values of N, k, v and t respectively).
One of the most crucial parts of any Grid deployment is the development of an automatic system for controlling and monitoring the evolution of an experiment. Basically, the system will be in charge of submitting the different gLite jobs (the number of jobs is equal to the value of the parameter N), monitoring the status of these jobs, resubmitting (in case a job has failed or it has been successfully completed but the SA algorithm has not already converged) and retrieving the results. This automatic system has been implemented as a master process which periodically (or asynchronously as the name of the schema suggests) oversees the status of the jobs.
This system must possess the following properties: completeness, correctness, quick performance and efficiency on the usage of the resources. Regarding the completeness, we have take into account that an experiment will involve a lot of jobs and it must be ensured that all jobs are successfully completed at the end. The correctness implies that there should be a guarantee that all jobs produce correct results which are comprehensive presented to the user and that the data used is properly updated and coherent during the whole experiment (the master must correctly update the file with the .ca extension showed in the JDL specification in order the Simulated Annealing algorithm to converge). The quick performance property implies that the experiment will finish as quickly as possible. In that sense, the key aspects are: a good selection of the resources that will host the jobs (according to the empirical tests performed in the preprocessing stage) and an adequate resubmission policy (sending new jobs to the resources that are being more productive during the execution of the experiment).
Finally, if the on-the-fly tracking of the most productive computing resources is correctly done, the efficiency in the usage of the resources will be achieved.
Due to the asynchronous behavior of this schema, the number of slaves (jobs) that can be submitted (the maximum size of N) is only limited by the infrastructure. However, other schemas such as the one showed in the next point, could achieve a better performance in certain scenarios.

Synchronous schema
This schema a sophisticated mechanism known, in Grid terminology, as submission of pilot jobs. The submission of pilot jobs is based on the master-worker architecture and supported by the DIANE (DIANE, 2011) + Ganga (Moscicki et al., 2009) tools. When the processing begins a master process (a server) is started locally, which will provide tasks to the worker nodes until all the tasks have been completed, being then dismissed. On the other side, the worker agents are jobs running on the Working Nodes of the Grid which communicate with the master. The master must keep track of the tasks to assure that all of them are successfully completed while workers provide the access to a CPU previously reached through scheduling, which will process the tasks. If, for any reason a task fails or a worker losses contact with the master, the master will immediately reassign the task to another worker. The whole process is exposed in Fig. 4. So, in contrast to the asynchronous schema, in this case the master is continuously in contact with the slaves. However, before initiating the process or execution of the master/worker jobs, it is necessary to define their characteristics. Firstly, the specification of a run must include the master configuration (workers and heartbeat timeout). It is also necessary to establish master scheduling policies such as the maximum number of times that a lost or failed task is assigned to a worker; the reaction when a task is lost or fails; and the number of resubmissions before a worker is removed. Finally, the master must know the arguments of the tasks and the files shared by all tasks (executable and any auxiliary files). At this point, the master can be started using the specification described above. Upon checking that all is right, the master will wait for incoming connections from the workers.
Workers are generic jobs that can perform any operation requested by the master which are submitted to the Grid. In addition, these workers must be submitted to the selected CEs in the pre-processing stage. When a worker registers to the master, the master will automatically assign it a task.
This schema has several advantages derived from the fact that a worker can execute more than one task. Only when a worker has successfully completed a task the master will reassign it a new one. In addition, when a worker demands a new task it is not necessary to submit a new job. This way, the queuing time of the task is intensively reduced. Moreover, the dynamic behavior of this schema allows achieving better performance results, in comparison to the asynchronous schema.
However, there are also some disadvantages that must be mentioned. The first issue refers to the unidirectional connectivity between the master host and the worker hosts (Grid node). While the master host needs inbound connectivity, the worker node needs outbound connectivity. The connectivity problem in the master can be solved easily by opening a port in the local host; however the connectivity in the worker will rely in the remote system configuration (the CE). So, in this case, this extra detail must be taken into account when selecting the computing resources. Another issue is defining an adequate timeout value. If, for some reason, a task working correctly suffers from temporary connection problems and exceeds the timeout threshold it will cause the worker being removed by the master. Finally, a key factor will be to identify the rightmost number of worker agents and tasks.
In addition, if the number of workers is on the order of thousands (i.e. when N is about 1000) bottlenecks could be met, resulting on the master being overwhelmed by the excessive number of connections.

Experimental results
This section presents an experimental design and results derived from testing the approach described in the section 6. In order to show the performance of the SA algorithm, two experiments were developed. The first experiment had as purpose to fine tune the probabilities of the neighborhood functions to be selected. The second experiment evaluated the performance of SA over a new benchmark proposed in this chapter. The results were compared against two of the well-known tools in the literature that constructs CAs, the TConfig 1 (recursive constructions) and ACTS 2 (a greedy algorithm named IPOG-F) respectively.
In all the experiments the following parameters were used for our SA implementation:

Fine tuning the probability of execution of the neighborhood functions
It is well-known that the performance of a SA algorithm is sensitive to parameter tuning. In this sense, we follow a methodology for a fine tuning of the two neighborhood functions used in our SA algorithm. The fine tuning was based on the next linear Diophantine equation, where x i represents a neighborhood function and its value set to 1, P i is a value in {0.0, 0.1, .., 1.0} that represents the probability of executing x i ,a n dq is set to 1.0 which is the maximum probability of executing any x i . A solution to the given linear Diophantine equation must satisfy This equation has 11 solutions, each solution is an experiment that test the degree of participation of each neighborhood function in our SA implementation to accomplish the construction of an CA. Every combination of the probabilities was applied by SA to construct the set of CAs shows in Table 5(a) and each experiment was run 31 times, with the data obtained for each experiment we calculate the median. A summary of the performance of SA with the probabilities that solved the 100% of the runs is shown in Table 5(b). Finally, given the results shown in Fig. 5 the best configuration of probabilities was P 1 = 0.3 and P 2 = 0.7 because it found the CAs in smaller time (median value). The values P 1 = 0.3 and P 2 = 0.7 were kept fixed in the second experiment.
In the next subsection, we will present more computational results obtained from a performance comparison carried out among our SA algorithm, a well-known greedy algorithm (IPOG_F) and a tool named TConfig that constructs CAs using recursive functions.

Comparing SA with the state-of-the-art algorithms
For the second of our experiments we have obtained the ACTS and TConfig software. We create a new benchmark composed by 60 ternary CAs instances where 5 ≤ k ≤ 100 and 2 ≤ t ≤ 4.
The SA implementation reported by (Cohen et al., 2003) for solving the CAC problem was intentionally omitted from this comparison because as their authors recognize this algorithm fails to produce competitive results when the strength of the arrays is t ≥ 3.  The results from this experiment are summarized in Table 6, which presents in the first two columns the strength t and the degree k of the selected benchmark instances. The best size N found by the TConfig tool, IPOG-F algorithm and our SA algorithm are listed in columns 3, 4 and 5 respectively. Next, Fig. 6 compares the results shown in Table 6.
From Table 6 and Fig. 6 we can observe that our SA algorithm gets solutions of better quality than the other two tools. Finally, each of the 60 ternary CAs constructed by our SA algorithm have been verified by the algorithm described in Section 3 . In order to minimize the execution time required by our SA algorithm, the following rule has been applied when choosing the  Fig. 6. Graphical comparison of the performance among TConfig, IPOG-F and our SA to construct ternary CAs when 5 ≤ k ≤ 100 and 2 ≤ t ≤ 4.

Conclusions
In large problem domains, testing is limited by cost. Every test adds to the cost, so CAs are an attractive option for testing.
Simulated annealing (SA) is a general-purpose stochastic optimization method that has proven to be an effective tool for approximating globally optimal solutions to many types of NP-hard combinatorial optimization problems. But, the sequential implementation of SA algorithm has a slow convergence that can be improved using Grid or parallel implementations This work focused on constructing ternary CAs with a new approach of SA, which integrates three key features that importantly determines its performance: 1. An efficient method to generate initial solutions with maximum Hamming distance.
2. A carefully designed composed neighborhood function which allows the search to quickly reduce the total cost of candidate solutions, while avoiding to get stuck on some local minimal.
The empirical evidence presented in this work showed that SA improved the size of many CAs in comparison with the tools that are among the best found in the state-of-the-art of the construction of CAs.
To make up for the time the algorithm takes to converge, we proposed an implementation of our SA algorithm for Grid Computing. The main conclusion extracted from this point was the possibility of using two different schemas (asynchronous and synchronous) depending on the size of the experiment. On the one hand, the synchronous schema achieves better performance but is limited by the maximum number of slave connections that the master can keep track of. On the other hand, the asynchronous schema is slower but experiments with a huge value of N can be seamlessly performed.
As future work, we aim to extend the experiment where 100 ≤ k ≤ 20000 and 2 ≤ t ≤ 12, and compare our results against the best upper bounds found in the literature (Colbourn, 2011).
Finally, the new CAs are available in CINVESTAV Covering Array Repository (CAR), which is available under request at http://www.tamps.cinvestav.mx/~jtj/CA.php. Grid research, rooted in distributed and high performance computing, started in mid-to-late 1990s. Soon afterwards, national and international research and development authorities realized the importance of the Grid and gave it a primary position on their research and development agenda. The Grid evolved from tackling data and compute-intensive problems, to addressing global-scale scientific projects, connecting businesses across the supply chain, and becoming a World Wide Grid integrated in our daily routine activities. This book tells the story of great potential, continued strength, and widespread international penetration of Grid computing. It overviews latest advances in the field and traces the evolution of selected Grid applications. The book highlights the international widespread coverage and unveils the future potential of the Grid.