Simulation results for .
The task allocation problem is a key element in the solution of several applications from different engineering fields. With the explosion of the amount of information produced by the today Internet-connected solutions, scheduling techniques for the allocation of tasks relying on grids, clusters of computers, or in the cloud computing, is at the core of efficient solutions. The task allocation is an important problem within some branch of the computer sciences and operations research, where it is usually modeled as an optimization of a combinatorial problem with the inconvenience of a state explosion problem. This chapter proposes the modeling of the task allocation problem by the use of Colored Petri nets. The proposed methodology allows the construction of compact models for task scheduling problems. Moreover, a simulation process is possible within the constructed model, which allows the study of some performance aspects of the task allocation problem before any implementation stage.
- colored Petri nets
- task allocation problem
- modeling and simulation
- distributed computing
The services provided by Internet-based modern applications require the execution of massively parallel processes in order to produce time-effective information for different requirements. The processing of these “Big Data” is very computing, demanding in almost all of the cases. Thus, vast server farms installed over the world are ready to process the requests from different users. The digital data available in the web are huge and diverse, therefore complex, and in a continuous growth rate.
Nowadays, most of the queries on Internet are produced from smartphones and personal computers. However, the data traffic on web is expected to grow in an exponential rate, as the technology related to the Internet of Things (IoT) allows the connection of other devices from the houses, smart warehouses, automated driving cars, machinery from the agriculture, UAV’s for goods delivering, or smart cities, to mention a few. In order to cope with the challenges related to a massive number of connected devices, new ways of computing emerged. Most of them are based on parallel and distributed computing, such as clusters, grids, and cloud computing paradigms.
At the core of these paradigms are the tasks scheduling and resources allocation problems. Adaptations of theory, methods, algorithms, tools, and others are arising in order to cope with the increasing number of cores per processor, as well as with the interconnected and heterogeneous computers, considering that, in many cases, those computing architectures are connected through the Internet. The tasks scheduling and resources allocation were well studied in the context of architectures with a relatively small number of processors. One of the fundamental concerns deals with designing algorithms for an efficiently allocation of the tasks among all the available processes and resources.
The answer to this question may lead to a hard work, since, in the worst case, it involves an exponential number of possible solutions . Thus, a simulation process, for scenarios with a big number of computing elements, or cores, as in cloud and grid computing, is a valuable tool. In this context, the modeling and simulation are disciplines widely used to obtain feedback and statistics information, to improve associated applications and algorithms. Through experimentation in simulators, a fast prototyping process allows performance evaluation and parameter tuning for different workload conditions. To perform an efficient simulation process, a suitable modeling method is required. The Petri nets (PN’s) and its extension, such as colored PN (CPN), have been considered as an efficient modeling technique for concurrent and distributed systems.
The simulation and implementation stages of scheduling and resource allocation problems in distributed systems have been successfully addressed [2–9]. Moreover, variants of PN’s are used for modeling and simulation of other distributed concepts related to cloud services and computing [9–16], as well as in the field of the diagnosis of discrete event systems [17–19], reconfiguration [20–24], fuzzy inference  or identification .
This work proposes the modeling of the task allocation problem by the use of a methodology based on CPN’s. It also shows how to simulate these models in order to obtain relevant information for the design process. To illustrate the techniques, this work considers a tree structure and supposes that the set of tasks and the set of processes are fixed and defined before the task allocation is performed as in Ref. . The proposed CPN modeling method represents the behavior of a set of processors, or working threads, that traverse a tree structure from the root to a single leaf in order to acquire a task. The processes decide what route to take at each node of the tree structure, by using a decision function. Then, it routes the tree backwards updating some global variables that serves as inter-process communication mechanism.
The proposed method allows obtaining a CPN, which is a compact representation of the problem, and at the same time, it allows simulating its behavior and obtaining performance graphics. The colored tokens represent the processes, tasks, and other variables of interest in the problem. A decision function is attached to some transitions of the CPN model. This function influences the behavior of the process or thread. Due to the graphical nature of the CPN, in addition to the mathematical fundamentals, this methodology provides a great framework to simulate and analyze the task allocation problem, avoiding the state space explosion, and thus, having a compact representation of a complex behavior. The flexibility of the modeling framework allows simulating different scenarios, using different decision functions, as well as varying the number of initial processes or threads. Additionally, the Petri Net model can be easily extended to simulate a wider number of tasks.
The rest of this chapter includes a background of the task allocation and several techniques that have been used for addressing this problem. The modeling framework based on colored Petri nets is next presented and the modeling methodology is detailed. This methodology highlights the main aspects of the task allocation problem, such as the concepts of task and process, as well as the balancing policies and how to capture them by CPN blocks. The simulation of the obtained CPN models is then presented and discussed. Some performance charts are provided. The charts allow the study of key aspects of task allocation problem such as the effective work done by a processor, the task contention, the effects of different balancing policies, to mention a few. With this information, the scientists and engineers are able to study different aspects of the problem prior any implementation stage in a particular field of interest.
2. Task allocation problem
Task allocation problem has been addressed for several purposes and using different techniques. Analytic solutions are in some cases prohibitive since they require the analysis of an exponential number of possible solutions. Thus, some designs use probabilistic approaches to produce good solutions in a reasonable time. Jevtić et al.  present a distributed bee’s algorithm for static task allocation inspired on the behavior of a colony of bees and based on the optimization of a cost function. They also show results using the Arena simulator. Delle Fave et al.  introduced a decentralized optimization for static task assignment based on a constrained utility function, firstly evaluated by simulations. Johnson et al.  state a distributed multi-agent algorithm for static multi-task allocation as an optimization problem based on mixed-integer programming. They also provide some results of Monte Carlo simulation. Zhao-Pin et al.  introduced a distributed and static task allocation algorithm for multi-agents that learn how to determine by themselves what tasks should realize and how to form coalitions for cooperation, via their proposed profit-sharing learning mechanism. Macarthur et al.  introduced a dynamic and distributed algorithm for multi-agent task allocation problems, based on fast-max-sum algorithm combined with a branch-and-bound technique, in order to reduce the execution time in the allocation of task to its respective agent. Alistarh et al. [27, 33] present a decentralized task allocation, static and dynamic, based on to-do trees, where decisions on every inner node are based on a probability function.
One of these most used techniques, due to its simplicity of implementation, is based on a rooted tree structure that the process traverses in order to acquire a task [7, 8]. The tree structure allows to the working processes for executing a non-deterministic behavior, which is quite suitable for real environments. Typically, in these tree structures, every reference to a task is allocated on the leaves. Thus, every internal vertex allows to a working process taking a decision based on a function over the number of tasks in every leaf of the subtrees from each child of the current vertex. Hence, the decision function, which could be a simple deterministic algorithm, or a sophisticated statistical or stochastic procedure, is a key element for which a process traverses through the tree structure, in this kind of task allocation techniques.
Figure 1 shows a binary tree that represents a scheduling problem where eight tasks have to be attended. The height of the tree is , where the level of is 2, the vertex is said to be parent, or ancestor, of and , and the latter two are said to be children, or descendant, of . The vertex is the root, and every vertex is internal except for all of the level 3, i.e., with , which are known as leafs of the tree. The tree formed by the vertexes and the corresponding edges that connects them, is a subtree with as a root.
From the graph theory point of view, a DT is an acyclic graph in which a unique path connects every node to others, represented in a top-to-bottom fashion, where the root, i.e., initial node, is plotted at the top of the tree. The DT is a binary decision tree (BDT) if at every node is considered at most two possible decisions, as shown in Figure 1. For more information on trees see Ref. .
2.1. Task scheduling with DT
By using a tree structure like the one depicted in Figure 1, the task allocation algorithm is relatively simple. Each working process shares the tree structure and iterates in an infinite loop, trying to acquire a new task to execute, which is located at the leafs of the tree. Each working process starts at the root node of the tree and decides whether to go to the left or the right child node. The process knows an approximation of the total pending tasks at the left and right subtrees at every step. For example, in Figure 1, each working thread knows that at the beginning of the scheduling process, at the root node, that there exist four pending tasks at the left subtree and four pending tasks at the right one. Then, at every stage, the processes decide to traverse to the left or right in an asynchronous fashion. The total number of pending tasks is backward updated from the lower nodes to the uppers as the tasks are successfully executed by the threads. Notice that, the number of pending tasks at the left or right subtree is an approximation since this number may be not updated yet.
Since the processes are asynchronous among them, it is possible that they arrive to the leaf nodes that may occur at different rates. Every time that a process reaches a leaf, it asks for the execution of the task. If such a task is free, then the process blocks this task, executes the activity, and marks the task as executed. Then, it goes back to the root node to seek for a new task to execute, while the information of the remaining tasks is backward updated from the leaf to the root node. It is possible that when a process reaches a leaf node, the corresponding task was already executed by another process. In this case, the process goes back to the root node, and this miss is counted to measure the effectiveness of the algorithms.
In Eq. (1), it represented a decision rule that a working process may execute at every level of the BDT. Let x be the total number of pending tasks in the left subtree and y be the total number of pending tasks in the right subtree, while r is a random number. Thus, the probability to go to the left subtree in a flip coin is given by r. As greater the ratio, it is greater the probability to go to the left child of the current node. One of the main objectives of this work is to measure the effectiveness of a decision rule as Eq. (1) by means of a simulation process.
3. Colored Petri nets
PN’s are a modeling framework that combines a graphical visualization with a mathematical model . The CPN’s are one of the extensions to PN’s [9, 17, 35–42], which allow the construction of compact representations of big models. In this section, basic notions of CPN are presented.
The main characteristic that differentiates CPN from other type of nets resides in the token definition. In a CPN, the tokens can stand for complex data besides of a single value, such as an integer, a real, Boolean or strings. This characteristic allows for the representation of elaborated data types, similar to those used by high-level programming languages. This ability exploits the multi-set cardinality to construct compact models that otherwise are in the power set of the colored tokens. The formal definition of a CPN is as follows .
Definition 1. A Colored Petri Net (CPN) is a tuple
where its elements are described as:
P is a finite set of places of , with ,
T is a finite set of transitions of , such that , with
B is finite set of color classes,
F is a set of conditions,
is the color domain mapping,
is a conditional color mapping, and
are matrices representing the input and output incidence matrices of , such that and are mappings for every pair ; where for all , such that .
The incidence matrix is . The mapping given by , defines for each conditional color mapping f of , the token bag to be removed from p, in the occurrence of t, under color condition f. In the same way, specifies the token bag to be added to p, in the occurrence of t, under color condition f.
Definition 2. A marking M of a CPN is a vector of size , such that for every . The vector denotes the initial marking of the net and the pair is known as a CPN System.
Definition 3. A transition in a CPN System is said to be enabled for a color condition f in a marking M iff , denoted as .
Definition 4. The marking evolution w.r.t a color condition f is given by , denoted by . For a general color condition, it is denoted as where is implicit in a context.
The net in Figure 2 represents a very simple CPN System where:
, where and are variables,
and are as depicted, and
, where , , , .
Notice that, at the initial condition, . That is , thus, t1 can be fired for any color binding due to the condition in f1. Suppose that, t1 fires for color condition f1 with , then , with , where , , , by as detailed in the figure. Now, and , in such a way, that if is fired, then , with , where , , , . However, if is fired from for the color condition , then , with , where , .
Roughly speaking, the CPN binds the even number in to the character “a” in and the odd number to character “b.” This is defined by the guard function attached to . The function discards the element of the token bound to the firing of , while the firing of adds two to the element of the token and recovers the r element that is put back to the place . Thus, the tones in changes to and , while the characters “a” and “b” in remain the same all the time. The “R” in the place stands for “resources,” which is the function intended for the characters in this CPN model.
Notice that, the bindings and are discarded by the function . This is one of the advantages of the CPN modeling tool, since otherwise, the combinatorial explosion of possible bindings turns untreatable under some circumstances that typically arises in real applications. For more information about the CPN modeling formalism and related analysis and tools, see Ref. .
4. Modeling and simulation of task scheduling
The construction of a generalized model for a BDT based on CPN considers different features and classes. The procedure includes the identification of the transitions and places of the model, the arcs and its labeling, as well as the multi-set of colored tokens. The remaining of this section is devoted to detail the methodology and illustrate it by short examples.
4.1. Structure of the task allocation as a CPN
Consider the following steps for the construction of a CPN model that captures the structure of a BDT. It is supposed that the BDT is a full binary tree with a complete set of tasks represented by a structure called Task Array. Thus, the size of the Task Array is , where the height of the tree is H.
Step 1. Labeling nodes in the BDT
Firstly, suppose that the DT is a full binary tree. Then, assign “0” to every left edge and “1” to every right edge. This labeling represents the result of the flip coin in the task allocation procedure of a BDT discussed in the previous section.
After that, some information is used to label every node with a pair , where is the level of the node, and is the corresponding position from left to right. For example, consider the root node in the section of the BDT depicted in Figure 3. It is a fraction of the tree of height of Figure 1. Then, to label the children of the root node proceed as follows:
Let be and , with if this node is the left child, otherwise.
Therefore, the label of the left child is , while the label of the right child is .
Step 2. Building the CPN System
Let be a CPN System for a dynamic binary decision tree for classes and features with constructed as follows:
, where is the set of places with assigned tokens of type O, is the set of places with assigned tokens of type R, is the place with assigned tokens of type A. , and where . Places of type O represent every level from the root to the leaves, forward, and backwards, i.e., a token in one of those place contains the information of , , . Places of type R represent the state of every subtree from every node at that level, i.e., the value of the function for every node of the same level given by and . Place of type A represent the Task Array state, i.e., the information of the availability of the task.
T is the set of transitions with .
for , , and where , , for , , and , with , , , , variables and , constants.
, such that , , , .
for all , for all and for .
for , for , for and
, for , , for , , for , , for , , for , , , for , , for , where:
, such that .
, such that .
, such that .
, such that .
, for , , for , , for , , for , , for , , for , , for , where:
, such that . is a decision function.
, such that .
, such that , where returns 1 if for a color class with , , otherwise 0.
, such that ,
, such that
, such that .
, such that .
The initial marking, assuming that the Task Array is initially full, is defined as follows:
, where for .
assuming that the Task Array is initially full, as mentioned.
Figure 4 shows the relation of the tree structure with the proposed CPN as stated by the previous definition. Observe that places of type O represent every level from the root to the leaves, forward, and backwards, i.e., a token in one of those place contains the information of , , and . Places of type R represent the state of every subtree from every node at that level, i.e., the value of the function for every node of the same level given by and . Place of type A represents the Task Array state, i.e., the information of the availability of the task.
4.2. Dynamics of the task allocation as a CPN
Typically, in these tree structures, every reference to a task is allocated on the leaves. Thus, every internal vertex allows to a working process taking a decision based on a function over the number of tasks in every leaf of the subtrees from each child of the current vertex. Hence, the decision function, which could be a simple deterministic algorithm, or a sophisticated statistical or stochastic procedure, is a key element for which a process traverses through the tree structure, in this kind of allocation techniques. The following definitions allow capturing these dynamics aspects of the task allocation based on a BDT.
Definition 5. The is a position function defined from a set of vertexes of a tree to the set of natural numbers as , such that where is the numeric position of vertex with respect to all the vertexes in the same level and counting from left to right.
Definition 6. The is a function defined from a set of vertexes of a tree to the set of natural numbers as , such that from the root node to
Notice that, in Figure 2, the identification of a vertex is given by where and .
Definition 7. The is a weight function defined from the set of edges of a tree to the set containing 0 and 1 as such that if connects a parent vertex to its left child, and 1 if it connects to its right child. Let consider a vertex of a tree denoted as such that and . Then, for parent of left child and right child , with their respective edges and , with , by definition of , then and .
Notice that, the weight function provides a zero for an edge connecting a father with its left child and a one for the edge connecting it with the right child. In the modeling methodology here introduced, it is assumed that there exists a data structure called Task Array, which may have a reference to a task, if it is initially inserted, and the task has not been taken yet. This Task Array is mapped to the location of the leaves of a tree of height from left to right, i.e., the array has a capacity for referencing a maximum of tasks. Figure 1 shows a tree of height 3, with 8 leaves that are associated with the location in the Task Array, where references to task are . Accordingly, the following functions are defined.
Definition 8. The function is defined from the set of vertexes of a tree to the set of natural numbers as , where of leaves in the subtree of root with a task ready to be taken.
Definition 9. The is a function defined from the set of vertexes of a tree to itself as , such that , if is child of .
Definition 10. The is a recursive function defined from the set of vertexes of a tree of height , to the set of natural numbers as , such that:
(Induction): , for every pair , where , .
In Figure 2, for example, is shown in the left of every vertex as , e.g., .
Observe that function generates the value of a node based on the value of his parent. Notice that, there are 8 leaves in level 3, and then, every node value is exactly one entry in the Task Array. For example, the value at points to the entry . Now, consider the nodes in level 2, which are 4, exactly the half of those at level 3. Every vertex at this level, points to the first entry in a range of 2, i.e., has a value of 0, and has a value of 2. Thus, every vertex at level 1, points to the first entry of the two partitions of the Task Array, and therefore, points to the first entry of the whole of the Task Array.
For illustration purposes, suppose that there is one node with two threads and eight tasks to be executed. Accordingly, and . The CPN System for modeling this problem is shown in Figure 4.
Notice that accordingly to the proposed method, a full binary tree is obtained when for some , since, in this case, is exactly equal to . In the particular example, , since . Then, , with ; , with ; with ; , with ; since ; , since , and .
The following functions complement the CPN model:
, for . This marking represents the current state of every process, i.e., at the beginning, processes are at root node with a value 0. being 0 means that no task has been assigned to the current process.
, such that , where: , , , , , , , , , , , , , . These markings provide the information about the available for every subtree constructed from the node identified as .
such that , where: , , , , , , , . These markings provide the information about the availability of the task located at , i.e., if the task is available, 0 otherwise.
The color mapping is for , for and . That is, the places accept tokens of type O, the places accept tokens of type R, and accept tokens of type A. The conditional color mapping is for , for , and for and . The - and matrices for the net of this example are shown in Figure 5.
Thus, Figure 4 represents a CPN that captures the structure of a BDT and the behavior of the working threads over the tree of height . They represents the places and transitions for the traveling from the root down to a leaf, and the reverse way from the leaf up to the root, on the left and right side of the CPN, respectively. Additionally, the central places, marked as “R,” represent the “resources” in the system, i.e., the shared memory registers and the tasks.
Consider an initial token in place as described by the initial marking . Now, the binding for requires two tokens from , besides one initial token in , subject to conditions given by . Since in this case due to , then the required tokens from are and , where , assuming that the Task Array is full of tasks to be attended. Thus, the output of due to is , assuming .
The sketches of CPN in Figure 6 show the three main marking evolutions in the CPN subject to the binding functions . The marking evolution subject to from the current state is shown in (a). In (b), the tokens are taken by the depicted transition, since , then In (c), the transition has fired, then , since assuming that . The tokens remain the same, since . In (d), the marking evolution subject to from the current marking is updated to the marking as shown in (f), since the task reference represented by the marking is available as shown in (e). Notice that, this marking is updated to as shown in (f), as expected. In (g), the input state of the transitions subject to is represented with the marking . Then, it is binding by with the token , as illustrated in (h). Thus, the input token is updated to by , as well as, is updated to by , as shown in the section (i) of the figure.
One of the main advantages of modeling the task allocation problem by using CPN’s is the possibility of varying the parameters during the simulation process, as well as the flexibility of the net structure in order to cope with a greater number of task to be attended. One of the key parameters for the simulation of the problem is the number of tokens of type O, which represents the number of processes or threads in a specific problem. The other important parameter is the decision function α, which is related to the spreading of processes or threads through the tree, by updating of and functions. It is clear, for example, that the distribution of the processes or threads at every level in the tree structure directly influences the contention at the acquisition of the tasks.
The next section explains the simulation process that is possible to execute with the proposed modeling methodology and how these simulations can help explore different parameters in order to optimize the task allocation problem.
5. Simulation of the CPN model
This section shows how to simulate a CPN model as that obtained by the methodology detailed in the previous subsection. The simulation process allows to investigate the performance of the model under parametric variations. First, the general characteristics of the simulation framework are introduced. Then, the simulation and results of a CPN model, representing a problem with 1024 tasks, are given.
In order to simplify the construction of the CPN model by using the herein proposed methodology, and thus, simulate its behavior, a CPN tool is used . This environment allows editing, simulating and analyzing CPN models [36–42] and was successfully used for a variety of applications areas [44–48].
The model parameters that have to be considered are a number of initially available tasks , the height of the tree with respect to the number of tasks , and the decision function . For this example, those values are , , and for a given such that and where is a randomly generated number. These parameters are fixed during every simulation process. However, the number of threads is varying between simulations, with , for a total of 10 simulations runs.
In the simulation framework, there are different measurements of parameters that can be obtained. For illustrative purposes, in this example, the attention is focused on study the impact on the performance of the task allocation procedure when the number of processes, or working threads, is increasing.
Table 1 summarizes some results obtained from the simulations performed on the tools framework for a model with model described previously. The number of tasks was constant, while the number of processes was increased by a power of two, represented in the first column. The second column represents the average of events required per process to complete all the tasks. It provides a measure of the speed by increasing the number of processes. The third column represents the total number of readings to the task array. The forth column represents the total number of read collisions, i.e., when a process reads a task that was previously attended by other process. The fifth column represents the average of the reads to the task array executed by each process. The sixth column shows the average of the failed reads or collisions executed by each process. The seventh column shows the proficiency of the execution by the total amount of processes. Finally, the eighth column represents the average of total reads, including the failed ones, to the task array required per task.
|Processes (d)||Average events||Reads||Collisions||Reads per process||Fails per process||Proficiency||Read per task|
The plot in Figure 7 shows the average of event, in the simulation framework, that each process required for completing all the tasks in the array. Notice that, as the number of processes increases exponentially, the number of events per processes decreases almost logarithmically.
The plot in Figure 8 shows the average of reads that each process has performed to the task array for the completion of all the tasks. As in the previous figure, notice that the number of reads decreases almost logarithmically as the number of processes increases exponentially.
The plot in Figure 9 shows the average of collisions, i.e., a process’s read of a task that was previously attended by other process, as the number of processes increases. The figure shows that the maximum number of average collisions occurs when the number of processes is 32 in these experiments. The collisions, or failed reads, are highly influenced by the decision rule .
The plot in Figure 10 shows the proficiency of attending all the tasks in the array as the number of processes increases. The proficiency increases almost exponentially as the number of processes increases to 256. The ideal behavior is to double the proficiency as the number of processes also double. However, the collisions at reading the tasks undermine this proficiency, as expected. Notice that, the plot is logarithmic.
The plot in Figure 11 shows the average of work that each process has to do for completing a task, measured as the number of reads. This number slightly increases as the number of processes increases due to the growth in the number of collisions.
This work presented a framework, based on Colored Petri nets, for the modeling and simulation of task allocation problems, which arises in environments such as grid or cluster of computers. The framework allows the construction of complex problems in a compact way. Additionally, a simulation process of the constructed models permits the study of different key aspects of the allocation strategies. Some analysis that could be performed includes the impact of different decision rules of the processes for allocating the tasks, the effect of a greater number of processes in the contention of the task’s acquisition or the ration of the increased speed to the attention of tasks by a greater number of processes, among others. Additionally, the methodology allows with ease the construction of structures for an incremental model construction and its respective simulation process.
The proposed methodology allows with ease the extension to tree structures as well as tree structures with a greater number of tasks and their respective simulation process.