Region-failure information

## 1. Introduction

Wireless Sensor networks (WSNs) have recently received increased attractiveness driven by many mission-critical applications such as battlefield reconnaissance and homeland security monitoring. Mission critical here refers to networking for application domains whose infrastructure and operations are absolutely necessary for an organization to carry out its mission [1]. Therefore, the main feature that must be guaranteed by all networks running mission critical applications is the network continuity.

However, due to the nature of the deployment field, these networks are vulnerable to natural disasters such as earthquakes, tornadoes or floods. Moreover, they are also subject to physical attacks such as an Electro-Magnetic Pulse (EMP) attack and security breaches such as sinkhole and selective forwarding attacks [2]. Such real world events may happen in particular geographical areas and disrupt specific parts of the network. Therefore, the geographical layout of the network topology determines the impact of such events on the network’s connectivity.

Several contributions in the literature have addressed the failure modeling and survivability problems. The authors in [3, 4] tackled the single link failure problem in the logical topology. The authors in [5] focus on the dual link failure assumption. Most of these studies are based on a common assumption that failures are independent of their locations and randomly distributed across the network, which fails to reflect several real scenarios. Such real-world events have geographical nature, and therefore, the geographical structure of the network affects the impact of these events. Under such region failure scenarios, several network components within a geographically correlated region may be simultaneously destroyed, resulting in network holes, cuts (partitions) or even breakdown of the overall network connectivity as shown in Fig. 1. Therefore, it is essential to assess the vulnerabilities of mission critical networks to such region-failures.

Some research has been conducted to understand the impact of region failures on wired backbone networks such as [11–18]. On the other hand, the cut detection problem has been investigated by [19] and [20].

Recently, few studies have tackled the region-failure problem in wireless networks. The authors in [7–9] investigated the region- based connectivity issue in wireless networks and demonstrated the effect of the transmitting power on maintaining a region-based connectivity in the presence of single and multiple region failures. The authors in [10] proposed a more general Probabilistic Region Failure (PRF) Model to capture the key features of geographically correlated region failures. They also developed a framework to apply the PRF model for the reliability assessment of wireless mesh networks.

All of the aforementioned studies about regional-failures consider a worst-case cut as the cut which maximizes or minimizes certain performance metric (such as capacity) of the intersected links. However, such a definition is inadequate to capture many realistic situations where the faulty nodes may influence larger number of nodes rather than larger number of link-cuts. we claim that, using the number of failed links as the main criteria for defining the worst-case region cut underestimates the impact of a region failure on the overall network performance.

Therefore, in this chapter, we first introduce a new definition for a worst-case cut (partition) due to failure regions. Then, we identify the location of a disaster that would have the maximum impact on a test network using both definitions. Finally, we conduct a deeper analysis to understand the behavior of a single and dual region-failures on the network performance with mission-critical nodes. Our simulation results indicate that, current studies in regional failures under estimate the impact of the worst-case cut due to their dependence on a relaxed definition for the worst-case region-cut.

The rest of this chapter is organized as follows. Section 2 presents the problem investigation. Section 3 demonstrates our proposed scheme for identifying the worst-case cut under single and dual region-failures. Section 4 presents our experimental results. Finally, Section 5 concludes our chapter.

## 2. Problem investigation

Most of the available studies [21–23] in the literature consider that the failure probability of a node is independent of its location in the deployment area. Few studies [6–10] addressed the region failure problem of spatially correlated network nodes in the physical topology. Available studies focusing on region failures consider link-cuts due to a region-failure as the main criteria for identifying the worst-case region-failure and can be interpreted in terms of the network capacity and throughput. However, such fault scenario is inadequate to capture many realistic situations where the failure region may influence larger number of nodes rather than larger number of link-cuts as shown in Fig.2.

For the network shown in Fig. 2, suppose that all links have the same capacity. Then, it can be easily seen that the region failure in Fig. 2a leads to 10 link-cuts while the region failure shown in Fig. 2b leads to only 6 link-cuts. Based on the current studies, the region with the maximum number of link-cuts is identified as the worst-case cut (i.e the region in Fig. 2a). On the other hand, the failure region depicted in Fig. 2b disconnects the dashed region (8 nodes) from the rest of the network which has a greater impact on the network than the case shown in Fig. 2a which has only 6 disconnected nodes.

Therefore, we first propose a new model for the worst-case region-cut considering disconnected nodes (Node-Based) due to a region-failure as the major criteria for identifying the worst-case region-cut. In our proposed model, we may select a region with less number of link-cuts as the worst-case cut if it isolates a larger number of nodes. Moreover, in our proposed model mission-critical nodes are given more weight during the worst-case analysis process to reflect the importance of its service continuity.

## 3. Proposed model

The network model we consider here is composed of a base station, a sink node, and a set of distributed wireless sensor nodes. The base station can be inserted in any suitable place whether in the field or somewhere else. It is directly connected to the sink node through a wire or wireless link. The sink node is a wireless sensor with high capability in memory, processing, power, and wireless coverage. It works as an intermediate node between the base station and the other sensors. It receives commands from the base station and then conveys them to the deployed sensors. In addition, it collects data from the sensors and sends it to the base station. The other sensor nodes are categorized into mission-critical (MC) sensor nodes and regular sensor nodes. A mission-critical sensor node is a node that is responsible for sensing or reading mission-critical information such as a sensor node in the battle field. while the regular sensor node is any other node. These sensors which have limited capabilities in their battery-powers, memory, and processing are distributed all over the area of interest in such a way that any deployed node has at least one path to the sink node.

In the rest of this section we present the problem formulation of our proposed model in section 3.1. Then, in section 3.2 we introduce the routing process based on the above mentioned network model. Finally, we demonstrate the region failure analysis under single and dual region-failures in section 3.3.

### 3.1. Routing process

Nodes in a routing table are classified into three categories: 1) parent node, 2) sibling node and 3) child node. A parent node is a node in the transmission range of another sending node and having a hop count one less than the sending node. A sibling node is a node in the transmission range of another sending node and having the same hop count as the sending node. A child node is a node in the transmission range of another sending node and having a hop count one more than the sending node. After deploying sensor nodes into the network field, the routing tables of the underlying sensor nodes are established. To illustrate the process, a network is constructed in Fig. 3. The network has eight sensors: a sink node, two mission-critical sensors (*A* and *F*), and five regular sensors (*B*, *C*, *D*, *E*, and *G*).

The routing process is started when the sink node broadcasts a setup packet to all nodes within its transmission range. The setup packet contains several parameters including the number of hops *h* between the sending node and the sink node. Each sensor node receiving the setup packet from the sink node sets its value of *h* to the incremented value of the received *h* and marks the sink node as a parent node. For the network shown in Fig. 3a, each of the nodes *A*, *B*, and *C* in the solid circle marks the sink node as its parent. Then, every node with hop count of 1 broadcasts a newly constructed setup packet. For example, suppose that the node *B* sends its setup packet. In this case, the sink node (belongs to the third category), the sensor nodes *A* and *C* (belong to the second category), and the sensor nodes *D* and *E* (belong to the first category) mark the node *B* as a child, sibling, and parent, respectively. Each node in the first category (in this case, *D* and *E*) sets its value of *h* to 2 (i.e., the value of received *h* + 1). Each of the nodes *D* and *E* constructs its setup packets and broadcasts it to its neighbors. This broadcasting process is repeated hop-by-hop by the nodes of the first category until all the deployed sensor nodes establish their routing tables. This will end up with the Fig. 3b, where a headed-arrow represents a link between a parent node and a child and it starts from the parent and heads into its child. A line with no arrow-head refers to a link between two sibling nodes.

### 3.2. Problem formulation

When considering circular cuts, we assume that a disaster results in a cut of radius *r*, which is centered at [x,y]. We define the worst-case region-cut of a network topology *nt* as *WCC*(*nt*) which can be evaluated as follows.

Where *rw*_{x,y} is the weight function of the region-cut centered at (x,y) and can be evaluated as follows.

Where *c* is a constant that gives the disconnected nodes higher weight than link-cuts, *ds*_{x,y} is the set of disconnected sensor nodes due to the region-cut centered at (*x*,*y*), and *nw*(*s*_{i}) is the weight of the sensor node *s*_{i} that differentiates between a regular node and a mission-critical node.

*f s*_{x,y} is the set of failed sensor nodes within the region-failure centered at (x,y), and *pw*(*s*_{j}) is the weight of the sensor node *s*_{j} on the path from a source node *S* to the sink node. The *pw*(*s*_{j}) is used to distinguish between a node on a regular path or a mission-critical path. *br*_{x,y} is the set of paths (bridges) connecting a disconnected network partition (due to the region-cut centered at (x,y)) with a connected network section. *bw*(*b*_{i}) is the weight of the bridge *b*_{i}.

Equation 1 indicates that, the worst-case cut can be located by evaluating a weight function *rw*_{x,y} for each region cut and the region-cut with maximum weight is considered the worst-case region-cut.

Equation 2 evaluates the weight of a region cut centered at [x,y] by estimating the importance of three main factors: disconnected nodes, failed nodes, and link-cuts as described in Eq. 2.

The first term *nw*(*s*_{i}), we provide more weight to mission-critical nodes than regular nodes as indicated by Equation 3.

Where *RS* is the set of all regular sensor nodes in the network field and *MS* is the set of all mission-critical sensor nodes.

The second term

This weight function *pw*(*s*_{j}) is used to grant higher weights to the failed nodes within the region failure that are used in forwarding mission-critical information as described in Eq. 4.

Where *MP* is the set of mission-paths which we refer to as paths carrying mission-critical information, *SC* is set of children of the sensor *s*_{j} and *pn*_{k} is the number of parents of the sensor *s*_{k.}

In Eq.4, the path weight *pw*(*s*_{j}) is assigned a value of zero if a failed node is a regular node that is not located on a mission-path (i.e., it is not used to forward mission-critical information). on the other hand, if the failed node is mission-critical node it receives higher path weight.

Finally, if the failed node *s*_{j} is a regular node that is located on a mission-critical path, then the path weight is evaluated recursively by the summation of the ratio of path weights *pw*(*s*_{k}) of the child node *k* to the number of parent nodes *pn*_{k} of the child node *k*.

The last term *bw*(*b*_{i}) is the weight of the bridge *b*_{i}. This weight function reflects the importance of a failed link due to a region-cut. Here, a failed link can participate in more than one bridge. Hence, we are interested in calculating the For example, if all links on the failed bridge are not located on mission-path it is given a weight of 1.

On the other hand, if all links on the failed bridge are located on mission-path it is given the highest weight.

Finally, if not all of the links on the failed bridge are located on a mission-path, its weight is a percentage of the highest weight. This percentage can be evaluated by the ratio of the number of links carrying mission-critical information of the bridge

where *b*_{i} is a linking path (bridge) connecting a disconnected node with a connected nodes and passing through failed nodes. Each of the connected and disconnected nodes should have a direct link with a failed node. The bridge direction should be from a child toward a parent (i.e. any bridge linking two sibling nodes is excluded). *b*_{i}, and *b*_{i}.

To further illustrate the evaluation of the worst-case region-failure, we provide the following example shown in Fig. 4. In this example compare two different region failures as shown in Fig. 4c and Fig. 4d. Here fater, we also provide the corresponding analytical calculations.

For the network shown in Fig. 4a, to find the worst-case region-failure we have to apply the routing process discussed earlier to configure the routing tables by the category (parent, sibling, child) of each neighbor. The result of routing phase for the given network topology is depicted in Fig. 4b. As described earlier in this section, the headed-arrow represents a child node toward the head and a parent node toward the tail while a line without heads represents a sibling relationship. Hereafter in the following example, we assume that the value of the constant c is equal to 5.

For the failure *Region*1 shown in Fig. 4c, to find the worst-case region-cut, we apply Eq. 2 of our proposed model to get the weight of this region-failure as follows. For the calculation of the node weight *nw*(*s*_{i}) based on Eq. 3, each node in the disconnected partition of the topology is assigned a value of 1. Hence, *pw*(*s*_{j}) we use Eq. 4. As all the failed nodes within the region-failure are not carrying mission critical information, then *bw*(*b*_{i}) we use Eq. 5. Here we have 6 different paths (bridges) crossing the region-failure and none of these bridges is carrying mission-critical informations. Hence, each bridge will be given a weight of 1. Consequently,

Similarly, For the failure *Region*2 shown in Fig. 4d, we apply Eq. 2 to get the weight of this region-failure as follows. For the calculation of the node weight *nw*(*s*_{i}) based on Eq. 3,each node in the disconnected partition of the topology is assigned a value of 1 except each of the mission-critical nodes *A* and *B* will have the value of 32. Hence,

For the calculation of the path weight *pw*(*s*_{j}) we use Eq. 4. As the failed nodes within the region-failure are carrying mission critical information of node *A* and *B*, we have to find the weight of each node on a mission-path to the sink node. The estimation of the *pw*(*s*_{j}) is performed as follows. first, each mission-critical node such as *A* and *B* receives a weight of 32 based on equation 4. Since the node *B* has only two parent nodes (node *F* and node *E*), each of them will receive half of the weight of the node *B* (i.e 16). Node *D* will receive its *pw*(*D*) weight of 24 which is the sum of 8 (half of the *pw*(*F*) as node *F* has two parent nodes) and 16 (the *pw*(*E*) as node *E* has only one parent node) received from its child nodes *F* and *E*, respectively. In the same way, Node *G* will have a *pw*(*G*) weight of 8 ( half of the *pw*(*F*) of its child node *F*), Node *C* will have a *pw*(*C*) weight of 52 (the sum of *pw*(*G*), *pw*(*A*), and half of *pw*(*D*)), Node *H* will receive a *pw*(*H*) of zero since the node *H* is not on a mission-path, Node *I* will receive a *pw*(*I*) weight of 12 which is half of the *pw*(*D*) weight of its child node *D*, Node *J* will receive a *pw*(*J*) of 64 that is the sum of *pw*(*I*) and *pw*(*C*). This process continues in the same way up to the sink node as shown in Fig. 4d. Then, *z* without a *pw*(*z*) weight indicates that this node is not on a mission-path.

Finally, we calculate the bridge weight *bw*(*b*_{i}) for each bridge using Eq. 5.In Fig. 4d, we have 4 different paths (bridges) crossing the region-failure and all of them carrying mission-critical informations. Hence, each bridge will be given a weight of 5 according to Equation 4. Consequently, *rw*_{Region2} = 5 * 70 + 76 + 20 = 446.

Based on Eq. 1 of the proposed model, the region-failure with maximum weight will be chosen as the worst-case region-failure (*Region*2 in this example).

### 3.3. Region failure algorithms

In this section we introduce the main algorithms used to find out the worst-case region-failure under the single and dual region-failure scenarios in Section 3.3.1 and Section 3.3.2, respectively.

Hereafter, we present the main algorithms used in our model. The algorithm shown in Fig.5 finds all sensors within a region-failure. The algorithm depicted in Fig. 6 removes the failed sensors from the network topology (*nt*). The algorithm shown in Fig. 7 finds all paths between a given source/destination pair.

The algorithm shown in Fig. 5 demonstrates how to find sensor nodes located within the region-failure. The input parameters to this algorithm are the network topology *nt*, center (*i*,*j*) and radius *r* of the region-failure. In this algorithm, we first calculate the distance *d* from the center of region failure to the sensor node. If d is less than the radius of region-failure, then the sensor is located within the failed region.

The algorithm shown in Fig. 6 presents our strategy for removing a sensor node from the routing table (*rt*) of all sensors in the network field. The input parameters to this algorithm are the network topology *nt*_{1} and the list *rs* of sensor nodes to be removed. In this algorithm we perform the following steps. First, for each sensor node in the removed sensor list (*rs*) we search for that sensor id. If the sensor ID. is found in any routing table of a sensor node that belongs to the network topology, then it is removed from the routing table and the network topology is updated with the changes made.

The algorithm shown in Fig. 7 presents how to check if a sensor node *s* is still connected or not after a region-failure happens. In other words, we need to find all paths *p* from the given node *s* to the destination node *d* and add these paths to the empty list *paths* of paths. This is accomplished by checking first if the path already exists in the paths list *paths*. If it does not exist, a path search is initiated from each parent of the node *s*. If a path is found, the search is terminated, the found path is added to the paths list and the node *s* is considered a connected node. On the other hand, if no path exists from the parent node of node *s*, then a path search is initiated from the sibling node of node *s*. Finally, if no path is found from both the parent and the sibling nodes of node *s* the node is marked as disconnected node.

#### 3.3.1. Single region-failure

The algorithm shown in Fig. 8 demonstrates how to estimate the number of disconnected nodes due to a single region-failure. The input parameters to this algorithm are the network topology *nt*, the radius of the failure region *r*, the increment value Δ*r* for the radius *r*, the threshold of disconnected nodes *ns*_{th}, and the network field’s length *n f l* and width *nfw*. In this algorithm, we first generate a failure region with radius *r*. then we find all sensors located within the region by the algorithm shown in Fig. 5. Sensors located within the failure region are then removed from the routing tables of all nodes in the network topology by executing the algorithm shown in Fig. 6. For the rest of the remaining sensors, we investigate the availability of a path from each node to the sink node by carrying out the algorithm depicted in Fig. 7. If a path is found the node is marked as connected node and disconnected otherwise. then we calculate the number of disconnected nodes due to the failure region. The above mentioned scenario is repeated by incrementing the coordinates of center of the region-failure until the whole topology is scanned by region-failures. Finally, Equation 1 is applied to estimate the worst case region-failure.

#### 3.3.2. Dual region-failure

Under dual region-failure scenario, we propose the following algorithm shown in Fig. 9 to estimate the number of disconnected nodes within the given network topology.

The algorithm shown in Fig. 9 demonstrates our strategy to determine the number of disconnected nodes under dual region-failures scenario. The input parameters to this algorithm are the same as that of a single region failure shown in Fig. 8. In this algorithm, at the beginning, we follow similar steps to that used to find the number of disconnected nodes of a single region failure where we first generate a failure region with radius *r*. then we find all sensors located within the region by the algorithm shown in Fig. 9. Sensors located within the failure region are then removed from the routing tables of all nodes in the network topology by executing the algorithm shown in Fig. 6. The abovementioned steps are repeated for the second region-failure. Now, we have come up with a network topology without the failed nodes due to the dual region-failure. Then, we examine the path availability from each node in the network topology to the sink node by executing the algorithm depicted in Fig. 7. If no path is available from a node to the sink node the node is marked as a disconnected node. On the other hand, if a path is found the node is marked as connected node. then we calculate the number of disconnected nodes due to the failure region. The above mentioned scenario is repeated by iterating all the possible combinations of the regions centers until the whole topology is visited by region-failures. As the number of disconnected nodes due to dual region-failures is usually large, we use a threshold such that we get the dual region-failures that lead to a number of disconnected nodes greater than the predefined threshold.

Finally, Equation 1 is applied to estimate the worst case region-failure.

## 4. Results

In this section, we present our simulation results. In our simulations we consider the network topology shown in Fig. 10 in which node 0 is the sink node. The failure information of different region failures generated during our simulations results are shown in Table 1 and Table 2, respectively.

Region id | Failed Nodes IDs | No. of Failed links | No. of Disconnected Nodes |

1 | 12 | 4 | 8 |

2 | 11, 12 | 6 | 8 |

3 | 12, 13 | 6 | 7 |

4 | 2, 3 | 8 | 5 |

5 | 31, 32 | 11 | 3 |

6 | 13 | 3 | 3 |

7 | 7, 8 | 6 | 3 |

8 | 7 | 3 | 3 |

9 | 21 | 3 | 2 |

Region id | Failed Nodes’ IDs at Region 1 | Failed Nodes IDs at Region 2 |

1 | 6 | 2, 3 |

2 | 4 | 2, 3 |

3 | 1, 4 | 2 |

4 | 1, 4 | 2, 3 |

5 | 2 | 12 |

6 | 2 | 11, 12 |

7 | 2 | 12, 13 |

8 | 2, 3 | 5, 11 |

9 | 2, 3 | 12 |

10 | 2, 3 | 11, 12 |

11 | 2, 3 | 12, 13 |

12 | 5, 11 | 7, 8 |

13 | 12 | 7 |

14 | 12 | 7, 8 |

15 | 12 | 31, 32 |

16 | 11, 12 | 7 |

17 | 11, 12 | 7, 8 |

18 | 11, 12 | 31, 32 |

19 | 12, 13 | 31,32 |

Note that, due to the large number of region-failures generated, we present only region failures that lead to disconnecting more than two and eight nodes for the single and dual failure scenarios, respectively.

Hereafter, we investigate the location of the worst-case region-failure under Link-based, Node-based (without a mission critical node) and Node-based with a mission critical node in Fig. 12, Fig. 14 and Fig. 16, respectively. In the Node-based with a mission critical node, the node 28 is chosen as a Mission-Critical(MC) node.

The results shown in Fig. 11 show that, based on the traditional definition of the worst-case region cut, the failure region with id 5 is considered the worst-region cut as its failure leads to having the maximum number of failed links. The location of this failure-region is depicted in Fig 12. It is also notable that, according to Table 1 the failure of this region results in disconnecting 3 sensor nodes namely, node 33, 34, and 35 as depicted in Fig. 12.

The results shown in Fig. 13 clearly indicate that, in absence of mission-critical nodes within the network topology the proposed model is very clever to find out the worst-case region-failure. The worst-case region-failure is the region-failure that has the maximum impact on the network performance which is region 2 in this case as its failure results in disconnecting 8 sensor nodes, namely, 13, 14, 15, 16, 17, 18, 19 and 20. However, according to Fig. 11, the failure of this region leads to failure of 6 links only.

The results shown in Fig. 15 indicate clearly that, with the proposed model of a region-cut, introducing a mission-critical node into the network topology leads to a change in selecting the worst-case region cut which is region 4 as its failure results in disconnecting 5 sensor nodes, namely, 7, 8, 21, 23, and 28 as shown in Table 1. However, according to Fig. 11, the failure of this region leads to failure of 8 links.

Therefore, we claim that, using the failed links as the only criteria for defining the worst-case region cut is impractical as it ignores the case that the failure of some nodes may lead to failure of few links however its impact on the network is more severe due to disconnecting larger number of nodes. Moreover, it disregards the fact that some network nodes have higher priority than others.

Hereafter, we investigate the location of the worst-case dual region-failures without including a MC-node and with a MC-node in Fig. 17 and Fig. 18, respectively.

The locations of the worst-case dual region-failures shown in Fig. 17 indicate that, using the link-based approach (blue regions including nodes 11,12 and 31, 32) lead to cutting of 17 links and disconnecting 12 nodes. On the other hand, under the dual region failure scenario, using the proposed Node-based approach, the worst-case region-failure depicted in Fig. 17 (red regions including nodes 1,4 and 2) have only 8 link-cuts however, it results in a full disconnection of all the network nodes. This is due to the fact that, based on our approach, the worst-case region cuts are located near by the sink node and the failure of these nodes lead to isolating the sink node from the whole network. The results shown in Fig. 18 show that, by introducing the MC-node number 28, the worst-case dual region failure remains unchanged as the earlier case without MC-node. This is due to fact that, by introducing the MC-node, no further influence can affect the network as the maximum impact has been already happened by having a complete disconnected sensor network.

## 5. Conclusion

In this chapter, we introduced a new model for a worst-case cut (partition) due to failure regions. The proposed model takes into consideration the physical correlation among the locations of the network nodes and the possible priority of some nodes over the others. Based on the proposed model, We identified the location of a disaster that would have the maximum impact on a sample test network under single and dual region-failure scenario. Extensive Simulation results indicate that, using the number of failed links as the main criteria for defining the worst-case region cut, underestimates the impact of a region failure on the overall network performance.