Secure and Energy-Efficient Communication in IoT/CPS

Secure and energy efficient routing protocol is fairly an open research despite a plethora of routing protocols has been proposed in the literature. However, most routing protocols specifically designed for resource constrained wireless devices, if not all, follow from the same perspective and almost have reached the maximum improvements. This chapter describes the design of cross-layer secure multi-hop zone routing protocol (MZRP) and a hybrid energy-efficient medium access control (MAC) featuring the benefits from both carrier sense multiple access (CSMA) and time-division multiple access (TDMA). MZRP employs the intelligent artificial neural network (ANN) self-organizing map (SOM) algorithm, which is performed at the coordinator or the base station (BS) to divide the area into multi-level zones. Then cluster heads (CHs) are chosen using k-medoids in each zone. The performance of MZRP is better in terms of energy efficiency compared to dual-hop and HT2HL as it extends the network lifetime using hybrid MAC and the security algorithm employed has less message update.


Introduction
The architecture of the global network "Internet" will expand drastically to accommodate the connections of billion IP-enabled or non-IP smart wireless devices "things" at the network edge which brings up the era of the Internet of Things (IoT). Depending on the application, the data model can either be simple as small "things" are connected (e.g., wearables, smart home, etc.) or it can be complex as big "things" are connected (e.g., automotive, smart city, etc.). Hence, new architecture and technologies have been developed in the fifth generation (5G) to meet the sky-rocketing growth of the IoT [1].
The non-IP network edge utilizes an IoT architecture supported by IPv6 over low-power wireless personal area networks (6LoWPAN) to achieve seamless integration of different heterogeneous networks which enable the interaction between network edge "things" and the outside world "internet" [2]. A typical 6LoWPAN network consists of few hundreds or even thousands of low-cost and tiny intelligent sensor nodes (SNs) embedded with sensing transducers, microcontroller, and radio transceiver units, and all these units are powered by battery or other source of energy [3].
They form an ad hoc network among themselves and a gateway or 6LoWPAN edge router which acts as a centralized network coordinator or a base station (BS). The BS refers to gateway or 6LoWPAN edge router throughout this chapter. This type of network is known as wireless sensor network (WSN), which forms the major hardware devices of the IoT. Different WSNs can be seamlessly connected by routers to provide tunnels to the backbone [4].
The SNs in many applications have no fixed power source. Hence, they are powered by batteries where the replacement either be infeasible or might not be possible. The WSN's lifetime depends on the lifetime of batteries that operate the SNs. Thus, energy saving is the most important factor in designing WSNs, which mandates the necessity for energy efficient techniques to prolong the network lifetime of WSNs [5].
The WSN's lifetime maximization and the individual SN's energy conservation have been investigated by researchers, and several approaches have been proposed in the literature. The overall energy consumption in the SNs is due to sensing, processing, and communication. But radio transceiver has the highest energy expenditure, which involves transmit, receive, idle, and sleep modes. The energy consumption in sleep mode or low power listening (LPL) is very small; hence, the radio transceivers of SNs can be put in sleep mode, which can significantly increase the network lifetime as will be shown later. This is governed by the performance of the protocols built across all layers of the open systems interconnection (OSI) protocol stack especially routing and medium access control (MAC) protocols within the SNs. Hence, routing and MAC protocols have critical impact on the network lifetime of WSNs. Therefore, careful design of energy efficient routing and MAC protocols is critical to conserve the wasteful energy and to utilize the useful energy wisely.
In a single-hop scenario, SNs can either send or receive messages to/from destination directly if they are within the transmission ranges of each other. If they are far away, i.e., the distance between a SN and a destination is more than the transmission range, and indirect communication using multi-hop is chosen where the message is relayed by several intermediate relays from a SN to a destination.
A simple approach is flooding where a SN sends a query packet for route discovery which propagates over the network until destination is reached. However, this results in broadcast storm due to the propagation of message even after route has been discovered. Thus, it is not suitable for energy constrained SNs as the energy consumption is hug. Data fusion employing passive clustering is another approach to achieve energy savings by processing data in network using machine learning and artificial intelligence (ML/AI) algorithms. It is an intuitive that the traffic and the energy consumption will be reduced significantly.
Hierarchical clustering has been proven to be an effective method for energy optimization in which SNs are organized into clusters. Each cluster has a cluster head (CH), which aggregates the data from cluster member nodes (MNs), and the CHs communicate directly to the BS using single-hop routing or relay the data through intermediate nodes using multi-hop routing. This combines the proactive approach within cluster (intra-cluster) and reactive approach across clusters (inter-cluster).
In battery-powered SNs, the design of low duty-cycle MAC protocol is crucial for energy savings. Most wireless technologies uses CSMA as channel access mechanism. Thus, the MAC layer of a SN is based on the IEEE 802.15.4 standard which defines the specifications of the lowest two layers of the OSI protocol stack; that is, the physical (PHY) and MAC layers. The PHY layer of the IoT protocol stack is based on the IEEE 802.15.4 standard; however, MAC layer is time-slotted channel hopping (TSCH) based on IEEE 802.15.4e standard to ensure high robust and reliable requirements in industrial applications [6]. Hence, these intelligent wireless devices provide the backbone infrastructure for cyber-physical system (CPS) to monitor physical processes [1].
The distribution of SNs at different places and trying to connect them via wireless links make WSNs vulnerable to attacks. The types of attacks can be internal or external. External attacks can be prevented using link-layer encryption and authentication using global key (GK). However, internal attacks are difficult to defend since a compromised node inside the network is hard to detect as its identity can be verified using its private key (PK).
Hence, routing and forwarding packets can cause security problems as routing protocols are designed to exchange network topology between SNs to establish routes but not to provide defense against malicious attackers. Security in WSNs is still an open research as many questions remain unanswered. The existing network security approaches are either computationally complex, or resources are not efficiently utilized which makes the core security issue a big challenge under great resource constrained SNs.
The challenge is to design energy efficient routing and MAC protocols to prolong the network lifetime of WSNs and considers aspects related to security and data mining using artificial intelligence (AI) and machine learning (ML) algorithms under resource constraints. Thus, a careful design of routing and MAC protocols is required to enhance the data collection, improve security, and increase the lifetime of WSNs in thing-fog-cloud IoT architecture [6].
Our contribution is the design of a cross-layer secure multi-hop zone-based routing protocol (MZRP) and a hybrid energy-efficient MAC called X-SREM. MZRP is a hierarchical clustering routing and hybrid MAC which switches CSMA and TDMA based on data traffic. Low-duty cycle operation is achieved using cooperative strategies by shutting down the radio transceivers of SNs all times except during their assigned time slot. This novel design is useful to potentially achieve security and huge energy savings by eliminating the interference among densely deployed SNs and collision-free data transmission within clusters and reduces the traffic over the network by removing redundancy using data fusion algorithms.
The remainder of this chapter is organized as follows: Section 2 presents the related work. In Section 3, the problem is described. In Section 4, the models are defined. Section 5 describes our proposed cross-layer MZRP and hybrid MAC protocol. Section 6 presents the performance evaluation and analysis of the proposed protocol compared to two benchmark heterogeneous protocols dual-hop and HT2HL. The conclusion of this research is presented in Section 7.

Related work
Hierarchical clustering routing and low duty cycle MAC protocols have been proven methods to reduce the energy consumption of SNs and to eliminate the interference among them. Heinzelman et al. [7] proposed low energy adaptive clustering hierarchy (LEACH) and LEACH centralized (LEACH-C) [8], and they are the reference routing protocols in homogeneous hierarchical clustering category. LEACH has provided the basis for many later studies.
For instance, threshold sensitive energy efficient sensor network protocol (TEEN) and adaptive TEEN (APTEEN) are proposed in [9,10], respectively. The election of CHs in these protocols is a random process. TEEN and APTEEN combine the hierarchical and data-centric approaches. Both reduce data transmission by performing data fusion at CHs, hence achieving large energy conservation. Hybrid energy efficient distributed clustering (HEED) was introduced in [11]. In HEED protocol, CHs are elected periodically based on the residual energy of the SNs and the intra-cluster communication cost which incorporates cluster radius to adjust the transmission power for inter-cluster broadcast. The previous protocols and their variants are classified as homogeneous scheme protocols.
A stable election protocol (SEP) was proposed in [12] and distributed energyefficient clustering (DEEC) proposed in [13] are the two well-known protocols for heterogeneous WSNs. In SEP and DEEC, the SNs are assumed to have different energy levels and processing power. Both assume m fraction of the total SNs are equipped with a factor α of additional resources compared to the normal SNs. Although both follow the same approach, DEEC considers the residual energy of SNs in the CHs selection process whereas SEP does not. SEP and DEEC and their variants are classified as heterogeneous scheme protocols.
Alharthi and Johnson [14] suggested HT2HL to deal with some drawbacks in LEACH. Abu Shiba et al. [15] proposed CH-LEACH which uses k-means algorithm to select the CHs, and Park et al. [16] proposed an efficient CH selection using kmeans to maximize the energy efficiency of WSNs. Abdul Latiff et al. [17] compared the performance of optimization algorithms for clustering in WSN and validated that particle swarm optimization (PSO) achieves better network lifetime. The CHs are responsible to perform data fusion like typical artificial neural network (ANN) as we will discuss later.
The previous protocols and many of their variants [18] and a plethora other in the literature are single-hop hierarchical clustering routing for WSNs. This assumption is not realistic as the sink might not be reachable by all CHs or the network lifetime is drastically reduced as more energy is consumed in long distance. Multi-hop has more advantages and can solve this problem by choosing optimal routes in the tree formed between the leaves (CHs) and the root (BS).
Multi-hop LEACH finds paths using multi-hop approach like two-level LEACH [19] and MR-LEACH. In [20,21], the authors proposed new routing techniques to minimize the next hop distance which reduces the energy. Al-Sodairi and Ouni [22] proposed EM-LEACH for load balancing and extended lifetime of WSNs. Modified RPL was proposed by Alharthi et al. in [23].
Routing protocols based on intelligent algorithms like ANN, fuzzy logic, PSO, genetic algorithm (GA), etc. have been surveyed in [24] for network lifetime optimization. A review given in Table 1 is some existing single-hop and multi-hop and intelligent routing for WSNs.

Problem statement
The design of energy efficient routing and MAC protocols aims to maximize the network lifetime (Tnet). The battery of a SN i will drain out under flow f = {fi,j} in a time (Ti). Hence, the objective is to minimize the total energy consumption Etot. The energy consumption in one cycle is the sum of energy in the active period (Ea) which depends on (e.g., transmit, receive or idle) and the sleep period (Es). For n number of cycles, the Etot = P n i¼1 E a þ E s . Most, if not all, previous protocols are based on LEACH and they inherited its drawbacks. Some drawbacks are highlighted in [14] and other important are discussed here. First, though there are enough CHs in some rounds, some rounds have larger, fewer or no CHs are elected. Second, the separation and compactness due to the randomness employed in the election of CHs. Finally, re-clustering consume large amount of energy but it is unnecessarily performed every round resulting in SNs with less remaining energy be elected as CHs after 1/p rounds where p is the percentage of CHs. This lowest energy rout will result in energy depletion of the nodes along the path and may lead to network partitioning [20].
The data aggregation is simple, which aims to reduce the number of messages. Hierarchical routing protocols play significant role to choose the SNs to perform data aggregation by constructing a loop-free spanning-tree. Hence, the latency decreases as the path length is decreased, and the accuracy and energy efficiency are increased. The resilience of aggregation can be improved using hierarchal techniques which provide defense against routing and aggregation attacks.

System models
The following section describes the network model for three-tier heterogeneous WSNs and defines the radio energy model which is used to compute the energy dissipated by the SNs.

Network model
Consider three-tier thing-fog-cloud IoT architecture as shown in Figure 1 and WSN containing N nodes in tier-0 and fog nodes in tier-1. They are distributed uniformly at random in a 2D grid or a square sensor field (area A = LxL) to monitor the environment as shown in Figure 1.
There are two or more types of SNs might exist inside WSNs; i.e., full function devices (FFD), reduced function devices (RFD) and coordinator in Zigbee and Class A, B, and C in LoRA networks. Let S be the set of FFD SNs such that S = {S1, S2, … , Si} where Si denotes the i-th SN, 1 ≤ i ≤ N represents the decimal ID of the SN. A SN can be equipped with one or more same or different types of sensors (e.g., temperature, pressure, etc.). The BS is static and located at the center with no limited energy. The SNs are unaware about their location and fixed after deployment and each has a unique ID as shown in Figure 2(a).
The network in tier-0 can be denoted as a graph G = (V, E), where V is a set of vertices representing the SNs such that |V| = N and E ⊆ V 2 is a set of edges that signify the bidirectional wireless links between a pair of SNs v i and v j . Hence, edges (v i ,v j ) ∈ E if v i and v j are neighbors; i.e., the distance between them d (v i ,v j ) is less than the range of each other as shown in Figure 2(b).
where R is the communication range. We define c i , j (v i ,v j ) as the cost of energy on the edge between v i and v j assuming that the links are symmetric. That is, c i , j (v i ,v j ) = c j , i (v j ,v i ).
Let G' = (C, E') be a subgraph, where C is a set of vertices representing the CHs such that |C| = k and E' ⊆ C 2 is a set of edges that show the bidirectional wireless links between a pair of CHs C i and C j . Let P(v i , v j ) = (v i , v 1 , … , v j ) is the set of multiple paths in G' from C i to C j since other FFD SNs are in sleep mode and neither involve in routing nor act as fog nodes at this moment as shown in Figure 3.

Radio energy model
There are various models for radio energy consumption, though most of the researchers have used the energy model in [8]. The model for communication between SNs is depicted in Figure 4.
The energy consumption in the transmitter circuit E Tx (B,d) to transmit B-bit message is given by The E elec depends on electronic circuits of the transceiver used in digital coding, modulation, demodulation, and other circuits. The E amp is the energy dissipation in power amplifier, and it depends on power loss (PL) in the channel model used. The PL in free-space and log-distance models can be described using Friis equation as the ratio of the power transmitted to the power received (Pt/Pr) as given below: where λ is the wavelength, G is the product of equal gains of transmit and receive antennas, d is the distance between antennas, and δ is the PL exponent which depends on the propagation environment. Table 2 gives the typical values of δ.
Hence, d 2 is the power loss in free space (E fs ) and d 4 is the power loss in two-ray or multipath fading (E amp ).
The values of E fs = 10pJ/bit/m 2 and E amp = 13fJ/bit/m 4 .  The energy dissipated for receiving B-bit message E Rx (B) is given by The E elec by transmitter and receiver circuits are equal and can be set to 50 nJ/bit. The CH and its MNs can communicate provided that the received power will be greater than the sensitivity of the receiver.
The process of data aggregation is same as the concept of feed-forward multilayer perceptron (MLP) in ANN or fuzzy logic as depicted in Figure 5.
The set of input data xi (x1, x2, … , x n ) are measured by groups of SNs xi in each cluster. The CHs aggregate the data from inputs xi using quantization techniques to process the input data locally, which results in quantized data (q1, q2, … , qn) before sending them to the fusion center. A weight value wi is assigned to each SN and a bias bk as described in Eq. (5).
The energy consumption due to data aggregation (E DA ) is 5pJ/bit. The CHs can perform sensing (e.g., temperature, pressure, etc.) and compare the readings with data aggregated from other SNs. The sensed signal is analog which will be converted to binary of size D bits using analog-to-digital converter (ADC). If the readings from the SNs fall in the interval [ÀM, M], their readings will be aggregated, compressed, and sent to upper level CHs or zone gateways which have lower variance (i.e., Da [Da1, Da2] where Da1 ≥ -D and Da2 < D). However, they are identified as outliers since the readings do not match the aggregated confidence interval [Da1, Da2].
The aggregated data are compressed into a single fixed-length packet by applying some sort of transforms on the source bit like discrete wavelet transform (DWT) and discrete cosine transform which achieve signal-to-noise-ratio (SNR) of 20 dB or better to reduce the number of transmitted bits. Relay nodes do not aggregate data as it has less correlation.  Table 2.
Typical values of δ.

Figure 5.
Data aggregation process in AI model.

Security model
Wireless links are susceptible to passive eavesdropping or active impersonation in which the former violates confidentiality by eavesdropping exchanged messages without being detected while later try to inject erroneous messages, drop packets or change routing information.
Symmetric cryptography or group key management is common security scheme for WSNs. Each SNs including the sink needs pairwise cryptographic keys for authentication. The global keys (GKs) are shared among SNs in the network; however, private keys (PKs) are shared between the sink and individual SNs only.
To build the security mechanism, confidentiality, integrity, and availability (CIA) triad model with passive attacks is assumed as shown in Figure 6.

Proposed protocol
The proposed cross-layer secure MZRP and a hybrid energy-efficient MAC (X-SREM) have three phases: intelligent network configuration phase (INCP), the clustering and rotation of CHs phase (CRCP), and the steady-state phase (SSP). MZRP is a hybrid protocol which divides the sensing field into multi-level hierarchy to solve the minimum transmit power topology control problem faced in multi-hop wireless network. Each level is divided into zones using self-organizing map (SOM) neural network, and zones are further divided into regions which are composed of a group of clusters led by cluster heads (CHs). The BS calculates the proximity of all SNs and those that can communicate directly with the BS will be assigned zone-0 identifier and serve as border gateways (BG); and the CHs in zone-1 forward their data directly to the zone-0 gateways. The CHs in the lower level zones choose the least cost multi-hop paths to forward their data to the BS using multihop.

Intelligent network configuration phase (INCP)
The initial phase is the network configuration that begins with "hello" message from the BS with the TDMA schedule to all SNs to advertise themselves. Based on the distance, the neural network clustering using SOM is performed in the BS. The input layer of the neural network is the three-element vectors (3 Â 100 matrix) which define the x-and y-coordinates of the SNs and the distance to the BS. The SOM layer is 1-by-10 hidden layers of neurons to classify SNs to zone in a 2D-grid as illustrated in Figure 7 and described in Algorithm 1.

Clustering and rotation of CH phase (CRCP)
Once SNs are assigned to zones, the BS runs ML algorithm (e.g., k-medoids) to determine k CHs in each zone that minimizes the cost function described in [17].
The BS broadcasts "hello" message again to all SNs which includes information of the elected CHs. Once the CHs are selected, the clusters are fixed for the entire network lifetime and members are sorted in a list to become CHs based on the distance from the BS if the energy level of CHs fall below a threshold value defined as f2/k. The zones created around the BS based on the distance using SOM and CHs using k-medoids are shown in Figure 8(a) and data aggregation in Figure 8(b).

Steady-state phase (SSP)
The SSP breaks the data transmission into frames using hybrid TDMA/CSMA where member nodes (MNs) send the data to their CHs in the time slots allocated to them. Hence, SNs have periodic sleep with duty-cycle equals T A /(T A + T S ) where T A and T S are the active and sleep time, respectively. This is repeated periodically to save SN's energy.
The CHs aggregate the data and send them to the BS or ZG using carrier sense multiple access (CSMA) and fixed spreading codes. Based on traffic, CHs switch between TDMA and CSMA to transmit the data. For instance, when the energy threshold is reached, the SNs with highest energy will be elected as CHs and this process continues to manage the routing using the AI algorithm in entirely distributed fashion. The round structure is shown in Figure 9.
The flow chart of MZRP is shown in Figure 10.   A single-hop inter-cluster communication happens between the MNs and their CHs and between CHs and BS as well if the distance between the BS and the CHs is less than their communication range, otherwise data are relayed from lower-level CHs through intermediate CHs at upper levels until it reaches the sink using multihop with minimum cost based on zone ID, remaining energy and link quality with minimum hop-count as shown in Figure 11.
The cost of a link (i,j) is given by where E res is the initial energy E init minus the energy consumption (E Tx + E Rx + E DA ) defined before and e, q, and l are constants. The route selection is described in Algorithm 2.  12: if (HOP next element equals HOP of the CH) then 13: add j to ni 11: end if 14: j next element in the list 15: endwhile 16: sort n i according to c i,j in descending order NH 17: NH top of the list n i Now, logical key tree is produced using topology key hierarchy (TKH) algorithm [25] based on the model shown in Figure 12. The algorithm is described below.
The total number of messages in d degree key tree is P hÀ1 ÞÀ1 [26]. The total cost of key update is sum of cost of revoked subtree (RS) (tree i ) plus the sum of cost of subtrees ∉ RS given by The model in Figure 12 is a novel mechanism for security in WSNs to defend against aggregation and routing attacks. The subtrees are rooted at T1, T2, and T3 or BGs.
The sink uses a public function F to generate a key chain K i where K i = F(K i + 1). Each SN had stored an initial key K 0 before its deployment where K 0 = F n (K). Hence, the sink's first message will be encrypted using K 1 = F n-1 (K) and transmitted. Once they have been received by all SNs, the sink reveals K 1 and each SN computes F(K 1 ) = F(F n-1 (K)) to validate this matching its initial key (i.e., K 0 ). Similarly, the consecutive keys can be revealed until K n = K is reached.
Further processing on the sensed data and when decision has to be made, fog nodes have data base storage to store the data generated temporarily and perform AI/ML to make decision on what to do next and send commands to the BS to instruct it what action needs to be taken. The big data will be sent to the cloud for storage and higher power computation. Figure 13 illustrates the relationship between IoT, AI/ML and Big data.

Simulation
MATLAB R2019a is used to evaluate the proposed cross-layer MZRP and a hybrid MAC protocol based on the model described in Section 4.1. The BS performs SOM to create the zones and k-medoids to elect CHs. Other simulation parameters are listed below. The average remaining energy in MZRP is higher compared to dual-hop and HT2HL as shown in Figure 14(a).  The simulation of MZRP, dual-hop, and HT2HL are repeated, and the average values for the deaths of first node, half nodes, and last node (FND, HND and LND) are shown in the histogram of Figure 14(b). Obviously, MZRP outperforms dualhop and HT2HL in terms of network lifetime. Hence, it is more energy efficient.
The MAC protocol is responsible for active/sleep schedule. The average power consumption of a SN can be approximated by the following expression: where T cycle is the cycle period (i.e., T A + T S ), I S is the current drawn in the sleep mode, and V is the battery voltage [27]. Figure 15(a) compares the remaining energy in CSMA and TDMA for different numbers of SNs accessing the channel. TDMA is better when the number of SNs is increased as it offers collision free media. Hence, SNs transmit in the time slot and turn their transceivers off. Thus, SNs have less active time in TDMA compared to CSMA as shown in Figure 15(b).
The TKH cryptographic algorithm [25] is used in the MZRP, and the key management is static (i.e., no update) since re-clustering does not happen until the threshold value of the energy of CHs are reached. As the topology will have changed as new CHs are elected, the keys will be updated. Hence, dynamic key management  is required. The TKH algorithm has low cost compared to logical key hierarchy (LKH) and key link tree (KLT) [26] as shown in Figure 16.
Clearly, the TKH has the least key update messages; hence, energy consumption of SNs using TKH is less.

Conclusion
The cross-layer energy efficient multi-hop zone-based routing MZRP and hybrid MAC (X-SREM) is proposed in this paper. The AI/ML are at the heart of our design where SOM is used to divide the area into zones and k-medoids algorithm is used for clustering and ANN is used to aggregate data by CHs. MZRP protocol extends the network lifetime; hence, its performance of is better than dual-hop and HT2HL. The load balancing is achieved through rotation of CHs within clusters when the threshold value of the energy level is 50% of the initial energy since this has been tested to be the optimum threshold for reclustering. The fog nodes provide data analytics near the network edge and relay big data to cloud for further processing and storage. Enhanced lifetime is our future work to improve MZRP.