Confidentiality and Integrity for IoT/Mobile Networks

This chapter discusses how to ensure confidentiality and integrity for data flow in IoT applications. While confidentiality could be assessed by access control, cryptography, or information flow analysis, integrity is still an open challenge. This chapter proposes to use error-correcting codes to guarantee integrity, i.e., to main-tain and assure the errorless state of data. Besides errors, many communication channels also cause erasures, i.e., the receiver cannot decide which symbol the received waveform represents. The chapter proposes a method that might correct both errors and erasures together. Our method is efficient in reducing memory storage as well as decoding complexity.


Introduction
It is estimated that Internet of Things (IoT) will generate billions of dollars in profit for industries over the next two decades. Many organizations have started to develop and implement their own IoT strategies. IoT enables devices would generate and transmit so many data such that security should be a top concern. IoT users require that communication technologies have to guarantee both efficiency and security. This chapter discusses how to guarantee two main properties of security, i.e., confidentiality and integrity, for IoT applications.

Confidentiality
Securing the data manipulated by information systems has been a challenge in the past few years. Several methods to limit the information disclosure have been proposed, such as access control and cryptography. These are useful approaches, i.e., they can prevent confidential information from being read or modified by unauthorized users. However, they still have a fundamental limitation, i.e., they do not regulate the information propagation after it has been released. For example, access control prevents unauthorized file access, but is insufficient to control how the data is used afterwards. Similarly, cryptography provides a shield to exchange information privately across a nonsecure channel, but no guarantee about the confidentiality of private data is given after it is decrypted. Thus, neither access control nor encryption provides a complete solution to protect confidentiality for information systems.
To ensure confidentiality for an information system, i.e., IoT system, it is necessary to show that the system as a whole enforces a confidentiality policy, i.e., by analysing how information flows within the system. The analysis must show that information controlled by a confidentiality policy cannot flow to a place where that policy is violated. Thus, the confidentiality policy we wish to enforce is an information flow policy, and the method that enforces them is an information flow analysis.
Information flow analysis is a technique that has recently become an active research topic. In general, the approach of information flow security is based on the notion of interference [1]. Informally, interference exists inside a system when private data affect public data, e.g., an attacker might guess private data from observing public data. Noninterference, i.e., the absence of interference, is often used to prove that an information system is secured.
Noninterference is required for applications where the users need their private data strictly protected. However, many practical IoT applications might leak minor information. Such systems include password checkers, cryptographic operations, etc. For instance, when an attacker tries to guess the password: even when the attacker makes a wrong guess, secret information has been leaked, i.e., it reveals information about what the real password is not. Similarly, there is a flow of information from the plain-text to the cipher-text, since the cipher-text depends on the plain-text. These applications are rejected by the definition of noninterference.
However, the insecure property will happen only when it exceeds a specific threshold, or amount of interference. If the interference in the system is small enough, e.g., below a threshold given by specific security policy, the system is considered to be secure. The security analysis that requires to determine how much information flows from high level, i.e., secret data, to low level, i.e., public output, is known as quantitative information flow. It concerned with measure the leakage of information in order to decide if the leakage is tolerable.
Qualitative information flow analysis, i.e., noninterference, aims to determine whether a program leaks private information or not. Thus, these absolute security properties always reject a program if it leaks any information. Quantitative information flow analysis offers a more general security policy, since it gives a method to tolerate a minor leakage, i.e., by computing how much information has been leaked and comparing this with a threshold. By adjusting the threshold, the security policy can be applied for different applications, and in particular, if the threshold is 0, the quantitative policy is seen as a qualitative one. The idea of quantitative information flow analysis has been discussed in details in [2], one of our papers; readers can refer to it for more information.

Integrity
Integrity means maintaining and assuring accuracy and completeness of data. However, during the wireless transmission in IoT applications, messages can be erroneous due to many reasons, e.g., attenuation, distortion or the addition of noise. Error means the receiver cannot decode correctly the signal to get the right symbol. In order to protect data against errors, channel coding, i.e., error-correcting codes are required. Error-correcting codes ensure proper performance of IoT systems. They ensure the integrity of communication links in the presence of noise, distortion, and attenuation [3][4][5][6]. The use of a parity-bit as an error-detecting mechanism is one of the simplest and most well-known schemes used in digital communication. Data is portioned into blocks. To each block, an additional bit is appended to make the number of bits which are 1 in the block, including the appended bit, an even number. If a single bit-error occurs, within the block, the number of 1's becomes odd. Hence, this allows for detection of single errors [7,8].
Error-correcting codes are often applied in telecommunications. Many early applications of coding were developed for deep-space and satellite communication systems. For example, signals from satellites and space crafts are sent back to earth. The channel for such transmission is space and the earth's atmosphere. These communication systems not only have limitations on their transmitted power, but also introduce errors, due to solar activity and atmospheric conditions, into weak signals. Error-correcting codes are an excellent method to guarantee the integrity of these communication links. With the applications of error-correcting codes, most of the data sent could be correctly decoded here on earth. As examples, a binary (32,6,16) Reed-Muller code was used during the Mariner and Viking mission to Mars around 1970 or a convolutional code was used on the Pioneer 10 and 11 missions to Jupiter and Saturn in 1972. The (24,12,8) Golay code was used in the Voyager 1 and Voyager 2 spacecrafts transmitting color pictures of Jupiter and Saturn in 1979 and 1980. When Voyager 2 went on to Uranus and Neptune, the code was switched to a concatenated Reed-Solomon code for its substantially more powerful error correcting capabilities.
The block and convolutional codes are also applied to the global system for mobile communications (GSM) which is the most popular digital cellular mobile communication system. Reed Solomon and Viterbi codes have been used for nearly 20 years for the delivery of digital satellite TV. Low-density parity-check codes (LDPC codes) are now used in many recent high-speed communication standards, such as Digital video broadcasting-S2 (DVB-S2), WiMAX, 10GBase-T Ethernet [9].
Most error correcting codes, in general, are designed to correct or detect errors. However, many channels cause erasures, i.e., the demodulator cannot decide whether the received waveform represents bit 0 or 1, in addition to errors. Basically, decoding over such channels can be done by: firstly, deleting erased symbols and then, decoding the resulting vector with respect to the punctured code, i.e., the code in which all erasures have been removed. For any given linear code and any given maximum number of correctable erasures, in [7], Abdel-Ghaffar and Weber introduced a parity-check matrix yielding parity-check equations that do not check any of the erased symbols and which are sufficient to characterize the punctured code. This allows for the separation of erasures from errors to facilitate decoding. However, these parity-check matrices have too many redundant rows. To reduce decoding complexity, parity-check matrices with small number of rows are preferred. This chapter proposes a method that can build a matrix with a smaller number of rows.
Organization of the paper: The rest of this chapter is organized as follows. Section 2 introduces the main ideas of error-correcting codes, errors and erasures. Section 3 presents methods to construct a parity-check matrix that can correct both errors and erasures. Section 4 discusses a general solution for the covering design, which is used in the proposal. Finally, Section 5 concludes the chapter.

Codes, errors and erasures 2.1 Linear block codes
Let C be an n; k; d ½ linear block code. It means that C is a k-dimensional subspace of the n-dimensional vector space. The set of codewords of C can be defined as the null space of the row space of an r Â n parity-check matrix H ¼ h i, j À Á of rank n À k. Since a vector x is a codeword of C iff xH T ¼ 0, where the superscript T denotes the transpose, we can derive r parity-check equations PCE, as follows, PCEi : An equation PCEi(x) is said to check x in position j iff h i, j 6 ¼ 0.

Erasures
Sometimes, at the receiver, the demodulator cannot decide which symbol the received waveform represents. In this case, we declare the received symbol as an erasure. When the received codeword contains erasures instead of errors, the iterative decoding can be used [8].
Here, we summarize the iterative decoding procedure using an example of the (7,4,3) binary Hamming code with the following parity-check matrix, Þ is a codeword iff xH T ¼ 0. Hence, every codeword has to satisfy three parity-check equations as follows, Assume that the received vector is * 010 * 0, where the erased symbol is denoted by Equation A checks on x 1 , x 3 , x 4 and x 5 . If exactly one of these four symbols is erased, it can be retrieved from this equation. Thus, Similarly, we can derive that x 2 ¼ 1, and x 6 ¼ 0 from Equation B and C. Therefore, the iterative decoding decided that the transmitted codeword is 1101000.
Iterative decoding is successful iff erasures do not fill the positions of a nonempty stopping set. A stopping set is a set of positions in which there is no parity-check equation that checks exactly one symbol in these positions. The performance of iterative decoding techniques for correcting erasures depends on the sizes of the stopping sets associated with the parity-check matrix representing the code. The parity-check matrix with redundant rows could benefit the decoding performance, i.e., reducing the size of stopping sets, while increasing the decoding complexity. More information on stopping set can be found in [2,10,11].

Separation of errors from erasures
In this part, we discuss how to handle errors together with erasures. In this case, we can apply an algorithm using trials in which erasures are replaced by 0 or 1; and the resulting vector is decoded by a decoder which is capable of correcting errors. For binary code, two trials are sufficient [8,12].
For example, if C is a binary n; k ð Þ-code with a Hamming distance d ¼ 2t ε þ t ? þ 1, then C can correct t ε errors and t ? erasures. In the presence of no erasures, C is able to correct up to t ε þ t ? =2 b cerrors. Let r be a received vector having at most t ε errors and at most t ? erasures. Suppose the decoder constructs two vectors r0 and r1, where ri is obtained by filling all erasure positions in r with the symbols i, i ¼ 0, 1. Since C is binary, in either r0 or r1, at least half of the erasure locations has the right symbols. Hence, either r0 or r1has a distance at most t ε þ t ? =2 b cfrom the transmitted codeword. Thus, any standard error correction technique can be applied. If the correction decodes both r0 and r1 to codewords, and these codewords are the same, then this is the transmitted codeword. If they are different, then there is one, and only one, vector requiring at most t ε changes in nonerasure positions to become the right codeword. More information on this algorithm can be found in [6].
Abdel-Ghaffar and Weber proposed another way of decoding over such channels [7]. First, all erasures are deleted from the received message. Errors in the resulting codeword will be corrected based on the punctured code, i.e., codewords consist of symbols in positions which are not erased. After all errors have been corrected, erasures will be recovered by the iterative decoding.
The decoder can compute a parity-check matrix for the punctured code after receiving the codeword. However, this leads to time delay which is unacceptable specially in IoT applications. To reduce time delay, we can store parity-check matrices of all punctured codes corresponding to all erasure patterns. The drawback of this solution is the requirement of huge memory storage at the decoder.
Abdel-Ghaffar and Weber proposed using a separating matrix with redundant rows, providing enough parity-check equations which do not check any of the erased symbols and are sufficient to form a parity-check matrix for the punctured code obtained by deleting all erasures [7]. Having these parity-check equations not checking any of the erased symbols lead to the concept of separation of errors from erasures.
The basic concept of this decoding technique can be illustrated by an example as follows. We consider an (8,4,4) binary extended Hamming code with the following parity-check matrix, Figure 1.
A normal parity-check matrix just has only four rows as the first four rows in this separating matrix. Allowing redundant rows simplifies the decoding of erasures in addition to errors. Assume that we get a codeword r = 0 * 011000 with one erasure in the second position. Applying the decoding technique mentioned above, firstly we delete the erasure and the resulting vector is r' = 0011000. This vector r' can be considered as a codeword of the (7,4,3) punctured code. In H, the first, the second and the sixth row have zeros in the second position. It means that three corresponding parity-check equations do not check the erased symbol. Based on these three rows, we can form a parity-check matrix H' for the punctured code, as follows Figure 2.
Using H', r' is decoded into 0011010. Putting back the erasure, we get 0*011010. The third row of H, which checks the erased symbol, can be used to recover the erasure. Thus, the decoded codeword corresponding to r is 01011010.
A normal parity-check matrix cannot be used for decoding of both errors and erasures together. Decoding is feasible when we pay the price of storing a paritycheck matrix with more rows than a normal one. In order to reduce the memory storage as well as the decoding complexity, a parity-check matrix with small number of rows is preferred.
Given any linear code and any given maximum number of correctable erasures, Abdel-Ghaffar and Weber introduced separating matrices yielding parity-check equations that do not check any of the erased symbols and which are sufficient to characterize all punctured codes corresponding to this maximum number of erasures [7]. This allows for the separation of erasures from errors to facilitate decoding. However, their proposal yields separating matrices which typically have too many redundant rows. The following part of this chapter discusses an improved method to construct such separating matrices, applying covering design, with a smaller number of rows.

Set separation
Let H ¼ h i, j À Á of rank n À k be an (r Ân) parity-check matrix of C, r ≥ n À k. Let S be a subset of 1; 2; …; n f gand T be a subset of 1; 2; …; r f g , define H T S ¼ h i, j À Á with i ∈ T and j ∈ S, be a T ∨ Â ∨ S∨ submatrix of H. For the code C with the length n, define C S ¼ c S : c ∈ Cg È be the punctured code consisting of all codewords of C in which the symbols in positions indexed by S, S ¼ 1; 2; …; n f g S f are deleted. Clearly, C S is a linear code over GF q ð Þ of length n' ¼ S ∨, dimension k' ≤ k, and [7]: A parity-check matrix H of an n; k; d ½ linear code C separates a set S of size S j j ≤ d À 1 iff H S ð Þ has rank n À k À S j j. Definition 2 [7]: If H separates all sets S of size l for a fixed l ≤ min d; n À k f gÀ 1, it is l-separating. If H is an l-separating parity-check matrix of the code C, based on H, we can construct a parity-check matrix for any code punctured up to a fixed number l of symbols. H has two features: • H can separate erasures from errors, since H has enough parity-check equations that do not check any erased symbols, and are sufficient to characterize the punctured code. It means that the punctured code, which is formed by deleting erased symbols, can be corrected errors by a sub-matrix of H. • In case l ≤ min d; n À k f gÀ 1, H has no stopping set of size l or less. For any pattern of l or fewer erasures, not only are there enough parity-check equations that do not check any of the erased symbols characterize the punctured code, but also there is a parity-check equation that checks exactly one of the erased symbols. It means that after all errors have been corrected, erasures can be recovered by the iterative decoding procedure.

Separating matrix
Let H' be a full rank parity-check matrix, S i ⊆ 1; 2; …; n f g , in which i ¼ 1, 2, …, n l , be distinct subsets of 1; 2; …; n f gof size l, For each i, it is trivial that H 0 S i has rank l (l ≤ min d; n À k f gÀ 1). By elementary row operations on H', we can obtain an n À k ð ÞÂn matrix H 0 i , for each i ¼ 1, 2, …, n l , of rank n À k, such that its last n À k À l rows have zeros in positions indexed by S i Figure 3.
Let H I be a matrix which rows is the union of sets of the last n À k À l rows in

A more efficient separating matrix
In this section, we propose a method that can construct an l-separating matrix with a smaller number of rows. This method implements the idea of covering design [13,14]. Basically, given 1 ≤ t ≤ u ≤ v, a v; u; t ð Þcovering design is a collection of uelement subsets of V ¼ 1; 2; …; v f g , called blocks, such that each t-element subset of V is contained in at least one block, e.g., 1; 2 f g is contained in 1; 2; 3 f g. For our specific situation, consider an n; b; l ð Þcovering design. Let B ¼ B i f g be a set of b-element subsets, 1 ≤ l ≤ b ≤ min d; n À k f gÀ 1, such that every l-element subset S i is contained in at least one member of B. Assign to each S i , For any B j , by elementary row operations on H', we can obtain an n À k ð ÞÂnmatrix of rank n À k such that its last n À k À b rows have zeros in positions indexed by B j . After arranging columns, we obtain a matrix H 1 j with the following format (Step 1) Figure 5.  Consider the set S i assigned to B j , by further elementary row operations, H 1 j can be changed into a matrix such that rows l þ 1, l þ 2, …, b have zeros in positions indexed by S i , and rows b þ 1, b þ 2, …, n À k have zeros in positions indexed by B j . After column arrangement, we obtain a matrix with the following format (Step 2), Figure 6.
Following this method, if S i and S i 0 belong to the same B j , the last n À k À b rows in H 0 i and H 0 i 0 are the same. It follows that the matrix which rows is the union of the last n À k À l rows in H 0 j , j ¼ 1, 2, …, ∨B∨, and the rows l þ 1, l þ 2,

Covering design
Consider a v; u; t ð Þcovering design, where 1 ≤ t ≤ u ≤ v.
Example 1: The covering design problem has been investigated since many years ago. However, until now, there is no general optimal solution for all triples v; u; t ð Þ. In this section, we propose a covering design valid for all triples v; u; t ð Þ. This design is not optimal but it can give a general solution for the problem. We can merge the last two subsets in each column (except the special column) because: (a) the first u À 1 elements in the last two subsets are also in another subset in the columns. Thus, any subset of size t formed by using these u À 1 elements can be form by any other subset in the column; (b) any subset of size t containing 1; v À 1 f gor 1; v f g can be formed by subsets in the special column. Example 2: Given that v ¼ 9, u ¼ 5, t ¼ 4. Following the first three steps of Approach 2, we get: First, take {1} out of the set. Following Step 2, form all subsets of size 4, and arrange them into columns. Put {1} back into each subset of size 4 to have subsets of size 5. We denote subsets in boxes are subsets which can be merged Figure 8. Therefore, Step 4 of Approach 2 gives the following result. It is easy to see that any subset of size 4 can be formed by subsets in Figure 9. way to build a separating parity-check matrix with a smaller set of rows. This method reduces both decoding complexity and memory storage. Besides, we also present a covering design. This design is not optimal but it gives a general solution for all triple v; u; t ð Þ.