Computational complexity analysis.

## Abstract

This chapter introduces the novel applications of deep reservoir computing (RC) systems in cyber-security and wireless communication. The RC systems are a new class of recurrent neural networks (RNNs). Traditional RNNs are very challenging to train due to vanishing/exploding gradients. However, the RC systems are easier to train and have shown similar or even better performances compared with traditional RNNs. It is very essential to study the spatio-temporal correlations in cyber-security and wireless communication domains. Therefore, RC models are good choices to explore the spatio-temporal correlations. In this chapter, we explore the applications and performance of delayed feedback reservoirs (DFRs), and echo state networks (ESNs) in the cyber-security of smart grids and symbol detection in MIMO-OFDM systems, respectively. DFRs and ESNs are two different types of RC models. We also introduce the spiking structure of DFRs as spiking artificial neural networks are more energy efficient and biologically plausible as well.

### Keywords

- recurrent neural networks
- reservoir computing
- delayed feedback reservoir
- echo state networks
- cyber-security
- smart grids
- MIMO-OFDM

## 1. Introduction

Smart grids are a new generation of power grids, which provide more intelligent and efficient power transmission and distribution. However, the smart grids are vulnerable to security challenges unless properly protected. False data injection (FDI) attacks are the first and most common type of attacks in smart grids. Two major types of FDI attacks are known in smart grids. These two major types are single-period or opportunistic and multi-period or dynamic attack, respectively. In single-period attack, the adversary waits until it finds the opportunity to launch the attack instantaneously. On the other hand, in dynamic attacks, the adversary launches the attack gradually and through time toward its desired state. The single-period attacks are widely studied in the literature and they are more easily detected by the supervisory control and data acquisition (SCADA). In this chapter, we focus to study the multi-period or dynamic attacks [1, 2, 3, 4, 5].

State vector estimation (SVE) is the first technique to tackle the FDI detection in smart grids. However, SVE fails to detect stealth FDI attacks with low magnitudes. In recent years, both supervised and unsupervised machine learning (ML) approaches have been proposed to study FDI detection in smart grids. Generally, ML-based techniques have shown better performances than SVE. However, the ML techniques that have been proposed so far are not capable to capture the rich spatio-temporal correlations that exist between different components of smart grids. Therefore, in this chapter, we introduce spiking delayed feedback reservoirs (DFRs) to tackle the FDI detection problem in smart grids as they are very energy efficient and also can capture the spatio-temporal correlations between different components of smart grids. DFRs are an energy efficient class of reservoir computing systems [6, 7, 8].

Figure 1 demonstrates the structure of a reservoir computing (RC) system. As it can be seen, there are three layers in RC systems. They are the input, reservoir, and output layer, respectively. The architecture of RC systems is based on recurrent neural networks (RNNs). However, unlike the RNNs, the weights of the hidden (reservoir) layer are fixed and do not go through a training. The reservoir weights have to be initialized such that the echo state property is satisfied. Echo state property implies that in order to form a memory, the largest eigenvalue of the reservoir weights has to be less than 1. The largest eigenvalue of the reservoir layer’s weights is a design parameter and plays an important role in the performance of the RC systems. DFRs, echo state networks (ESNs), and liquid state machines (LSMs) are three different categories of RC systems. The strength of RNNs is employed as the reservoir or liquid states. In the reservoirs or liquid states, the weights of synaptic connections are fixed and do not require any training. The output weights are the only sets of weights that require training in RC models. This results in reducing the computational complexity of RC models compared to traditional RNNs [9, 10, 11, 12].

Equation (1) expresses the states of reservoir nodes,

where

where

The DFR is a ring topology of RC systems, where a single artificial neuron and a delay loop together form the reservoir layer. There are multiple choices available for the single artificial neuron of the DFR. In this chapter, we introduce spiking neurons as the nonlinear single neuron of the DFR. Spiking neurons are one of the several mathematical models that are introduced to model the biological neurons. Spikes are the main signals that the neurons of the brain use for communication. Hence, the mathematical representation of the biological neurons as spikes tends to be more biological plausible. Energy efficiency is another motivation to use the spiking neurons. TrueNorth chip consumes only 70 milliWatts (mW) to run 1 million spiking neurons with 256 million synapses [13, 14, 15]. The energy efficiency of spiking neural networks (SNNs) makes them a suitable choice for hardware implementations of artificial neurons as well [16, 17].

So far, several models for spiking neurons including leaky-integrate-and-fire (LIF) and the Hodgkin-Huxley have been proposed to mimic the behavior of our brains’ neurons [18]. The LIF models of spiking neurons have been used more commonly than other spiking artificial models of neurons due to their simplicity and ease of hardware implementation [19, 20]. The spiking neurons fire a spike as soon as a stimulating current is applied on their membrane, which makes the voltage of the membrane exceeds a certain threshold value. The relationship between the stimulating current and the voltage of membrane is expressed as follows:

where *ohms* and

In Figure 2, the topology of our proposed spiking DFR is demonstrated. There are multiple blocks in this structure. The input block is where the smart grids’ measurements are received. These measurements have to be first encoded before getting processed by DFR. There are two major types of encoding schemes for spiking neurons, namely rate encoding and temporal encoding [22]. Rate encoding has been vastly studied in the literature. However, recent studies have shown that temporal encoding schemes are more efficient and are superior to rate encoding schemes. The exact time that spike fires is used for temporal encoding of spikes. However, in rate encoding schemes, the number of the spikes that are fired by the neuron is used to encode the stimulus.

It has been shown in several experiments that temporal encoding is more likely to be the encoding scheme, which is leveraged by biological neurons. The neurons in the lateral geniculate nucleus, retina, and the visual cortex respond to the stimuli with milliseconds (ms) precision. The computational complexity of temporal encoding schemes has also made them superior to rate encoding approaches [23]. Therefore, in this chapter, we focus on temporal encoding schemes.

After the smart grids’ measurements are encoded, the encoded data is then converted to the analog current. This current is next fed in to the nonlinear node, which in our case, is a LIF neuron. For each current signal, its corresponding spike train is generated by the LIF neuron, and this spike train goes through a delay loop. The delay loop along with the LIF neuron forms the reservoir layer of DFR. We repeat this process as long as the corresponding reservoir states of each smart grid’s measurements are generated. The interspike intervals (ISI) of each spike trains are used as the training feature of the readout layer [24]. In this chapter, a multi-layer perceptron (MLP) is used as the readout layer. The features extracted in the reservoir layer are used for training the MLP layer. For each class of data, i.e., compromised and uncompromised, a proper label is assigned. We consider 1 as the label of compromised samples, and 0 for uncompromised samples.

Equation (4) expresses the governing equation for DFR,

where *N* equidistant delay units within the delay loop. Dividing the total delay into *N* equidistant delay units is expressed as follows:

where

DFRs have drawn a lot of attentions due to their capability to map the data from low dimensional space to high dimensional space. As it can be seen in Figure 3, by mapping the data from low dimensional space to high dimensional space, the non-linearly separable data becomes linearly separable. The chaos theory through Lyapunov analysis has shown that delay systems can show high dimensional behavior if the delay value is tuned properly such that the delay system operates at the edge of chaos. The Lyapunov dimension of a delay chaos system directly is determined by to the delay value [25]. In this chapter, we will examine the effect of delay value on the performance of DFR while detecting the dynamic hidden attacks in smart grids.

In this chapter, we will also look at symbol detection in multiple-input multiple-output orthogonal frequency division multiplexing (MIMO-OFDM) systems. In wireless communication systems, multicarrier access techniques are realized through OFDM. In fact, frequency-selective fading channels are converted to multiple flat-fading subchannels [26, 27, 28]. Spectral efficiency, transceiver structure, channel capacity, and robustness against interference are all improved as a result of applying OFDM in wireless communication systems [29, 30, 31, 32, 33]. MIMO systems are also extensively leveraged in different wireless communication systems including HSPA+(3G), WiMAx(4G), and long term evolution (4G LTE). By using MIMO systems, the capacity of wireless link is improved through the transmission of symbols on multiple paths. The system which is realized through the combination of MIMO and OFDM systems is called a MIMO-OFDM system [34, 35, 36, 37, 38]. A MIMO-OFDM system has shown to be very effective in utilizing the benefits of both MIMO and OFDM systems.

In order to detect the transmitted symbols accurately at the receiver (Rx), it is very essential to estimate the wireless channel state information (CSI) precisely [39, 40, 41]. CSI estimation is one of the major challenges of MIMO-OFDM systems. There are generally two major approaches for CSI estimation. The first approach leverages blind channel estimation to obtain the statistical properties of the channel [42]. The second category of CSI estimation techniques is based on training the symbols sent by transmitter (Tx) and received by (Rx) [29, 43, 44]. Training-based CSI estimation techniques have been adopted in many advanced communication systems including 3GPP LTE/LTE-Advanced. In the former category of CSI estimation techniques, no computational overhead is inferred, but they are good only for the channels that are varying very slowly with respect to time [45]. The latter category, i.e., training-based category can be applied for any channel regardless of their statistical properties. Therefore, the learning-based techniques including artificial neural networks have been vastly studied in literature [46, 47, 48] as the wireless channel estimation mechanism. RNNs have also been studied in [49, 50, 51, 52] for CSI estimation and symbol detection. Due to the difficulties of training, the conventional RNNs, we introduce echo state networks (ESN) for symbol detection and CSI estimation in MIMO-OFDM wireless communication systems.

## 2. Problem formulation of smart grids attack detection

The state and topology of smart grids are the two major targets that are manipulated by the adversaries [53]. The state of the smart grids is the key factor in determining the measurements values. A linear function

where

where

where

The dynamic attack

where *A* is the magnitude of attack; *cos* is cosine function; *N*(0,1) is a normally distributed vector in which its mean is zero and its variance is 1.

MATPOWER is a publicly available toolbox [54] that can be used to simulate the smart grids. In this chapter, we use MATPOWER to simulate the meters of a smart grid with 14 buses. There are totally 34 different meters in an IEEE-14 bus smart grid. We assume that the level of the access that the adversary can have to the meters of the system can range from 0 to 34. The level of access is defined as the number of meters that can be compromised by the attacker. In this chapter, the dataset that we use for train, test, and validation is assumed to be unbalanced. A dataset is called unbalanced when the ratio of compromised and uncompromised samples is not equal. In this chapter, it is assumed that 80% of the samples are uncompromised and 20% are compromised. Totally, 10,000 samples for training and 10,000 samples for test and validation are generated using MATPOWER.

## 3. Attack detection performance of DFR

The performance metrics for evaluation are **accuracy** and **F1**. **Accuracy** and **F1** are defined as:

where

Accuracy of attack detection for three different methods and magnitude of attacks, *A* = 0.1, 1, and 10.

In order to evaluate the performance of our proposed spiking DFR model, we compare our results with a MLP and a SNN. The MLP is trained using backpropagation algorithm and SNN is trained using precise spike driven (PSD) algorithm. In PSD, temporal encoding is leveraged as the encoding scheme. PSD is used to learn the hetero-associations that exist in spatio-temporal spike patterns and is introduced in [21]. As it can be seen in Figures 4 and 5, spiking DFR + MLP outperforms both MLP and SNN in terms of **accuracy** and **F1**. That is due to the fact that the spiking DFR + MLP is capable to map the data from low dimensional space to high dimensional space, and also captures the spatio-temporal correlation that exists between different components of smart grids. Based on our simulation results, the average **accuracy** of attack detection is increased up to **94.6%** when the combination of spiking neurons, DFR, and MLP is realized in a single platform. This improvement is observed for all different magnitude of attacks and number of compromised measurements. In our baseline model where only SNNs are used, the average **accuracy** is **77.92%**. This improvement implies that the average **accuracy** is improved about **17%** through our introduced hybrid spiking DFR and MLP model. **F1** measure shows even more significant improvement brought about. **F1** that is achieved through combination of spiking neurons, DFR, and MLP is **78%**. However, the **F1** which is achieved by SNN and PSD algorithm for dynamic attack detection is about **25%**, which means that our introduced model increases the **F1** for **53%**.

### 3.1 Delay effect on the performance of DFR

As it was mentioned in Section 1, the DFRs cannot show high dimensional behavior unless the delay value is tuned properly that the DFR operates at the edge of chaos. At this part, we show that delay value can significantly affect the performance of DFR for hidden dynamic attack detection on smart grids. Figure 6 demonstrates the performance of DFR for different values of delay. As it can be seen in Figure 6, for delay equal to 40 milliseconds (ms), the performance of spiking DFR + MLP achieves the highest value in terms of **F1** and **accuracy**. However, for delay value equal to 10 ms, the lowest performances are obtained. This observation implies that only for a proper delay value, the spiking DFR + MLP can operate at the edge of chaos and show high dimensional behavior. The phase portrait behavior of DFR with respect to varying the delay time is shown in Figure 7. The dynamic behavior of the delay systems can be tracked through phase portraits and chaotic or periodic behavior of the system can be demonstrated. It is suggested in [25] that if the delay of dynamic system is tuned properly, it can show high dimensional behavior. We also investigate the solution of the delay differential equation (DDE) to further explore the dynamic behaviors of our introduced model. As demonstrated in Figure 7, DDE is leveraged to model the dynamic behavior of nonlinear function while the delay is varying.

Figure 7 shows that varying the delay value can shift the behavior of delay system from periodic to edge of chaos region and completely chaotic.

### 3.2 Complexity analysis

In this section, the complexity of our approach in terms of training time is analyzed. The computational complexity of the introduced spiking DFR + MLP is associated with calculating the state of the reservoir layer, and updating the weights of readout layer during training. In the introduced spiking DFR model, the weights of input and reservoir layers are fixed and do not undergo any training. That is the fact that makes DFRs significantly computationally efficient compared to other types of RNNs. In traditional RNNs, all the hidden layers require to be trained. Due to the training of all hidden layers, the RNNs are very difficult to train. The measure of complexity is equivalent to the total number of floating-point operations (FLOPs). The training time of RC-based learning techniques correspond to the complexity of model as well [55]. In order to evaluate the computational complexity of our proposed model, the training time of our model is compared with the baseline approaches, i.e., MLP and SNN. Table 1 presents the training times (complexity) of spiking DFR + MLP, MLP, and SNN.

Algorithm | Training time |
---|---|

Spiking DFR + MLP | 16.69 s |

MLP | 3.2 s |

SNN | 90 s |

The SNN which is trained by PSD algorithm shows the highest computational complexity, as it can be seen in Table 1. The spiking DFR + MLP and MLP rank as the second and third computationally complex algorithms, respectively. As it can be seen in Figure 2, there are some building blocks in the spiking DFR + MLP. Therefore, the computational complexity of spiking DFR + MLP is higher than a simple MLP. Temporal encoding, spike to current, and reservoir blocks are the blocks that exist in our introduced model. However, the superiority of our model in terms of performance makes it justified for us to use this model as the attack detection platform in smart grids.

## 4. Reservoir computing-based symbol detection

### 4.1 Received signal

We assume there are

where

where

The channel model is defined according to the ray-tracing principle

where

### 4.2 Symbol detection framework

In symbol detection, we aim to estimate

where

Following this way, we can rewrite the received signal model (14) as:

where

The symbol detection requires learning

Moreover, an input buffer can be incorporated to further improve the symbol detection performance as proposed in [31]. To this end, the input of RC at time

### 4.3 One layer learning

We consider the special case when the output is only with one layer. According to the dynamic equation of inner states, denoted as

where

where *p*th OFDM symbol. Specially, for the comb pilots,

For solving the problem (20),

or thorough an online version, such as gradient descent or recursive least squares [57]. For multiple output layers, it follows the same method as multiple layers feed-forward neural networks via the forward backward propagation procedure [58].

### 4.4 Simulation results

In Figure 9, it demonstrates the BER performance of reservoir computing-based symbol detection methods: simple echo state networks (ESN) and echo state networks with windows (WESN) to the conventional methods: linear minimum mean squared error (LMMSE) and sphere decoding (SD). For the conventional methods, the CSI is obtained by LMMSE channel estimation [59, 60]. Here, we also consider the impact by PA non-linearity at the transmitter side. When the transmitted signal goes throughout the nonlinear region of PA, the signal suffers strong distortion, which can lead to a poor BER performance. Meanwhile, from this figure, we can observe the learning-based methods perform the best at low SNR regime and nonlinear region. This is because conventional methods rely on accurate CSI, which cannot be obtained in these two cases, while learning-based methods are robust against the model-based methods.

## 5. Conclusion

In this chapter, the emerging applications of spiking DFRs and ESNs were explored. We introduced the combination of spiking neurons, DFRs, and MLPs as the main platform to detect FDI attacks in smart grids. Our simulation results showed that spiking DFR + MLP outperforms SNN, and MLP in terms of **accuracy** and **F1**, respectively. The combination of DFRs and spiking neurons is capable of mapping the data to high dimensional space and capturing the spatio-temporal correlations, which exist between different components of smart grids. The effect of delay value on the performance of DFR was also studied in this chapter. We showed that DFRs can show high dimensional behaviors only for the delay values that make them operate at the edge of chaos. The computational complexity of our introduced model was also studied. In the use case of ESN for MIMO-OFDM symbol detection, we see this learning-based framework can perform better than conventional channel model-based methods when the obtained channel information is imperfect or model mismatch exists. The cost of learning is very few, i.e., it does not require a large size of pilots, which permits the application of this technique in practical system.

## Acknowledgments

The work of K. Hamedani, L. Liu and Z. Zhou are supported in part by the U.S. National Science Foundation under grants ECCS-1802710, ECCS-1811497, CNS-1811720, and CCF-1937487.