Due to the recent trend of intelligent systems and their ability to adapt with varying conditions, deep learning becomes very attractive for many researchers. In general, neural network is used to implement different stages of processing systems based on learning algorithms by controlling their weights and biases. This chapter introduces the neural network concepts, with a description of major elements consisting of the network. It also describes different types of learning algorithms and activation functions with the examples. These concepts are detailed in standard applications. The chapter will be useful for undergraduate students and even for postgraduate students who have simple background on neural networks.
- neural network
- digital signal processing
- supervised learning
- unsupervised learning
- time series
The artificial neural network is a computing technique designed to simulate the human brain’s method in problem-solving. In 1943, McCulloch, a neurobiologist, and Pitts, a statistician, published a seminal paper titled “A logical calculus of ideas immanent in nervous activity” in Bulletin of Mathematical Biophysics , where they explained the way how brain works and how simple processing units—neurons—work together in parallel to make a decision based on the input signals.
The similarity between artificial neural networks and the human brain is that both acquire the skills in processing data and finding solutions through training .
2. Neural network’s architecture
To illustrate the structure of the artificial neural network, an anatomical and functional look must be taken on the human brain first.
The human brain consists of about computing units “neurons” working in parallel and exchanging information through their connectors “synapses”; these neurons sum up all information coming into them, and if the result is higher than the given potential called action potential, they send a pulse via axon to the next stage. Human neuron anatomy is shown in Figure 1 .
In the same way, artificial neural network consists of simple computing units “artificial neurons,” and each unit is connected to the other units via weight connectors; then, these units calculate the weighted sum of the coming inputs and find out the output using squashing function or activation function. Figure 2 shows the block diagram of artificial neuron.
Based on the block diagram and function of the neural network, three basic elements of neural model can be identified:
Synapses, or connecting links, have a weight or strength where the input signal connected to neuron is multiplied by synaptic weight .
An adder for summing the weighted inputs.
An activation function to produce the output of a neuron. It is also referred to as a squashing function, in that it squashes (limits) the amplitude range of the output signal to a finite value.
The bias has the effect of increasing or decreasing the net input of the activation function, depending on whether it is positive or negative, respectively.
Mathematically, the output on the neuron can be described as
are the input’s signals.
are the respective weights of neuron.
is the bias.
is the activation function.
To clarify the effect of the bias on the performance of the neuron, the output given in Eq. (1) is processed in two stages, where the first stage includes the weighted inputs and the sum which is donated as :
Then, the output of adder will be given in Eq. (3):
where the output of neuron will be
3. Types of activation function
Activation function defines the output of neuron as the function to the adder’s output . The following sections describe the different activation functions:
3.1. Linear function
Where neuron output is proportional to the input as shown in Figure 5.
And, it can be described by
3.2. Threshold (step) function
This activation function is described in Figure 6 where the output of neuron is given by
In neural computation, such a neuron is referred to as the McCulloch-Pitts model in recognition of the pioneering work done by McCulloch and Pitts (1943); the output of the neuron takes on the value of 1 if the induced local field of that neuron is nonnegative and 0 otherwise. This statement describes the all-or-none property of the McCulloch-Pitts model .
3.3. Sigmoid function
The most common type of activation functions in neural network is described by
Figure 7 shows the sigmoid activation function, it is clearly observed that this function has nonlinear nature and it can produce analogue output unlike threshold functions which produce output in discrete range [0, 1].
Also, we can note that sigmoid activation function is limited between 0 and 1 and gives an advantage over linear activation function which produces output form to .
3.4. Tanh activation function
This activation function has the advantages of sigmoid function, while it is characterized by output range between −1 and 1 as shown in Figure 8.
The output is described by
4. Neural network models
The manner in which the neurons of a neural network are structured is intimately linked with the learning algorithm used to train the network . Three main models can be identified for the neural network.
4.1. Single-layer feedforward neural network
In a layered neural network, the neurons are organized in the form of layers . The simplest structure is the single-layer feedforward network that consists of input nodes connected directly to the single layer of neurons. The node outputs are based on the activation function as shown if Figure 9.
Mathematically, the inputs will be presented as vectors with dimensions of , while the weights will be presented as a matrix with dimensions of , and outputs will be presented as a vector with dimensions of as given in Eq. (9):
4.2. Multilayer feedforward neural network
The second class of a feedforward neural network distinguishes itself by the presence of one or more hidden layers, whose computation nodes are correspondingly called hidden neurons as shown in Figure 10.
By adding one or more hidden layers, the network is enabled to extract higher-order statistics from its input .
5. Neural network training
The process of calibrating the values of weights and biases of the network is called training of neural network to perform the desired function correctly .
Learning methods or algorithms can be classified into:
5.1. Supervised learning
In supervised learning, the data will be presented in a form of couples (input, desired output), and then the learning algorithm will adapt the weights and biases depending on the error signal between the real output of network and the desired output as shown in Figure 11.
As a performance measure for the system, we may think in terms of the mean squared error or the sum of squared errors over the training sample defined as a function of the free parameters (i.e., synaptic weights) of the system .
5.2. Unsupervised learning
To perform unsupervised learning, a competitive learning rule is used. For example, we may use a neural network that consists of two layers—an input layer and a competitive layer. The input layer receives the available data. The competitive layer consists of neurons that compete with each other (in accordance with a learning rule) for the “opportunity” to respond to features contained in the input data (Figure 12) .
6. Neural networks’ applications in digital signal processing
Digital signal processing could be defined using field of interest statement of the IEEE Signal Processing Society as follows:
Signal processing is the enabling technology for the generation, transformation, extraction, and interpretation of information. It comprises the theory, algorithms with associated architectures and implementations, and applications related to processing information contained in many different formats broadly designated as signals. Signal processing uses mathematical, statistical, computational, heuristic, and/or linguistic representations, formalisms, modelling techniques and algorithms for generating, transforming, transmitting, and learning from signals. .
Based on this definition, many neural network structures could be developed to achieve the different processes mentioned in the definition.
One of the most important applications of an artificial neural network is classification, which can be used in different digital signal processing applications such as speech recognition, signal separation, and handwriting recognition and detection .
The objects of interest can be classified according to their features, and classification process could be considered as probability process, since the classification of any object under a given class depends on the likelihood that the object belongs to the class more than the probability of belonging to the other classes .
Assume that is the vector of features for the objects of interest which could be classified into classes where is the pool of classes. Then, classification will be applied as follows:
Classification process will be described using Eq. (12)
One of the examples of classification is QPSK modulator output detection, where detection is considered as a special case of classification.
Assume that the received signal is :
where is normally the distributed noise signal and s is the transmitted signal.
The output of QPSK modulator is shown in Figure 13, where the samples are arranged in four classes.
By adding white Gaussian noise, the received signal will be as shown in Figure 14.
The neural network shown in Figure 15 is used to detect and demodulate the received signal, where the network consists of one hidden layer with five neurons and an output layer with two neurons.
6.2. Time series prediction
A series is a sequence of values as a function of parameter; in the case of time series, the values will be as a function of the time. So, many applications use time series to express their data, for example, metrology, where the temperature is described as time series .
The interesting problem in time series is the future prediction of the series values; neural networks can be used to predict the future results in series in three ways :
Predict the future values based on the past values of the same series; this way can be described byE14
Predict the future values based on the values of relevant time series, whereE15
Predict the future values based on both previous cases, whereE16
Figure 18 shows the predicted series by neural network based on the first way, where the samples of original series are given of determined period, and then the neural network predicts the future values of the series based on series behavior.
6.3. Independent component analysis
The goal of the independent component analysis (ICA) is to separate the linearly mixed signals. ICA is a type of blind source separation when the separation is performed without the pre-information about the source of signals or the signal-mixing coefficients. Although the problem of separating the blind source, in general, is not specified, the solution of use can be obtained under some assumptions .
ICA model assumes that independent signals where are mixed using matrix:
Then, mixed signal could be expressed as
As the separation process is blind, that is, both and are unknown; thus, ICA assumes that the mixed signals are statistically independent and have non-Gaussian distribution .
Neural network shown in Figure 19 is used to estimate the unmixing matrix .
The separated signals are given as
Different methods could be applied to find , for example, natural gradient based define as
where is the training factor and both and are the odd functions.