Open access peer-reviewed chapter

Neural Network Principles and Applications

Written By

Amer Zayegh and Nizar Al Bassam

Submitted: 28 February 2018 Reviewed: 20 July 2018 Published: 05 November 2018

DOI: 10.5772/intechopen.80416

From the Edited Volume

Digital Systems

Edited by Vahid Asadpour

Chapter metrics overview

1,920 Chapter Downloads

View Full Metrics

Abstract

Due to the recent trend of intelligent systems and their ability to adapt with varying conditions, deep learning becomes very attractive for many researchers. In general, neural network is used to implement different stages of processing systems based on learning algorithms by controlling their weights and biases. This chapter introduces the neural network concepts, with a description of major elements consisting of the network. It also describes different types of learning algorithms and activation functions with the examples. These concepts are detailed in standard applications. The chapter will be useful for undergraduate students and even for postgraduate students who have simple background on neural networks.

Keywords

  • neural network
  • neuron
  • digital signal processing
  • training
  • supervised learning
  • unsupervised learning
  • classification
  • time series

1. Introduction

The artificial neural network is a computing technique designed to simulate the human brain’s method in problem-solving. In 1943, McCulloch, a neurobiologist, and Pitts, a statistician, published a seminal paper titled “A logical calculus of ideas immanent in nervous activity” in Bulletin of Mathematical Biophysics [1], where they explained the way how brain works and how simple processing units—neurons—work together in parallel to make a decision based on the input signals.

The similarity between artificial neural networks and the human brain is that both acquire the skills in processing data and finding solutions through training [1].

Advertisement

2. Neural network’s architecture

To illustrate the structure of the artificial neural network, an anatomical and functional look must be taken on the human brain first.

The human brain consists of about 1011 computing units “neurons” working in parallel and exchanging information through their connectors “synapses”; these neurons sum up all information coming into them, and if the result is higher than the given potential called action potential, they send a pulse via axon to the next stage. Human neuron anatomy is shown in Figure 1 [2].

Figure 1.

Human neuron anatomy.

In the same way, artificial neural network consists of simple computing units “artificial neurons,” and each unit is connected to the other units via weight connectors; then, these units calculate the weighted sum of the coming inputs and find out the output using squashing function or activation function. Figure 2 shows the block diagram of artificial neuron.

Figure 2.

Block diagram of artificial neuron.

Based on the block diagram and function of the neural network, three basic elements of neural model can be identified:

  1. Synapses, or connecting links, have a weight or strength where the input signal xi connected to neuron k is multiplied by synaptic weight wwki.

  2. An adder for summing the weighted inputs.

  3. An activation function to produce the output of a neuron. It is also referred to as a squashing function, in that it squashes (limits) the amplitude range of the output signal to a finite value.

The bias bk has the effect of increasing or decreasing the net input of the activation function, depending on whether it is positive or negative, respectively.

Mathematically, the output on the neuron k can be described as

yk=φi=1mxi.wki+bkE1

where

x1,x2,x3,,xm are the input’s signals.

wk1,wk2,wk3,,wkm are the respective weights of neuron.

bk is the bias.

φ is the activation function.

To clarify the effect of the bias on the performance of the neuron, the output given in Eq. (1) is processed in two stages, where the first stage includes the weighted inputs and the sum which is donated as Sk:

Sk=i=1mxi.wkiE2

Then, the output of adder will be given in Eq. (3):

vk=Sk+bkE3

where the output of neuron will be

yk=φvkE4

Depending on the value of the bias, the relationship between the weighted input and adder output will be modified [3] as shown in Figure 3.

Figure 3.

Effect of bias.

Bias could be considered as an input signal x0 fixed at +1 with synaptic weight equal to the bias bk as shown in Figure 4 [3].

Figure 4.

Neuron structure with considering bias as input [1].

Advertisement

3. Types of activation function

Activation function defines the output of neuron as the function to the adder’s output vk. The following sections describe the different activation functions:

3.1. Linear function

Where neuron output is proportional to the input as shown in Figure 5.

Figure 5.

Linear activation function.

And, it can be described by

yk=vkE5

3.2. Threshold (step) function

This activation function is described in Figure 6 where the output of neuron is given by

yk=1ifvk00ifvk<0E6

Figure 6.

Threshold activation function.

In neural computation, such a neuron is referred to as the McCulloch-Pitts model in recognition of the pioneering work done by McCulloch and Pitts (1943); the output of the neuron takes on the value of 1 if the induced local field of that neuron is nonnegative and 0 otherwise. This statement describes the all-or-none property of the McCulloch-Pitts model [4].

3.3. Sigmoid function

The most common type of activation functions in neural network is described by

yk=11+evkE7

Figure 7 shows the sigmoid activation function, it is clearly observed that this function has nonlinear nature and it can produce analogue output unlike threshold functions which produce output in discrete range [0, 1].

Figure 7.

Sigmoid activation function.

Also, we can note that sigmoid activation function is limited between 0 and 1 and gives an advantage over linear activation function which produces output form to + [5].

3.4. Tanh activation function

This activation function has the advantages of sigmoid function, while it is characterized by output range between −1 and 1 as shown in Figure 8.

Figure 8.

Tanh activation function.

The output is described by

yk=21+e2vk1E8

Advertisement

4. Neural network models

The manner in which the neurons of a neural network are structured is intimately linked with the learning algorithm used to train the network [1]. Three main models can be identified for the neural network.

4.1. Single-layer feedforward neural network

In a layered neural network, the neurons are organized in the form of layers [1]. The simplest structure is the single-layer feedforward network that consists of input nodes connected directly to the single layer of neurons. The node outputs are based on the activation function as shown if Figure 9.

Figure 9.

Single-layer neural network.

Mathematically, the inputs will be presented as vectors with dimensions of 1×i, while the weights will be presented as a matrix with dimensions of i×k, and outputs will be presented as a vector with dimensions of 1×k as given in Eq. (9):

y1y2yk=x1x2xiw11w21wk1w1kw2kwikE9

4.2. Multilayer feedforward neural network

The second class of a feedforward neural network distinguishes itself by the presence of one or more hidden layers, whose computation nodes are correspondingly called hidden neurons as shown in Figure 10.

Figure 10.

Multilayer feedforward neural network.

By adding one or more hidden layers, the network is enabled to extract higher-order statistics from its input [1].

Advertisement

5. Neural network training

The process of calibrating the values of weights and biases of the network is called training of neural network to perform the desired function correctly [2].

Learning methods or algorithms can be classified into:

5.1. Supervised learning

In supervised learning, the data will be presented in a form of couples (input, desired output), and then the learning algorithm will adapt the weights and biases depending on the error signal between the real output of network and the desired output as shown in Figure 11.

Figure 11.

Supervised learning.

As a performance measure for the system, we may think in terms of the mean squared error or the sum of squared errors over the training sample defined as a function of the free parameters (i.e., synaptic weights) of the system [1].

5.2. Unsupervised learning

To perform unsupervised learning, a competitive learning rule is used. For example, we may use a neural network that consists of two layers—an input layer and a competitive layer. The input layer receives the available data. The competitive layer consists of neurons that compete with each other (in accordance with a learning rule) for the “opportunity” to respond to features contained in the input data (Figure 12) [1].

Figure 12.

Unsupervised learning.

Advertisement

6. Neural networks’ applications in digital signal processing

Digital signal processing could be defined using field of interest statement of the IEEE Signal Processing Society as follows:

Signal processing is the enabling technology for the generation, transformation, extraction, and interpretation of information. It comprises the theory, algorithms with associated architectures and implementations, and applications related to processing information contained in many different formats broadly designated as signals. Signal processing uses mathematical, statistical, computational, heuristic, and/or linguistic representations, formalisms, modelling techniques and algorithms for generating, transforming, transmitting, and learning from signals. [6].

Based on this definition, many neural network structures could be developed to achieve the different processes mentioned in the definition.

6.1. Classification

One of the most important applications of an artificial neural network is classification, which can be used in different digital signal processing applications such as speech recognition, signal separation, and handwriting recognition and detection [7].

The objects of interest can be classified according to their features, and classification process could be considered as probability process, since the classification of any object under a given class depends on the likelihood that the object belongs to the class more than the probability of belonging to the other classes [8].

Assume that X is the vector of features for the objects of interest which could be classified into classes cψ where ψ is the pool of classes. Then, classification will be applied as follows:

X belongs to the class ci if PciX>PCjX when ijE10

To decrease the difficulty of solving probability equations in Eq. (10), discriminant function is used, and then Eq. (10) will be.

QiX>QjX     ifciX>PCjXwhen ijE11

Classification process will be described using Eq. (12)

Xbelongs to the classciifQiX>QjXE12

One of the examples of classification is QPSK modulator output detection, where detection is considered as a special case of classification.

Assume that the received signal is X:

X=s+nE13

where n is normally the distributed noise signal and s is the transmitted signal.

The output of QPSK modulator is shown in Figure 13, where the samples are arranged in four classes.

Figure 13.

QPSK modulator output.

By adding white Gaussian noise, the received signal will be as shown in Figure 14.

Figure 14.

QPSK output with noise.

The neural network shown in Figure 15 is used to detect and demodulate the received signal, where the network consists of one hidden layer with five neurons and an output layer with two neurons.

Figure 15.

Structure of neural network.

Figures 16 and 17 show the performance of neural network evaluated using mean squared error (MSE) criteria.

Figure 16.

MSE of training, validation, and test vs no. of epochs.

Figure 17.

Training parameters and results.

6.2. Time series prediction

A series is a sequence of values as a function of parameter; in the case of time series, the values will be as a function of the time. So, many applications use time series to express their data, for example, metrology, where the temperature is described as time series [7].

The interesting problem in time series is the future prediction of the series values; neural networks can be used to predict the future results in series in three ways [9]:

  • Predict the future values based on the past values of the same series; this way can be described by

    ŷt=Eytyt1yt2E14

  • Predict the future values based on the values of relevant time series, where

    ŷt=Eytxtxt1xt2E15

  • Predict the future values based on both previous cases, where

    ŷt=Eytxtxt1xT2yt1yt2E16

Figure 18 shows the predicted series by neural network based on the first way, where the samples of original series are given of determined period, and then the neural network predicts the future values of the series based on series behavior.

Figure 18.

Time series prediction.

6.3. Independent component analysis

The goal of the independent component analysis (ICA) is to separate the linearly mixed signals. ICA is a type of blind source separation when the separation is performed without the pre-information about the source of signals or the signal-mixing coefficients. Although the problem of separating the blind source, in general, is not specified, the solution of use can be obtained under some assumptions [10].

ICA model assumes that n independent signals sit where i=1,2,3,.,n are mixed using matrix:

A=a11a12a21a22a1na2nan1an2annE17

Then, mixed signal xit could be expressed as

xit=i=1nj=1naijsjtE18

As the separation process is blind, that is, both aij and sjt are unknown; thus, ICA assumes that the mixed signals are statistically independent and have non-Gaussian distribution [11].

Neural network shown in Figure 19 is used to estimate the unmixing matrix W.

Figure 19.

ICA neural network [12].

The separated signals yit are given as

y=WxE19
W=w11w12w21w22w1nw2nwn1wn2E20

Different methods could be applied to find W, for example, natural gradient based define W as

dWdt=ηt1fytgTytWE21

where ηt is the training factor and both f and g are the odd functions.

References

  1. 1. Haykin S. Neural Networks and Learning Machines. 3rd ed. Hamilton, Ontario, Canada: Pearson Education, Inc.; 2009. 938 p
  2. 2. Smith S. The Scientist and Engineer's Guide to Digital Signal Processing. 2011
  3. 3. Hu Y, Hwang J. Handbook of Neural Network Signal Processing. Boca Raton: CRC Press; 2002
  4. 4. Milad MAMRAN. Neural Network Demodulator For. International Journal of Advanced Studies. 2016;5(7):10-14
  5. 5. Jan Michalík. Applied Neural Networks for Digital Signal Processingwith DSC TMS320F28335. Technical Univerzity of Ostrava; 2009
  6. 6. Constitution. IEEE Signal Processing Society [Online]. 2018. Available from: https://signalprocessingsociety.org/volunteers/constitution [Accessed: March 03, 2018]
  7. 7. Kriesel D. A Brief Introduction to Neural Network. Bonn: University of Bonn in Germany; 2005
  8. 8. Gurney K. An Introduction to Neural Networks. London: CRC Press; 1997
  9. 9. Maier H, Dandy G. Neural networks for the prediction and forecasting of water resources variables: A review of modelling issues and applications. Environmental Modelling & Software. 2000;15(1):101-124
  10. 10. Hansen LK, Larsen J, Kolenda T. On Independent Component Analysis for Multimedia Signals, Multimedia Image and Video Processing. CRC Press; 2000. pp. 175-199
  11. 11. Mørup M, Schmidt MN. Transformation invariant sparse coding, Machine Learning for Signal Processing, IEEE International Workshop on (MLSP), Informatics and Mathematical Modelling. Technical University of Denmark, DTU; 2011
  12. 12. Pedersen MS, Wang D, Larsen J, Kjems U. Separating Underdetermined Convolutive Speech Mixtures, ICA2006. 2006

Written By

Amer Zayegh and Nizar Al Bassam

Submitted: 28 February 2018 Reviewed: 20 July 2018 Published: 05 November 2018