Artificial intelligence (AI) is a branch of computer science that seeks to create intelligence. While humans have been using computers to simplify several tasks, AI provides new options to use computers. For instance, voice recognition software uses AI to transform the sounds to the equivalent text words. There are several techniques that AI includes. An artificial neural network (ANN) is one of these techniques.
Humans use their intelligence to solve complex problems and perform daily tasks. Human intelligence is provided by the brain. Small processing units called neurons are the main components of the human brain. ANNs try to imitate partially some of the human brain behavior. Thus, artificial neurons are designed to mimic the activities of biological neurons.
Humans learn by experience: they are exposed to events that encourage their brains to acquire knowledge. Similarly, ANNs extract information froma data set; this set is typically called the training set and is organized in the same way that schools design their courses’ content. ANNs provide an excellent way to understand better biological neurons. In practice, some problems may be described by a data set. For instance, an ANN is typically trained using a data set. For some problems, building a data set may be very difficult or sometimes impossible as the data set has to capture all possible cases of the experiment.
Simulated annealing (SA) is a method that can be used to solve an ample set of optimization problems. SA is a very robust technique as it is not deceived with local minima. Additionally, a mathematical model is not required to apply SA to solve most optimization problems.
This chapter explores the use of SA to train an ANN without the requirement of a data set. The chapter ends with a computer simulation where an ANN is used to drive a car. Figure 1 shows the system architecture. SA is used to provide a new set of weights to the ANN. The ANN controls the acceleration and rotation speed of the car. The car provides feedback by sending vision information to the ANN. The distance traveled along the road fromthe Start is used by the method of SA. At the beginning of the simulation the ANN does not know how to drive the car. As the experiment continues, SA is used to train the ANN. Each time the temperature decreases, the ANN improves its driving skills. By the end of the experiment, when the temperature has reached its final value, the ANN and the car have evolved to the point that they can easily navigate a maze.
2. Artificial neural networks
An ANN is a computational method inspired in biological processes to solve problems that are very difficult for computers or humans. One of the key features of ANNs is that they can adapt to a broad range of situations. They are typically used where a mathematical equation or model is missing, see . The purpose of an ANN is to extract, map, classify or identify some sort of information that is allegedly hidden in the input, .
The human brain is composed of processing units called neurons. Each neuron is connected to other neurons to form a neural network. Similarly, the basic components of an ANN are the neurons. Neurons are arranged in layers inside the ANN. Each layer has a fixed number of neurons, see . For instance, the ANN shown in the Figure 2 has three layers: the input layer, the hidden layer, and the output layer.
An ANN accepts any kind of input that can be expressed as a set of numeric values; typical inputs may include: an image, a sound, a temperature value, etc. The output of an ANN is always dependent upon its input. That is, a specific input will produce a specific output. When a set of numeric values is applied to the input of an ANN, the information flows from one neuron to another until the output layer generates a set of values.
2.2. Activation function
The internal structure of an artificial neuron is shown in Figure 3(a). The output value z is given by:
where each wiis called weight. A fixed input, known as Bias, is applied to the neuron, its value is always 1 and wbis the respective weight for this input. The neuron includes also an activation function denoted by f (y) in Figure 3(a). Without the Bias, the output of the network would be f (0) when all inputs are zero. One common activation function used in multilayer ANNs is:
A neuron can be active or inactive, when the neuron is active its output is 1, when it is inactive its output value is 0. Some input values activate some neurons, while other values may activate other neurons. For instance, in Figure 4, the sound of the word Yeswill activate the first output neuron, while the sound of the word No will activate the second neuron at the output of the network. The structure of the ANN shown in this figure is very simple with only two neurons.
Before an ANN can be used for any practical purpose, it must be trained. An ANN learns during its training. For the duration of the learning process the ANN weights are recurrently adjusted.
3.1. Structured learning
In some instances, ANNs may learn from a data set. This set is typically known as the training set, and it is used on a new ANN (as its name indicates) for training. The training set has two parts: the input and the target. The input contains the set of inputs that must be applied to the network. The target includes the set of desired values at the output of the ANN. In other words, each sample (case) in the training set completely specifies all inputs, as well as the outputs that are desired when those inputs are presented, see . During the training, each case in the training set is presented to the network, and the output of the network is compared with the desired output. After all cases in the training set have been processed, an epoch or iteration has completed by updating the weights of the network. There are several methods for updating the weights at each epoch or iteration. All these methods update the weights in such a way that the error (measured between the actual output of the network and the desired output) is reduced at each epoch, see .
Some training methods are based on the gradient of the error (they are called gradient based methods). These methods quickly converge to the closest local minima. The most typical gradient based methods to train an ANN are: the variable metric method, the conjugate gradient method, and method of Levenberg-Marquardt. To make the training of an ANN robust, it is always recommended to combine gradient based methods with other optimization methods such as SA.
When an ANN is trained using a data set, typically the set includes many training cases, and the training is done at once. This kind of training is called structured learning because the knowledge is organized in the data set for the network to learn. One of the main disadvantages of structured learning is that the training set has to be prepared to describe the problem at hand. Another disadvantage of structure learning is that if more cases are added to the training set, the ANN has again to be trained starting from scratch. As ANN training is time consuming, structured learning may be inadequate for problems that require continuous adaptation.
3.2. Continuous learning
In continuous learning, an ANN does not require a data set for training, the ANN learns by experience and is able to adapt progressively by incorporating knowledge gradually. Because some problems cannot be appropriately described by a data set, and because training using a data set can be time consuming, continuous learning is important for real-time computing (RTC) where there is a "real-time constraint".
After the ANN training has been completed, the network performance has to be validated. When using ANNs, the validation process is extremely important, as a matter of fact, the validation is as important as the training, see . The purpose of the validation is to predict how well the ANN will behave in other conditions in the future. The validation process may be performed using a data set called the validation set. The validation set is similar to the training set but not equal. Under normal circumstances (when the ANN is properly used), the error obtained during training and during validation should be similar.
4. Simulated annealing
SA is an optimization method that can be used to solve a broad range of problems, . SA is recommended for complex optimization problems. The algorithm begins at a specifictemperature; as time passes the temperature gradually decreases following a cooling scheduleas shown in Figure 5. The solution is typically described by a set of variables, but it canbe described by other means. Once the algorithm has started, the solution approachesprogressively the global minimum that presumably exists in a complex error surface, see and . Because of its great robustness, SA has been used in many fields including thetraining of ANNs with structured learning, .
One of the key features of SA is that it always provides a solution, even though the solution may not be optimal. For some optimization problems that cannot be easily modeled, SA may provide a practical option to solve them.
5. Simulated annealing evolution
Simulated annealing evolution includes the use of: ANNs, continuous learning and SA. In simulated annealing evolution, an ANN does not require a training set; instead the ANN gradually learns new skills or improves existing ones by experience. Figure 5 shows how SA evolution works. In this figure, a typical cooling schedule used in SA is displayed. Suppose that there is a 2D landscape with valleys and hills as shown in this figure. Suppose also that it is desired to find the deepest valley on this landscape. Each of the balls, in this figure, represents an ANN. At the beginning of the simulation, the high initial temperature produces a state of high energy; the balls shake powerfully and are able to traverse easily through the high hills of the 2D terrain. In other words, each ANN is exploring, that is, the ANN is in the initial step of learning. As the temperature decreases, the energy of the balls decreases, and the movement of the balls is more restricted than at high temperatures, see . Thus, as the temperature diminishes, the ANN has less freedom to explore new information as the network has to integrate new knowledge with the previous one. By the end of the cooling schedule, it is desired that one of the balls reached the deepest valley in the landscape, in other words, that one ANN learned a set of skills.
At each temperature, an ANN (in the set) has the chance to use its knowledge to perform a specific task. As the temperature decreases, each ANN has the chance to improve its skills. If the ANNs are required to incorporate new skills, temperature cycling can be used, see . Specifically, an ANN may learn by a combination of SA and some sort of hands-on experience. Thus, simulated annealing evolution is the training of an ANN using SA without a training set.
Each ANN may be represented by a set of coefficients or weights. For illustrative purposes only, suppose that an ANN may be described by a set of two weights w11 and w12. In Figure6, there are three individuals, each is represented by a small circle with its two weights: w11 and w12. At every temperature, each ANN is able to explore and use its abilities. At high temperatures, each ANN has limited skills and most of the ANNs will perform poorly at the required tasks. The gray big shadow circle in the figure indicates how the ANNs are considering different set of values w11 and w12 for testing and for learning. As temperature decreases, the ANNs get closer to each other, illustrating the fact that most of them have learned a similar set of skills. By the time the temperature has reached its final value, the skills of each ANN will be helpful to the degree that the network is able to perform the task at hand. At this moment, the simulation may end or the temperature may increase, if there are new skills that need to be incorporated.
6. Problem description
To illustrate how to use simulated annealing evolution, this section presents a simple learning problem. The problem consists of using an ANN to drive a car in a simple road. Clearly, the problem has two objects: the road and the car. The road includes a Start point and a Finish point as shown in Figure 7. The road includes several straight lines and turning points. The car is initially placed at the Start. The driving is performed by an ANN that was integrated with the car. Specifically, the ANN manipulates indirectly several parameters in the car such as position, speed and direction. The purpose of the simulation is to train the ANN without a training set. At the beginning of the simulation, the ANN does not know how to drive; as the evolution continues the ANN learns and improves its skills being its goal to drive the car quickly from the Start to the Finish without hitting the bounds of the road (that is, the car must always stay inside the road). To solve this problem simulated annealing evolution was used to train the ANN.
The simulation was performed using object oriented programming (OOP); the respective UML diagrams for the simulation are shown in Figures 8 and 9. The two basis classes are shown in the diagram of Figure 8. This diagram includes two classes: Point and Bounds. The Point class in the diagram represents a point in a 2-Dimensional space, the diagram indicates that this structure includes only two floating point variables: x and y. The Bounds class, in the same UML diagram, is used to describe the bounds of an object when the object is at different positions and rotations angles. The main purpose of the Bounds class is to detect collisions when one object moves around other objects.
6.1. The object class
Figure 9 shows the respective UML diagram for the Object class. This class represents a static object in a 2-Dimensional space. The class name is displayed in italics indicating that this class is abstract. As it can be seen the Render() method is displayed in italics, and hence, it is a virtual method and must be implemented by the non-abstract derived classes. If the experiment includes some sort of visualization, the Render() method may be used to perform custom drawing. There are many computer technologies that can be used to perform drawing, some of them are: Microsoft DirectX, Microsoft GDI, Microsoft GDI+, Java Swing and OpenGL. From the UML diagram of Figure 9, it can be seen that each object has a position, a rotation angle (theta) and a set of bounds. ThemethodIsCollision() takes another object of type Object and returns true when the bounds of one object collidewith the bounds of another object. This method was implemented using a simple version of the algorithm presented by .
6.2. The mobile class
This class represents a moving object in a 2-Dimensional space. Each Mobile object has a speed, an acceleration and a rotation speed as shown in the UML diagram of Figure 9. This class is derived directly from the Object class, and therefore, is also abstract as the Render() method is not implemented. The UpdateBounds() method for this class takes the number of seconds at which the object bounds will be computed. This method is extremely useful when moving an object around other objects, for instance, if the bounds of one object intersect with the bounds of another object, the object cannot move, this is implemented internally in the method IsCollision(). To update the bounds of a Mobile object, the speed of the object may be computed using Equation 3,
wherev1 is the initial velocity of the object, v2 is its final velocity, a is the acceleration, and t stands for the time for which the object moved. Similarly, the position of the object can be updated using another Kinematic equation. To compute the new position of the object Equation 4 can be used,
where the symbol d stands for the displacement of the object. In most cases, the object moves along a line described by its rotation angle and its position. Thus d has to be accordingly projected in the coordinate system as shown by Equations 5 and 6,
In some cases, the object may be rotating at a constant speed and, hence, the rotation angle has also to be updated at each period of time.
During the experiments, the methods UpdateBounds() and IsCollision() of the Mobile class are used together to prevent object collision. First, the simulator calls the UpdateBounds() method to compute the bounds of the object at some specific time, and then will call the IsCollision() method to check for potential collisions with other objects.
6.3. The road
The simulation experiments were performed using only two classes: the Road class and the Car class. The UML diagram of the Road class is shown in Figure 9. The Road class is derived directly from the Object class; the method Render() is implemented to draw the road displayed in Figure 7 using Microsoft GDI and Scalable Vector Graphics (SVG).When the car leaves the Start, the ANN has to make its first right turn at 90°, as the car is just accelerating this turn is easy. The following turn is also to the right at 90°and the ANN should not have any trouble making this turn. Then, if the ANN wants to drive the car to point A, it has to make two turns to the left.
The straight segment from A to B should be easy to drive; unfortunately, the ANN maycontinually accelerate, and reach point B at a very high speed. The turn at point B is themost difficult of all the turns, because the car has to make a right turn at 90°at high speed.Because the simulation is over when the car hits the bounds of the road, as soon as the ANNcan see the turn of point B, it has to start reducing the speed of the car. Once the ANN hasmanaged to drive to point B, reaching the Finish should not be difficult.
6.4. The car
The car used for the simulation is shown in Figure 10. The car has a position represented byx and y in Figure 10; the car rotation is represented by θ. The speed and acceleration vectorsare represented by v and a respectively. The arrow next to the rotation speed in Figure 10indicates that the car is capable of turning. The car has several variables to store its state(position, theta, speed, acceleration, rotationSpeed and neuralNetwork) as shown in the UMLdiagram of Figure 9. The Car class derives directly from the Mobile class and implements themethod Render() to draw the car of Figure 10 usingMicrosoft GDI and SVG.When v = 0 anda = 0 the values of x and y do not change. When v ≠0 and a = 0, the values of x and ywill change while v remains constant. When a ≠0, the values of x, y and v will change. Themethod GetDistance() computes the distance that the car has traveled along the road fromthe Start, this distance is represented by s in Figure 7.
Figure 11 illustrates how the car is able to receive information about its surroundings. Thecar had seven vision points illustrated by the arrows in the figure. To prevent the ANN fromdriving backwards, no vision lines were placed in the back of the car. Each value of d1, d2,...d7, represents the distance from the center of the car to the closest object in the directionof the vision line. These values were computed using the bounds of the road. To create amore interesting environment for the ANN, the values of d1, d2,...d7 were computed at lowresolution and the car could not see objects located away from it.
In real life, a car driver is not able to modify directly the position or velocity of the car, thedriver only controls the acceleration and the turning speed. Asmention before, each car in the simulation has an ANN to do the driving. At each period of time, the ANN receives the visioninformation from the surroundings and computes the acceleration and the rotation speed ofthe car. Figure 12 shows the ANN of the car, the ANN has 8 inputs and two outputs. As itcan be seen from this figure, the speed of the car is also applied to the input of the network;this is very important because the ANN needs to react differently depending on the currentspeed of the car. For instance, if the car moves a high speeds and faces a turn, it needs toappropriately reduce its speed before turning. As a matter of fact, the ANN needs to be readyfor an unexpected turn and may regulate the speed of the car constantly.
7. Experimental results
This section explains how SA was used to train the ANN. The implementation of SA wasdivided in three steps: initialization, perturbation and error computation.
The ANN training process using SA is illustrated in Figure 13. The simulation starts byrandomly setting the network weights using a uniform probability distribution U[-30, 30].In the second block, a copy of the weights is stored in a work variable. In the third block,the temperature is set to the initial temperature. For the simulation experiments, the initialtemperature was set to 30. In the fourth block (iteration = 1), the optimization algorithmbegins the first iteration. Then, the work variable (a set of weights) is perturbed. After theperturbation is completed, the ANN weights are set to these new weights. At this moment,the ANN is allowed to drive the car and the error is computed as shown in the figure. Thetemperature decreases exponentially and the number of iterations is updated as shown inthe flow diagram. The simulation ends when the error reaches the desired goal or when thetemperature reaches its final value (a value of 0.1 was used).
The cooling schedule used in the simulations is described by Equation 7,
whereTj+1 is the next temperature value, Tjis the current temperature, and 0 <c <1. Clearly,the cooling schedule of Equation 7 is exponential and slower than a logarithmic one, thereforeSimulated Quenching (SQ) is being used for the training of the ANN, see .
Observe, that each time the ANN weights are perturbed, the ANN is allowed to drive the car.Then, the error is computed and the oracle makes a decision about whether the new set ofweights is accepted or rejected using the probability of acceptance of Equation 8, see  and. Some implementations of SA accept a new solution only if the new solution is better thanthe old one, i.e. it accepts the solution only when the Error decreases; see  and . Theprobability of acceptance is defined as
A uniform probability distribution was used to generate states for subsequent consideration.At high temperatures, the algorithm may frequently accept an ANN (a set of weights) evenif the ANN does not drive better than the previous one. During this phase, the algorithmexplores in a very wide range looking for a global optimal ANN, and it is not concernedtoo much with the quality of the ANN. As the temperature decreases, the algorithm is moreselective, and it accepts a new ANN only if its error is less than or very similar to the previoussolution error following the decision made by the oracle.
7.1. SA initialization
Because of the properties of the activation function of Equation 2, the output of an ANN islimited. As mentioned before, an ANN is trained by adjusting the weights that connect theneurons. The training of an ANN can be simplified, if the input applied to the ANN is limited.Specifically, if the input values are limited from −1 to 1, then the ANN weights are limited toapproximately from −30 to 30,  and . To simplify the simulation, the input values of theANN were scaled from −1 to 1. Therefore, the SA initialization consisted in simply assigninga random value from −30 to 30 using a uniform probability distribution to each of the ANNweights as shown in the C++ code shown in Figure 14. Observe that the random numbergenerator uses the (ISO/IEC TR 19769) C++ Library Extensions TR1: default_random_engineand uniform_real. In this case, the ANN has two sets of weights: the hidden weights and theoutput weights. Each set of weights was stored in a matrix using the vector template fromtheStandard Template Library (STL); each matrix was built using a vector of vectors.
7.2. SA perturbation
The code of Figure 15 shows the implementation of the SA perturbation using the C++language. First, each ANN weight was perturbed by adding a random value from –Tto T using a uniform probability distribution (tr1::uniform_real), where T is the currenttemperature. Second, if the perturbed weight was outside the valid range from −30 to 30,the value was clipped to ensure that the weight remained inside the valid range.
7.3. SA error computation
In order to measure the driving performance of the ANN, an error function E was defined asshown in Equation 9,
where the variable s represents the distance along the road measured from the Start to thecurrent position of the car as shown in Figure 7. As it can be seen the value of the errordecreases as the car drives along the road. The smallest error is accomplished when the carreaches the Finish.
The code of Figure 16 illustrates the implementation of the error function. The function startsby setting the ANN weights. The variable deltaTimeSecis used to refresh the simulation,a value of 16.7 milliseconds was used; it provides a refreshing frequency of 60 Hz (so thatthe simulation could be rendered on a computer display at 60 frames per second). Next, thefunction begins a while block, at each iteration the car bounds are updated and the simulationchecks for a collision between the car and the road. If there is a collision the simulation stopsand the error is computed. If there are not collisions, the ANN computes vision informationand updates the acceleration and rotation speed of the car.
Several experimental simulations were performed using different configurations to analyzethe behavior of the ANN and the car.In the first simulation, the speed of the car was not applied at the input of the ANN, in allcases, the ANN was not able to turn at point B. At some unexpected point, the ANN was ableto see the approaching turn of point B and did not have enough time to reduce the speed ofthe car.
In the second simulation, the number of neurons in the hidden layer was varied from 0 to5. When the number of neurons in the hidden layer was zero, the ANN was able to drivethe car to the Finish in 90% of the cases. When the number of neurons in the hidden layerwas increased to one, the car was always able to get to the Finish. It was also observedthat the ANN was driving faster when using more neurons in the hidden layer, thus, the carwas getting to its destination quicker. When the number of neurons in the hidden layer wasincreased to 5, there were not any noticeable changes in the performance of the car than whenthe ANN had 4 neurons in this layer.
The third experiment consisted in varying the number of vision lines described in Figure 11.The number of vision lines was varied from 3 to 7. When using 3 vision lines, the ANN wasable to reach 50% of the times to point A, 10% of the cases to point B and it was never able toget to the Finish. When the number of vision lines was set to 4, the ANN was able to drive thecar to the Finish in 50% of the cases. When the number of vision lines was set to 5, 6 or 7, theANN was always able to drive the car to the Finish. However, the ANN was always drivingfaster when using more vision lines.
The SA parameters were set to be compatible with the ANN weights. The initial temperaturewas 30, the final temperature was 0.1. Some experiments were performed by using lowerfinal temperatures, but there were not any noticeable changes in the performance of the ANN.The number of temperatures was set to 10 using 20 iterations per temperature. Some testswere performed using more numbers of iterations, but there were not improvements. All thesimulations were run using an exponential cooling schedule.
To validate the training of the ANN, another road similar to the shown in Figure 7 was built.In all cases, the ANN behaved similar in both roads: the road for training and the road forvalidation.
An ANN is a method inspired in biological processes. An ANN can learn from a training set.In this case, the problem has to be described by a training set. Unfortunately, some problemscannot be easily described by a data set. This chapter proposes the use of SA to train an ANNwithout a training set. We call this method simulated annealing evolution because, the ANNlearns by experience during the simulation.
Simulated annealing evolution can be used to train an ANN in an ample set of cases.Because human beings learn by experience, simulated annealing evolution is similar to humanlearning.
An optimization problem was designed to illustrate how to use SA to train an ANN. Theproblem included a car and a road. An ANN was used to drive the car in a simple road.The road had several straight segments and turning points. The objective of the ANN was todrive the car from the Start to the Finish of the road. At the beginning of the simulation, thecar was placed at the Start and the ANN weights were set to random values. Obviously, theANN could not drive too far the car without hitting the bounds of the road, and stopping thesimulation. By the time the temperature reached its final value, the ANN was able to drivesuccessfully to the Finish of the road as it will be briefly described.
During the simulation, the car had a set of vision lines to compute the distance to the closestobjects. The distance from each vision line (measured from the car to the closest object) wasapplied to the input of an ANN. It was noticed that the ANN performed much better whenthe speed of the car was also applied to the input of the ANN.
The number of neurons in the hidden layer of the ANN was varied during the simulations. Itwas observed that when the number of neurons in the hidden layer was increased, the ANNwas able to reach quicker the Finish. It was observed also that when using 5 or more neuronsin the hidden layer, the performance of the ANN did not improve. It was noticed, however,that when using zero neurons in the hidden layer, the ANN could not always drive the car tothe Finish.
The car vision consisted in a set of lines. Experimental simulations were performed varyingthe number of vision lines form 3 to 7. The experimental results indicated that when 3 visionlines are used, the ANN does not have enough information and cannot drive successfullyto the Finish. It was observed also that when the number of vision lines was increased, thedriving of the ANN was smoother. Finally, it was noticed that when the number of visionlines is increased to 8 or more, the ANN did not improve its performance (meaning that therewere not observable changes in its driving).