Underactuation, impulsive nature of the impact with the environment, the existence of feet structure and the large number of degrees of freedom are the basic problems in control of the biped robots. Underactuation is naturally associated with dexterity . For example, headstands are considered dexterous. In this case, the contact point between the body and the ground is acting as a pivot without actuation. The nature of the impact between the lower limbs of the biped walker and the environment makes the dynamic of the system to be impulsive. The foot-ground impact is one of the main difficulties one has to face in design of robust control laws for biped walkers . Unlike robotic manipulators, biped robots are always free to detach from the walking surface and this leads to various types of motions . Finally, the existence of many degrees of freedom in the mechanism of biped robots makes the coordination of the links difficult. According to these facts, designing practical controller for biped robots remains to be a challenging problem . Also, these features make applying traditional stability margins difficult.
In fully actuated biped walkers where the stance foot remains flat on the ground during single support phase, well known algorithms such as the Zero Moment Point (ZMP) principle guarantees the stability of the biped robot . The ZMP is defined as the point on the ground where the net moment generated from ground reaction forces has zero moment about two axes that lie in the plane of ground. Takanishi , Shin , Hirai  and Dasgupta  have proposed methods of walking patterns synthesis based on ZMP. In this kind of stability, as long as the ZMP lies strictly inside the support polygon of the foot, then the desired trajectories are dynamically feasible. If the ZMP lies on the edge of the support polygon, then the trajectories may not be dynamically feasible. The Foot Rotation Indicator (FRI)  is a more general form of the ZMP. FRI is the point on the ground where the net ground reaction force would have to act to keep the foot stationary. In this kind of stability, if FRI is within the convex hull of the stance foot, the robot is possible to walk and it does not roll over the toe or the heel. This kind of walking is named as fully actuated walking. If FRI is out of the foot projection on the ground, the stance foot rotates about the toe or the heel. This is also named as underactuated walking. For bipeds with point feet  and Passive Dynamic walkers (PDW)  with curved feet in single support phase, the ZMP heuristic is not applicable. Westervelt in  has used the Hybrid Zero Dynamics (HZD) ,  and Poincaré mapping method - for stability of RABBIT using underactuated phase. The controller proposed in this approach is organized around the hybrid zero dynamics so that the stability analysis of the closed loop system may be reduced to a one dimensional Poincaré mapping problem. HZD involves the judicious choice of a set of holonomic constraints that were imposed on the robot via feedback control . Extracting the eigenvalues of Poincaré return map is commonly used for analyzing PDW robots. But using of eigenvalues of Poincaré return maps assumes periodicity and is valid only for small deviation from limit cycle .
The ZMP criterion has become a very powerful tool for trajectory generation in walking of biped robots. However, it needs a stiff joint control of the prerecorded trajectories and this leads to poor robustness in unknown rough terrain  while humans and animals show marvelous robustness in walking on irregular terrains. It is well known in biology that there are Central Pattern Generators (CPG) in spinal cord coupling with musculoskeletal system -. The CPG and the feedback networks can coordinate the body links of the vertebrates during locomotion. There are several mathematical models which have been proposed for a CPG. Among them, Matsuoka's model - has been studied more. In this model, a CPG is modeled by a Neural Oscillator (NO) consisting of two mutually inhibiting neurons. Each neuron in this model is represented by a nonlinear differential equation. This model has been used by Taga ,  and Miyakoshi  in biped robots. Kimura ,  has used this model at the hip joints of quadruped robots.
The robot studied in this chapter is a 5-link planar biped walker in the sagittal plane with point feet. The model for such robot is hybrid  and it consists of single support phase and a discrete map to model the frictionless impact and the instantaneous double support phase. In this chapter, the goal is to coordinate and control the body links of the robot by CPG and feedback network. The outputs of CPG are the target angles in the joint space, where P controllers at joints have been used as servo controllers. For tuning the parameters of the CPG network, the control problem of the biped walker has been defined as an optimization problem. It has been shown that such a control system can produce a stable limit cycle (i.e. stride). The structure of this chapter is as follows. Section 2 models the walking motion consisting of single support phase and impact model. Section 3 describes the CPG model and tuning of its parameters. In Section 4, a new feedback network is proposed. In Section 5, for tuning the weights of the CPG network, the problem of walking control of the biped robot is defined as an optimization problem. Also the structure of the Genetic algorithm for solving this problem is described. Section 6 includes simulation results in MATLAB environment. Finally, Section 7 contains some concluding remarks.
2. Robot model
The overall motion of the biped involves continuous phases separated by abrupt changes resulting from impact of the lower limbs with the ground. In single support phase and double support phase, the biped is a mechanical system that is subject to unilateral constraints -. In this section, the biped robot has been assumed as a planar robot consisting of rigid links with revolute and parallel actuated joints to form a tree structure. In the single support phase, the mechanical system consists of DOF, where DOF associated with joint coordinates which are actuated, two DOF associated with horizontal and vertical displacements of the robot in the sagittal plane which are unactuated, and one DOF associated with orientation of the robot in sagittal plane which is also unactuated. With these assumptions, the generalized position vector of the system ( ) can be split in two subsets and . It can be expressed as
where encapsulates the joint coordinates and which is the unactuated DOF between the stance leg and the ground. Also is the Cartesian coordinates of the stance leg end.
A. Single support phase
Figure 1 depicts the single support phase and configuration variables of a 5-link biped robot ( ). In the single support phase, second order dynamical model immediately follows from Lagrange's equation and the principle of virtual work -
where is the symmetric and positive definite inertia matrix, includes centrifugal and Coriolis terms and is the vector containing gravity terms. Also includes the joint torques applied at the joints of the robot, is the input matrix, includes the joint frictions modeled by viscous and static friction terms, is the Jacobian at the stance leg end. Also is the ground reaction force at the stance leg end. With setting in (2), the dynamic equation of the mechanical system can be rewritten as the following form
where is the total mass of the robot. If we assume that the Cartesian coordinates have been attached to the stance leg end and the stance leg end is stationary (i.e. in contact with the ground and not slipping), these assumptions (i.e. ) will allow one to solve for the ground reaction force as explicit functions of , . Also, the dynamic equation in (3) will be reduced with this assumptions and this will lead to a lower dimensional mechanical model which describes the single support phase if the stance leg end is stationary as follows
where and is a nonlinear mapping of . Also is the state space of the reduced model where is a simply connected, open subset of . Note that is an unactuated DOF in (4) (i.e. without actuation) and hence . It can be shown that
where and are the coordinate of the mass center of link and the mass center of the robot, respectively, is the mass of the link and is the gravitational acceleration. With assumption and , we have
where is the Jacobian matrix at the center of mass, also and . The validity of the reduced model in (4) is dependent on two following conditions
where is the static friction coefficient between the stance leg end and the ground. The first condition in (8) is to ensure that the stance leg end is contact with the walking surface and the second condition is to ensure that the slipping dos not occur at the stance leg end . The dynamic equation of (4) in the state-variable is expressed as where is the state vector. If we assume that and , we get and
B. Frictionless impact model
In this section, following assumptions are done for modeling the impact :
A1. the impact is frictionless (i.e. ). The main reason for this assumption is the problem arising of the introducing of dry friction ;
A2. the impact is instantaneous;
A3. the reaction forces due to the impact at impact point can be modeled as impulses;
A4. the actuators at joints are not impulsive;
A5. the impulsive forces due to the impact may result in instantaneous change in the velocities, but there is no instantaneous change in the positions;
A6. impact results in no slipping and no rebound of the swing leg; and
A7. stance foot lifts from the ground without interaction.
With these assumptions, impact equation can be expressed by the following equation
where is the impulsive force at impact point and is the Jacobian matrix at the swing leg end. The assumption A6 implies that impact is plastic. Hence, impact equation becomes
This equation is solvable if the coefficient matrix has full rank. The determinant of the coefficient matrix is equal to and it can be shown that the coefficient matrix has full rank iff the robot is not in singular position. The solution of the equation in (11) can be given by the following equation
and also and . The map from to without relabeling is
After solving these equations, it is necessary to change the coordinates since the former swing leg must now become the stance leg. Switching due to the transfer of pivot to the point of contact is done by relabeling matrix ,  . Hence, we have
In equation (16), is the impact mapping where is the set of points of the state-space where the swing leg touches the ground. and are the state vector of the system after impact and the state vector of the system before impact, respectively. Also, we have
where by . The ground reaction force due to the impact can be shown as the following form
where by . The validity of the results of equation (17) depends on two following conditions
where and . The first condition is to ensure that the swing foot lifts from the ground at . The second condition is to ensure that the impact results in no slipping . The valid results are used to re-initialize the model for next step. Furthermore, the double support phase has been assumed to be instantaneous. If we define
the hybrid model of the mechanical system can be given by
where . For , this model is not valid. Also the validity conditions in (8) can not be expressed only as a function of and they can be expressed as a function of .
3. Control system
Neural control of human locomotion is not yet fully understood, but there are many evidences suggesting that the main control of vertebrates is done by neural circuits called central pattern generators (CPG) in spinal cord which have been coupled with musculoskeletal system. These central pattern generators with reflexes can produce rhythmic movements such as walking, running and swimming.
A. Central pattern generator model
There are several mathematical models proposed for CPG. In this section, neural oscillator model proposed by Matsuoka has been used , . In this model, each neural oscillator consists of two mutually inhibiting neurons (i.e. extensor neuron and flexor neuron). Each neuron is represented by the following nonlinear differential equations
where suffixes and mean flexor muscle and extensor muscle, respectively. Also suffix means the th oscillator. is the inner state of th neuron, is the output of the th neuron, is a variable which represents the degree of self-inhibition effect of the th neuron, is an external input from brain with a constant rate and is a feedback signal from the mechanical system which can be an angular position or an angular velocity. Moreover, and are the time constants associated with and , respectively, is a constant representing the degree of the self-inhibition influence on the inner state and is a connecting weight between the th and th neurons. Finally, the output of the neural oscillator is a linear combination of the extensor neuron inner state and the flexor neuron inner state
The positive or negative value of corresponds to activity of flexor or extensor muscle, respectively. The output of the neural oscillator can be used as a reference trajectory, joint torque and phase. In this chapter, it is used as a reference trajectory at joints. The studied robot (see Fig. 1) has four actuated joints (i.e. hip and knee joints of the legs). We assume that one neural oscillator has been used for generating reference trajectories at each of the actuated joints.
B. Tuning of the CPG parameters
The walking period is a very important factor since it much influences stability, maximum speed and energy consumption. The walking mechanism has its own natural frequency determined mainly by the length of the links of the legs. It appears that humans exploit the natural frequencies of their arms, swinging pendulums at comfortable frequencies equal to the natural frequencies . Human arms can be thought of as masses connected by springs, whose frequency response makes the energy and the control required to move the arm vary with frequency . Humans certainly learn to exploit the dynamics of their limbs for rhythmic tasks , . Robotic examples of this idea include open-loop stable systems where the dynamics are exploited giving systems which require little or no active control for stable operation (e.g. PDW ). At the resonant frequency, the control need only inject a small amount of energy to maintain the vibration of the mass of the arm segment on the spring of the muscles and tendons. Extracting and using the natural frequency of the links of the robots is a desirable property of the robot controllers. According to these facts, we match the endogenous frequency of each neural oscillator with the resonant frequency of the corresponding link. On the other hand, when swinging or supporting motions of the legs are closer to the free motion, there will not be any additional acceleration and deceleration and the motion will be effective . When no input is applied to the CPG, the frequency of it is called endogenous frequency. Endogenous frequency of the CPG is mainly determined
by and . In this section, we change the value of with constant value of . In this case, the endogenous frequency of CPG is proportional to . It was pointed out that the proper value of the for stable oscillation is within . After tuning the time constants of the CPG, other parameters of CPG can be tuned by using the necessary conditions for free oscillation. These necessary conditions for free oscillation can be written as the following form , 
Table I specifies the lengths, masses and inertias of each link of the robot studied in this chapter . By these data and extracting and using resonant frequencies of the links, we match the endogenous frequency of the CPG with the resonant frequency of each link. In this case, is designed at and for all of the neural oscillators. According to conditions in (24), we tune and to 2 and -2, respectively. Also is equal to 5. The amplitude of the output signal of the CPG is approximately proportional to , and . The output parameters of the CPGs (i.e. and of oscillators at the knee and the hip joints) can be determined by the amplitude of the desired walking algorithm. Table II specifies the designed values of the output parameters of the oscillators at the knee and the hip joints of the robot.
4. Feedback network
It is well known in biology that the CPG network with feedback signals from body can coordinate the members of the body, but there is not yet a suitable biological model for feedback network. The control loop used in this section is shown in the Fig. 2 where encapsulates the actuated joint coordinates and there is not any feedback signal from the unactuated DOF (i.e. ). The feedback network in this control loop is for autonomous adaptation of the CPG network. In other hand, by using feedback network, the CPG network (i.e. the higher level of the control system) can correct its outputs (i.e. reference trajectories) in various conditions of the robot.
where is a constant value and also is the neutral point of this feedback loop at hip joints. We tune the and to 1 and , respectively.
One of important factors in control of walking is the coordination of the knee and the hip joints in each leg. For tuning the phase difference between the oscillators of the knee and the hip joints in each leg, we propose the following feedback structure which is applied only at oscillators of the knee joints
where and are constant values, is the neutral point of the tonic stretch reflex signal at knee joints and is a unit step function. The first terms of feedback signals in (26) are the tonic stretch reflex terms. These terms are active in stance phase (i.e. ). With these terms, we force the mechanical system to fix the stance knee at a certain angular position (i.e. ) during the single support phase like the knee joints of the human being. We call as the bias of the stance knee. In this section, we tune and to 10 and , respectively. The second terms in (26) are active in swinging phase (i.e. ). These terms force the knee oscillator to increase its output at the beginning of swinging phase (i.e. ). Also these terms force the knee oscillator to decrease its output at the end of swinging phase (i.e. ). We tune and to 4 and , respectively.
5. Tuning of the weights in the CPG network
The coordination and the phase difference among the links of the biped robot in the discussed control loop are done by the synaptic weights of connections in the CPG network. There are two kinds of connections in the CPG network. One of them is the connections among the flexor neurons and the other one is the connections among the extensor neurons. The neural oscillators in the CPG network can be relabeled as shown in the Fig. 3. According to this relabeling law,
NO1, NO2, NO3 and NO4 correspond to the right knee, the right hip, the left hip and the left knee neural oscillators, respectively. We show the weight matrix among the flexor and extensor neurons by and , respectively. According to the symmetry between the right leg and the left leg, these matrixes can be written as the following form
This symmetry can be given by the following equations
In this chapter, we assume . With this assumption and the symmetry between legs, there are six unknown weights which should be determined (bold lines in Fig. 3). For tuning the unknown weights of the CPG network, we should use a tool of the concept of stability for the biped robots. But the concept of stability and stability margin for biped robots is difficult to precisely define, especially for underactuated biped robots with point feet. Since the discussed robot in this chapter has point feet, the ZMP heuristic is not applicable for trajectory generation and verification of the dynamic feasibility of trajectories during execution. In addition, extracting the eigenvalues magnitude of the Poincaré return map may be sufficient for analyzing periodic bipedal walking but they are not sufficient for analyzing nonperiodic motions such as when walking over discontinuous rough terrain. Also, large disruptions from a limit cycle, such as when being pushed, cannot be analyzed using this technique. Some researchers  have suggested that angular momentum about the Center of Mass (CoM) should be minimized throughout a motion. As studied in , minimizing the angular momentum about the CoM is neither necessary nor sufficient condition for stable walking. According to these facts, for tuning the weights of the CPG network, we define the control problem of the underactuted biped walking as an optimization problem. By finding the optimal solution of the optimization problem, the unknown weights are determined. The total cost function of the optimization problem in this chapter is defined as a summation of sub cost functions and it can be given by
and . Also are the positive weights. The first sub cost function in (29) can be defined as a criterion of the difference between the distance travelled by the robot in the sagittal plane and the desired distance
where is the step length of the th step, is the time duration of the th step and is an upper bound of the traveled distance. Also, is the duration of the simulation. This sub cost function is a good criterion of the stability.
The second sub cost function in (29) can be defined as the least value of the normalized height of the CoM of the mechanical system during simulation and it can be given by
where is the value of the height of the CoM where the vector is equal to zero. Since the biped should maintain an erect posture during locomotion, this sub cost function is defined as a criterion of the erect body posture.
The regulation of the rate change of the angular momentum about the CoM is not a good indicator of whether a biped will fall but the reserve in angular momentum that can be utilized to help recover from push or other disturbance is important. We use the rate change of the angular momentum about the CoM for defining the third sub cost function. With
setting and in equation (5) where is the distance from the stance leg end to the CoM and is the angle from the stance leg end to the CoM with vertical being zero (see Fig. 4), the equation (5) becomes
where and . Also, the total momentum about the stance leg end consists of the angular momentum of the CoM rotating the stance foot plus the angular momentum about the CoM
where and are the angular momentums about the stance leg end and CoM, respectively. Also the net angular momentum rate change is equal to , . With differentiating of equation (34) and setting in it and comparing with equation (33), it can be shown that
Hence, the third sub cost function is defined as following
In this chapter, , and and the control problem of the biped walking is defined as the optimal solution of the following optimization problem
By using Genetic algorithm, the optimal solution can be determined. Genetic algorithm is one of the evolutionary algorithms based on the natural selection. In this section, the size of each generation in this algorithm is equal to 400, and at the end of each generation, 50% of chromosomes are preserved and the others are discarded. The roulette strategy is employed for selection and 100 selections are done by this strategy. With applying one-point crossover, 200 new chromosomes are produced. The mutation is done for all of the chromosomes with the probability of 10% except the elite chromosome which has the most fitness. Also, each parameter is expressed in 8 bits.
6. Simulation results
In this section, the simulation of a 5 link planar biped robot is done in MATLAB environment. Table I specifies the lengths, masses and inertias of each link of the robot. This is the model of RABBIT . RABBIT has gear reducers between its motors and links. In this biped robot, the joint friction is modeled by viscous and static friction terms as described by . Joint PI controllers have been used as servo controllers. Because of the existence of the abrupt changes resulting from the impacts in the hybrid model, the servo controller does not include the derivative terms. We have designed , , and for the servo controllers at the hip and the knee joints. Also in optimization problem, we tune and . By using Genetic algorithm, the optimal solution of the optimization problem in (37) is determined after 115 generations. The optimal solution of the optimization problem in (37) is equal to
The period of the neural oscillators in the biped robot with the best fitness is equal to . The time between consecutive impacts for this robot is equal to . Also the step length during the walking (the distance between consecutive impacts) is equal to . The snapshots of one step for the best biped robot at limit cycle in this set of experiments are depicted in Fig. 5. In this picture, the left leg is taking a step forward. It can be seen that the swing leg performs a full swing and it allows sufficient ground clearance for the foot to be transferred to a new location. In Fig. 6, the CPG outputs and the joint angle positions of the leg joints during are shown with dashed lines and solid lines, respectively. Figure 7 depicts the phase plot and the limit cycle of joint angle vs. velocity at the unactuated joint ( plane) during . Also Fig. 8 depicts the limit cycles at the phase plots of the leg joints during .
Control signals of the servo controllers during are depicted in Fig. 9. The validity of the reduced single support phase model and impact model can be seen by plotting the ground reaction forces as plotted in Fig. 10.
For evaluating the robustness of the limit cycle of the closed loop system, an external force as disturbance is applied to the body of the biped robot. We assume that the external force is applied at the center of mass of the torso and it can be given by where is the disturbance amplitude, is the time when the disturbance is applied, is the duration of the pulse and is a unit step function. The stick figure of the robot for a pulse with amplitude and with pulse duration equal to which is applied at is shown in Fig. 11. This figure shows the robustness of the limit cycle due to disturbance. Also Fig. 12 shows the stable limit cycle at the unactuated joint. Figure 13 shows the maximum value of the positive and negative pulses vs. pulse duration which don’t result in falling down.
In this chapter, the hybrid model was used for modeling the underactuated biped walker. This model consisted of single support phase and the instantaneous impact phase. The double support phase was also assumed to be instantaneous. For controlling the robot in underactuated walking, a CPG network and a new feedback network were used. It is shown that the period of the CPG is the most important factor influencing the stability of the biped walker. Biological experiments show that humans exploit the natural frequencies of their arms, swinging pendulums at comfortable frequencies equal to the natural frequencies. Extracting and using the natural frequency of the links of the robots is a desirable property of the robot controller. According to this fact, we match the endogenous frequency of each neural oscillator with the resonant frequency of the corresponding link. In this way, swinging motion or supporting motion of legs is closer to free motion of the pendulum or the inverted pendulum in each case and the motion is more effective.
It is well known in biology that the CPG network with feedback signals from body can coordinate the members of the body, but there is not yet a suitable biological model for feedback network. In this chapter, we use tonic stretch reflex model as the feedback signal at the hip joints of the biped walker as studied before. But one of the most important factors in control of walking is the coordination or phase difference between the knee and the hip joints in each leg. We overcome this difficulty by introducing a new feedback structure for the knee joints oscillators. This new feedback structure forces the mechanical system to fix the stance knee at a constant value during the single support phase. Also, it forces the swing knee oscillator to increase its output at the beginning of swinging phase and to decrease its output at the end of swinging phase.
The coordination of the links of the biped robot is done by the weights of the connections in the CPG network. For tuning the synaptic weight matrix in CPG network, we define the control problem of the biped walker as an optimization problem. The total cost function in this problem is defined as a summation of the sub cost functions where each of them evaluates different criterions of walking such as distance travelled by the biped robot in the sagittal plane, the height of the CoM and the regulation of the angular momentum about the CoM. By using Genetic algorithm, this problem is solved and the synaptic weight matrix in CPG network for the biped walker with the best fitness is determined. Simulation results show that such a control loop can produce a stable and robust limit cycle in walking of the biped walker. Also these results show the ability of the proposed feedback network in correction of the CPG outputs. This chapter also shows that by using the resonant frequencies of the links, the number of unknown parameters in the CPG network is reduced and hence applying Genetic algorithm is easier.