Open access peer-reviewed chapter

Cooperative Adaptive Learning Control for a Group of Nonholonomic UGVs by Output Feedback

Written By

Xiaonan Dong, Paolo Stegagno, Chengzhi Yuan and Wei Zeng

Submitted: February 12th, 2019 Reviewed: May 24th, 2019 Published: April 22nd, 2020

DOI: 10.5772/intechopen.87038

Chapter metrics overview

548 Chapter Downloads

View Full Metrics


A high-gain observer-based cooperative deterministic learning (CDL) control algorithm is proposed in this chapter for a group of identical unicycle-type unmanned ground vehicles (UGVs) to track over desired reference trajectories. For the vehicle states, the positions of the vehicles can be measured, while the velocities are estimated using the high-gain observer. For the trajectory tracking controller, the radial basis function (RBF) neural network (NN) is used to online estimate the unknown dynamics of the vehicle, and the NN weight convergence and estimation accuracy is guaranteed by CDL. The major challenge and novelty of this chapter is to track the reference trajectory using this observer-based CDL algorithm without the full knowledge of the vehicle state and vehicle model. In addition, any vehicle in the system is able to learn the knowledge of unmodeled dynamics along the union of trajectories experienced by all vehicle agents, such that the learned knowledge can be re-used to follow any reference trajectory defined in the learning phase. The learning-based tracking convergence and consensus learning results, as well as using learned knowledge for tracking experienced trajectories, are shown using the Lyapunov method. Simulation is given to show the effectiveness of this algorithm.


  • cooperative control
  • deterministic learning
  • neural network
  • multi-agent systems
  • distributed adaptive learning and control
  • unmanned ground vehicles

1. Introduction

The two-wheel-driven, unicycle-type vehicle is one of the most common mobile robot platforms, and many research results have been published regarding this system [1, 2, 3, 4]. There are two major challenges for controlling this system: the knowledge of all state variables, and the actuate modeling of the system. For the unicycle-type vehicle that we use in this chapter, the vehicle position and velocity are both required for the trajectory tracking control. The position of the vehicle can be obtained using cameras or GPS signals, while direct measurement of the vehicle velocity is difficult. State observer has been proposed to estimate the full state of the system using the measured signals [5, 6], however, traditional observers require the knowledge of the system model for accurate state estimations. High-gain observer has been proposed to estimate the unmeasured state variables in case that the system model is not fully known to the observer, and the estimated states can be used for control purposes [7, 8, 9, 10]. In this chapter, we follow the standard high-gain observer design method [8] to obtain the estimation of vehicle velocity using the measured vehicle position.

For the second challenge, adaptive control has been introduced to deal with system uncertainties [11, 12], in which neural network (NN) based control is able to further deal with nonlinear system uncertainties [13, 11]. Though tracking control can be achieved by NN-based adaptive control, however, traditional NN-based control methods failed to achieve parameter (NN weight) convergence. This shortage requires the controller to update the system parameter (NN weight) all the time when the controller is operating, which is time consuming and computational demanding. To overcome this deficiency, a deterministic learning (DL) method has been proposed to model the system uncertainties under the partial persistency of excitation (PE) condition [14]. To be more specific, it has been shown that the system uncertainties can be accurately modeled with a sufficient large number of radial basis function (RBF) NNs, and local NN weights online updated by DL will converge to their optimal values, provided that the input signal of the RBFNNs is recurrent.

Since the RBFNN estimation is locally accurate around the recurrent trajectory, this becomes a disadvantage when there exists multiple tracking tasks. The learned knowledge of the system uncertainties, presented by the RBFNNs, cannot be directly applied on a different control task, and it will need a significant amount of storage space for a large number of different tasks. In recent years, distributed control is a rising topic regarding the control of multiple coordinated agents [15, 16, 17, 18, 19, 20]. In this chapter, we took the idea of communicating inside the multi-agent system (MAS) and apply it on DL, such that in the learning phase, any vehicle in the MAS is able to learn the unmodeled dynamics not only along its own trajectory, but along the trajectories of all other vehicle agents in this MAS as well. In other words, the NN weight of any vehicle in this MAS will converge to a common constant, which presents the unmodeled dynamics along the union trajectory of all vehicles, and any vehicle in the MAS is able to use this knowledge to achieve trajectory tracking for any control task learned in the learning phase.

The main contributions of this chapter are summarized as follows.

  1. A high-gain observer is introduced to estimate the vehicle velocities using the measurement of vehicle position.

  2. An observer and RBFNN-based adaptive learning control algorithm is developed for a multi-vehicle system, such that each vehicle agent will be able to follow the desired reference trajectory.

  3. An online cooperative adaptive NN learning law is proposed, such that the RBFNN weight of all vehicle agents will converge to one common value, which represents the unmodeled dynamics of the vehicle along the union trajectories experienced by all vehicle agents.

  4. An observer and experience-based controller is developed using the common NN model obtained from the learning phase, such that vehicles are able to follow the reference trajectory experienced by any vehicle before with improved control performance.

In the following sections, we briefly describe some preliminaries on graph theory and RBFNNs based DL method, then present the vehicle dynamics and the problem statement, all in Section 2. The main results of this chapter, including the high-gain observer design, CDL-based trajectory tracking control, accurate cooperative learning using RBF NNs, and experience-based trajectory tracking control, are provided in Section 3, respectively. Simulation results of an example with four vehicles running three different tasks are provided in Section 4. The conclusions are drawn in Section 5.

Notations.R, R+and +denote, respectively, the set of real numbers, the set of positive real numbers and the set of positive integers; Rm×ndenotes the set of m×nreal matrices; Rndenotes the set of n×1real column vectors; Indenotes the n×nidentity matrix; Om×ndenotes the zero matrix with dimension of m×n; Subscript kdenotes the kthcolumn vector of a matrix; is the absolute value of a real number, and is the 2-norm of a vector or a matrix, i.e., x=xTx12; żdenotes the total derivative of zwith respect to the time; /zdenotes the Jacobian matrix as z=z1zn.


2. Preliminaries and problem statement

2.1 Graph theory

In a graph defined as G=VεA, the elements of V=12nare called vertices, the elements of εare pairs ijwith i,jV,ijcalled edges, and the matrix Ais called the adjacency matrix. If ijε, then agent iis able to receive information from agent j, and agent iand jare called adjacent. The adjacency matrix is thus defined as A=aijn×n, in which aij>0if and only if ijε, and aij=0otherwise. For any two nodes vi,vjV, if there exists a path between them, then the graph Gis called connected. Furthermore, the graph Gis called fixed if εand Ado not change over time, and called undirected if ijε, pair jiis also in ε. According to [21], for the Laplacian matrix L=lijn×nassociated with the undirected graph G, in which lij=j=1,jinaiji=jaijij.If the graph is connected, then Lis a positive semi-definite symmetric matrix, with one zero eigenvalue and all other eigenvalues being positive and hence, rankLn1.

2.2 Localized RBF neural networks and deterministic learning

The RBF networks can be described by fnnZ=i=1NnwisiZ=WTSZ[22], where ZΩZRqis the input vector, W=w1wNnTRNnis the weight vector, Nnis the NN node number, and SZ=s1Zμ1sNnZμNnT, with sibeing a radial basis function, and μii=12Nnbeing distinct points in state space. The Gaussian function siZμi=expZμiTZμiσ2is one of the most commonly used radial basis functions, where μi=μi1μi2μiqTis the center of the receptive field and σiis the width of the receptive field. The Gaussian function belongs to the class of localized RBFs in the sense that siZμi0as Z. It is easily seen that SZis bounded and there exists a real constant SMR+such that SZSM[14].

It has been shown in [22, 23] that for any continuous function fZ:ΩZRwhere ΩZRqis a compact set, and for the NN approximator, where the node number Nnis sufficiently large, there exists an ideal constant weight vector W, such that for any ϵ>0, fZ=WTSZ+ϵ,ZΩZ, where ϵ<ϵis the ideal approximation error. The ideal weight vector Wis an “artificial” quantity required for analysis, and is defined as the value of Wthat minimizes ϵfor all ZΩZRq, i.e., WargminWRNnsupZΩZfZWTSZ. Moreover, based on the localization property of RBF NNs [14], for any bounded trajectory Ztwithin the compact set ΩZ, fZcan be approximated by using a limited number of neurons located in a local region along the trajectory: fZ=WζTSζZ+ϵζ, where ϵζis the approximation error, with ϵζ=Oϵ=Oϵ, SζZ=sj1ZsZTRNζ, Wζ=wj1wTRNζ, Nζ<Nn, and the integers ji=j1,,jζare defined by sjiZp>θ(θ>0is a small positive constant) for some ZpZk.

It is shown in [14] that for a localized RBF network WTSZwhose centers are placed on a regular lattice, almost any recurrent trajectory Zk(see [14] for detailed definition of “recurrent” trajectories) can lead to the satisfaction of the PE condition of the regressor subvector SζZ. This result is recalled in the following Lemma.

Lemma 1[14, 24]. Consider any recurrent trajectoryZk:+Rq.Zkremains in a bounded compact setΩZRq, then for RBF networkWTSZwith centers placed on a regular lattice (large enough to cover compact setΩZ), the regressor subvectorSζZconsisting of RBFs with centers located in a small neighborhood ofZkis persistently exciting.

2.3 Vehicle model and problem statement

As shown in Figure 1 , this unicycle-type vehicle is a nonholonomic system, with the constraint force preventing the vehicle from sliding along the axis of the actuated wheels. The nonholonomic constraint can be presented as follows

Figure 1.

A unicycle-type vehicle.


in which Aqi=sinθicosθi0T, and qi=xiyiθiTis the general coordinates of the ithvehicle (i=1,2,,n, with nbeing the number of vehicles in the MAS). (xi,yi) and θidenote the position and orientation of the vehicle with respect to the ground coordinate, respectively.

With this constraint, the degree of freedom of the system is reduced to two. Independently driven by the two actuated wheels on each side of the vehicle, the non-slippery kinematics of the ithvehicle is


where viand ωiare the linear and angular velocities measured at the center between the driving wheels, respectively. The dynamics of the ithvehicle can be described by [25].


in which MR3×3is a positive definite matrix that denotes the inertia, CR3×3is the centripetal and Coriolis matrix, FR3×1is the friction vector, GR3×1is the gravity vector. τiR2×1is a vector of system input, i.e., the torque applied on each driving wheel, B=1rcosθicosθisinθisinθiRRR3×2is the input transformation matrix, projecting the system input τonto the space spanned by xyθ, in which D=2Ris the distance between two actuation wheels, and ris the radius of the wheel. λiis a Lagrange multiplier, and AλiR3×1denotes the constraint force.

Matrices Mand Cin Eq. (3) can be derived using the Lagrangian equation with the follow steps. First we calculate the kinetic energy for the ithvehicle agent


where mis the mass of the vehicle, Iis the moment of inertia measured at the center of mass, xic, yic, and θicare the position and orientation of the vehicle at the center of mass, respectively. The following relation can be obtained from Figure 1 :


Then Eq. (4) can be rewritten into


in which M=m0mdsinθi0mmdcosθimdsinθimdcosθimd2+I. It will be shown later that the inertia matrix Mshown above is identical to that in Eq. (3). Then the dynamics equation of the system is given by the following Lagrangian equation [26],


in which Lqiq̇i=Tqiq̇iUqiis the Lagrangian of the ithvehicle, Uqiis the potential energy of the vehicle agent, λRk×1is the Lagrangian multiplier, and ATλis the constraint force. Qi=Bqiτifuidenotes the external force, where τiis the force generated by the actuator, and fuiis the friction on the actuator. Then Eq. (7) can be rewritten into


By setting Cqiq̇iq̇i=Ṁq̇iTiqiT, Fqiq̇i=Bqifq̇i, and Gqi=UiqiT, Eq. (8) can be thereby transferred into Eq. (3). Notice that the form of Cn×nis not unique, however, with a proper definition of the matrix C, we will have Ṁ2Cto be skew-symmetric. The ijthentry of Cis defined as follows [26].


where q̇kis the kthentry of q̇, and cijk=12mijqk+mikqjmjkqiis defined using the Christoffel symbols of the first kind. Then we have the centripetal and Coriolis matrix calculated as C=00mdθ̇icosθi00mdθ̇isinθi000. Since the vehicle is operating on the ground, the gravity vector Gis equal to zero. The friction vector Fis assumed to be a nonlinear function of the general velocity ui, and is unknown to the controller.

To eliminate the nonholonomic constraint force Aqiλifrom Eq. (3), we left multiplying JTqito the equation, it yields:


From Eqs. (1) and (2), we have JTA=02×1, then the dynamic equation of uiis simplified as




The degree of freedom of the vehicle dynamics is now reduced to two. Since JTBis of full rank, then for any transformed torque input τ¯i, there exists a unique corresponding actual torque input τiR2that applied on each wheel.

The main challenge for controlling the system includes (i) the direct measurement of the linear and angular velocities is not feasible, and (ii) system parameter matrices C¯and F¯are unknown to the controller.

Based on the above system setup, we are ready to formulate our objective of this chapter. Consider a group of nhomogeneous unicycle-type vehicles, the kinematics and dynamics of each vehicle agent are described by Eqs. (2) and (11), respectively. The communication graph of such nvehicles is denoted as G. Regarding this MAS, we have the following assumption.

Assumption 1.The graph Gis undirected and connected.

The objective of this chapter is to design an output-feedback adaptive learning control law for each vehicle agent in the MAS, such that

  1. State estimation:The immeasurable general velocities ui=viωiTcan be estimated by a high-gain observer using the measurement of the general coordinates qi=xiyiθiT.

  2. Trajectory tracking:Each vehicle in the MAS will track its desired reference trajectory, which will be quantified by xrityritθrit; i.e., limtxitxrit=0, limtyityrit=0, limtθitθrit=0.

  3. Cooperative Learning:The unknown homogeneous dynamics of all the vehicles can be locally accurately identified along the union of the trajectories experienced by all vehicle agents in the MAS.

  4. Experience based control:The identified/learned knowledge from the cooperative learning phase can be re-utilized by each local vehicle to perform stable trajectory tracking with improved control performance.

In order to apply the deterministic learning theory, we have the following assumption on the reference trajectories.

Assumption 2.The reference trajectoriesxrit,yrit,θritfor alli=1,,nare recurrent.


3. Main results

3.1 High-gain observer design

In mobile robotics control, the position of the vehicle can be easily obtained in real time using GPS signals or camera positioning, while the direct measurement of the velocities is much more difficult. For the control and system estimation purposes, the velocities of the vehicle are required for the controller. To this end, we follow the high-gain observer design method in [8, 9], and introduce a high-gain observer to estimate the velocities using robot positions. First, we define two new variables as follows


Notice that the operation above can be considered as a projecting the vehicle position onto the a frame whose origin is fixed to the origin of ground coordinates, and the axes are parallel to the body-fixed frame of the vehicle. The coordinates of the vehicle in this rotational frame is pxipyiand hence, pxiand pyican be calculated based on the measurement of the position and the orientation. The rotation rate of this frame equals to the angular velocity of the vehicle θ̇i=ωi. Based on this, we design the high-gain observer for ωas


in which δis a small positive scalar to be designed, and l1and l2are parameters to be chosen, such that l11l20is Hurwitz stable. The time derivative of this coordinates defined in Eq. (12) is given by ṗxi=vi+pyiωi, and ṗyi=pxiωi, then we design the high-gain observer for vas


To prevent peaking while using this high-gain observer and in turn improving the transient response, parameter δcannot be too small [9]. Due to the use of a globally bounded control, decreasing δdoes not induce peaking phenomenon of the state variables of the system, while the ability to decrease δwill be limited by practical factors such as measurement noise and sampling rates [7, 27]. According to [8], it is easy to show that the estimation error between the actual and estimated velocities of the ithvehicle zi=uiûiwill converge to zero, detailed proof is omitted here due to space limitation.

3.2 Controller design and tracking convergence analysis

After obtaining the linear and angular velocities from the high-gain observer, we now proceed to the trajectory tracking. First, we define the tracking error q˜iby projecting qriqionto the body coordinate of the ithvehicle, with the xaxis set to be the front and yto be the left of the vehicle, as shown in Figure 2 .

Figure 2.

Projecting tracking error onto the body-fixed frame.


using the constraint Eq. (1) and kinematics Eq. (2), we have the derivative of the tracking error as follows


where viand ωiare the linear and angular velocities of the ithvehicle, respectively.

In order to utilize the backstepping control theory, we treat viand ωiin Eq. (16) as virtual inputs, then following the methodology from [28], we can design a stabilizing virtual controller as


in which Kx, Ky, and Kθare all positive constants. It can be shown that this virtual velocity controller is able to stabilize the closed-loop system Eq. (16) kinematically by replacing viand ωiwith vciand ωci, respectively. To this end, we define the following Lyapunov function for the ithvehicle


and the derivative of V1iis


Since V̇1iis negative semi-definite, then we can conclude that this closed-loop system is stable, i.e., the tracking error q˜ifor the ithvehicle will be bounded.

Remark 1.In addition to the stable conclusion above, we could also conclude the asymptotic stability by finding the invariant set ofV̇1i=0. By settingV̇1i=0, we havex˜i=0andsinθ˜=0. Applying this result intoEqs. (16) and(17) , we have the invariant set equals tox˜i=0y˜i=0sinθ˜=0x˜i=0sinθ˜=0y˜i0vri=0ωri=0. With the assumption 2, the velocity of the reference cannot be constant over time, then we can conclude that the only invariant subset ofV̇1i=0is the originq˜i=0. Therefore, we can conclude that the closed-loop systemEqs. (16) and(17) is asymptotically stable [29 ].

With the idea of backstepping control, we then derive the transformed torque input τ¯ifor the ithvehicle with the following steps. By defining the error between the virtual controller uciand the actual velocity uias u˜i=v˜i ω˜iT=uciui, we can rewrite Eq. (16) in terms of v˜iand ω˜ias


Then we define a new Lyapunov function V2i=V1i+u˜iTM¯u˜i2for the closed-loop system Eq. (20), whose derivative can be written as


To make the system stable, the term u˜iTx˜isinθ˜iKy+M¯u˜̇ineeds to be negative definite. From the definition of u˜iand Eq. (11), we have


Motivated from the results of [9], it is easy to show that this term is negative definite if τ¯iis designed to be


where Kuis a positive constant. Since the actual linear and angular velocity of the vehicle is unknown, we use v̂iand ω̂igenerated by the high-gain observer Eqs. (13) and (14) to replace viand ωiin Eq. (23). From the discussion in previous subsection, the convergence of velocities estimation is guaranteed.

In Eq. (23), C¯uiand F¯uiare unknown to the controller. To overcome this issue, RBFNN will be used to approximate this nonlinear uncertain term, i.e.,


in which SXiis the vector of RBF, with the variable (RBFNN input) Xi=ui, Wis the common ideal estimation weight of this RBFNN, and ϵiis the ideal estimation error, which can be made arbitrarily small given sufficiently large number of neurons. Consequently, we proposed the implementable controller for the ithvehicle as follows


For the NN weights used in Eq. (25), we propose an online NN weight updating law as follows


where Γ, γ, and βare positive constants.

Theorem 1.Consider the closed-loop system consisting of thenvehicles in the MAS described byEqs. (2) and(11) , reference trajectoryqrit, high-gain observerEqs. (13) and(14) , adaptive NN controllerEq. (25) with the virtual velocityEq. (17) , and the online weight updating law(26) , under the Assumptions 1 and 2, then for any bounded initial condition of all the vehicles andŴi=0, the tracking errorq˜iconverges asymptotically to a small neighborhood around zero for all vehicle agents in the MAS.

Proof:We first derive the error dynamics of velocity between uciand uiusing Eqs. (22) and (25)


where ϵi=ϵviϵωiTand W˜i=WŴi. Notice that the convergence of ûito uiis guaranteed by the high-gain observer. Then we derive the error dynamics of NN weight as follows


For the closed-loop system given by Eqs. (20), (27), and (28), we can build a positive definite function Vas


whose derivative is equal to


By using Eqs. (27) and (28), the equation above is equivalent to

V˙=i=1n{x˜i(v˜i+ωiy˜iKxx˜i)+y˜i(vrisinθ˜iωix˜i)+sinθ˜iKy(ω˜ivriKyy˜iKθsinθ˜i)+ u˜iT[W˜iTS(Xi)+εiKuu˜i[x˜isinθ˜iKy]]+ trace(W˜iT[S(Xi)u˜iT+γŴiΓ+βΓj=1naij(ŴiŴj)])}=i=1n{Kxx˜i2KθKysin2θ˜iKuu˜iTu˜i+u˜iTεi+u˜iT[W˜iTS(Xi)] trace([W˜iTS(Xi)]u˜iT)+trace(γW˜iTŴiΓ)}trace(i=1nβΓW˜iTj=1naij(ŴiŴj))=i=1n{Kxx˜i2KθKysin2θ˜iKuu˜iTu˜i+u˜iTεi+γΓtrace(W˜iTŴi)}βΓtrace(W˜T(LI)W˜)E31

where Lis the Laplacian matrix of G, and W˜=W˜1TW˜nTT. Since βand Γare all positive, and Lis positive semi-definite, then we have βΓtraceW˜TLIW˜0. Notice that the estimation error can be made arbitrary small with a sufficient large number of neurons, and γis the leakage term chosen as a small positive constant. Therefore, we can conclude that the closed-loop system Eqs. (20), (27), and (28) is stable, i.e., V̇0, if the following condition stands


Hence, the closed-loop system is stable, and all tracking error are bounded. Since all variables in Eq. (31) are continuous (i.e., V¨is bounded), then with the application of Barbalat’s Lemma [30], we have limtV̇=0, which implies that the tracking error q˜ifor all agents will converge to a small neighborhood of zero, whose size depends on the norm of u˜iTϵi+γΓtraceW˜iTŴi. Q.E.D.

3.3 Consensus convergence of NN weights

In addition to the tracking convergence shown in the previous subsection, we will show that all vehicles in the system is able to learn the unknown vehicle dynamics along the union trajectory (denoted as i=1nζiXit) experienced by all vehicles in this subsection.

By defining v˜=v˜1v˜nT, ω˜=ω˜1ω˜nT, W˜v=W˜1,1W˜n,1T, and W˜ω=W˜1,2W˜n,2T, we combine the error dynamics in Eqs. (27) and (28) for all vehicles into the following form:


in which


where S=diagSX1SX2SXn, and


As is shown in Theorem 1, the tracking error q˜iwill converge to a small neighborhood of zero for all vehicle agents in the MAS. Furthermore, the ideal estimation errors ϵviand ϵωican be made arbitrarily small given sufficient number of RBF neurons, and γis chosen to be a small positive constant, therefore, we can conclude that the norm of Ein Eq. (33) is a small value. In the following theorem, we will show that Wi=Wi,1Wi,2converges to a small neighborhood of the common ideal weight Wfor all i=1,,nunder Assumptions 1 and 2.

Before proceeding further, we denote the system trajectory of the ithvehicle as ζifor all i=1,,n. Using the same notation from [14], ζand ζ¯represent the parts of related to the region close to and away from the trajectory ζ, respectively.

Theorem 2.Consider the error dynamicsEq. (33) , under the Assumptions 1 and 2, then for any bounded initial condition of all the vehicles andŴi=0, along the union of the system trajectoriesi=1nζiXit, all local estimated neural weightsŴζiused inEqs. (25) and(26) converge to a small neighborhood of their common ideal valueWζ, and locally accurate identification of nonlinear uncertain dynamicsHXtcan be obtained byŴiTSXas well asW¯iTSXfor allXi=1nζiXit, where


with taitbi(tbi>tai>Ti) being a time segment after the transient period of tracking control.

Proof:According to [14], if the nominal part of closed loop system shown in Eq. (33) is uniformly locally exponentially stable (ULES), then v˜, ω˜, W˜v, and W˜ωwill converge to a small neighborhood of the origin, whose size depends on the value of E.

Now the problem boils down to proving ULES of the nominal part of system Eq. (33). To this end, we need to resort to the results of Lemma 4 in [31]. It is stated that if the Assumptions 1 and 2 therein are satisfied, and the associated vector SζXiis PE for all i=1,,n, then the nominal part of Eq. (33) is ULES. The assumption 1 therein is automatically verified since Sis bounded, and Assumption 2 therein also holds, if we set the counterparts P=Γm00Iand Q=2ΓKuIn00KuIn. Furthermore, the PE condition of SζXiwill also be met, if Xiof the learning task is recurrent [14], which is guaranteed by Assumption 2 and results from Theorem 1. Therefore, we can obtain the conclusion that v˜, ω˜, W˜v, and W˜ωwill converge to a small neighborhood of the origin, whose size depends on the small value of E.

Similar to [24], the convergence of Ŵζito a small neighborhood of Wζimplies that for all Xi=1nζiXit, we have


where ϵ1ζi=W˜ζiTSζX+ϵζiis close to ϵζidue to the convergence of W˜ζi. With the W¯idefined in Eq. (34), then Eq. (35) can be rewritten into


where W¯ζiT=w1ζwkζTis a subvector of W¯iand ϵ2ζiis the error using W¯ζiTSζXas the system approximation. After the transient process, ϵ1ζiϵ2ζiis small for all i=1,,n.

On the other hand, due to the localization property of Gaussian RBFs, both Sζ¯and W¯ζ¯Sζ¯Xare very small. Hence, along the union trajectory i=1nζiXit, the entire constant RBF network W¯TSXcan be used to approximate the nonlinear uncertain dynamics, demonstrated by the following equivalent equations


where ϵ1iϵ1ζiand ϵ2iϵ2ζiare all small for all i=1,,n. Therefore, the conclusion of Theorem 2 can be drawn. Q.E.D.

3.4 Experience-based trajectory tracking control

In this section, based on the learning results from the previous subsections, we further propose an experience-based trajectory tracking control method using the knowledge learned in the previous subsection, such that the experience-based controller is able to drive each vehicle to follow any reference trajectory experienced by any vehicle on the learning stage.

To this end, we replace the NN weight Ŵiin Eq. (25) by the converged constant NN weight W¯ifor the ithvehicle. Therefore, the experience-based controller for the ithvehicle is constructed as follows


in which u̇ciis the derivative of the virtual velocity controller from Eq. (17), and W¯iis obtained from Eq. (34) for the ithvehicle. The system model Eqs. (2) and (11), and the high-gain observer design Eqs. (14) and (13) remain unchanged.

Theorem 3.Consider the closed-loop system consisting ofEqs. (2) and(11) , reference trajectoryqrij=1nqjt, high-gain observerEqs. (14) and(13) , and the experience-based controllerEq. (38) with virtual velocityEq. (17) . For any bounded initial condition, the tracking errorq˜iconverges asymptotically to a small neighborhood around zero.

Proof:Similar to the proof of Theorem 1, by defining q˜iand u˜ito be the error between the position and velocity of the ithvehicle and its associated reference trajectory, we have the error dynamics of the ithvehicle as


With the same high-gain observer design used in the learning-based tracking, the convergence of ûito uiis also guaranteed. For the closed-loop system shown above, we can build a positive definite function as


and the derivative of Viis

V̇i=x˜ix˜̇i+y˜iy˜̇i+sinθ˜iKyθ˜̇i+u˜iTM¯u˜̇i=x˜iv˜i+ωiy˜iKxx˜i+y˜ivrisinθ˜iωix˜i+sinθ˜iKyω˜ivriKyy˜iKθsinθ˜i+ u˜iTϵ2iKuu˜ix˜isinθ˜iKy=Kxx˜i2KθKysin2θ˜iKuu˜iTu˜i+u˜iTϵ2iE41

where ϵ2i=HXiW¯iTSXi. Then following the similar arguments in the proof of Theorem 1, given positive Kx, Ky, Kθ, and Ku, then we can conclude that the Lyapunov function Viis positive definite and V̇iis negative semi-definite in the region Kxx˜i2+KθKysin2θ˜i+Kuu˜iTu˜iu˜iTϵ¯i. Similar to the proof of Theorem 1, it can be shown that limtV̇i=0with Barbalat’s Lemma, and the tracking errors will converge to a small neighborhood of zero. Q.E.D.


4. Simulation studies

Consider four identical vehicles, whose unknown friction vector is assumed to be a nonlinear function of vand ωas follows F¯=0.1mvi+0.05mvi20.2Iωi+0.1Iωi2, and since we assume the vehicles are operating on the horizontal plane, the gravitational vector G¯is equal to zero. The physical parameters of the vehicles are given as m=2kg, I=0.2kgm2; R=0.15m, r=0.05m and d=0.1m. The reference trajectories of the three vehicles are given by


and for all vehicles, the orientations of reference trajectories and vehicle velocities satisfy the following equations


The parameters of the observer Eqs. (13) and (14) are given as δ=0.01, and l1=l2=1. The parameters of the controller Eq. (25) with Eq. (17) are given as Kx=Ky=Kθ=1, and Ku=2. The parameters of Eq. (26) are given as Γ=10, γ=0.001, and β=10. For each i=1,2,3,4, since Xi=viωiT, we construct the Gaussian RBFNN ŴiSXiusing N=5×5=25neuron nodes with the centers evenly placed over the state space 04×04and the standard deviation of the Gaussian function equal to 0.7. The initial position of the vehicles are set at the origin, with the velocities set to be zero, and the initial weights of RBFNNs are also set to be zero. The connection between three vehicles is shown in Figure 3 , and the Laplacian matrix Lassociated with the graph Gis

Figure 3.

Connection between four vehicles.


Simulation results are shown as following. Figure 4a shows that the observer error will converge to a close neighborhood around zero in a very short time period, and Figure 4b shows that all tracking errors x˜iand y˜iwill converge to zero, and Figures 5a–f show that all vehicles (blue triangles) will track its own reference trajectory (red solid circles) on the 2-D frame. Figure 6b shows that the NN weights of all vehicle agents converge to the same constant, and Figure 6a shows that all RBFNNs of three vehicles are able to accurately estimate the unknown dynamics, as the estimation errors converging to a small neighborhood around zero.

Figure 4.

Observer errors and tracking errors using observer-based controller. (a) Observer errors using observer (13) and (14). (b) Tracking errors using controller (25) with (17) and (26).

Figure 5.

Snapshot of trajectory tracking using controllerEq. (25)withEqs. (17)and(26). (a) time at 0 seconds. (b) time at 1 seconds. (c) time at 4 seconds. (d) time at 9 seconds. (e) time at 16 seconds. (f) time at 25 seconds.

Figure 6.

Estimation errors and NN weight convergence. (a) Estimation errors using controller (25) with(17) and (26). (b) Weight vector 1-norm of Wv and Ww.

To demonstrate the results of Theorem 3, which states that after the learning process, each vehicle is able to use the learned knowledge to follow any reference trajectory experienced by any vehicle on the learning stage. In this part of our simulation, the experience-based controller Eq. (38) will be implemented with the same parameters as those of the previous subsection, such that vehicle 1 will follow the reference trajectory of vehicle 3, vehicle 2 will follow the reference trajectory of vehicle 1, and vehicle 3 will follow the reference trajectory of vehicle 2. The initial position of the vehicles are set at the origin, with all velocities equal to zero.

Simulation results are shown as following. Figure 7a shows that the observer error will converge to a close neighborhood around zero in a very short time period. Figures 8a–c show that all vehicles (blue triangles) will track its own reference trajectory (red solid circles), and Figure 7b shows that all tracking errors x˜iand y˜iwill converge to zero.

Figure 7.

Observer errors and tracking errors using observer-based controller. (a) Observer errors using observer (13) and (14). (b) Tracking errors using controller (38) with (17).

Figure 8.

Snapshot of trajectory tracking using controllerEqs. (38)withEq. (17). (a) time at 0 seconds. (b) time at 4 seconds. (c) time at 16 seconds.


5. Conclusion

In this chapter, a high-gain observer-based CDL control algorithm has been proposed to estimate the unmodeled nonlinear dynamics of a group of homogeneous unicycle-type vehicles while tracking their reference trajectories. It has been shown in this chapter that the state estimation, trajectory tracking, and consensus learning are all achieved using the proposed algorithm. To be more specific, any vehicle in the system is able to learn the unmodeled dynamics along the union of trajectories experienced by all vehicles with the state variables provided by measurements and observer estimations. In addition, we have also shown that with the converged NN weight, this knowledge can be applied on the vehicle to track any experienced trajectory with reduced computational complexity. Simulation results have been provided to demonstrate the effectiveness of this proposed algorithm.


  1. 1. Yu X, Liu L, Feng G. Trajectory tracking for nonholonomic vehicles with velocity constraints. In: IFAC-Papers Online. Vol. 48, No. 11. 2015. pp. 918-923
  2. 2. Chen X, Jia Y. Simple tracking controller for unicycle-type mobile robots with velocity and torque constraints. Transactions of the Institute of Measurement and Control. 2015;37(2):211-218
  3. 3. Seyboth GS, Wu J, Qin J, Yu C, Allgöwer F. Collective circular motion of unicycle type vehicles with nonidentical constant velocities. IEEE Transactions on control of Network Systems. 2014;1(2):167-176
  4. 4. Dong X, Yuan C, Wu F. Cooperative deterministic learning-based trajectory tracking for a group of unicycle-type vehicles. In: ASME 2018 Dynamic Systems and Control Conference. 1em plus 0.5em minus 0.4em; American Society of Mechanical Engineers. 2018. p. V003T30A006
  5. 5. Luenberger DG. Observing the state of a linear system. IEEE transactions on military electronics. 1964;8(2):74-80
  6. 6. Luenberger D. An introduction to observers. IEEE Transactions on Automatic Control. 1971;16(6):596-602
  7. 7. Lee KW, Khalil HK. Adaptive output feedback control of robot manipulators using high-gain observer. International Journal of Control. 1997;67(6):869-886
  8. 8. Khalil HK. High-gain observers in nonlinear feedback control. In: 2008 International Conference on Control, Automation and Systems. 1em plus 0.5em minus 0.4em; IEEE. 2008. pp. xlvii-xlvii
  9. 9. Zeng W, Wang Q, Liu F, Wang Y. Learning from adaptive neural network output feedback control of a unicycle-type mobile robot. ISA Transactions. 2016;61:337-347
  10. 10. Boker A, Yuan C. High-gain observer-based distributed tracking control of heterogeneous nonlinear multi-agent systems. In: 2018 37th Chinese Control Conference (CCC). 1em plus 0.5em minus 0.4em; IEEE. 2018. pp. 6639-6644
  11. 11. Rossomando FG, Soria CM. Identification and control of nonlinear dynamics of a mobile robot in discrete time using an adaptive technique based on neural pid. Neural Computing and Applications. 2015;26(5):1179-1191
  12. 12. Miao Z, Wang Y. Adaptive control for simultaneous stabilization and tracking of unicycle mobile robots. Asian Journal of Control. 2015;17(6):2277-2288
  13. 13. Fierro R, Lewis FL. Control of a nonholonomic mobile robot using neural networks. IEEE Transactions on Neural Networks. 1998;9(4):589-600
  14. 14. Wang C, Hill DJ. Deterministic Learning Theory for Identification, Recognition, and Control. Vol. 32. Boca Ration, FL: CRC Press Taylor & Francis Group; 2009
  15. 15. Cai X, de Queiroz M. Adaptive rigidity-based formation control for multirobotic vehicles with dynamics. IEEE Transactions on Control Systems Technology. 2015;23(1):389-396
  16. 16. Yuan C. Distributed adaptive switching consensus control of heterogeneous multi-agent systems with switched leader dynamics. Nonlinear Analysis: Hybrid Systems. 2017;26:274-283
  17. 17. Yuan C, Licht S, He H. Formation learning control of multiple autonomous underwater vehicles with heterogeneous nonlinear uncertain dynamics. IEEE transactions on cybernetics. 2017;99:1-15
  18. 18. Yuan C, Zeng W, Dai S. Distributed model reference adaptive containment control of heterogeneous uncertain multi-agent systems. ISA Transactions. 2019;86:73-86
  19. 19. Yuan C, He H, Wang C. Cooperative deterministic learning-based formation control for a group of nonlinear uncertain mechanical systems. IEEE Transactions on Industrial Informatics. 2019;15(1):319-333
  20. 20. Stegagno P, Yuan C. Distributed cooperative adaptive state estimation and system identification for multi-agent systems. IET Control Theory and Applications. 2019;13(1):815-822
  21. 21. Agaev R, Chebotarev P. The matrix of maximum out forests of a digraph and its applications. arXiv preprint math/0602059; 2006
  22. 22. Park J, Sandberg IW. Universal approximation using radial-basis-function networks. Neural Computation. 1991;3(2):246-257
  23. 23. Buhmann MD. Radial Basis Functions: Theory and Implementations. Vol. 12. Cambridge, UK: Cambridge University Press; 2003
  24. 24. Wang C, Hill DJ. Learning from neural control. IEEE Transactions on Neural Networks. 2006;17(1):130-146
  25. 25. Fierro R, Lewis FL. Control of a nonholonomic mobile robot: Backstepping kinematics into dynamics. In: Proceedings of the 34th IEEE Conference on Decision and Control. Vol. 4; IEEE. 1995. pp. 3805-3810
  26. 26. Siciliano B, Sciavicco L, Villani L, Oriolo G. Robotics: Modelling, Planning and Control. London, United Kingdom: Springer Science and Business Media; 2009
  27. 27. Oh S, Khalil HK. Nonlinear output-feedback tracking using high-gain observer and variable structure control. Automatica. 1997;33(10):1845-1856
  28. 28. Kanayama Y, Kimura Y, Miyazaki F, Noguchi T. A stable tracking control method for an autonomous mobile robot. In: Proceedings 1990 IEEE International Conference on Robotics and Automation; IEEE. 1990. pp. 384-389
  29. 29. Ioannou PA, Sun J. Robust Adaptive Control. 1em plus 0.5em minus 0.4em; Courier Corporation; 2012
  30. 30. Barbalat I. Systemes d’équations différentielles d’oscillations non linéaires. Reviews Mathematics Pures Applied. 1959;4(2):267-270
  31. 31. Chen W, Wen C, Hua S, Sun C. Distributed cooperative adaptive identification and control for a group of continuous-time systems with a cooperative pe condition via consensus. IEEE Transactions on Automatic Control. 2014;59(1):91-106

Written By

Xiaonan Dong, Paolo Stegagno, Chengzhi Yuan and Wei Zeng

Submitted: February 12th, 2019 Reviewed: May 24th, 2019 Published: April 22nd, 2020