Cooperative Adaptive Learning Control for a Group of Nonholonomic UGVs by Output Feedback

Xiaonan Dong; Paolo Stegagno; Chengzhi Yuan; Wei Zeng

doi:10.5772/intechopen.87038

Abstract

A high-gain observer-based cooperative deterministic learning (CDL) control algorithm is proposed in this chapter for a group of identical unicycle-type unmanned ground vehicles (UGVs) to track over desired reference trajectories. For the vehicle states, the positions of the vehicles can be measured, while the velocities are estimated using the high-gain observer. For the trajectory tracking controller, the radial basis function (RBF) neural network (NN) is used to online estimate the unknown dynamics of the vehicle, and the NN weight convergence and estimation accuracy is guaranteed by CDL. The major challenge and novelty of this chapter is to track the reference trajectory using this observer-based CDL algorithm without the full knowledge of the vehicle state and vehicle model. In addition, any vehicle in the system is able to learn the knowledge of unmodeled dynamics along the union of trajectories experienced by all vehicle agents, such that the learned knowledge can be re-used to follow any reference trajectory defined in the learning phase. The learning-based tracking convergence and consensus learning results, as well as using learned knowledge for tracking experienced trajectories, are shown using the Lyapunov method. Simulation is given to show the effectiveness of this algorithm.

Keywords

cooperative control
deterministic learning
neural network
multi-agent systems
distributed adaptive learning and control
unmanned ground vehicles

Author Information

Show +

Xiaonan Dong*
- Department of Mechanical, Industrial and Systems Engineering, University of Rhode Island, USA
Paolo Stegagno
- Department of Electrical, Computer, and Biomedical Engineering, University of Rhode Island, USA
Chengzhi Yuan*
- Department of Mechanical, Industrial and Systems Engineering, University of Rhode Island, USA
Wei Zeng
- School of Mechanical and Electrical Engineering, Longyan University, China

*Address all correspondence to: dong xn@uri.edu and cyuan@uri.edu

1. Introduction

The two-wheel-driven, unicycle-type vehicle is one of the most common mobile robot platforms, and many research results have been published regarding this system [1, 2, 3, 4]. There are two major challenges for controlling this system: the knowledge of all state variables, and the actuate modeling of the system. For the unicycle-type vehicle that we use in this chapter, the vehicle position and velocity are both required for the trajectory tracking control. The position of the vehicle can be obtained using cameras or GPS signals, while direct measurement of the vehicle velocity is difficult. State observer has been proposed to estimate the full state of the system using the measured signals [5, 6], however, traditional observers require the knowledge of the system model for accurate state estimations. High-gain observer has been proposed to estimate the unmeasured state variables in case that the system model is not fully known to the observer, and the estimated states can be used for control purposes [7, 8, 9, 10]. In this chapter, we follow the standard high-gain observer design method [8] to obtain the estimation of vehicle velocity using the measured vehicle position.

For the second challenge, adaptive control has been introduced to deal with system uncertainties [11, 12], in which neural network (NN) based control is able to further deal with nonlinear system uncertainties [13, 11]. Though tracking control can be achieved by NN-based adaptive control, however, traditional NN-based control methods failed to achieve parameter (NN weight) convergence. This shortage requires the controller to update the system parameter (NN weight) all the time when the controller is operating, which is time consuming and computational demanding. To overcome this deficiency, a deterministic learning (DL) method has been proposed to model the system uncertainties under the partial persistency of excitation (PE) condition [14]. To be more specific, it has been shown that the system uncertainties can be accurately modeled with a sufficient large number of radial basis function (RBF) NNs, and local NN weights online updated by DL will converge to their optimal values, provided that the input signal of the RBFNNs is recurrent.

Since the RBFNN estimation is locally accurate around the recurrent trajectory, this becomes a disadvantage when there exists multiple tracking tasks. The learned knowledge of the system uncertainties, presented by the RBFNNs, cannot be directly applied on a different control task, and it will need a significant amount of storage space for a large number of different tasks. In recent years, distributed control is a rising topic regarding the control of multiple coordinated agents [15, 16, 17, 18, 19, 20]. In this chapter, we took the idea of communicating inside the multi-agent system (MAS) and apply it on DL, such that in the learning phase, any vehicle in the MAS is able to learn the unmodeled dynamics not only along its own trajectory, but along the trajectories of all other vehicle agents in this MAS as well. In other words, the NN weight of any vehicle in this MAS will converge to a common constant, which presents the unmodeled dynamics along the union trajectory of all vehicles, and any vehicle in the MAS is able to use this knowledge to achieve trajectory tracking for any control task learned in the learning phase.

The main contributions of this chapter are summarized as follows.

A high-gain observer is introduced to estimate the vehicle velocities using the measurement of vehicle position.
An observer and RBFNN-based adaptive learning control algorithm is developed for a multi-vehicle system, such that each vehicle agent will be able to follow the desired reference trajectory.
An online cooperative adaptive NN learning law is proposed, such that the RBFNN weight of all vehicle agents will converge to one common value, which represents the unmodeled dynamics of the vehicle along the union trajectories experienced by all vehicle agents.
An observer and experience-based controller is developed using the common NN model obtained from the learning phase, such that vehicles are able to follow the reference trajectory experienced by any vehicle before with improved control performance.

In the following sections, we briefly describe some preliminaries on graph theory and RBFNNs based DL method, then present the vehicle dynamics and the problem statement, all in Section 2. The main results of this chapter, including the high-gain observer design, CDL-based trajectory tracking control, accurate cooperative learning using RBF NNs, and experience-based trajectory tracking control, are provided in Section 3, respectively. Simulation results of an example with four vehicles running three different tasks are provided in Section 4. The conclusions are drawn in Section 5.

Notations. R , R + and ℤ + denote, respectively, the set of real numbers, the set of positive real numbers and the set of positive integers; R m × n denotes the set of m × n real matrices; R n denotes the set of n × 1 real column vectors; I n denotes the n × n identity matrix; O m × n denotes the zero matrix with dimension of m × n ; Subscript ⋅ k denotes the k th column vector of a matrix; ∣ ⋅ ∣ is the absolute value of a real number, and ‖ ⋅ ‖ is the 2-norm of a vector or a matrix, i.e., ‖ x ‖ = x T x 1 2 ; z ̇ denotes the total derivative of z with respect to the time; ∂ / ∂ z denotes the Jacobian matrix as ∂ ∂ z = ∂ ∂ z 1 ⋯ ∂ ∂ z n .

2. Preliminaries and problem statement

2.1 Graph theory

In a graph defined as G = V ε A , the elements of V = 1 2 … n are called vertices, the elements of ε are pairs i j with i , j ∈ V , i ≠ j called edges, and the matrix A is called the adjacency matrix. If i j ∈ ε , then agent i is able to receive information from agent j , and agent i and j are called adjacent. The adjacency matrix is thus defined as A = a ij n × n , in which a ij > 0 if and only if i j ∈ ε , and a ij = 0 otherwise. For any two nodes v i , v j ∈ V , if there exists a path between them, then the graph G is called connected. Furthermore, the graph G is called fixed if ε and A do not change over time, and called undirected if ∀ i j ∈ ε , pair j i is also in ε . According to [21], for the Laplacian matrix L = l ij n × n associated with the undirected graph G , in which l ij = ∑ j = 1 , j ≠ i n a ij i = j − a ij i ≠ j . If the graph is connected, then L is a positive semi-definite symmetric matrix, with one zero eigenvalue and all other eigenvalues being positive and hence, rank L ≤ n − 1 .

2.2 Localized RBF neural networks and deterministic learning

The RBF networks can be described by f nn Z = ∑ i = 1 Nn w i s i Z = W T S Z [22], where Z ∈ Ω Z ⊂ R q is the input vector, W = w 1 ⋯ w N n T ∈ R N n is the weight vector, N n is the NN node number, and S Z = s 1 ‖ Z − μ 1 ‖ ⋯ s N n ‖ Z − μ N n ‖ T , with s i ⋅ being a radial basis function, and μ i i = 1 2 ⋯ N n being distinct points in state space. The Gaussian function s i ‖ Z − μ i ‖ = exp − Z − μ i T Z − μ i σ 2 is one of the most commonly used radial basis functions, where μ i = μ i 1 μ i 2 ⋯ μ iq T is the center of the receptive field and σ i is the width of the receptive field. The Gaussian function belongs to the class of localized RBFs in the sense that s i ‖ Z − μ i ‖ → 0 as ‖ Z ‖ → ∞ . It is easily seen that S Z is bounded and there exists a real constant S M ∈ R + such that ‖ S Z ‖ ≤ S M [14].

It has been shown in [22, 23] that for any continuous function f Z : Ω Z → R where Ω Z ⊂ R q is a compact set, and for the NN approximator, where the node number N n is sufficiently large, there exists an ideal constant weight vector W ∗ , such that for any ϵ ∗ > 0 , f Z = W ∗ T S Z + ϵ , ∀ Z ∈ Ω Z , where ∣ ϵ ∣ < ϵ ∗ is the ideal approximation error. The ideal weight vector W ∗ is an “artificial” quantity required for analysis, and is defined as the value of W that minimizes ∣ ϵ ∣ for all Z ∈ Ω Z ⊂ R q , i.e., W ∗ ≔ arg min W ∈ R N n sup Z ∈ Ω Z f Z − W T S Z . Moreover, based on the localization property of RBF NNs [14], for any bounded trajectory Z t within the compact set Ω Z , f Z can be approximated by using a limited number of neurons located in a local region along the trajectory: f Z = W ζ ∗ T S ζ Z + ϵ ζ , where ϵ ζ is the approximation error, with ϵ ζ = O ϵ = O ϵ ∗ , S ζ Z = s j 1 Z ⋯ s jζ Z T ∈ R N ζ , W ζ ∗ = w j 1 ∗ ⋯ w jζ ∗ T ∈ R N ζ , N ζ < N n , and the integers j i = j 1 , ⋯ , j ζ are defined by ∣ s j i Z p ∣ > θ ( θ > 0 is a small positive constant) for some Z p ∈ Z k .

It is shown in [14] that for a localized RBF network W T S Z whose centers are placed on a regular lattice, almost any recurrent trajectory Z k (see [14] for detailed definition of “recurrent” trajectories) can lead to the satisfaction of the PE condition of the regressor subvector S ζ Z . This result is recalled in the following Lemma.

Lemma 1 [14, 24]. Consider any recurrent trajectory Z k : ℤ + → R q . Z k remains in a bounded compact set Ω Z ⊂ R q , then for RBF network W T S Z with centers placed on a regular lattice (large enough to cover compact set Ω Z ), the regressor subvector S ζ Z consisting of RBFs with centers located in a small neighborhood of Z k is persistently exciting.

2.3 Vehicle model and problem statement

As shown in Figure 1 , this unicycle-type vehicle is a nonholonomic system, with the constraint force preventing the vehicle from sliding along the axis of the actuated wheels. The nonholonomic constraint can be presented as follows

A T q i q ̇ i = 0 E1

in which A q i = sin θ i − cos θ i 0 T , and q i = x i y i θ i T is the general coordinates of the i th vehicle ( i = 1 , 2 , … , n , with n being the number of vehicles in the MAS). ( x i , y i ) and θ i denote the position and orientation of the vehicle with respect to the ground coordinate, respectively.

With this constraint, the degree of freedom of the system is reduced to two. Independently driven by the two actuated wheels on each side of the vehicle, the non-slippery kinematics of the i th vehicle is

q ̇ i = x ̇ i y ̇ i θ ̇ i = cos θ i 0 sin θ i 0 0 1 v i ω i = def J q i u i E2

where v i and ω i are the linear and angular velocities measured at the center between the driving wheels, respectively. The dynamics of the i th vehicle can be described by [25].

M q i q ¨ i + C q i q ̇ i q ̇ i + F q i q ̇ i + G q i = B q i τ i + A q i λ i , E3

in which M ∈ R 3 × 3 is a positive definite matrix that denotes the inertia, C ∈ R 3 × 3 is the centripetal and Coriolis matrix, F ∈ R 3 × 1 is the friction vector, G ∈ R 3 × 1 is the gravity vector. τ i ∈ R 2 × 1 is a vector of system input, i.e., the torque applied on each driving wheel, B = 1 r cos θ i cos θ i sin θ i sin θ i R − R ∈ R 3 × 2 is the input transformation matrix, projecting the system input τ onto the space spanned by x y θ , in which D = 2 R is the distance between two actuation wheels, and r is the radius of the wheel. λ i is a Lagrange multiplier, and A λ i ∈ R 3 × 1 denotes the constraint force.

Matrices M and C in Eq. (3) can be derived using the Lagrangian equation with the follow steps. First we calculate the kinetic energy for the i th vehicle agent

T i = m x ̇ ic 2 + y ̇ ic 2 2 + I θ ̇ ic 2 2 E4

where m is the mass of the vehicle, I is the moment of inertia measured at the center of mass, x ic , y ic , and θ ic are the position and orientation of the vehicle at the center of mass, respectively. The following relation can be obtained from Figure 1 :

x ic = x i + d cos θ i y ic = y i + d sin θ i θ ic = θ i , x ̇ ic = x ̇ i − d θ ̇ sin θ i y ̇ ic = y ̇ i + d θ ̇ cos θ i θ ̇ ic = = θ ̇ i E5

Then Eq. (4) can be rewritten into

T q i q ̇ i = m x ̇ i − d θ ̇ sin θ i 2 + y ̇ i + d θ ̇ cos θ i 2 2 + I θ ̇ i 2 2 = 1 2 m x ̇ i 2 + m y ̇ i 2 + md 2 + I θ ̇ 2 − 2 md sin θ x ̇ i θ ̇ i + 2 md cos θ y ̇ i θ ̇ i = q ̇ i T M q i q ̇ i 2 E6

in which M = m 0 − md sin θ i 0 m md cos θ i − md sin θ i md cos θ i md 2 + I . It will be shown later that the inertia matrix M shown above is identical to that in Eq. (3). Then the dynamics equation of the system is given by the following Lagrangian equation [26],

d d t ∂ L ∂ q ̇ i T − ∂ L ∂ q i T = A q i λ i + Q i E7

in which L q i q ̇ i = T q i q ̇ i − U q i is the Lagrangian of the i th vehicle, U q i is the potential energy of the vehicle agent, λ ∈ R k × 1 is the Lagrangian multiplier, and A T λ is the constraint force. Q i = B q i τ i − f u i denotes the external force, where τ i is the force generated by the actuator, and f u i is the friction on the actuator. Then Eq. (7) can be rewritten into

M q i q ¨ i + M ̇ q ̇ i − ∂ T i ∂ q i T + ∂ U i ∂ q i T + B q i f q ̇ i = A q i λ i + B q i τ i E8

By setting C q i q ̇ i q ̇ i = M ̇ q ̇ i − ∂ T i ∂ q i T , F q i q ̇ i = B q i f q ̇ i , and G q i = ∂ U i ∂ q i T , Eq. (8) can be thereby transferred into Eq. (3). Notice that the form of C n × n is not unique, however, with a proper definition of the matrix C , we will have M ̇ − 2 C to be skew-symmetric. The i j th entry of C is defined as follows [26].

c ij = ∑ k = 1 n c ijk q ̇ k E9

where q ̇ k is the k th entry of q ̇ , and c ijk = 1 2 ∂ m ij ∂ q k + ∂ m ik ∂ q j − ∂ m jk ∂ q i is defined using the Christoffel symbols of the first kind. Then we have the centripetal and Coriolis matrix calculated as C = 0 0 − md θ ̇ i cos θ i 0 0 − md θ ̇ i sin θ i 0 0 0 . Since the vehicle is operating on the ground, the gravity vector G is equal to zero. The friction vector F is assumed to be a nonlinear function of the general velocity u i , and is unknown to the controller.

To eliminate the nonholonomic constraint force A q i λ i from Eq. (3), we left multiplying J T q i to the equation, it yields:

J T MJ u ̇ i + J T M J ̇ + CJ u i + J T F + J T G = J T B τ i + J T A λ i E10

From Eqs. (1) and (2), we have J T A = 0 2 × 1 , then the dynamic equation of u i is simplified as

M ¯ q i u ̇ i + C ¯ u i u i + F ¯ u i + G ¯ q i = τ ¯ i , E11

where

M ¯ = J T MJ = m 0 0 md 2 + I , C ¯ = J T M J ̇ + CJ = 0 − md θ ̇ i md θ ̇ i 0 , F ¯ = J T F , G ¯ = J T G = 0 2 × 1 , τ ¯ i = τ ¯ vi τ ¯ ωi = J T B τ i = 1 / r 1 / r R / r − R / r τ i .

The degree of freedom of the vehicle dynamics is now reduced to two. Since J T B is of full rank, then for any transformed torque input τ ¯ i , there exists a unique corresponding actual torque input τ i ∈ R 2 that applied on each wheel.

The main challenge for controlling the system includes (i) the direct measurement of the linear and angular velocities is not feasible, and (ii) system parameter matrices C ¯ and F ¯ are unknown to the controller.

Based on the above system setup, we are ready to formulate our objective of this chapter. Consider a group of n homogeneous unicycle-type vehicles, the kinematics and dynamics of each vehicle agent are described by Eqs. (2) and (11), respectively. The communication graph of such n vehicles is denoted as G . Regarding this MAS, we have the following assumption.

Assumption 1. The graph G is undirected and connected.

The objective of this chapter is to design an output-feedback adaptive learning control law for each vehicle agent in the MAS, such that

State estimation: The immeasurable general velocities u i = v i ω i T can be estimated by a high-gain observer using the measurement of the general coordinates q i = x i y i θ i T .
Trajectory tracking: Each vehicle in the MAS will track its desired reference trajectory, which will be quantified by x ri t y ri t θ ri t ; i.e., lim t → ∞ x i t − x ri t = 0 , lim t → ∞ y i t − y ri t = 0 , lim t → ∞ θ i t − θ ri t = 0 .
Cooperative Learning: The unknown homogeneous dynamics of all the vehicles can be locally accurately identified along the union of the trajectories experienced by all vehicle agents in the MAS.
Experience based control: The identified/learned knowledge from the cooperative learning phase can be re-utilized by each local vehicle to perform stable trajectory tracking with improved control performance.

In order to apply the deterministic learning theory, we have the following assumption on the reference trajectories.

Assumption 2. The reference trajectories x ri t , y ri t , θ ri t for all i = 1 , ⋯ , n are recurrent.

3. Main results

3.1 High-gain observer design

In mobile robotics control, the position of the vehicle can be easily obtained in real time using GPS signals or camera positioning, while the direct measurement of the velocities is much more difficult. For the control and system estimation purposes, the velocities of the vehicle are required for the controller. To this end, we follow the high-gain observer design method in [8, 9], and introduce a high-gain observer to estimate the velocities using robot positions. First, we define two new variables as follows

p x i = x i cos θ i + y i sin θ i p y i = y i cos θ i − x i sin θ i E12

Notice that the operation above can be considered as a projecting the vehicle position onto the a frame whose origin is fixed to the origin of ground coordinates, and the axes are parallel to the body-fixed frame of the vehicle. The coordinates of the vehicle in this rotational frame is p x i p y i and hence, p x i and p y i can be calculated based on the measurement of the position and the orientation. The rotation rate of this frame equals to the angular velocity of the vehicle θ ̇ i = ω i . Based on this, we design the high-gain observer for ω as

θ ̂ ̇ i = ω ̂ i + l 1 δ θ i − θ ̂ i ω ̂ ̇ i = l 2 δ 2 θ i − θ ̂ i E13

in which δ is a small positive scalar to be designed, and l 1 and l 2 are parameters to be chosen, such that − l 1 1 − l 2 0 is Hurwitz stable. The time derivative of this coordinates defined in Eq. (12) is given by p ̇ x i = v i + p y i ω i , and p ̇ y i = − p x i ω i , then we design the high-gain observer for v as

p ̂ ̇ x i = v ̂ i + p y i ω ̂ i + l 1 δ p x i − p ̂ x i v ̂ ̇ i = l 2 δ 2 p x i − p ̂ x i E14

To prevent peaking while using this high-gain observer and in turn improving the transient response, parameter δ cannot be too small [9]. Due to the use of a globally bounded control, decreasing δ does not induce peaking phenomenon of the state variables of the system, while the ability to decrease δ will be limited by practical factors such as measurement noise and sampling rates [7, 27]. According to [8], it is easy to show that the estimation error between the actual and estimated velocities of the i th vehicle z i = u i − u ̂ i will converge to zero, detailed proof is omitted here due to space limitation.

3.2 Controller design and tracking convergence analysis

After obtaining the linear and angular velocities from the high-gain observer, we now proceed to the trajectory tracking. First, we define the tracking error q ˜ i by projecting q ri − q i onto the body coordinate of the i th vehicle, with the x axis set to be the front and y to be the left of the vehicle, as shown in Figure 2 .

Figure 2.
Projecting tracking error onto the body-fixed frame.

q ˜ i = x ˜ i y ˜ i θ ˜ i = cos θ i sin θ i 0 − sin θ i cos θ i 0 0 0 1 x ri − x i y ri − y i θ ri − θ i , E15

using the constraint Eq. (1) and kinematics Eq. (2), we have the derivative of the tracking error as follows

x ˜ ̇ i = v ri cos θ ˜ i + ω i y ˜ i − v i y ˜ ̇ i = v ri sin θ ˜ i − ω i x ˜ i θ ˜ ̇ i = ω ri − ω i E16

where v i and ω i are the linear and angular velocities of the i th vehicle, respectively.

In order to utilize the backstepping control theory, we treat v i and ω i in Eq. (16) as virtual inputs, then following the methodology from [28], we can design a stabilizing virtual controller as

u c i = v c i ω c i = v r i cos θ ˜ i + K x x ˜ i ω r i + v r i K y y ˜ i + K θ sin θ ˜ i , E17

in which K x , K y , and K θ are all positive constants. It can be shown that this virtual velocity controller is able to stabilize the closed-loop system Eq. (16) kinematically by replacing v i and ω i with v c i and ω c i , respectively. To this end, we define the following Lyapunov function for the i th vehicle

V 1 i = x ˜ i 2 2 + y ˜ i 2 2 + 1 − cos θ ˜ i K y E18

and the derivative of V 1 i is

V ˙ 1 i = x ˜ i x ˜ ̇ i + y ˜ i y ˜ ̇ i + sin θ ˜ i K y θ ˜ ̇ i = x ˜ i ( v r i cos θ ˜ i + ω i y ˜ i − v c i )+ y ˜ i ( v r i sin θ ˜ i − ω i x ˜ i )+ sin θ ˜ i K y ( ω r i − ω c i ) = x ˜ i ( ω i y ˜ i − K x x ˜ i )+ y ˜ i ( v r i sin θ ˜ i − ω i x ˜ i )+ sin θ ˜ i K y ( − v r i K y y ˜ i − K θ sin θ ˜ i ) =− K x x ˜ i 2 − K θ K y sin 2 θ ˜ i ≤0 E19

Since V ̇ 1 i is negative semi-definite, then we can conclude that this closed-loop system is stable, i.e., the tracking error q ˜ i for the i th vehicle will be bounded.

Remark 1. In addition to the stable conclusion above, we could also conclude the asymptotic stability by finding the invariant set of V ̇ 1 i = 0 . By setting V ̇ 1 i = 0 , we have x ˜ i = 0 and sin θ ˜ = 0 . Applying this result into Eqs. (16) and (17) , we have the invariant set equals to x ˜ i = 0 y ˜ i = 0 sin θ ˜ = 0 ∪ x ˜ i = 0 sin θ ˜ = 0 y ˜ i ≠ 0 v r i = 0 ω ri = 0 . With the assumption 2, the velocity of the reference cannot be constant over time, then we can conclude that the only invariant subset of V ̇ 1 i = 0 is the origin q ˜ i = 0 . Therefore, we can conclude that the closed-loop system Eqs. (16) and (17) is asymptotically stable [ 29 ].

With the idea of backstepping control, we then derive the transformed torque input τ ¯ i for the i th vehicle with the following steps. By defining the error between the virtual controller u c i and the actual velocity u i as u ˜ i = v ˜ i ω ˜ i T = u c i − u i , we can rewrite Eq. (16) in terms of v ˜ i and ω ˜ i as

x ˜ ̇ i = v r i cos θ ˜ i + ω i y ˜ i − v c i + v ˜ i = − K x x ˜ i + ω i y ˜ i + v ˜ i y ˜ ̇ i = − ω i x ˜ i + v r i sin θ ˜ i θ ˜ ̇ i = ω r i − ω c i + ω ˜ i = − v r i K y y ˜ i − K θ sin θ ˜ i + ω ˜ i E20

Then we define a new Lyapunov function V 2 i = V 1 i + u ˜ i T M ¯ u ˜ i 2 for the closed-loop system Eq. (20), whose derivative can be written as

V ̇ 2 i = x ˜ i x ˜ ̇ i + y ˜ i y ˜ ̇ i + sin θ ˜ i K y θ ˜ ̇ i + u ˜ i T M ¯ u ˜ ̇ i = x ˜ i − K x x ˜ i + ω i y ˜ i + v ˜ i + y ˜ i − ω i x ˜ i + v r i sin θ ˜ i + sin θ ˜ i K y − v r i K y y ˜ i − K θ sin θ ˜ i + ω ˜ i + u ˜ i T M ¯ u ˜ ̇ i = − K x x ˜ i 2 − K θ K y sin 2 θ ˜ i + u ˜ i T x ˜ i sin θ ˜ i K y + M ¯ u ˜ ̇ i E21

To make the system stable, the term u ˜ i T x ˜ i sin θ ˜ i K y + M ¯ u ˜ ̇ i needs to be negative definite. From the definition of u ˜ i and Eq. (11), we have

M ¯ u ˜ ̇ i = M ¯ u ˜ ̇ c i − M ¯ u ̇ i = M ¯ u ˜ ̇ c i + C ¯ u i + F ¯ − τ ¯ i E22

Motivated from the results of [9], it is easy to show that this term is negative definite if τ ¯ i is designed to be

τ ¯ i = M ¯ u ̇ c i + C ¯ u i + F ¯ + K u u ˜ i + x ˜ i sin θ ˜ i K y , E23

where K u is a positive constant. Since the actual linear and angular velocity of the vehicle is unknown, we use v ̂ i and ω ̂ i generated by the high-gain observer Eqs. (13) and (14) to replace v i and ω i in Eq. (23). From the discussion in previous subsection, the convergence of velocities estimation is guaranteed.

In Eq. (23), C ¯ u i and F ¯ u i are unknown to the controller. To overcome this issue, RBFNN will be used to approximate this nonlinear uncertain term, i.e.,

H X i = C ¯ u i u i + F ¯ u i = W ∗ T S X i + ϵ i , E24

in which S X i is the vector of RBF, with the variable (RBFNN input) X i = u i , W ∗ is the common ideal estimation weight of this RBFNN, and ϵ i is the ideal estimation error, which can be made arbitrarily small given sufficiently large number of neurons. Consequently, we proposed the implementable controller for the i th vehicle as follows

τ ¯ i = M ¯ u ̇ c i + W ̂ i T S X i + K u v c i − v ̂ i ω c i − ω ̂ i + x ˜ i sin θ ˜ i K y , E25

For the NN weights used in Eq. (25), we propose an online NN weight updating law as follows

W ̂ ̇ i = Γ S X i u ˜ i T − γ W ̂ i − β ∑ j = 1 n a ij W ̂ i − W ̂ j , E26

where Γ , γ , and β are positive constants.

Theorem 1. Consider the closed-loop system consisting of the n vehicles in the MAS described by Eqs. (2) and (11) , reference trajectory q r i t , high-gain observer Eqs. (13) and (14) , adaptive NN controller Eq. (25) with the virtual velocity Eq. (17) , and the online weight updating law (26) , under the Assumptions 1 and 2, then for any bounded initial condition of all the vehicles and W ̂ i = 0 , the tracking error q ˜ i converges asymptotically to a small neighborhood around zero for all vehicle agents in the MAS.

Proof: We first derive the error dynamics of velocity between u c i and u i using Eqs. (22) and (25)

u ˜ ̇ i = M ¯ − 1 W ˜ i T S X i + ε i − K u v c i − v ̂ i ω c i − ω ̂ i − x ˜ i sin θ ˜ i K y E27

where ϵ i = ϵ v i ϵ ω i T and W ˜ i = W ∗ − W ̂ i . Notice that the convergence of u ̂ i to u i is guaranteed by the high-gain observer. Then we derive the error dynamics of NN weight as follows

W ˜ ̇ i = − W ̂ ̇ i = − Γ S X i u ˜ i T + γ W ̂ i + β ∑ j = 1 n a ij W ̂ i − W ̂ j ) E28

For the closed-loop system given by Eqs. (20), (27), and (28), we can build a positive definite function V as

V = ∑ i = 1 n x ˜ i 2 2 + y ˜ i 2 2 + 1 − cos θ ˜ i K y + u ˜ i T M ¯ u ˜ i 2 + trace W ˜ i T W ˜ i 2 Γ E29

whose derivative is equal to

V ̇ = ∑ i = 1 n x ˜ i x ˜ ̇ i + y ˜ i y ˜ ̇ i + sin θ ˜ i K y θ ˜ ̇ i + u ˜ i T M ¯ u ˜ ̇ i + trace W ˜ i T W ˜ ̇ i Γ E30

By using Eqs. (27) and (28), the equation above is equivalent to

V ˙ = ∑ i=1 n { x ˜ i ( v ˜ i + ω i y ˜ i − K x x ˜ i )+ y ˜ i ( v r i sin θ ˜ i − ω i x ˜ i )+ sin θ ˜ i K y ( ω ˜ i − v r i K y y ˜ i − K θ sin θ ˜ i ) + u ˜ i T [ W ˜ i T S( X i )+ ε i − K u u ˜ i −[ x ˜ i sin θ ˜ i K y ] ] + trace( W ˜ i T [ −S( X i ) u ˜ i T + γ W ̂ i Γ + β Γ ∑ j=1 n a ij ( W ̂ i − W ̂ j ) ] ) } = ∑ i=1 n { − K x x ˜ i 2 − K θ K y sin 2 θ ˜ i − K u u ˜ i T u ˜ i + u ˜ i T ε i + u ˜ i T [ W ˜ i T S( X i ) ] − trace( [ W ˜ i T S( X i ) ] u ˜ i T )+trace( γ W ˜ i T W ̂ i Γ ) }−trace( ∑ i=1 n β Γ W ˜ i T ∑ j=1 n a ij ( W ̂ i − W ̂ j ) ) = ∑ i=1 n { − K x x ˜ i 2 − K θ K y sin 2 θ ˜ i − K u u ˜ i T u ˜ i + u ˜ i T ε i + γ Γ trace( W ˜ i T W ̂ i ) }− β Γ trace( W ˜ T ( L⊗I ) W ˜ ) E31

where L is the Laplacian matrix of G , and W ˜ = W ˜ 1 T ⋯ W ˜ n T T . Since β and Γ are all positive, and L is positive semi-definite, then we have β Γ trace W ˜ T L ⊗ I W ˜ ≥ 0 . Notice that the estimation error can be made arbitrary small with a sufficient large number of neurons, and γ is the leakage term chosen as a small positive constant. Therefore, we can conclude that the closed-loop system Eqs. (20), (27), and (28) is stable, i.e., V ̇ ≤ 0 , if the following condition stands

K x x ˜ i 2 + K θ K y sin 2 θ ˜ i + K u u ˜ i T u ˜ i ≥ u ˜ i T ϵ i + γ Γ trace W ˜ i T W ̂ i E32

Hence, the closed-loop system is stable, and all tracking error are bounded. Since all variables in Eq. (31) are continuous (i.e., V ¨ is bounded), then with the application of Barbalat’s Lemma [30], we have lim t → ∞ V ̇ = 0 , which implies that the tracking error q ˜ i for all agents will converge to a small neighborhood of zero, whose size depends on the norm of u ˜ i T ϵ i + γ Γ trace W ˜ i T W ̂ i . Q.E.D.

3.3 Consensus convergence of NN weights

In addition to the tracking convergence shown in the previous subsection, we will show that all vehicles in the system is able to learn the unknown vehicle dynamics along the union trajectory (denoted as ∪ i = 1 n ζ i X i t ) experienced by all vehicles in this subsection.

By defining v ˜ = v ˜ 1 … v ˜ n T , ω ˜ = ω ˜ 1 … ω ˜ n T , W ˜ v = W ˜ 1 , 1 … W ˜ n , 1 T , and W ˜ ω = W ˜ 1 , 2 … W ˜ n , 2 T , we combine the error dynamics in Eqs. (27) and (28) for all vehicles into the following form:

v ˜ ̇ ω ˜ ̇ W ˜ ̇ v W ˜ ̇ ω = A B C D v ˜ ω ˜ W ˜ v W ˜ ω + E E33

in which

A 2 n × 2 n = − K u m I n 0 0 − K u I I n , B 2 nN × 2 n = S T m 0 0 S T I , C 2 n × 2 nN = − Γ S 0 0 − Γ S , D 2 nN × 2 nN = − β L ⊗ I N 0 0 − β L ⊗ I N ,

where S = diag S X 1 S X 2 … S X n , and

E 2 nN + 2 nN × 1 = E 1 E 2 E 3 E 4 , E 1 = 1 m ϵ v 1 − x ˜ 1 ⋮ ϵ v n − x ˜ n , E 2 = 1 I ϵ ω 1 − sin θ ˜ 1 K y ⋮ ϵ ω n − sin θ ˜ n K y , E 3 = γ m W ̂ 1 , 1 ⋮ W ̂ n , 1 , E 4 = γ m W ̂ 1 , 2 ⋮ W ̂ n , 2 .

As is shown in Theorem 1, the tracking error q ˜ i will converge to a small neighborhood of zero for all vehicle agents in the MAS. Furthermore, the ideal estimation errors ϵ vi and ϵ ωi can be made arbitrarily small given sufficient number of RBF neurons, and γ is chosen to be a small positive constant, therefore, we can conclude that the norm of E in Eq. (33) is a small value. In the following theorem, we will show that W i = W i , 1 W i , 2 converges to a small neighborhood of the common ideal weight W ∗ for all i = 1 , … , n under Assumptions 1 and 2.

Before proceeding further, we denote the system trajectory of the i th vehicle as ζ i for all i = 1 , ⋯ , n . Using the same notation from [14], ⋅ ζ and ⋅ ζ ¯ represent the parts of ⋅ related to the region close to and away from the trajectory ζ , respectively.

Theorem 2. Consider the error dynamics Eq. (33) , under the Assumptions 1 and 2, then for any bounded initial condition of all the vehicles and W ̂ i = 0 , along the union of the system trajectories ∪ i = 1 n ζ i X i t , all local estimated neural weights W ̂ ζ i used in Eqs. (25) and (26) converge to a small neighborhood of their common ideal value W ζ ∗ , and locally accurate identification of nonlinear uncertain dynamics H X t can be obtained by W ̂ i T S X as well as W ¯ i T S X for all X ∈ ∪ i = 1 n ζ i X i t , where

W ¯ i = mean t a i ≤ t ≤ t b i W ̂ i t E34

with t a i t b i ( t b i > t a i > T i ) being a time segment after the transient period of tracking control.

Proof: According to [14], if the nominal part of closed loop system shown in Eq. (33) is uniformly locally exponentially stable (ULES), then v ˜ , ω ˜ , W ˜ v , and W ˜ ω will converge to a small neighborhood of the origin, whose size depends on the value of ‖ E ‖ .

Now the problem boils down to proving ULES of the nominal part of system Eq. (33). To this end, we need to resort to the results of Lemma 4 in [31]. It is stated that if the Assumptions 1 and 2 therein are satisfied, and the associated vector S ζ X i is PE for all i = 1 , ⋯ , n , then the nominal part of Eq. (33) is ULES. The assumption 1 therein is automatically verified since S is bounded, and Assumption 2 therein also holds, if we set the counterparts P = Γ m 0 0 I and Q = − 2 Γ K u I n 0 0 K u I n . Furthermore, the PE condition of S ζ X i will also be met, if X i of the learning task is recurrent [14], which is guaranteed by Assumption 2 and results from Theorem 1. Therefore, we can obtain the conclusion that v ˜ , ω ˜ , W ˜ v , and W ˜ ω will converge to a small neighborhood of the origin, whose size depends on the small value of ‖ E ‖ .

Similar to [24], the convergence of W ̂ ζi to a small neighborhood of W ζ ∗ implies that for all X ∈ ∪ i = 1 n ζ i X i t , we have

H X = W ζ ∗ T + ϵ ζ = W ̂ ζ i T S ζ X + W ˜ ζ i T S ζ X + ϵ ζi = W ̂ ζ i T S ζ X + ϵ 1 ζ i E35

where ϵ 1 ζi = W ˜ ζi T S ζ X + ϵ ζi is close to ϵ ζi due to the convergence of W ˜ ζi . With the W ¯ i defined in Eq. (34), then Eq. (35) can be rewritten into

H X = W ̂ ζ i T S ζ X + ϵ 1 ζ i = W ¯ ζ i T S ζ X + ϵ 2 ζ i E36

where W ¯ ζ i T = w 1 ζ ⋯ w k ζ T is a subvector of W ¯ i and ϵ 2 ζ i is the error using W ¯ ζi T S ζ X as the system approximation. After the transient process, ‖ ϵ 1 ζ i ‖ − ‖ ϵ 2 ζ i ‖ is small for all i = 1 , ⋯ , n .

On the other hand, due to the localization property of Gaussian RBFs, both S ζ ¯ and W ¯ ζ ¯ S ζ ¯ X are very small. Hence, along the union trajectory ∪ i = 1 n ζ i X i t , the entire constant RBF network W ¯ T S X can be used to approximate the nonlinear uncertain dynamics, demonstrated by the following equivalent equations

H X = W ζ ∗ T S ζ X + ϵ ζ H X = W ̂ ζ i T S ζ X + W ̂ ζ ¯ i T S ζ ¯ X + ϵ 1 i = W ̂ i T S X + ϵ 1 i H X = W ¯ ζ i T S ζ X + W ¯ ζ ¯ i T S ζ ¯ X + ϵ 2 i = W ¯ i T S X + ϵ 2 i E37

where ‖ ϵ 1 i ‖ − ‖ ϵ 1 ζ i ‖ and ‖ ϵ 2 i ‖ − ‖ ϵ 2 ζ i ‖ are all small for all i = 1 , ⋯ , n . Therefore, the conclusion of Theorem 2 can be drawn. Q.E.D.

3.4 Experience-based trajectory tracking control

In this section, based on the learning results from the previous subsections, we further propose an experience-based trajectory tracking control method using the knowledge learned in the previous subsection, such that the experience-based controller is able to drive each vehicle to follow any reference trajectory experienced by any vehicle on the learning stage.

To this end, we replace the NN weight W ̂ i in Eq. (25) by the converged constant NN weight W ¯ i for the i th vehicle. Therefore, the experience-based controller for the i th vehicle is constructed as follows

τ ¯ i = M ¯ u ̇ c i + W ¯ i T S X i + K u v c i − v ̂ i ω c i − ω ̂ i + x ˜ i sin θ ˜ i K y , E38

in which u ̇ ci is the derivative of the virtual velocity controller from Eq. (17), and W ¯ i is obtained from Eq. (34) for the i th vehicle. The system model Eqs. (2) and (11), and the high-gain observer design Eqs. (14) and (13) remain unchanged.

Theorem 3. Consider the closed-loop system consisting of Eqs. (2) and (11) , reference trajectory q ri ∈ ∪ j = 1 n q j t , high-gain observer Eqs. (14) and (13) , and the experience-based controller Eq. (38) with virtual velocity Eq. (17) . For any bounded initial condition, the tracking error q ˜ i converges asymptotically to a small neighborhood around zero.

Proof: Similar to the proof of Theorem 1, by defining q ˜ i and u ˜ i to be the error between the position and velocity of the i th vehicle and its associated reference trajectory, we have the error dynamics of the i th vehicle as

x ˜ ̇ i = v r i cos θ ˜ i + ω i y ˜ i − v i = v ˜ i + ω i y ˜ i − K x x ˜ i y ˜ ̇ i = v r i sin θ ˜ i − ω i x ˜ i θ ˜ ̇ i = ω r i − ω i = ω ˜ i − v r i K y y ˜ i − K θ sin θ ˜ i u ˜ ̇ i = M ¯ − 1 H X i − W ¯ i T S X i − K u v c i − v ̂ i ω c i − ω ̂ i − x ˜ i sin θ ˜ i K y E39

With the same high-gain observer design used in the learning-based tracking, the convergence of u ̂ i to u i is also guaranteed. For the closed-loop system shown above, we can build a positive definite function as

V i = x ˜ i 2 2 + y ˜ i 2 2 + 1 − cos θ ˜ i K y + u ˜ i T M ¯ u ˜ i 2 E40

and the derivative of V i is

V ̇ i = x ˜ i x ˜ ̇ i + y ˜ i y ˜ ̇ i + sin θ ˜ i K y θ ˜ ̇ i + u ˜ i T M ¯ u ˜ ̇ i = x ˜ i v ˜ i + ω i y ˜ i − K x x ˜ i + y ˜ i v r i sin θ ˜ i − ω i x ˜ i + sin θ ˜ i K y ω ˜ i − v r i K y y ˜ i − K θ sin θ ˜ i + u ˜ i T ϵ 2 i − K u u ˜ i − x ˜ i sin θ ˜ i K y = − K x x ˜ i 2 − K θ K y sin 2 θ ˜ i − K u u ˜ i T u ˜ i + u ˜ i T ϵ 2 i E41

where ϵ 2 i = H X i − W ¯ i T S X i . Then following the similar arguments in the proof of Theorem 1, given positive K x , K y , K θ , and K u , then we can conclude that the Lyapunov function V i is positive definite and V ̇ i is negative semi-definite in the region K x x ˜ i 2 + K θ K y sin 2 θ ˜ i + K u u ˜ i T u ˜ i ≥ u ˜ i T ϵ ¯ i . Similar to the proof of Theorem 1, it can be shown that lim t → ∞ V ̇ i = 0 with Barbalat’s Lemma, and the tracking errors will converge to a small neighborhood of zero. Q.E.D.

4. Simulation studies

Consider four identical vehicles, whose unknown friction vector is assumed to be a nonlinear function of v and ω as follows F ¯ = 0.1 mv i + 0.05 mv i 2 0.2 I ω i + 0.1 I ω i 2 , and since we assume the vehicles are operating on the horizontal plane, the gravitational vector G ¯ is equal to zero. The physical parameters of the vehicles are given as m = 2 kg , I = 0.2 kg ⋅ m 2 ; R = 0.15 m , r = 0.05 m and d = 0.1 m . The reference trajectories of the three vehicles are given by

x r 1 = − sin t y r 1 = 2 cos t x r 2 = 2 cos t y r 2 = sin t x r 3 = − 2 sin t y r 3 = 3 cos t x r 4 = 3 cos t y r 4 = 2 sin t E42

and for all vehicles, the orientations of reference trajectories and vehicle velocities satisfy the following equations

tan θ ri = y ̇ ri x ̇ ri , v ri = x ̇ ri 2 + y ̇ ri 2 , ω ri = x ̇ ri y ¨ ri − x ¨ ri y ̇ ri x ̇ ri 2 + y ̇ ri 2 . E43

The parameters of the observer Eqs. (13) and (14) are given as δ = 0.01 , and l 1 = l 2 = 1 . The parameters of the controller Eq. (25) with Eq. (17) are given as K x = K y = K θ = 1 , and K u = 2 . The parameters of Eq. (26) are given as Γ = 10 , γ = 0.001 , and β = 10 . For each i = 1,2,3,4 , since X i = v i ω i T , we construct the Gaussian RBFNN W ̂ i S X i using N = 5 × 5 = 25 neuron nodes with the centers evenly placed over the state space 0 4 × 0 4 and the standard deviation of the Gaussian function equal to 0.7 . The initial position of the vehicles are set at the origin, with the velocities set to be zero, and the initial weights of RBFNNs are also set to be zero. The connection between three vehicles is shown in Figure 3 , and the Laplacian matrix L associated with the graph G is

Figure 3.
Connection between four vehicles.

L = 2 − 1 0 − 1 − 1 2 − 1 0 0 − 1 2 − 1 − 1 0 − 1 2 . E44

Simulation results are shown as following. Figure 4a shows that the observer error will converge to a close neighborhood around zero in a very short time period, and Figure 4b shows that all tracking errors x ˜ i and y ˜ i will converge to zero, and Figures 5a–f show that all vehicles (blue triangles) will track its own reference trajectory (red solid circles) on the 2-D frame. Figure 6b shows that the NN weights of all vehicle agents converge to the same constant, and Figure 6a shows that all RBFNNs of three vehicles are able to accurately estimate the unknown dynamics, as the estimation errors converging to a small neighborhood around zero.

Figure 4.
Observer errors and tracking errors using observer-based controller. (a) Observer errors using observer (13) and (14). (b) Tracking errors using controller (25) with (17) and (26).

Figure 5.
Snapshot of trajectory tracking using controller Eq. (25) with Eqs. (17) and (26). (a) time at 0 seconds. (b) time at 1 seconds. (c) time at 4 seconds. (d) time at 9 seconds. (e) time at 16 seconds. (f) time at 25 seconds.

Figure 6.
Estimation errors and NN weight convergence. (a) Estimation errors using controller (25) with(17) and (26). (b) Weight vector 1-norm of Wv and Ww.

To demonstrate the results of Theorem 3, which states that after the learning process, each vehicle is able to use the learned knowledge to follow any reference trajectory experienced by any vehicle on the learning stage. In this part of our simulation, the experience-based controller Eq. (38) will be implemented with the same parameters as those of the previous subsection, such that vehicle 1 will follow the reference trajectory of vehicle 3, vehicle 2 will follow the reference trajectory of vehicle 1, and vehicle 3 will follow the reference trajectory of vehicle 2. The initial position of the vehicles are set at the origin, with all velocities equal to zero.

Simulation results are shown as following. Figure 7a shows that the observer error will converge to a close neighborhood around zero in a very short time period. Figures 8a–c show that all vehicles (blue triangles) will track its own reference trajectory (red solid circles), and Figure 7b shows that all tracking errors x ˜ i and y ˜ i will converge to zero.

Figure 7.
Observer errors and tracking errors using observer-based controller. (a) Observer errors using observer (13) and (14). (b) Tracking errors using controller (38) with (17).

Figure 8.
Snapshot of trajectory tracking using controller Eqs. (38) with Eq. (17). (a) time at 0 seconds. (b) time at 4 seconds. (c) time at 16 seconds.

5. Conclusion

In this chapter, a high-gain observer-based CDL control algorithm has been proposed to estimate the unmodeled nonlinear dynamics of a group of homogeneous unicycle-type vehicles while tracking their reference trajectories. It has been shown in this chapter that the state estimation, trajectory tracking, and consensus learning are all achieved using the proposed algorithm. To be more specific, any vehicle in the system is able to learn the unmodeled dynamics along the union of trajectories experienced by all vehicles with the state variables provided by measurements and observer estimations. In addition, we have also shown that with the converged NN weight, this knowledge can be applied on the vehicle to track any experienced trajectory with reduced computational complexity. Simulation results have been provided to demonstrate the effectiveness of this proposed algorithm.

References

1. Yu X, Liu L, Feng G. Trajectory tracking for nonholonomic vehicles with velocity constraints. In: IFAC-Papers Online. Vol. 48, No. 11. 2015. pp. 918-923
2. Chen X, Jia Y. Simple tracking controller for unicycle-type mobile robots with velocity and torque constraints. Transactions of the Institute of Measurement and Control. 2015;37(2):211-218
3. Seyboth GS, Wu J, Qin J, Yu C, Allgöwer F. Collective circular motion of unicycle type vehicles with nonidentical constant velocities. IEEE Transactions on control of Network Systems. 2014;1(2):167-176
4. Dong X, Yuan C, Wu F. Cooperative deterministic learning-based trajectory tracking for a group of unicycle-type vehicles. In: ASME 2018 Dynamic Systems and Control Conference. 1em plus 0.5em minus 0.4em; American Society of Mechanical Engineers. 2018. p. V003T30A006
5. Luenberger DG. Observing the state of a linear system. IEEE transactions on military electronics. 1964;8(2):74-80
6. Luenberger D. An introduction to observers. IEEE Transactions on Automatic Control. 1971;16(6):596-602
7. Lee KW, Khalil HK. Adaptive output feedback control of robot manipulators using high-gain observer. International Journal of Control. 1997;67(6):869-886
8. Khalil HK. High-gain observers in nonlinear feedback control. In: 2008 International Conference on Control, Automation and Systems. 1em plus 0.5em minus 0.4em; IEEE. 2008. pp. xlvii-xlvii
9. Zeng W, Wang Q, Liu F, Wang Y. Learning from adaptive neural network output feedback control of a unicycle-type mobile robot. ISA Transactions. 2016;61:337-347
10. Boker A, Yuan C. High-gain observer-based distributed tracking control of heterogeneous nonlinear multi-agent systems. In: 2018 37th Chinese Control Conference (CCC). 1em plus 0.5em minus 0.4em; IEEE. 2018. pp. 6639-6644
11. Rossomando FG, Soria CM. Identification and control of nonlinear dynamics of a mobile robot in discrete time using an adaptive technique based on neural pid. Neural Computing and Applications. 2015;26(5):1179-1191
12. Miao Z, Wang Y. Adaptive control for simultaneous stabilization and tracking of unicycle mobile robots. Asian Journal of Control. 2015;17(6):2277-2288
13. Fierro R, Lewis FL. Control of a nonholonomic mobile robot using neural networks. IEEE Transactions on Neural Networks. 1998;9(4):589-600
14. Wang C, Hill DJ. Deterministic Learning Theory for Identification, Recognition, and Control. Vol. 32. Boca Ration, FL: CRC Press Taylor & Francis Group; 2009
15. Cai X, de Queiroz M. Adaptive rigidity-based formation control for multirobotic vehicles with dynamics. IEEE Transactions on Control Systems Technology. 2015;23(1):389-396
16. Yuan C. Distributed adaptive switching consensus control of heterogeneous multi-agent systems with switched leader dynamics. Nonlinear Analysis: Hybrid Systems. 2017;26:274-283
17. Yuan C, Licht S, He H. Formation learning control of multiple autonomous underwater vehicles with heterogeneous nonlinear uncertain dynamics. IEEE transactions on cybernetics. 2017;99:1-15
18. Yuan C, Zeng W, Dai S. Distributed model reference adaptive containment control of heterogeneous uncertain multi-agent systems. ISA Transactions. 2019;86:73-86
19. Yuan C, He H, Wang C. Cooperative deterministic learning-based formation control for a group of nonlinear uncertain mechanical systems. IEEE Transactions on Industrial Informatics. 2019;15(1):319-333
20. Stegagno P, Yuan C. Distributed cooperative adaptive state estimation and system identification for multi-agent systems. IET Control Theory and Applications. 2019;13(1):815-822
21. Agaev R, Chebotarev P. The matrix of maximum out forests of a digraph and its applications. arXiv preprint math/0602059; 2006
22. Park J, Sandberg IW. Universal approximation using radial-basis-function networks. Neural Computation. 1991;3(2):246-257
23. Buhmann MD. Radial Basis Functions: Theory and Implementations. Vol. 12. Cambridge, UK: Cambridge University Press; 2003
24. Wang C, Hill DJ. Learning from neural control. IEEE Transactions on Neural Networks. 2006;17(1):130-146
25. Fierro R, Lewis FL. Control of a nonholonomic mobile robot: Backstepping kinematics into dynamics. In: Proceedings of the 34^th IEEE Conference on Decision and Control. Vol. 4; IEEE. 1995. pp. 3805-3810
26. Siciliano B, Sciavicco L, Villani L, Oriolo G. Robotics: Modelling, Planning and Control. London, United Kingdom: Springer Science and Business Media; 2009
27. Oh S, Khalil HK. Nonlinear output-feedback tracking using high-gain observer and variable structure control. Automatica. 1997;33(10):1845-1856
28. Kanayama Y, Kimura Y, Miyazaki F, Noguchi T. A stable tracking control method for an autonomous mobile robot. In: Proceedings 1990 IEEE International Conference on Robotics and Automation; IEEE. 1990. pp. 384-389
29. Ioannou PA, Sun J. Robust Adaptive Control. 1em plus 0.5em minus 0.4em; Courier Corporation; 2012
30. Barbalat I. Systemes d’équations différentielles d’oscillations non linéaires. Reviews Mathematics Pures Applied. 1959;4(2):267-270
31. Chen W, Wen C, Hua S, Sun C. Distributed cooperative adaptive identification and control for a group of continuous-time systems with a cooperative pe condition via consensus. IEEE Transactions on Automatic Control. 2014;59(1):91-106

Sections

Author information

1.Introduction
2.Preliminaries and problem statement
3.Main results
4.Simulation studies
5.Conclusion

References

Publish with IntechOpen

Next chapter

Multiagent Systems for 3D Reconstruction Applications

By Metehan Aydın, Erkan Bostancı, Mehmet Serdar Güzel and Nadia Kanwal

855 downloads | 4 cites

[1] 1. Yu X, Liu L, Feng G. Trajectory tracking for nonholonomic vehicles with velocity constraints. In: IFAC-Papers Online. Vol. 48, No. 11. 2015. pp. 918-923

[2] 2. Chen X, Jia Y. Simple tracking controller for unicycle-type mobile robots with velocity and torque constraints. Transactions of the Institute of Measurement and Control. 2015;37(2):211-218

[3] 3. Seyboth GS, Wu J, Qin J, Yu C, Allgöwer F. Collective circular motion of unicycle type vehicles with nonidentical constant velocities. IEEE Transactions on control of Network Systems. 2014;1(2):167-176

[4] 4. Dong X, Yuan C, Wu F. Cooperative deterministic learning-based trajectory tracking for a group of unicycle-type vehicles. In: ASME 2018 Dynamic Systems and Control Conference. 1em plus 0.5em minus 0.4em; American Society of Mechanical Engineers. 2018. p. V003T30A006

[5] 5. Luenberger DG. Observing the state of a linear system. IEEE transactions on military electronics. 1964;8(2):74-80

[6] 6. Luenberger D. An introduction to observers. IEEE Transactions on Automatic Control. 1971;16(6):596-602

[7] 7. Lee KW, Khalil HK. Adaptive output feedback control of robot manipulators using high-gain observer. International Journal of Control. 1997;67(6):869-886

[8] 8. Khalil HK. High-gain observers in nonlinear feedback control. In: 2008 International Conference on Control, Automation and Systems. 1em plus 0.5em minus 0.4em; IEEE. 2008. pp. xlvii-xlvii

[9] 9. Zeng W, Wang Q, Liu F, Wang Y. Learning from adaptive neural network output feedback control of a unicycle-type mobile robot. ISA Transactions. 2016;61:337-347

[10] 10. Boker A, Yuan C. High-gain observer-based distributed tracking control of heterogeneous nonlinear multi-agent systems. In: 2018 37th Chinese Control Conference (CCC). 1em plus 0.5em minus 0.4em; IEEE. 2018. pp. 6639-6644

[11] 11. Rossomando FG, Soria CM. Identification and control of nonlinear dynamics of a mobile robot in discrete time using an adaptive technique based on neural pid. Neural Computing and Applications. 2015;26(5):1179-1191

[12] 12. Miao Z, Wang Y. Adaptive control for simultaneous stabilization and tracking of unicycle mobile robots. Asian Journal of Control. 2015;17(6):2277-2288

[13] 13. Fierro R, Lewis FL. Control of a nonholonomic mobile robot using neural networks. IEEE Transactions on Neural Networks. 1998;9(4):589-600

[14] 14. Wang C, Hill DJ. Deterministic Learning Theory for Identification, Recognition, and Control. Vol. 32. Boca Ration, FL: CRC Press Taylor & Francis Group; 2009

[15] 15. Cai X, de Queiroz M. Adaptive rigidity-based formation control for multirobotic vehicles with dynamics. IEEE Transactions on Control Systems Technology. 2015;23(1):389-396

[16] 16. Yuan C. Distributed adaptive switching consensus control of heterogeneous multi-agent systems with switched leader dynamics. Nonlinear Analysis: Hybrid Systems. 2017;26:274-283

[17] 17. Yuan C, Licht S, He H. Formation learning control of multiple autonomous underwater vehicles with heterogeneous nonlinear uncertain dynamics. IEEE transactions on cybernetics. 2017;99:1-15

[18] 18. Yuan C, Zeng W, Dai S. Distributed model reference adaptive containment control of heterogeneous uncertain multi-agent systems. ISA Transactions. 2019;86:73-86

[19] 19. Yuan C, He H, Wang C. Cooperative deterministic learning-based formation control for a group of nonlinear uncertain mechanical systems. IEEE Transactions on Industrial Informatics. 2019;15(1):319-333

[20] 20. Stegagno P, Yuan C. Distributed cooperative adaptive state estimation and system identification for multi-agent systems. IET Control Theory and Applications. 2019;13(1):815-822

[21] 21. Agaev R, Chebotarev P. The matrix of maximum out forests of a digraph and its applications. arXiv preprint math/0602059; 2006

[22] 22. Park J, Sandberg IW. Universal approximation using radial-basis-function networks. Neural Computation. 1991;3(2):246-257

[23] 23. Buhmann MD. Radial Basis Functions: Theory and Implementations. Vol. 12. Cambridge, UK: Cambridge University Press; 2003

[24] 24. Wang C, Hill DJ. Learning from neural control. IEEE Transactions on Neural Networks. 2006;17(1):130-146

[25] 25. Fierro R, Lewis FL. Control of a nonholonomic mobile robot: Backstepping kinematics into dynamics. In: Proceedings of the 34^th IEEE Conference on Decision and Control. Vol. 4; IEEE. 1995. pp. 3805-3810

[26] 26. Siciliano B, Sciavicco L, Villani L, Oriolo G. Robotics: Modelling, Planning and Control. London, United Kingdom: Springer Science and Business Media; 2009

[27] 27. Oh S, Khalil HK. Nonlinear output-feedback tracking using high-gain observer and variable structure control. Automatica. 1997;33(10):1845-1856

[28] 28. Kanayama Y, Kimura Y, Miyazaki F, Noguchi T. A stable tracking control method for an autonomous mobile robot. In: Proceedings 1990 IEEE International Conference on Robotics and Automation; IEEE. 1990. pp. 384-389

[29] 29. Ioannou PA, Sun J. Robust Adaptive Control. 1em plus 0.5em minus 0.4em; Courier Corporation; 2012

[30] 30. Barbalat I. Systemes d’équations différentielles d’oscillations non linéaires. Reviews Mathematics Pures Applied. 1959;4(2):267-270

[31] 31. Chen W, Wen C, Hua S, Sun C. Distributed cooperative adaptive identification and control for a group of continuous-time systems with a cooperative pe condition via consensus. IEEE Transactions on Automatic Control. 2014;59(1):91-106

Cooperative Adaptive Learning Control for a Group of Nonholonomic UGVs by Output Feedback

Multi Agent Systems - Strategies and Applications

Abstract

Keywords

Author Information

Xiaonan Dong*

Paolo Stegagno

Chengzhi Yuan*

Wei Zeng

1. Introduction

2. Preliminaries and problem statement

2.1 Graph theory

2.2 Localized RBF neural networks and deterministic learning

2.3 Vehicle model and problem statement

Figure 1.

3. Main results

3.1 High-gain observer design

3.2 Controller design and tracking convergence analysis

Figure 2.

3.3 Consensus convergence of NN weights

3.4 Experience-based trajectory tracking control

4. Simulation studies

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

5. Conclusion

References

Multiagent Systems for 3D Reconstruction Applications

Your cart

Cooperative Adaptive Learning Control for a Group of Nonholonomic UGVs by Output Feedback

Multi Agent Systems - Strategies and Applications

Abstract

Keywords

Author Information

Xiaonan Dong*

Paolo Stegagno

Chengzhi Yuan*

Wei Zeng

1. Introduction

2. Preliminaries and problem statement

2.1 Graph theory

2.2 Localized RBF neural networks and deterministic learning

2.3 Vehicle model and problem statement

Figure 1.

3. Main results

3.1 High-gain observer design

3.2 Controller design and tracking convergence analysis

Figure 2.

3.3 Consensus convergence of NN weights

3.4 Experience-based trajectory tracking control

4. Simulation studies

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

5. Conclusion

References

Continue reading from the same book

Multi Agent Systems

Your cart