Summary of case study results.
This chapter studies a deterministic approach to transient trajectory generation and control as applied to the forced Van der Pol oscillatory system. This type of system tends towards a strongly nonlinear system, which can be considered chaotic. A classical tuning method, targeted exponential weighting, and isolated trajectory fractionalization trajectory generation methods are examined. Illustrating the given deterministic approach via the Van der Pol system highlights the potentially iterative nature of deterministic methods, and that traditional optimal linear time-invariant control techniques are unable to perform as desired whereas even an idealized nonlinear feedforward control significantly outperforms at the steady-state. It will be shown that utilizing a-priori knowledge of the system dynamics will enable the isolated trajectory fractionalization method to minimize the nonlinear transient effects due to miss-modeled or unmodeled plant dynamics, and that this benefit can be coupled with the targeted exponential weighting approach for greatly decreased trajectory tracking error on the order of a 92% reduction of the objective cost function in the presented case study based on the forced Van der Pol system.
- Van der Pol
- trajectory generation
- path planning
- nonlinear transients
- control systems
- nonlinear dynamic system
- aerospace engineering
- intelligent systems
- dynamic inversion
- disturbance modeling
- phase portrait tuning
Transient trajectory generation research is an interesting area, and often finds itself in a highly nonlinear environment. Similar to trajectory optimization problems [1, 2, 3, 4, 5, 6, 7, 8, 9], the goal of designing transient trajectories is to reach the desired steady-state trajectory more quickly while simultaneously minimizing and unwanted micro-transients along the way. Micro-transients can be considered as a subcategory of transient trajectories. Transient trajectories are smaller trajectories that connect two steady-state trajectories and can be quite volatile and full of non-linear micro transients that may push the system towards instability. Section 1 will introduce the reader to basic control theory, and the Van der Pol oscillator which will be used to illustrate transient trajectory generation approaches. Section 2 will describe a classical tuning method, targeted exponential weighting, and isolated trajectory fractionalization trajectory generation methods and present the corresponding results of each method. Section 3 will finish off the chapter by summarizing the results in table format.
1.1 Basic control theory
An introduction to control theory will be conducted to set the understanding baseline in common controls. This refresher will start with state-space nomenclature and then quickly step through the most common feedback control, feedforward control, observers, and adaptive designs. This will be the ground work that is used to later illustrate to the trajectory generation techniques and the inherent complexity in nonlinear dynamical systems such as the Van der Pol oscillator. Any dynamical system can be represented in state-space matrix form  to provide a solution formed as shown in Eq. (1):
where A is a matrix representing the system states (also referred to as the plant). B is an input matrix, C is the output matrix, and D is the feedback/feedforward control input matrix. A symbolic representation of a feedback control loop is illustrated in Figure 1.
The input to Figure 1 is nominally a desired position (output) from a user which may be converted to a usable command in the generate trajectory block. For the initial output the input will be fed directly to the plant for a resulting position. If this is not the desired position, then the feedback controller will calculate that error between the desired and resulting position and produced an appropriately weighted control signal to adjust the input. This will happen cyclically until the error approaches zero, and is typically referred to as a zero-seeking negative-feedback approach . A common way to represent the interaction between the input and output of a system is as a transfer function , and is usually presented in the Laplacian domain as shown as H in Eq. (2). Also known as the S-domain, this can be an easier format to mathematically manipulate the system variables primarily due to the fact that the system blocks interact in a convolutional way in the time-domain whereas in the S-domain one can just perform multiplication and achieve equivalent results .
The PID controller is the staple of many feedback control systems , and can be formulated such that it is a zero-seeking architecture by calculating the difference between the measured output of the system and the desired output as shown in Figure 2 and Eq. (4). The error signal is then fed into the proportional, integral, and derivative blocks, and summed together to provide an additional control input to the plant. The proportional block assigns a constant gain (Kp) to the error signal, the integral block calculates the integral of the error signal and applies a different gain (Ki). There are many different ways and variations of implementing a PID controller such as sometimes only utilizing one or two components as shown in Eq. (3) as a typical PI controller example.
where q is the desired position and
The third and final component of the PID feedback controller is the derivative block which calculated the derivative of the error signal and assigns a corresponding gain (Kd) to it. If all three components are summed together as shown in Figure 2, it will result in Eq. (5) for the time-domain solution, and Eq. 6 for the S-domain. Eqs. (5) and (6) illustrate both the time-domain equation and the S-domain equation for ease of comparison.
Another common feedback control method extends into the realm of optimal control [1, 2] which is focused primarily at looking at control as a cost minimization problem. A linear quadratic regulator (LQR) is one way to treat a feedback control system as an optimal control problem . One variation based in the continuous time-domain treats the problem with a focus on the infinite steady-state, the infinite-horizon continuous-time LQR. The cost function of this type of solution is identified in Eq. (7).
where Q is the state cost matrix, R is the input cost matrix, and N is the final state cost matrix with a feedback control law of:
and P is found by solving the continuous time Riccati equation :
For a linear system, with a linear response, an appropriately chosen cost function, J, can achieve optimal feedback control parameters and find the feedback control parameters that minimize the control cost J.
When full-state feedback is not present in a system, i.e. the outputs cannot be fully known, the control system will either need to be designed without the need for the missing information or those states will need to be estimated based on the sensed outputs as seen in Figure 3. In this case an estimator is also referred to as a state observer. A state observer’s role is to estimate the internal states of a given system based on the current inputs and outputs. For a more thorough discussion the reader is referred to .
An example of taking the general formulation above and applying it to a specific system is illustrated in Figure 4 for a spacecraft, more specifically a satellite. Here the system dynamics will also include control moment gyros, and other physical restraints/limitations of a real-world plant [12, 13, 14, 15, 16, 17, 18, 19, 20]. Additionally, the control variables here are in angle, angular rate, and angular acceleration in order to apply the correct torque on the motors to affect the desired outcome.
The key takeaway here is that the high-level depiction of the system identified in Figure 4 is not very different from the general case, only that the details within the blocks encompass many more physical variables.
1.2 The Van der Pol oscillator
Balthasar van der Pol (1889–1959) was a Dutch physicist who became interested in the differential equations of coupled electrical systems, which formed into relaxation oscillators, what later became a description of what we now call limit cycles . His initial investigation began with describing the human heartbeat, and later when turning towards the issues of radio communication [23, 24, 25] at the time (the result of deterministic chaos) and is illustrated with the resistor-inductor-capacitor (RLC) circuit shown in Figure 5.
The circuit in Figure 5 represents a schematic for an RLC circuit comprised of a nonlinear resistor (R), an inductor (L), and a capacitor (C). This common circuit can be described by the differential equation  in Eq. (11):
By using the simplification assumption in Eq. (15), one can simulate the VPD as a plant to a controls simulation in MATLAB/Simulink structured around the model presented previously in Figure 3. The natural periodic oscillations in the phase portrait can be seen in Figure 6, which are shown to converge to the same limit cycle from any initial condition nearby (local). This model will be discussed in greater detail in Section 2.
Phase portraits are ways to represent the performance of a system, and to also analyze the expected stability within some local bounds usually determined by the anticipated operational bounds [10, 11, 23]. These portraits display the state of interest with respect to the derivative of that state. Ideally, the goal is such that the state is controllable enough so that the state can be forced to zero within a given time-constant. Sometimes, in a nonlinear system, just being able to prevent the state from growing unbounded (blowing up) and towards an arbitrary asymptotic limit is acceptable .
Using the forced VDP equation, and simplifying it by using the outlined assumptions in Eq. (15) to achieve a perfectly circular limit cycle:
to get Eq. (16):
In order to implement the forcing function onto the VDP system, it is needed to invert the dynamics such that the VDP equation is used as the feedforward control signal, which will feed into the plant. The steady-state results can be seen in Figure 7, with a severe amount to transient phenomenon before reaching the perfectly circular limit-cycle steady-state response.
1.3 What are transient trajectories?
What are transient trajectories? This is a good question, and one with issues and difficulties that the reader may understand without explicitly realizing it. Transient trajectories are those trajectories that transition between the intended steady-state trajectories, and are often short lived [1, 10, 11, 26, 27, 28]. The most common types of transient trajectories are of the variety if initial start-up of a system . For example; let us say we have a satellite pointing at some location (x1, y1) on Earth, and now it is needed to point a different location (x2, y2). The desired control and corresponding control trajectory (if a solely feedback control loop is not implemented) will comprise the transient trajectory. It starts at (x1, y1), ends at (x2, y2), and incorporates every point in the resulting path between those two points that the platform takes to include the multiple controllable states of the system.
Figure 8 shows an example of a desired transient trajectory. In this example a satellite with an electro-optical telescope is being steered by directing the pointing vector of the staring sensor it employs. The sensor starts at a (0°, 0°, 0°) in the standard (roll, pitch, yaw) coordinate system, and then the user desires the sensor to move to (0°, 0°, 30°). The user inputs a command of 30° yaw to the system, and can be interpreted in many ways. The red, yellow, and blue lines represent different variations of the interpretation across the desired angle, angular rate, and angular acceleration. The trajectory in red represents a sinusoidal-based step function that takes on discontinuities in the angular rate and acceleration terms, whereas the trajectories in yellow and blue are entirely continuous.
The effects of transient trajectories can be illustrated via phase portraits like those presented in Figures 6 and 7, with the main concern in affecting the stability of a system of interest. If the transients couple with the potential variations in initial conditions to produce deleterious effects on the stability, it will be illustrated very clearly in a phase portrait as it will likely grow unbounded. Another method on evaluating the performance of a system when evaluating the effects of transients is through the use of an object cost function . A common approach is by using the root-mean-square (RMS) value between the desired trajectory and the measured trajectory to gain an RMS error value as shown in Eq. (17).
2. Transient trajectory generation case study
At this point it is relevant to apply the basic components introduced in the previous section to the common nonlinear, and slightly chaotic VDP system to illustrate the benefits and highlight potential areas for further discussion.
Taking the Van der Pol oscillatory system as described by Eq. (16), and modeling it as the system plant we get the MATLAB/Simulink model in Figure 9. This model receives a control signal, and generates a new position. Inverting the dynamics result in a similarly defined model for a feedforward controller identified in Figure 10. Here, the model receives a defined trajectory vector comprised of the position, velocity, and acceleration (
The plant dynamics and the feedforward controller form the heart of the simulation as the rest of the system parameters will change relative to the different approaches. Both of these models are incorporated in the high-level system diagram illustrated in Figure 11. The high-level diagram also shows the user input starting from the right-hand-side, which is fed in multiple trajectory generation functions. The output of the desired trajectory function is then fed into both the feedback and the feedforward controllers. Here, the output of the controllers can be turned on or off as desired to evaluate the performance of the controllers. For deterministic control algorithms it is necessary to assume either full-state feedback, or non-stochastic signals on the output.
2.1 Classical tuning method
One can apply classical tuning methods to the feedback controllers such as PID, and LQR to treat this nonlinear system as a linear system. The PID controller in Figure 2 can serve as a foundation for multiple variants of a linear feedback controller by modifying the gain coefficients Ki, Kd, and Kp. Due to the ineffectiveness of linear control algorithms on the highly nonlinear VDP plant, the LQR feedback controller will be the only approach illustrated. Calculating the LQR coefficients are somewhat trivial in MATLAB/Simulink if the system description is in a compatible form and the MATLAB function call can be used: [K, S,e] = LQR (A,B,Q,R,N) , which gives the relevant Kd, and Kp values. In this instance, the system is described in state-space as shown in Eq. (18):
to get a resulting control law of:
or Eq. (22) when coupling the ideal forced VDP feedforward controller with LQR:
The resulting phase portrait with only incorporating the LQR gains implemented in Eq. (19) are displayed in the left side of Figure 12. It can be seen that the stability is affected such that the system steadily grows unbounded and not to an asymptotic limit which is desired for illustrating stability. Investigating the nearby initial conditions does not identify any condition that produces a stable system.
The right side of Figure 12 illustrates similar results from coupling the ideal nonlinear feedforward controller with the LQR feedback gains. Again, it can be seen that the system grows unbounded with no identified initial conditions providing stability. It can also be seen that the trends in the trajectories are influenced by both the LQR feedback only system and the ideal feedforward controller, although the feedforward controller in not robust enough to bring the system back to an asymptotic limit cycle.
2.2 Isolated trajectory fractionalization
With the results identified in Section 2.1 using linear feedback methods to compensate for trajectory tracking error of the nonlinear VDP system, it is necessary to try something different. One way is to incorporate isolated trajectory fractionalization (ITF). This train of thought assumes the ability of the transient trajectory to be fractured into sub components which can be elegantly stitched together to form the desired transient trajectory. There are many ways to stitch trajectories utilizing the mathematical principles of spline interpolation, Chebyshev polynomials, Lagrange polynomials, and the Runge Kutta methods  for example. Here we will illustrate the micro-transient generation to get a better performing transient trajectory via the 8-term Fourier Series fitting method. Under the theory that any periodic signal can be broken up to an infinite series of orthogonal basis functions based on a combination of sines and cosines  as defined by Eq. (21):
where Tn is the resulting sum of the functions used to replicate the originally desired trajectory. a0 is a steady state offset, while ai and bi are the corresponding constant coefficients for each orthogonal basis set.
In the case of the sinusoidally forced VDP system illustrated in Figure 7, the unwanted transient response is primarily within the first few cycles of oscillations then after the sharp initial transients the tracking error slowly reduces over time until the trajectory is a perfect match to the desired trajectory. One way to split this desired trajectory is to heuristically describe the sinusoid function that will smoothly and quickly travel between the initial conditions and the desired steady state response. Taking advantage of the a-priori knowledge of how sinusoidal function performs with the VDP system a simple sine function with a phase shift is chosen as described by Eq. (22):
where is a scaling coefficient proportional to the desired oscillatory radius in the phase plane. And the resulting derivative is:
Next, by using the Fourier decomposition described in Eq. (21), the resulting trajectory can be planned as illustrated in Figure 13. This initial attempt is fairly close but still allows for large residual error. Figure 13 displays the desired total trajectories for
To allow the fitting algorithm more flexibility, a large section isolated between the desired trajectories is introduced. The missing data is strategically placed such that the points where the 2nd derivative is zero. This allows for an extremely smooth function across the 1st derivative. Any sharp discontinuities can introduce large unwanted transients. The next iteration in the isolated trajectory fractionalization routine is illustrated in Figure 14.
The Fourier fit in Figure 14 was built with the following parameters:
where is the time step, is the initial time, is the end time of the first trajectory fraction, is the start of the 2nd trajectory fraction, and is the final time. The resulting parameters allow the fitting algorithm 1500 (1.5 second) time steps to find a sm0oth fit. The resulting trajectories form the desired phase plane trajectory being fed into the feedforward controller is illustrated on the left in Figure 15. The right side of Figure 15 illustrates the actual results on the output of the VDP system.
Utilizing the tracking error values along the entire transient trajectory an objective cost function can be defined and evaluated. In the case of the VDP system, the goal is to match the actual trajectory with the desired trajectory with respect to the phase plane. Therefore the parameters of interest are
where RMSe is the RMS value of the error along the entire trajectory on each component. This provides a single metric to evaluate how well the actual trajectory fits to the desired trajectory. Once evaluated, a value of J = 1.0604 is achieved using the isolated trajectory fractionalization technique, whereas the objective cost function for the sinusoidal forced VDP system is J = 4.9603. This shows a 78% reduction in trajectory tracking error via this technique.
2.3 Targeted exponential weighting
2.3.1 1-term targeted exponential weighting
The targeted exponential weighting method is another technique to deterministically modify transient trajectories, and is ideally applied to the extrema in the transients via a-priori knowledge of the system characteristics. By taking advantage of the mathematical approach of exponential decay (or growth), one can modify an existing control trajectory via the cyclical parameter tuning process shown in Figure 17.
The parameter tuning process outlined in Figure 17 starts with a-priori information of the system’s characteristic behavior, this is then used to tailor a trajectory for the system. The trajectory is fed through the feedforward controller/plant. The output is then evaluated along the trajectory and compared to the desired. If the error is greater than a given tolerance value then the new information gained from the input parameters are compared to what was previously known and a new set of parameters are generated for the next evaluation. This is repeated until the error is within the given tolerance or until a pseudo-global minimum is found.
Considering the forced Van der Pol system, again it is now known that the initial conditions give rise to the largest perturbations to the ideal trajectory of a perfect circle in the phase plane. With this in mind the targeted exponential weighting technique will be targeted for those initial transient behaviors in order to minimize their impact.
For the forced VDP system under investigation, the initial configuration will take the original forcing function, xd = Asin(t), as the ideal trajectory for the position. This will give the desired steady state response. Next, the derivative will be evaluated to get
It can be seen in Figure 18 that the most severe trajectory transients are in the initial states of
This approach takes the ideal forcing function for the Van der Pol oscillator, Asin(⍵ t), and by modifying the nominal derivative, ⍵ Acos(⍵ t), via an exponential dithering method (1 − e−Bt) manipulates the targeted trajectory section (in this case where t < 2) in order to attempt to minimize the unwanted micro-transients. Generating the three trajectories in
The benefits by utilizing the targeted exponential weighting method are immediately noticeable when comparing the original phase portrait of the forced VDP system in Figure 7, and the results using the isolated trajectory fractionalization method in Figure 15. Additionally, the objective cost function, J, decreases from J = 4.9603, to J = 1.0604, to J = 0.6328 for an overall reduction of 87% from implementing target exponential weighting algorithm. The next step will be to implement the same methodology on
2.3.2 2-term targeted exponential weighting
Using the same methodology in the 1-term method previously described, the exponential dithering will be applied to the 2nd derivative, . Starting with Eq. (29) where is of the form: A(1 − e−Bt)cos(t). The direct derivative is of the form: −Ae−Bt((e−Bt − 1)sin(t) − Bcos(t)). And making the assumption from a heuristic approach that in the dithering sine term to get Eq. (30).
Generating the three trajectories in
The 2-term targeted exponential weighting method does indeed show an improvement over the 1-term variation. The resulting improvement based on the objective cost function shows an extra 5% decrease in trajectory tracking error to get J = 0.3758, which results in an overall reduction in tracking error on the order of 92% over the baseline forced Van der Pol system.
This chapter presented a small set of deterministic approaches to transient trajectory generation with particular interest in minimizing unwanted micro-transients that may cause havoc on a control system performance. Beginning with a brief introduction to control theory terminology which introduced state-space notation, transfer functions, feedback controllers, and the concept of observers for estimating unknown system states. From there the Van der Pol oscillatory equation was introduced and presented as a system under test to apply the deterministic trajectory generation techniques to.
Using the Van der Pol oscillator, feedforward controllers were introduced through the forced Van der Pol system and initial results of system performance were discussed and evaluated through the use of phase portraits. The micro-transients in the baseline case, the forced Van der Pol system using a forcing function in the form of Asin(t), were illustrated and compared to three different approaches. The first approach utilized a common feedback technique for linear system, the linear quadratic regulator (LQR). This was shown to be unstable. The next approach utilized the isolated trajectory fractionalization technique which segmented the desired trajectory into sub-trajectories, and isolated the transition points and applying a Fourier fitting routine to stich the desired trajectories back together. This approach led to a reduction in trajectory error (as evaluated by an objective cost function) by 78%.
The final method was split into two slightly different variants. The first one presented was the 1-term targeted exponential weighting method. This method utilized a-priori knowledge of where the largest micro-transient response and applied dithering techniques to minimize those unwanted transients in the derivative of the desired trajectory. The second variation, the 2-term targeted exponential weighting method, applied a similar approach to the 2nd derivative of the desired trajectory. The resulting improvement over the baseline case was an 87, and 92% trajectory tracking error reduction.
For ease of comparison, the results are summarized in Table 1.
The author would like to acknowledge the technical guidance and leadership of Dr. Tim Sands, and the support of Air Force Research Laboratory during the research and writing of this chapter.
Conflict of interest
The authors declare no conflict of interest. The views expressed in this paper are those of the author, and do not reflect the official policy or position of the United States Air Force, Department of Defense, or the U.S. Government.