Iterative Learning - MPC: An Alternative Strategy

A repetitive system is one that continuously repeats a finite-duration procedure (operation) along the time. This kind of systems can be found in several industrial fields such as robot manipulation (Tan, Huang, Lee & Tay, 2003), injection molding (Yao, Gao & Allgower, 2008), batch processes (Bonvin et al., 2006; Lee & Lee, 1999; 2003) and semiconductor processes (Moyne, Castillo, & Hurwitz, 2003). Because of the repetitive characteristic, these systems have two count indexes or time scales: one for the time running within the interval each operation lasts, and the other for the number of operations or repetitions in the continuous sequence. Consequently, it can be said that a control strategy for repetitive systems requires accounting for two different objectives: a short-term disturbance rejection during a finite-duration single operation in the continuous sequence (this frequently means the tracking of a predetermined optimal trajectory) and the long-term disturbance rejection from operation to operation (i.e., considering each operation as a single point of a continuous process1). Since in essence, the continuous process basically repeats the operations (assuming that long-term disturbances are negligible), the key point to develop a control strategy that accounts for the second objective is to use the information from previous operations to improve the tracking performance of the future sequence.

Iterative Learning -MPC: an Alternative Strategy 3 Fig. 1. Feedback-based ILC diagram. Here, continuous lines denote the sequence used during the i-th trail and dashed lines denote sequence that will be used in the next iteration.
purpose of an ILC algorithm is then to find a unique input sequence u ∞ which minimizes the tracking error.
The ILC formulation uses an iterative updating formula for the control sequence, given by This formula can be categorized according to how the information from previous iteration is used. Thus, (Norrlöf, 2000) among other authors define, DEFINITION 0.1. An ILC updating formula that only uses measurements from previous iteration is called first order ILC. On the other hand, when the ILC updating formula uses measurements from more than previous iteration, it is called a high order ILC.
1. Every iteration ends in a fixed discrete time of duration T f .

The plant dynamics are invariant throughout the iterations.
3. The reference or set-point, y r , is given a priori.
4. For each trail or run the initial states are the same. That means that x i (0) = x 0 (0), i ≥ 0.
5. The plant output y(k) is measurable.
6. There exists a unique input sequence, u ∞ , that yields the plant output sequence, y, with a minimum tracking error with respect to the set-point, e ∞ .
Regarding the last postulate, we present now the key concept of perfect control. It is interesting to note that the impossibility of achieving discrete perfect control, at least for discrete nominal non-delayed linear models, is exclusively related to the input and/or states limits, which are always present in real systems and should be consistent with the control problem constraints. In this regard, a system with slow dynamic might require high input values and input increments to track an abrupt output reference change, producing in this way the constraint activation. If we assume a non-delayed linear model without model mismatch, the perfect control sequence can be found as the solution of the following (unconstrained) open-loop optimization problem On the other hand, for the constrained case, the best possible input sequence, i.e., u ∞ , is obtained from: where U represents the input sequence limits, and will be discussed later.
A no evident consequence of the theoretical concept of perfect control is that only a controller that takes into account the input constraints could be capable of actually approach the perfect control, i.e. to approximate the perfect control up to the point where some of the constraints become active. A controller which does not account for constraints can maintain the system apart from those limits by means of a conservative tuning only. This fact open the possibility to apply a constrained Model Predictive Control (MPC) strategy to account for this kind of problems.

MPC overview
As was already said, a promising strategy to be used to approach good performances in an iterative learning scheme is the constrained model predictive control, or receding horizon control. This strategy solves, at each time step, an optimization problem to obtain the control action to be applied to the system at the next time. The optimization attempt to minimizes the difference between the desired variable trajectories and a forecast of the system variables, which is made based on a model, subject to the variable constraints (Camacho & Bordons, 2009). So, the first stage to design an MPC is to choose a model. Here, the linear model will be given by: where x k ∈ R n is the state at time k, u k ∈ R m is the manipulated input, y k ∈ R l is the controlled output, A, B and C are matrices of appropriate dimension, and d k ∈ R l is an integrating output disturbance (González, Adam & Marchetti, 2008).
Furthermore, and as a part of the system description, input (and possibly input increment) constraints are considered in the following inclusion: where U is given by: A simplified version of the optimization problem that solves on-line (at each time k) a typical stable MPC is as follows: where (e, u) := ||e|| 2 Q + ||u|| 2 R , F(e) := ||e|| 2 P . Matrices Q and R are such that Q > 0 and R ≥ 0. Furthermore, a terminal constraint of the form x k+N|k ∈ Ω, where Ω is a specific set, is usually included to assure stability. In this general context, some conditions should be fulfilled by the different "components" of the formulation (i.e., the terminal matrix penalization P, the terminal set, Ω, etc) to achieve the closed loop stability and the recursive feasibility 2 ( (Rawlings and Mayne, 2009)). In the next sections, this basic formulation will be modified to account for learning properties in the context of repetitive systems.

Problem index definition
As was previously stated, the control strategy proposed in this chapter consists of a basic MPC with learning properties. Then, to clarify the notation to be used along the chapter (that comes form the ILC and the MPC literature), we start by defining the following index variables: • i: is the iteration or run index, where i = 0 is the first run. It goes from 0 to ∞.
2 Recursive feasibility refers to the guarantee that once a feasible initial condition is provided, the controller will guide the system trough a feasible path 195 Iterative Learning -MPC: An Alternative Strategy • k: is the discrete time into a single run. For a given run, it goes from 0 to T f −1 (that is, T f time instants).
• j: is the discrete time for the MPC predictions. For a given run i, and a given time instant k, it goes from 0 to H = T f − k. To clearly state that j represents the time of a prediction made at a given time instant k, the notation k + j|k, wich is usual in MPC literature, will be used.
The control objective for an individual run i is to find an input sequence defined by which derives in an output sequence as close as possible to a output reference trajectory Furthermore, assume that for a given run i there exists an input reference sequence (an input candidate) given by and that the output disturbance profile, is known. During the learning process the disturbance profile is assumed to remain unchanged for several operations. Furthermore, the value u i r T f −1 represents a stationary input value, satisfying u r

Convergence analysis
In the context of repetitive systems, we will consider two convergence analyses: DEFINITION 0.3 (Intra-run convergence). It concerns the decreasing of a Lyapunov function (associated to the output error) along the run time k, that is, V y i k+1 − y r k+1 ≤ V y i k+1 − y r k for k = 1, . . . , T f −1 , for every single run. If the execution of the control algorithm goes beyond T f , with k → ∞, and the output reference remains constant at the final reference value (y r k = y r T t for T f ≤ k < ∞) then the intra-run convergence concerns the convergence of the output to the final value of the output reference trajectory y i k+1 → y r k as k → ∞). This convergence was proved in (González et al., 2009a) and presented in this chapter.
DEFINITION 0.4 (Inter-run convergence). It concerns the convergence of the output trajectory to the complete reference trajectory from one run to the next one, that is, considering the output of a given run as a vector of T f components (y i → y r as i → ∞).

Frontiers in Advanced Control Systems
Iterative Learning -MPC: an Alternative Strategy 7

Basic formulation
In this subsection, a first approach to a new MPC design, which includes learning properties, is presented. It will be assumed that an appropriate input reference sequence u i r is available (otherwise, it is possible to use a null constant value), and the disturbance d i k as well as the states x i k|k are estimated. Given that the operation lasts T f time instants, it is assumed here a shrinking output horizon defined as the distance between the current time k and the final time Figure 2). Under these assumptions the optimization problem to be solved at time k, as part of the single run i, is described as follows: where the (also shrinking) control horizon N s is given by N s = min(H, N) and N is the fixed control horizon introduced before (it is in fact a controller parameter). Notice that predictions with indexes given by k + H|k, which are equivalent to T f |k, are in fact prediction for a fixed future time (in the sense that the horizon does not recedes). Because this formulation contains some new concepts, a few remarks are needed to clarify the key points: Remark 0.1. In the ith-operation, T f optimization problems P1 must be solved (from k = 0 to k = T f − 1). Each problem gives an optimal input sequence u i opt k+j|k , for j = 0, · · · , H − 1, and following the typical MPC policy, only the first input of the sequence, u i opt k|k , is applied to the system.
Remark 0.2. The decision variables u i k+j|k , are a correction to the input reference sequence u i r k+j (see Equation (12)), attempting to improve the closed loop predicted performance. u i r k+j can be seen as the control action of an underlying stabilizing controller acting along the whole output horizon, which could be corrected, if necessary, by the control actions u i k+j|k . Besides, because of constraints (13), u i k+j|k is different from zero only in the first N s steps (or predictions) and so, the optimization problem P1 has N s decision variables (See Figure 2). All along every single run, the input and output references, u i r k+j and y r k+j , as well as the disturbance d i k+j may be interpreted as a set of fixed parameters.
Remark 0.3. The convergence analysis for the operation sequence assumes that once the disturbance appears it remains unchanged for the operations that follow. In this way the cost remains bounded despite it represents an infinite summation; this happens because the model used to compute the predictions leads to a final input (and state) that matches (y r

Decreasing properties of the closed-loop cost for a single run
The concept of stability for a finite-duration process is different from the traditional one since, except for some special cases such finite-time escape, boundless of the disturbance effect is trivially guaranteed. In (Srinivasan & Bonvin, 2007), the authors define a quantitative concept of stability by defining a variability index as the induced norm of the variation around a reference (state) trajectory, caused by a variation in the initial condition. Here, we will show two controller properties (Theorem 0.1). 1) The optimal IHMPC cost monotonically decreases w.r.t time k, and 2) if the control algorithm execution goes beyond T f with k → ∞ , and the output reference remains constant at the final reference value (y r k = y r T f for k ≥ T f ) then, the IHMPC cost goes to zero as k → ∞ , which implies that y i k → y r T f as k → ∞.
Theorem 0.1 (intra-run convergence). Let assume that the disturbance remains constant from one run to the next. Then, for the system (3)(4)(5), and the constraint (6), by using the control law derived from the on-line execution of problem P1 in a shrinking horizon manner, the cost is decreasing, that is,

Frontiers in Advanced Control Systems
Furthermore, the last cost of a given operation "i" is given by: and since current and one steps predictions are coincident with the actual values, it follows that: Proof See the Appendix.
Remark 0.4. The cost V i opt k of Problem P1 is not a strict Lyapunov function, because the output horizon is not fixed and then, V i opt k (e i k ) changes as k increases (in fact, as k increases the cost becomes less demanding because the output horizon is smaller). However, if a virtual infinite output horizon for predictions is defined, and stationary values of output and input references are assumed for with a fixed (infinite) output horizon. In this way V i opt k (e i k ) becomes a Lyapunov function since it is an implicit function of the actual output error e i k . To make the terminal cost the infinite tail of the output predictions, it must be defined as

Discussion about the stability of the closed-loop cost for a single run
Theorem 0.1, together with the assumptions of Remark 0.4, shows convergence characteristics of the Lyapunov function defined by the IHMPC strategy. These concepts can be extended to determine a variability index in order to establish a quantitative concept of stability (β-stability), as it was highlighted by (Srinivasan & Bonvin, 2007). To formulate this extension, the MPC stability conditions (rather than convergence conditions) must be defined, following the stability results presented in ( (Scokaert et al., 1997)). An extension of this remark is shown below.
First, we will recall the following exponential stability results.
Theorem 0.2 ((Scokaert et al., 1997)). Let assume for simplicity that state reference x r k is provided, such that y r k = Cx r k , for k = 0, . . . , T f − 1, and no disturbance is present. If there exist constants a x , a u , b u , c x , c u and d x such that the stage cost (x, u), the terminal cost F(x), and the model matrices A, B and C, in Problem P1, fulfill the following conditions: 199 Iterative Learning -MPC: An Alternative Strategy Proof The proof of this theorem can be seen in (Scokaert et al., 1997).
Condition (15) is easy to determine in terms of the eigenvalues of matrices Q and R. Condition (16), which are related to the Lipschitz continuity of the input, holds true under certain regularity conditions of the optimization problem. Now, we define the following variability index, which is an induced norm, similar to the one presented in (Srinivasan & Bonvin, 2007): for a small value of δ > 0. With the last definition, the concept of β -stability for finite-duration systems is as follows.
DEFINITION 0.5 ( (Scokaert et al., 1997)). The closed-loop system obtained with the proposed IHMPC controller is intra-run β-stable around the state trajectory x r k if there exists δ > 0 such that ξ ≤ β. Theorem 0.3 (quantitative β -stability). Let assume for simplicity that a state reference, x r k , is provided, such that y r k = Cx r k , k = 0, . . . , T f − 1, and no disturbance is present. If there exist constants a x , a u , b u , c x , c u and d x as in Theorem 0.2, then, the closed-loop system obtained with system(3) -(5) and the proposed IHMPC controller law is intra-run β-stable around the state trajectory x r k , with Proof See the Appendix.

IHMPC with learning properties
In the last section we studied the single-operation control problem, where we have assumed that an input reference is available and the output disturbance is known. However, one alternative is defining the input reference and disturbance as the input and disturbance obtained during the last operation (i.e. the last implemented input and the last estimated disturbance, beginning with a constant sequence and a zero value, respectively). In this way,

200
Frontiers in Advanced Control Systems a dual MPC with learning properties accounting for the operations sequence control can be derived. The details of this development are presented next.

Additional MPC constraints to induce learning properties
For a given operation i, consider the problem P1 with the following additional constraints: whered i−1 k+j represents the disturbance estimation. The first constraint requires updating the input reference for operation i with the last optimal sequence executed in operation i − 1 (i.e. u i r = u i−1 , for i = 1, 2, · · · , with an initial value given by u 0 := [G −1 y r T f · · · G −1 y r T f ]). The second one updates the disturbance profile for operation i with the last estimated sequence in operation i − 1 (i.e. d i =d i−1 , for i = 1, 2, · · · , with an initial value given by d 0 = [0 · · · 0]).
Besides, notice that the vector of differences between two consecutive control trajectories, , the elements of this vector are the collection of first control movements of the solutions of each optimization problem P1, for k = 0, · · · , T f − 1.
Remark 0.5. The input reference update, together with the correction presented in Remark 0.2, has the following consequence: the learning procedure is not achieved by correcting the implemented input action with past information but, by correcting the predicted input sequence with the past input profile , which represents here the learning parameter. In this way better output forecast will be made because the optimization cost has predetermined input information. Figure 3 shows the difference between these two learning procedures.  Remark 0.6. The proposed disturbance update implies that the profile estimated by the observer at operation i − 1 is not used at operation i − 1, but at operation i. This disturbance update works properly when the disturbance remains unmodified for several operations, i.e., when permanent disturbances, or model mismatch, are considered. If the disturbance substantially changes from one operation to next (that is, the disturbance magnitude or the time instant in which the disturbance enter the system change), it is possible to use an additional "current" disturbance correction given by . This correction is then added to permanent disturbance profile at each time k of the operation i.

201
Iterative Learning -MPC: An Alternative Strategy

MPC formulation and proposed run cost
Let us consider the following optimization problem: (3 -13) and (21): Run to run convergence means that both, the output error trajectory e i and the input difference between two consecutive implemented inputs, δ i = u i − u i−1 , converges to zero as i → ∞. Following an Iterative Learning Control nomenclature, this means that the implemented input, u i , converges to the perfect control input u per f .
To show this convergence, we will define a cost associated to each run, which penalizes the output error. As it was said, T f MPC optimization problems are solved at each run i, that is, from k = 0 to k = T f − 1. So, a candidate to describe the run cost is as follows: where V i opt k represents the optimal cost of the on-line MPC optimization problem at time k, corresponding to the run i.
Notice that, once the optimization problem P2 is solved and an optimal input sequence is obtained, this MPC cost is a function of only e i opt k|k = y i opt k|k − y r k = e i k . Therefore, it makes sense using (22) to define a run cost, since it represents a (finite) sum of positive penalizations of the current output error, i.e., a positive function of e i . However, since the new run index is made of outputs predictions rather than of actual errors, some cares must be taken into consideration. Firstly, as occurs with usual indexes, we should demonstrate that null output error vectors produce null costs (which is not trivial because of predictions). Then, we should demonstrate that the perfect control input corresponds to a null cost. These two properties, together with an additional one, are presented in the next subsection.

Some properties of the formulation
One interesting point is to answer what happens if the MPC controller receives as input reference trajectory the perfect control sequence presented in the first section. The simplest answer is to associate this situation with a null MPC cost. However, since the proposed MPC controller does not add the input reference (given by the past control profile) to the implemented inputs but to the predicted ones, some care must be taken. Property 0.1, below, assures that for this input reference the MPC cost is null. Without loss of generality we consider in what follows that no disturbances enter the system.
Proof It follows from Property 0.1 and Property 0.2.

Main convergence result
Now, we are ready to establish the run to run convergence with the following theorem. (21), and assuming that a feasible perfect control input trajectory there exists, the output error trajectory e i converges to zero as i → ∞. In addition, δ i converges to zero as i → ∞ which means that the reference trajectory u i r converges to u per f .

Theorem 0.4. For the system (3)-(5), by using the control law derived from the on-line execution of problem P2 in a shrinking horizon manner, together with the learning updating
Remark 0.7. In most real systems a perfect control input trajectory is not possible to reach (which represents a system limitation rather than a controller limitation). In this case, the costs V i opt k will converge to a non-null finite value as i → ∞ ,and then, since the operation cost J i is decreasing (see previous proof), it will converge to the smallest possible value. Given that, as was already said, the impossibility to reach perfect control is exclusively related to the input and/or states limits (which should be consistent with the control problem constraints), the proposed strategy will find the best approximation to the perfect control, which constitutes an important advantage of the method.
Remark 0.8. In the same way that the intra-run convergence can be extended to determine a variability index in order to establish a quantitative concept of stability (β-stability), for finiteduration systems (Theroem 0.3); the inter-run convergence can be extended to establish stability conditions similar to the ones presented in (Srinivasan & Bonvin, 2007).

Ilustrative examples
Example 1. In order to evaluate the proposed controller performance, we consider first a linear system (Lee & Lee, 1997) given by G(s) = 1/15s 2 + 8s + 1. The MPC parameters were tuned as Q = 1500 , R = 0.5 and T = 1. Figure 4 shows the obtained performance in the controlled variable where the difference with the reference is undistinguished. Given that the problem assumes that no information about the input reference is available, the input sequence u and u are equals.
The MPC cost function is showed in Fig. 5. According to the proof of Theorem 0.1 (nominal case), this cost function is monotonically decreasing.
Example 2. Consider now a nonlinear-batch reactor where an exothermic and irreversible chemical reaction takes place, (Lee & Lee, 1997). The idea is to control the reactor temperature 204 Frontiers in Advanced Control Systems by manipulating the inlet coolant temperature. Furthermore, the manipulated variable has minimum and maximum constrains given by: Tc min ≤ Tc ≤ Tc max , where Tc min = −25[ • C], Tc max = 25[ • C] and, Tc is written in deviation variable. In addition, to show how the MPC controller works, it is assumed that a previous information about the cooling jacked temperature (u = Tc) is available.
Here the proposed MPC was implemented and the MPC parameters were tuned as, Q = 1000 , R = 5 and T = 1 [min]. The nominal linear model used for predictions is the same proposed by (Adam, 2007). Figure 6 shows both the reference and the temperature of the batch reactor are expressed in deviation variable. Furthermore, the manipulated variable and the correction made by the MPC, u are shown.
Notice that, 1) the cooling jacked temperature reaches the maximum value and as a consequence the input constraints becomes active in the time interval from 41 minutes to 46 minutes; 2) similarly, when the cooling jacked temperature reaches the minimum value, the other constraint becomes active in the time interval from 72 minutes to 73 minutes; 3) the performance is quite satisfactory in spite of the problem is considerably nonlinear and, 4) given that it is assumed that a previous information about the cooling jacked temperature is available, the correction u is somewhat smaller than u (Fig. 6). Fig. 6. Temperature reference and controlled temperature of the batch reactor. Also, the cooling jacked temperature (u) and the correction (u) are showed.
Example 3. In order to evaluate the proposed controller performance we assume a true and nominal process given by (Lee et al., 2000;Lee & Lee, 1997) G(s) = 1/15s 2 + 8s + 1 and G(s) = 0.8/12s 2 + 7s + 1, respectively. The sampling time adopted to develop the discrete state space model is T = 1 and the final batch time is given by T f = 90T. The proposed strategy achieves a good control performance in the first two or three iterations, with a rather 205 Iterative Learning -MPC: An Alternative Strategy reduced control horizon. The controller parameters are as follows: Q = 1500, R = 0.05, N = 5. Figure 7 shows the output response together with the output reference, and the inputs u i and u i , for the first and third iteration. At the first iteration, since the input reference is a constant value (u i r T f −1 = 0), u i and u i are the same, and the output performance is quite poor (mainly because of the model mismatch). At the third iteration, however, given that a disturbance state is estimated from the previous run, the output response and the output reference are undistinguishable. As expected, the batch error is reduced drastically from run 1 to run 3, while the MPC cost is decreasing (as was established in Theorem 0.1) for each run (Fig. 8a).
Notice that the MPC cost is normalized taking into account the maximal value

Frontiers in Advanced Control Systems
where V 1 max ≈ 1.10 6 and V 1 max ≈ 286.5. This shows that the MPC cost J i decrease from one run to the next, as was stated in Theorem 0.4. Finally, Fig. 8b shows the normalized norm of the error corresponding to each run.

Conclusion
In this paper a different formulation of a stable IHMPC with learning properties applied to batch processes is presented. For the case in which the process parameters remain unmodified for several batch runs, the formulation allows a repetitive learning algorithm, which updates the control variable sequence to achieve nominal perfect control performance. Two extension of the present work can be considered. The easier one is the extension to linear-time-variant (LTV) models, which would allow representing the non-linear behavior of the batch processes better. A second extension is to consider the robust case (e.g. by incorporating multi model uncertainty into the MPC formulation). These two issues will be studied in future works. be the optimal input and state sequence that are the solution to problem P1 at time k − 1, with k = 1, · · · , T f − N (that means that the last N optimization problem of a given run i are not considered). The cost corresponding to these variables are

Appendix
Notice that at time k − 1, H = T f − k + 1, since H is a shrinking horizon. Now, let u i f eas k := u i opt k|k−1 , . . . , u opt k+N s −2|k−1 , 0, . . . , 0 be a feasible solution to problem P1 at time k. Since no new input is injected to the system from time k − 1 to time k, and no unknown disturbance is considered, the predicted state at time k, using the feasible input sequence, will be given by Then, the cost at time k corresponding to the feasible solution u i f eas is as follows: This means that the optimal cost at time k, which is not grater than the feasible one at the same time, satisfies Finally, notice that e i opt k−1|k−1 and u i opt k−1|k−1 represent actual (not only predicted) variables. Thus, we can write This shows that, whatever the output error is different from zero, the cost decreases when time k increases.
Finally, the decreasing property for k = T f − N + 1, · · · , T f − 1, and the last part of the theorem, can be proved following similar steps as before (i.e., finding a feasible solution).

Proof of Theorem 0.3
Proof From the recursive use of (25), together with (15), (19) and (20), we have for k = 0, . . . , T f − 2. So we can write: Therefore, since γ. x 0 σ is a lower bound of V i opt 0 (that is, γ. x 0 σ ≤ V i opt 0 ). Finally, Proof of Property 0.1 Proof ⇐) Let us assume that V i opt k = 0 , for k = 0, ..., T f − 1. Then, the optimal predicted output error and input will be given by e i opt k+j|k = 0, j = 0, ..., T f and u i opt k+j|k = 0, for j = 0, ..., T f − 1, respectively. If e i opt k+j|k adn u i opt k+j|k = 0 simultaneously, it follows that u i r k = u per f k for k = 0, . . . , T f − 1, since it is the only input sequence that produces null predicted output error 208 Frontiers in Advanced Control Systems (otherwise, the optimization will necessarily find an equilibrium such that e i opt k+1|k > 0 and u i opt k|k > 0, provided that Q 0 and R 0 by hypothesis). Consequently, u i r = u per f . ⇒) Let us assume that u i r k = u per f k . Because of the definition of the perfect control input, the optimization problem without any input correction will produce a sequence of null output error predictions given by e i k|k = 0 Consequently, the optimal sequence of decision variables (predicted inputs) will be u i o pt k+j|k = 0 for k = 0, . . . , T f − 1 and j = 0, . . . , T f − 1, since no correction is needed to achieve null predicted output error. This means that V i opt k = 0 for k = 0, . . . , T f − 1.

Proof of Property 0.2
Proof ⇒) Let us assume that e i = 0. This means that e i opt k|k = 0, for k = 0, . . . , T f . Now, assume that the input reference vector is different from the perfect control input, u ir = u per f , and consider the output error predictions necessary to compute the MPC cost V i k : e i k|k = 0 e i k+1|k = Cx i k+1|k − y r k+1 = C Ax i k|k + Bu i r k + Bu i k|k − y r k+1 . . .
Since u i r is not an element of the perfect control input, then Ax i k|k + Bu i r k = 0. Consequently, (assuming that CB is invertible) the input u i * k|k necessary to make e i opt k+1|k = 0, will be given by: which is a non null value. However, the optimization will necessary find an equilibrium solution such that e i opt k+1|k > 0 and u i opt k|k < u i * k|k , since Q 0 and R 0 by hypothesis. This implies that 3 e i opt k+1|k = e i opt k+1|k+1 = 0, contradicting the initial assumption of null output error.
From this reasoning for subsequent output errors, it follows that the only possible input reference to achieve e i = 0 will be the perfect control input (u i r = u per f ). If this is the case, it follows that V i opt k = 0 , for k = 0, ..., T f (Property 0.1), and so, J i = 0 3 Note that for the nominal case is e i k+1|k+1 = e i k+1|k 209 Iterative Learning -MPC: An Alternative Strategy