Algorithms for LQR via Static Output Feedback for Discrete-Time LTI Systems

Randomized and deterministic algorithms for the problem of LQR optimal control via static-output-feedback (SOF) for discrete-time systems are suggested in this chapter. The randomized algorithm is based on a recently introduced randomized optimization method named the Ray-Shooting Method that efficiently solves the global minimization problem of continuous functions over compact non-convex unconnected regions. The randomized algorithm presented here has a proof of convergence in probability to the global optimum. The suggested deterministic algorithm is based on the gradient method and thus can be proved to converge to local optimum only. A comparison between the algorithms is provided as well as the performance of the hybrid algorithm.


Introduction
The application of static-output-feedbacks (SOFs) for linear-quadratic regulators (LQR) is very attractive, since they are cheap and reliable and their implementation is simple and direct, because their components has direct physical interpretation in terms of sensors amplification rates and actuator activation power. Moreover, the long-term memory of dynamic feedbacks is useless for systems subject to random disturbances, to fast dynamic loadings or to random bursts and impulses, and the application of state feedbacks is not always possible due to unavailability of full-state measurements (see, e.g., [1]). Also, the use of SOF avoids the need to reconstruct the state by Kalman filter or by any other state reconstructor.
On the other hand, in practical applications, the entries of the needed SOFs are bounded, and since the problem of SOFs with interval constrained entries is NPhard (see [2,3]), one cannot expect the existence of a deterministic efficient (i.e., polynomial-time) algorithm to solve the problem. Randomized algorithms are thus natural solutions to the problem. The probabilistic and randomized methods for the constrained SOF problem and robust stabilization via SOFs (among other hard problems) are discussed in [4][5][6][7]. For a survey of the SOF problem see [8], and for a recent survey of the robust SOF problem see [9].

Preliminaries
Let a discrete-time system be given by where A ∈  pÂp , B ∈  pÂq , C ∈  rÂp , and x 0 ∈  p . Let the LQR cost functional be defined by where Q > 0 and R > 0. Let u k ¼ ÀKy k be the SOF, and let A cℓ K ð Þ≔A À BKC denote the closed-loop matrix. Let  denote the open unit disk, let 0 < α < 1, and let  α denote the set of all z ∈  with z j j < 1 À α (where z j j is the absolute value of z). For a square matrix Z, we denote by σ Z ð Þ its spectrum. For any rectangular matrix Z, we denote by Z þ its Moore-Penrose pseudo-inverse. By Z k k F ¼ trace Z T Z À Á1 2 we denote the Frobenius norm of Z, and by Z k k ¼ max σ Z T Z À Á À Á À Á 1 2 we denote the spectral norm of Z. By L Z and R Z , we denote the (left and right) orthogonal projections I À Z þ Z and I À ZZ þ on the spaces Ker Z ð Þ and Ker Z þ ð Þ, respectively. For a topological space X and a subset U ⊂ X , we denote by int U ð Þ the interior of U, i.e., the largest open set included in U. By U we denote the closure of U, i.e., the smallest closed set containing U, and by ∂U ¼ U À int U ð Þ we denote the boundary of U. Let S qÂr denote the set of all matrices K ∈  qÂr such that σ A cℓ ð Þ⊂  (i.e., stable in the discrete-time sense), and let S qÂr denote the set of all matrices K ∈  qÂr such that σ A cℓ ð Þ⊂  α . If the last is nonempty, we say that A cℓ is α-stable and we call α the degree of stability. Let K ∈ S qÂr α be given. Substitution of the SOF u k ¼ ÀKy k ¼ ÀKCx k into (2) yields: Since Q þ C T K T RKC > 0 and A cℓ K ð Þ is stable, it follows that the Stein equation has a unique solution P > 0, given by Substitution of (4) into (3) and using Thus, when x 0 is known, we search for K ∈ S qÂr α that minimizes the functional Let Thus, when x 0 is unknown, we search for K ∈ S qÂr α , such that σ max K ð Þ ¼ P K ð Þ k k is minimal. Note that if λ is an eigenvalue of A cℓ K ð Þ and v is a corresponding eigenvector, then (4) yields 1 À λ ð Þv < 1, and thus, minimizing σ max K ð Þ results in eigenvalues that are getting closer to the boundary of . Since α, the degree of stability, is important to get satisfactory decay rate of the state to 0, and for disturbance rejection, we allow the user of the algorithms to determine α. Note that too high value of α might result in nonexistence of any SOF for the system or in complicating the search for a starting SOF. Higher values of α result in higher values of the optimal value of the LQR functional, i.e., higher energy consumption for decaying the disturbance x 0 to 0. The functionals J x 0 , K ð Þand σ max K ð Þ are generally not convex since their domain of definition S qÂr α (and therefore S qÂr α ) is generally non-convex. Necessary conditions for optimality for continuous-time systems were given as three QMEs in [15][16][17][18]. Necessary and sufficient conditions for optimality for continuous-time systems, based on linear matrix inequalities (LMI), were given in [19][20][21]. However, algorithms based on these formulations are generally not guaranteed to converge, seemingly because of the non-convexity of the coupled matrix equations or inequalities, and when they converge, it is to a local optimum only.
In the sequel, we will use the following lemmas, given here without proofs.
where Z is arbitrary.
where Z is arbitrary. Lemma 2.2. We have: 1. Let A, B, X be matrices with sizes p Â q, r Â p, q Â r, respectively. Then 1. Let A, B, C, X be matrices with sizes p Â q, r Â r, q Â p, r Â q, respectively. Then

The randomized Ray-Shooting Method-based algorithm
The Ray-Shooting Method works as follows, for general function minimization: let f x ð Þ ≥ 0 be a continuous function defined over some compact set X ⊂  n . Let ϵ > 0 be given and assume that we want to compute x * ∈ X such that y * ≔f cylinder enclosed between X and the level y 0 . Let L 0 ð Þ ¼ x, y ð Þ x ∈ X , y ¼ 0 j f or x ∈ ∂X , 0 ≤ y ≤ y 0 :g. Let z 0 ≔ x 0 , y 0 À Á and note that z 0 ∈ S 0 ð Þ . Then, we choose w 0 in L 0 ð Þ randomly, according to some distribution, and we define the ray as z t ð Þ≔ 1 À t ð Þz 0 þ tw 0 , 0 ≤ t ≤ 1. We scan the ray and choose the largest 0 ≤ t 0 ≤ 1 such that 1 À t 0 ð Þz 0 þ t 0 w 0 ∈ S 0 ð Þ (actually, we scan the ray from t ¼ 1 in equal-spaced points and take the first t for which this happens). We define z 1 ≔ 1 À t 0 ð Þz 0 þ t 0 w 0 and update sets S 0 ð Þ , D 0 ð Þ , and L 0 ð Þ by replacing y 0 with y 1 , where x 1 , y 1 À Á ¼ z 1 . Let S 1 ð Þ , D 1 ð Þ , and L 1 ð Þ denote the updated sets. We continue the process similarly from z 1 ∈ S 1 ð Þ , and we define a sequence z n ∈ S n ð Þ , n ¼ 0, 1, …. Note that S ϵ ð Þ ⊂ S nþ1 ð Þ ⊂ S n ð Þ for any n ¼ 0, 1, …, unless we have z n ∈ S ϵ ð Þ for some n (in which the process is ceased). One can show that the sequence z n f g ∞ n¼0 converges (in probability) to a point in S ε ð Þ. Note that shooting rays from the points of local minimum have positive probability to hit S ε ð Þ (under the following mild assumption), because any global minimum is visible from any local minimum. Moreover, for a given level of certainty, we hit S ε ð Þ in a finite number of iterations (see Remark 3.1 below). Practically, we may stop the algorithm if no improvement is detected within a window of 20% of the allowed number of iterations. The function need not be smooth or even continuous. It only needs to be well defined and measurable over the compact domain X , and S ε ð Þ should have non-negligible measure (i.e., should have some positive volume). Obviously, global minimum points belong to the boundary of the search space S 0 ð Þ , and actually such points are where the distance between the compact sets X Â 0 f g and S 0 ð Þ in  nþ1 is accepted. This is essential for the efficiency of the Ray-Shooting Method, although we raised the search space dimension from n to n þ 1.
In order to apply the Ray-Shooting Method for the LQR via SOF problem, we need the following definitions: assume that K 0 ð Þ ∈ int S α ð Þ was found by the RS algorithm (see [10]) or by any other method (see [22][23][24]). Let h > 0 and let U 0 ð Þ be a unit vector (actually a matrix, but we consider here the space of matrices as a normed vector space) with respect to the Frobenius norm, i.e., U 0 α is compact (but generally not convex). We wish to minimize the continuous function σ max K ð Þ (or the continuous function J x 0 , K ð Þ, when x 0 is known) over the compact set The Ray-Shooting Algorithm 1 for the LQR via SOF problem, works as follows: we start with a point K 0 ð Þ ∈ int S α ð Þ, found by the RS algorithm (see [10]). Assuming that K * ∈ D 0 ð Þ , the inner loop ( j ¼ 1, …, n) uses the Ray-Shooting Method in order to find an approximation of the global minimum of the function The proof of convergence in probability of the inner loop and its complexity (under the above-mentioned assumption) can be found in [10] (see also [11]). In the inner loop, we choose a search direction by choosing a point F in R ∞ ϵ ð Þ-the base of the cone D 0 ð Þ . Next, in the most inner loop (k ¼ 0, …, s), we scan the ray K t ð Þ≔ 1 À t ð ÞK 0 ð Þ þ tF and record the best controller on it. Repeating this sufficiently many times, we reach K * (or an ϵ neighborhood of it) with high probability, under the assumption that K * ∈ D 0 ð Þ (see Remark 3.1).
The reasoning of the Ray-Shooting Method is that sampling the whole search space will lead to the probabilistic method that is doomed to the "curse of dimensionality," which the method tries to avoid. This is achieved by slicing the search space into covering cones (m is the number of cones allowed), because any point in the cone is visible from its vertex. At each cone we shoot rays (n is the number of rays per cone) from its node toward its basis, where each ray is sampled from its head toward its tail, while updating the best point found so far. Note that the global minimum of σ max K ð Þ over any compact subset of S α is achieved on the boundary of the related portion of the epigraph of σ max K ð Þ. Therefore, we can break the most inner loop; in the first moment, we find an improvement in σ max K ð Þ. This bypasses the need to sample the whole search space (although we raise by 1 the search space dimension) and explains the efficiency of the Ray-Shooting Method in finding global optimum. Another advantage of the Ray-Shooting Method which is specific to the problem of LQR via SOF is that the search is concentrated to the parameter space (the qr-dimension space where the K rests) and not to the certificate space (the p 2 -dimension space where the Lyapunov matrices P rests). Thus, the method avoids the need to solve any Riccati, LMI, and BMI equations, which might make crucial difference for largescale systems (i.e., where p 2 > > qr). Algorithm 1. The Ray-Shooting Algorithm for LQR via SOF for discrete-time systems.
Require: An algorithm for deciding α-stability, an algorithm for computing σ max K ð Þ and algorithms for general linear algebra operations.
choose F ∈ R ∞ ϵ ð Þ, uniformly at random 10. for Remark 3.1. In [12] it is shown that by taking Then, the total number of arithmetic operations of the RS algorithm that guarantees a probability of at least 1 À , for systems with q ≤ q 0 , r ≤ r 0 for fixed q 0 , r 0 , where r ϵ is the radius of the basis of a cone with height ϵ that has the same volume as of S 0 ð Þ α ϵ ð Þ; see [10][11][12]. This is a polynomial-time algorithm by restricting the input and by regarding r ∞ r ϵ as the size of the problem.

The deterministic algorithm
The deterministic algorithm we introduce here as Algorithm 2 (which we call the MC algorithm) generalizes the algorithm of Daniel D. Moerder and Anthony A. Calise (see [15]) to the case of discrete-time systems. To the best of our knowledge, this is the best algorithm for LQR via SOF published so far, in terms of rate of convergence (to local minimum).
Here, we wish to minimize the LQR functional under the constraints Since Y T ¼ Y, there exist orthogonal matrix U such thatŶ ¼ U T YU is diagonal. Now, minimizing (8) under the constraints (9) is equivalent to minimizing under the constraint P > 0, whereŜ i,i are the Lagrange multipliers. We have for any K any P > 0 and any S such that S T ¼ S. The necessary conditions for optimality are ∂L where the last passage is affordable because σ A cℓ ð Þ⊂ . Note that the last with the stability of A cℓ implies that S ≥ 0. We also have Therefore, Otherwise, if which is equivalent to then where Z is arbitrary q Â r matrix (and we may take Z ¼ 0, unless some other constraints on K are needed). Note that if condition (12) does not happen, then ∂L ∂K 6 ¼ 0. We conclude with the following theorem: Theorem 4.1. Assume that L K, P, S ð Þgiven by (10) is minimized locally at some point K * , P * > 0, and S * such that S T * ¼ S * . Then where A cℓ K * ð Þ ¼ A À BK * C.

Proof:
Since L K * , P * , S * ð Þis minimal in some neighborhood of K * , P * , S * ð Þ , it follows that ∂L ∂K K * , P * , S * ð Þ¼0, ∂L ∂P K * , P * , S * ð Þ¼0, and Þ⊂ , it follows that I p ⊗I p À A cℓ K * ð Þ T ⊗A cℓ K * ð Þ T is invertible, and therefore, where Z * is some q Â r matrix. ■ Note that the equations are coupled tightly, in the sense that P * and S * need K * , while K * needs P * and S * . Note also the cubic dependencies (that can be made quadratic by introducing new variables). These make the related QMEs non-convex and, therefore, hard to compute. Remark 4.1. When x 0 is unknown, it is customary to assume that x 0 is uniformly distributed on the unit sphere, which implies that E x 0 x T 0 Â Ã ¼ I p , where E • ½ is the expectation operator. Thus, changing the problem to that of minimizing E J x 0 , P ð Þ ½ amounts to replacing S * with Therefore, there is change in Algorithm 2. Remark 4.2. The convergence of Algorithm 2 to local minimum can be proved similarly to the proof appearing in [15], under the assumptions that S qÂr α is nonempty and that Q 1 2 , A is detectable (here, we do not need this condition because of the assumption that Q > 0). The convergence can actually be proved for the more general problem that adds K k k 2 F ¼ trace K T K À Á to the LQR functional, thus minimizing also the Frobenius norm of K. In this context, note that by adding K k k 2 ¼ max σ K T K À Á À Á to the LQR functional will lose the argument, and there will be a need to more general proof, because in the proof appearing in [15], the demand is for C 1 smooth function of K, while K k k 2 ¼ σ max K T K À Á is continuous but not Lipschitz continuous. The RS algorithm can use any continuous function of K and can deal also with sparse SOFs for LQR and with regional pole-placement SOFs for LQR. Example 4.1. In the following simple example, we illustrate the notions appearing in the definition of the RS algorithm, and we demonstrate the operation of the RS algorithm. Consider the unstable system where we look for SOF K stabilizing the system while reducing the LQR func- Applying the Y. Bistritz stability criterion (see [25]), we have where υ is the number of sign variations in the set. According to the Bistritz criterion, the system is stable if and only if υ ¼ 0. We conclude that S is the set of all K such that 5 2 k 1 À k 2 À 3 2 > 0, À 3 2 k 1 þ 2k 2 þ 2 > 0, 1 2 k 1 À 3k 2 þ 3 2 > 0 or 5 2 k 1 À k 2 À 3 2 < 0, À 3 2 k 1 þ 2k 2 þ 2 < 0, 1 2 k 1 À 3k 2 þ 3 2 < 0, where the last branch is empty (which could have make the set non-convex). The set S appears in Figure 1 as the blue region, where the golden star is the analytic global optimal solution K * ¼ 1:09473459 0:36138828 ½ (computed by the related discrete algebraic Riccati equation).

Algorithm 2 The MC algorithm for LQR via SOF for discrete-time systems.
Require: An algorithm for deciding α-stability, an algorithm for computing σ max K ð Þ and algorithms for general linear algebra operations.
Input: 0 < ϵ ≤ 1 2 , 0 < α < 1, integers: m, s > 0, controllable pairs A, B ð Þand In Figure 1, we can see how the RS algorithm works: we fix α ¼ 10 À3 , ϵ ¼ 10 À16 , r ∞ ¼ 2, h ¼ 2, and we set m ¼ 1, n ¼ 5, s ¼ 10 for a single iteration, where the single cone is sampled along 5 rays and each ray is sampled 10 times. The sampled points are circled, where red circles indicate infeasible or nonimproving points and the black circles indicate improving points. The green star point is the initial point K 0 ð Þ found by the Ray-Shooting algorithm for minimalnorm SOF. The bold black circle is the boundary of the closed circle  K 0 ð Þ , h À Á . We choose U 0 ð Þ randomly to define the search direction, and we set L 0 to be the point where the direction meets the boundary of the circle. L is the tangent line at L 0 ð Þ to the circle, and R ∞ ϵ ð Þ is the 2r ∞ width segment on the line, inflated by ϵ. The search cone D 0 ð Þ is just the portion of the blue region inside the triangle, and we can see that the assumption that K * ∈ D 0 ð Þ is in force. For the current problem ⌈ e ffiffiffiffiffiffiffiffiffiffi ffi 2π qr p ⌉ ¼ 10, and therefore, by making 10 iterations, K * will be inside some triangle almost surely.
The algorithm chooses F in the basis of the triangle and defines K t ð Þ to be the ray from K 0 to F. The ray is sampled at 10 equally spaced points, and the best feasible point is recorded.
In Figure 2, we can see that 5 iterations suffice to include K * in some triangle and to find improving points very close to K * . In Figure 3, we can see that when we allow 20 iterations, after 10 iterations, the center K 0 ð Þ is switched to the best point found so far (see lines 24 À 26 in Algorithm 1). This is done in order to raise the probability to hit K * or its ϵ-neighborhood, and as we can see, the final best point (green star) is very close to K * (Figure 4).
The results of the algorithm for 1, 5 and 20 iterations are the following. Note that σ max K * ð Þ ¼ 5:9551, and note the "huge variations" the function σ max K ð Þ has. For m ¼ 1 we had  Note that in this case, the MC algorithm makes no improvement, while the RS and RS + MC have very close values to the global optimal value, with slightly better value for the RS + MC, over the RS algorithm.
For m ¼ 5 we had    In Figure 5, the initial condition response of the open-loop system is given. One can see the unstable mode related to the unstable eigenvalue 2. In Figures 6-8, the    The initial condition response of the closed-loop system with the SOF computed by the MC algorithm (blue) compared with the global optimal response (red).

Experiments
In the following experiments, we applied Algorithms 1 and 2, for systems taken from the libraries [26][27][28]. The systems given in these libraries are real-life continuous-time systems. In order to get related discrete-time systems, we sampled the systems using the Tustin method with sampling rate T s ¼ 0:01 sec ½ . We took only the systems for which the RS algorithm succeeded in finding SOFs for the continuous-time systems (see [10], Table 8, p. 231). In order to initialize the MC Algorithm, we also used the RS algorithm to find a starting α-stabilizing SOF. In all the experiments, we used m ¼ 2⌈ e ffiffiffiffiffiffiffiffiffiffi ffi 2π qr p ⌉, n ¼ 100, s ¼ 100; h ¼ 100, r ∞ ¼ 100, ϵ ¼ 10 À16 , for the RS algorithm; and m ¼ 200⌈ e ffiffiffiffiffiffiffiffiffiffi ffi 2π qr p ⌉, s ¼ 100, for the MC Algorithm, in order to get the same number of total iterations and the same number s ¼ 100 of iterations for the local search. We took Q ¼ I p , R ¼ I q in all the cases.
The stability margin column of Table 1 relates to 0 < α < 1 for which the absolute value of any eigenvalue of the closed loop is less than 1 À α. The values of α in Table 1 relates to the largest 0 < α < 1 for which the RS algorithm succeeded in finding a starting SOF K 0 ð Þ . As we saw above, it is worth searching for a starting point K 0 ð Þ that maximizes 0 < α < 1. This can be achieved efficiently by running a binary search on the 0 < α < 1 and using the RS algorithm as an oracle. Note that the RS CPU time appearing in the fourth column of Table 1 relates to running the RS algorithm for known optimal value of 0 < α < 1. The RS algorithm is sufficiently fast also for this purpose, but other algorithms such as the HIFOO (see [24]) and HINFSTRUCT (see [29]) can be applied in order to get a starting SOF. The advantage of the use of the RS is of finding starting SOF with relatively small norm. Let σ max F ð Þ denote the functional (7) for the system A, B, I p À Á , where A À BF is stable, i.e., F ∈ S qÂp . Let P F ð Þ denote the Lyapunov matrix (5) for the system A, B, I p À Á with F as above. Let σ max K ð Þ denote the functional (7) for the system A, B, C ð Þwith K ∈ S qÂr and related Lyapunov matrix P ¼ P K ð Þ given by (5). Now, if A À BKC is stable for some K, then A À BF is stable for F ¼ KC (but there might exist F such that A À BF is stable, which cannot be defined as KC for some q Â r matrix K). Therefore, where F * is an optimal LQR state-feedback (SF) controller for the system A, B, I p À Á . We conclude that σ max F * ð Þ≤ σ max K * ð Þ≤ σ max K best ð Þ À Á . Thus, σ max F * ð Þ is a lower bound for σ max K best ð Þ À Á and can serve as a good estimator for it, in order to quantify the convergence of the algorithm to the global minimum (as is evidently seen from Table 1 in many cases) and in order to stop the algorithm earlier, since σ max F * ð Þ can be calculated in advance. The lower bound appears in the last column of Table 1.
For all the systems, we had A, B ð Þ, A T , C T À Á controllable, except for ROC1 and ROC2. All the systems are unstable, except for AC6, AC15, and NN16 which are stable, but not α-stable, for α given in the stability margin column.

Conclusions from the experiments
Regarding the experimental results in Table 2 and the comparison between the RS algorithm and the MC algorithm, we conclude: 1. The RS algorithm performs in magnitude better than the MC algorithm for the systems: AC1, AC11, HE1, HE4, ROC1, ROC4, TF1, and NN5.
2. The MC algorithm performs in magnitude better than the RS algorithm for the systems AC5 and DIS5.
3. The MC algorithm performs slightly better than the RS algorithm for systems HE3 and NN16.
2. The MC algorithm achieved a very close value of σ max K best ð Þ À Á to the lower bound, for the systems AC6, HE3, DIS5, NN5, and NN16.
As was expected, the MC algorithm seems to perform better locally, while the RS algorithm seems to perform better globally. Thus, practically, the best approach is to apply the RS algorithm in order to find a close neighborhood of a global minimum and then to apply the MC algorithm on the result, for the local optimization, as is evidently seen from the performance of the RS + MC algorithm.

Some specific results
Example 5.1. For the HE4 system with

Concluding remarks
The Ray-Shooting Method is a powerful tool, since it practically solves the problem of LQR via SOF, for real-life discrete-time LTI systems. The proposed hybrid algorithm RS + MC has good performance in terms of run-time, in terms of the quality of controllers (by reducing the starting point LQR functional value and by reducing the controller norm) and in terms of the success rate in finding a starting point feasible with respect to the needed α-stability. The RS + MC algorithm has a proof of convergence in probability to a global minimum (as is evidently seen from the experiments). This enlarges the practicality and scope of the Ray-Shooting Method in solving hard complexity control problems, and we expect to receive more results in this direction.