Distributionally Robust Optimization

This chapter presents a class of distributionally robust optimization problems in which a decision-maker has to choose an action in an uncertain environment. The decision-maker has a continuous action space and aims to learn her optimal strategy. The true distribution of the uncertainty is unknown to the decision-maker. This chapter provides alternative ways to select a distribution based on empirical observations of the decision-maker. This leads to a distributionally robust optimization problem. Simple algorithms, whose dynamics are inspired from the gradient flows, are proposed to find local optima. The method is extended to a class of optimization problems with orthogonal constraints and coupled constraints over the simplex set and polytopes. The designed dynamics do not use the projection operator and are able to satisfy both upper-and lower-bound constraints. The convergence rate of the algorithm to generalized evolutionarily stable strategy is derived using a mean regret estimate. Illustrative examples are provided.


Introduction
Robust optimization can be defined as the process of determining the best or most effective result, utilizing a quantitative measurement system under worst case uncertain functions or parameters.The optimization may occur in terms of best robust design, net cash flows, profits, costs, benefit/cost ratio, quality-of-experience, satisfaction, end-to-end delay, completion time, etc.Other measurement units may be used, such as units of production or production time, and optimization may occur in terms of maximizing production units, minimizing processing time, The decision-maker chooses to experiment several trials and obtains statistical realizations of ω from measurements.The measurement data can be noisy, imperfect and erroneous.Then, an empirical distribution (or histogram) m is built from the realizations of ω: However, m is not the true distribution of the random variable ω, and m may not be a reliable measure due to statistical, bias, measurement, observation or computational errors.Therefore, the decisionmaker is facing a risk.The risk-sensitive decision-maker should decide action that improves the performance of E m ra ; ω ðÞ among alternative distributions m within a certain level of deviation r > 0 from the distribution m: The distributionally robust optimization problem is therefore formulated as sup a ∈ A inf m ∈ B r m ðÞ E ω$ m ra ; ω ðÞ : where B r m ðÞ is the uncertainty set of alternative admissible distributions from m within a certain radius r > 0: Different distributional uncertainty sets are presented: the f -divergence and the Wasserstein metric, defined below.

f -divergence
We introduce the notion of f À divergence which will be used to compute the discrepancy between probability distributions.Definition 1.Let m and m be two probability measures over Ω such that m is absolutely continuous with respect to m: Let f be a convex function.Then, the f -divergence between m and m is defined as follows: where dm d m is the Radon-Nikodym derivative of the measure m with the respect the measure m: By Jensen's inequality: Thus, D f m∥ m ðÞ ≥ 0 for any convex function f : Note however that, the f À divergence D f m∥ m ðÞ is not a distance (for example, it does not satisfy the symmetry property).Here the distributional uncertainty set imposed to the alternative distribution m is given by Distributionally Robust Optimization http://dx.doi.org/10.5772/intechopen.76686B r m ðÞ ¼ mj m : ðÞ≥ 0; From the notion of f À divergence one can derive the following important concept: • α-divergence for fa ðÞ¼ • In particular, Kullback-Leibler divergence (or relative entropy) is retrieved as α goes to 1:

Wasserstein metric
The Wasserstein metric between two probability distributions m and m is defined as follows: Definition 2. For m, m ∈ P Ω ðÞ , let Π m; m ðÞ be the set of all couplings between m and m: That is, It is well-known that for every θ ≥ 1,W θ m; m ðÞ is a true distance in the sense that it satisfies the following three axioms: • positive-definiteness, • the symmetry property, • the triangle inequality.
Note that m is not necessarily absolutely continuous with respect to m: Now the distributional uncertainty/constraint set is the set of all possible probability distributions within a L θ -Wasserstein distance below r: Note that, if m is a random measure (obtained from a sampled realization), we use the expected value of the Wasserstein metric.
Example 2. The L θ -Wasserstein distance between two Dirac measures δ ω 0 and δ ω0 is W θ δ ω 0 ; δ ω0 ðÞ ¼ d ω 0 ; ωo ðÞ : More generally, for K ≥ 2, the L 2 -Wasserstein distance between empirical measures We have defined B r m ðÞ and Br m ðÞ : The goal now is to solve (1) under both f À divergence and Wasserstein metric.One of the difficulties of the problem is the curse of dimensionality.The distributionally robust optimization problem (1) of the decision-maker is an infinitedimensional robust optimization problem because B r is of infinite dimensions.Below we will show that (1) can be transformed into an optimization in the form of supinfsup: The latter problem has three alternating terms.Solving this problem requires a triality theory.

Triality theory
We first present the duality gap and develop a triality theory to solve equivalent formulations of (1).Consider uncoupled domains A i ,i∈ 1; 2; 3 fg : For a general function r 2 , one has sup and the difference min is called duality gap.As it is widely known in duality theory from Sion's Theorem [1] (which is an extension of von Neumann minimax Theorem) the duality gap vanishes, for example for convex-concave function, and the value is achieved by a saddle point in the case of non-empty convex compact domain.
Triality theory focuses on optimization problems of the forms: sup infsup or infsup inf: The term triality is used here because there are three key alternating terms in these optimizations.
Proposition 1.Let a 1 ; a 2 ; a 3 ðÞ ↦ r 3 a 1 ; a 2 ; a 3 ðÞ ∈ R be a function defined on the product space Q 3 i¼1 A i : Then, the following inequalities hold: and similarly Proof.Define ĝa 2 ; a 3 ðÞ ≔ inf a 1 ∈ A 1 r 3 a 1 ; a 2 ; a 3 ðÞ : Thus, for all a 2 ,a 3 , one has ĝa 2 ; a 3 ðÞ ≤ r 3 a 1 ; a 2 ; a 3 ðÞ : It follows that, for any a 1 ,a 3 , sup Using the definition of ĝ, one obtains sup Taking the infimum in a 1 yields: sup Now, we use two operations for the variable a 3 : • Taking the infimum in the inequality (5) in a 3 yields inf which proves the second part of the inequalities (3).The first part of the inequalities (3) follows immediately from (5).
• Taking the supremum in inequality (5) in a 3 yields sup a 2 ;a 3 ðÞ ∈ A 2 ÂA 3 inf which proves the first part of the inequalities (4).The second part of the inequalities (4) follows immediately from (5).
This completes the proof.

Equivalent formulations
Below we explain how the dimensionality of problem (1) can be significantly reduced using a representation by means of the triality theory inequalities of Proposition 1.

f -divergence
Interestingly, the distributionally robust optimization problem (1) under f -divergence is equivalent to the finite dimensional stochastic optimization problem (when A are of finite A full understanding of problem 6 ðÞrequires a triality theory (not a duality theory).The use of triality theory leads to the following equation: where h is the integrand function Àλ r þ f 1 ðÞ ðÞ À μ À λf * rþμ

ÀÁ
, where f * is Legendre-Fenchel transform of f defined by Note that the righthand side of ( 7) is of dimension n þ 2, which reduces considerably the dimensionality of the original problem (1).

Wasserstein metric
Similarly, the distributionally robust optimization problem under Wasserstein metric is equivalent to the finite dimensional stochastic optimization problem (when A is a set of finite dimension).If the function ω ↦ ra ; ω ðÞ is upper semi-continuous and Ω; d ðÞ is a Polish space then the Wasserstein distributionally robust optimization problem is equivalent to The next subsection presents algorithms for computing a distributionally robust solution from the equivalent formulations above.

Learning algorithms
Learning algorithms are crucial for finding approximate solutions to optimization and control problems.They are widely used for seeking roots/kernel of a function and for finding feasible solutions to variational inequalities.Practically, a learning algorithm generates a certain trajectory (or a set of trajectories) toward a potential approximate solution.Selecting a learning algorithm that has specific properties such as better accuracy, more stability, less-oscillatory and quick convergence is a challenging task [2][3][4][5].From the calculus of variations point of view, however, a learning algorithm generates curves.Therefore, selecting an algorithm among the others leads to an optimal control problem on the spaces of curves.Hence, it is natural to use optimal control theory to derive faster algorithms for a family of curves.Bergman-based algorithms and risk-aware version of it are introduced below to meet specific properties.We start by introducing the Bregman divergence.We are now ready to define algorithms for solving the righthand side of ( 7) and ( 9).One of the key approaches for error quantification of the algorithm with respect to the distributionally robust optimum is the so-called average regret.When the regret vanishes one gets close to a distributionally robust optimum.

Definition 4. The average regret of an algorithm which generates the trajectory a t ðÞ¼ ãt ðÞ
where a 0 ðÞ¼a 0 is the initial point of the algorithm and g is a strictly convex function on a: Let at ðÞbe the solution to (10).

Then the average regret within t
where a 0 ðÞis the initial point of the algorithm and g is a strictly convex function on a: Let at ðÞ be the solution to (13).Then the average regret within t 0 ; T ½ ,t 0 > 0 is bounded above by where c 0 ≔ d g a  TÀt 0 with an initial gap of c 0 ¼ 25: The advantage of algorithms ( 10) and ( 13) is that it is not required to compute the Hessian of E m ha ; ω ðÞ as it is the case in the Newton scheme.As a corollary of Proposition 2 the regret vanishes as T grows.Thus, it is a no-regret algorithm.However, Algorithm (10) may not be sufficiently fast.Algorithm (13) provides a higher order convergence rate by carefully designing α; β ÀÁ : The average regret decays very quickly to zero [7].However, it may generate an Distributionally Robust Optimization http://dx.doi.org/10.5772/intechopen.76686oscillatory trajectory with a big magnitude.The next subsection presents risk-aware algorithms that reduce the oscillatory phase of the trajectory.

Risk-aware Bregman learning algorithm
In order to reduce the oscillatory phase, we introduce a risk-aware Bregman learning algorithm [7] which is a speed-up-and-average version of (13) called mean dynamics m of a given by  (13) generates the mean dynamics (15).
Proof.We use the average relation mt ðÞ¼ 1 t Ð t 0 as ðÞ ds where a solves Eq. ( 13).From the definition of m, and by Hopital'srule,m 0 ðÞ¼a 0 ðÞ : Moreover, mt ðÞand at ðÞshare the following equations: Substituting these values in Eq. ( 13) yields the mean dynamics (15).This completes the proof.

Optimization Algorithms -Examples
The risk-aware Bregman dynamics (15) generates a less oscillatory trajectory due to its averaging nature.The next result provides an accuracy bound for (15).
The coefficient ω distribution is unknown but a sampled empirical measure m is considered to be similar to uniform distribution in 0; 1 ð with 10 4 samples.We illustrate the quick convergence rate of the algorithm in a basic example and plot in Figure 2 the trajectories under standard gradient, Bregman dynamics and riskaware Bregman dynamics (15).In particular, we observe that risk-aware Bregman dynamics (15) provides very quickly a satisfactory value.In this particular setup, we observe that the accuracy of the risk-aware Bregman algorithm (15) at t ¼ 0:5 will need four times (t ¼ 2) less than the standard Bregman algorithm to reach a similar level of error.It takes 40 times more t ¼ 20 ðÞ than the gradient ascent to reach that level.Also, we observe that the risk-aware Bregman algorithm is less oscillatory and the amplitude decays very fast compared to the risk-neutral algorithm.

Constrained distributionally robust optimization
In the constrained case i.e., when A is a strict subset of R nþ2 , algorithms ( 10) and ( 13) present some drawbacks: The trajectory at ðÞmay not be feasible, i.e., at ðÞ ∉A Â R þ Â R even when it starts in A: In order to design feasible trajectories, projected gradient has been widely studied in the literature.However, a projection into A at each time t involves additional optimization problems and the computation of the projected gradient adds extra complexity to the algorithm.We restrict our attention to the following constraints: We impose the following feasibility condition: a l < a l ,l ∈ 1; …; n fg ,c l > 0, P n l¼1 c l a l < b: Under this setting, the constraint set A is non-empty, convex and compact.
We propose a method to compute a constrained solution that has a full support (whenever it exists).We do not use the projection operator.Indeed we transform the domain a l ; a l ÂÃ ¼ ξ 0; 1 ½ ðÞ where ξ x l ðÞ ¼ a l x l þ a l 1 À x l ðÞ ¼ a l : ξ is a one-to-one mapping and The algorithm (18) Distributionally Robust Optimization http://dx.doi.org/10.5772/intechopen.76686 generates a trajectory at ðÞthat satisfies the constraint.
Algorithm 4. The constrained learning pseudocode is as follows: Proof.It suffices to check that for b ≤ min l c l a l À a l ÀÁ , the vector z defined by z l ¼ e y l P n k¼1 e y k solves the replicator equation, Note that the dynamics of x in Eq. ( 19) is a constrained replicator dynamics [8] which is widely used in evolutionary game dynamics.This observation establishes a relationship between optimization and game dynamics and explains that the replicator dynamics is the gradient flow of the (expected payoff) under simplex constraint.

Optimization Algorithms -Examples
The next example illustrates a constrained distributionally robust optimization in wireless communication networks.
Example 5 (Wireless communication).Consider a power allocation problem over n medium access channels.The signal-to-interference-plus-noise ratio (SINR) is • The interference on channel l is denoted I l ≥ 0: One typical model for I l is • e > 0 is the height of the transmitter antenna.
• ω ll is the channel state at l: The channel state is unknown.Its true distribution is also unknown.
• s r l ðÞis the location of the receiver of l • s t l ðÞis the location of the transmitter of l • o ∈ 2; 3; 4 fg is the pathloss exponent.
• a l is the power allocated to channel l: It is assumed to be between a l ≥ 0 and a l with 0 ≤ a l < a l < þ∞: Moreover, a total power budget constraint is imposed P n l¼1 a l ≤ a where a > P n l¼1 a l ≥ 0: It is worth mentioning that the action constraint of the power allocation problem are similar to the ones analyzed in Section 3. The admissible action space is Clearly, A is a non-empty convex compact set.The payoff function is the sum-rate r a; ω ðÞ ¼ P n l¼1 W l log 1 þ SINR l ðÞ where W l > 0: The mapping a; ω ðÞ ↦ ra ; ω ðÞ is continuously differentiable.
• Robust optimization is too conservative: Part of the robust optimization problem [9,7] consists of choosing the channel gain ω ll jj 2 ∈ 0; ω ll ½ were the bound ω need to be carefully designed.However the worst case is achieved when the channel gain is zero: ¼ 0: Hence the robust performance is zero.This is too conservative as several realizations of the channel may give better performance than zero.Another way is to re-design the bounds ω ll and ω ll : But if ω ll > 0 it means that very low channel gains are not allowed, which may be too optimistic.Below we use the distributional robust optimization approach which eliminates this design issue.
• Distributional robust optimization: By means of the training sequence or channel estimation method, a certain (statistical) distribution m is derived.However m cannot be considered as the Distributionally Robust Optimization http://dx.doi.org/10.5772/intechopen.76686true distribution of the channel state due to estimation error.The true distribution of ω is unknown.Based on this observation, an uncertainty set B r m ðÞ with radius r ≥ 0 is constructed for alternative distribution candidates.Note that r ¼ 0 means that B 0 m ðÞ ¼m fg : The distributional robust optimization problem is sup a inf m ∈ B r m ðÞ E m ra ; ω ðÞ : In presence of interference, the function r a; ω ðÞ is not necessarily concave in a: In absence of interference, the problem becomes concave.We add the term a; b À a hi to both sides to obtain the following relationships:

Distributed optimization
Recall that the projection operator on a convex and closed set A is uniquely determined by This completes the proof.
As a consequence we can derive the following existence result.
is the revision protocol, which describes how virtual agents are making decisions.The revision protocol ϱ takes a population state a, the corresponding fitness ∇E m h, the adjacency matrix Λ and returns a matrix.Therefore, let ϱ lk a; h; Λ ðÞ be the switching rate from the l th to k th component.Then, the virtual agents selecting the strategy l ∈ L have incentives to migrate to the strategy l ∈ L only if ϱ lk a; h; Λ ðÞ > 0, and it is also possible to design switch rates depending on the topology describing the migration constraints, i.e., λ lk ¼ 0 ) ϱ lk a; h; Λ ðÞ ¼ 0: The distributed distributionally robust optimization consists to perform the optimization problem above over the distributed network that is subject to communication restriction.We construct a distributed distributionally robust game dynamics to perform such a task.The distributed distributionally robust evolutionary game dynamics emerge from the combination of the (robust) fitness h and the constrained switching rates ϱ: The evolution of the portion a l is given by the distributed distributional robust mean dynamics Since the distributionally robust function h is obtained after the transformation from payoff function r by means of triality theory, the dynamics ( 22) is seeking for distributed distributionally robust solution.
Algorithm 5.The distributed distributional robust mean dynamics pseudocode is as follows: where r : R n !R is concave, and the parameters are possibly uncertain and selected as Therefore, the fitness functions for the corresponding full potential game are given by f l a ðÞ¼À 2a l c 2l À c 1l , for all l ∈ L, and action space is given by The distributed revision protocol is set to Figure 3 presents the evolution of the generated power, the fitness functions corresponding to the marginal costs and the total cost.For the first scenario, the evolutionary game dynamics converge to a standard evolutionarily stable state in which fa ⋆ ðÞ ¼ c1 n .In contrast, for the second scenario, the dynamics converge to a constrained evolutionarily stable state.

Extension to multiple decision-makers
Consider a constrained game G in strategic-form given by • P ¼ 1; …; P fg is the set of players.The cardinality of P is P ≥ 2: • Player p has a decision space A p ⊂ R n p ,n p ≥ 1: Players are coupled through their actions and their payoffs.The set of all feasible action profiles is A ⊂ R n , with n ¼ P p ∈ P n p : Player p can choose an action a p in the set A p a Àp ÀÁ ¼ a p ∈ A p : a p ; a Àp ÀÁ ∈ A ÈÉ : • Player p has a payoff function r p : A ! R: We restrict our attention to the following constraints: We propose a method to compute a constrained equilibrium that has a full support (whenever it exists).We do not use the projection operator.Indeed we transform the domain a pl ; a pl hi ¼ ξ 0; 1 ½ ðÞ where ξ x pl ÀÁ ¼ a pl x pl þ a pl 1 À x pl ÀÁ ¼ a pl : ξ is a one-to-one mapping and x

Notes
The work in [10] provides a nice intuitive introduction to robust optimization emphasizing the parallel with static optimization.Another nice treatment [11], focusing on robust empirical risk minimization problem, is designed to give calibrated confidence intervals on performance and provide optimal tradeoffs between bias and variance [12,13].f -divergence based performance evaluations are conducted in [11,14,15].The connection between risk-sensitivity measures such as the exponentiated payoff and distributionally robustness can be found in [16].Distributionally robust optimization and learning are extended to multiple strategic decisionmaking problems i.e., distributionally robust games in [17,18].

Figure 1
Figure 1 illustrates the advantage of algorithm (13) compared with the gradient flow (10).It plots the regret bound c 0 TÀt 0 Ð T t 0 e Àβ s ðÞ ds for β ¼ s and d g a * ; a 0 ðÞ log T t 0

ðÞ o 2 N
0 s r l ðÞ ðÞ þ I l s r l ðÞ ðÞ , where • N 0 > 0 is the background noise.

Proposition 9 .
Let the set of virtual population states A be a non-empty convex compact and the mapping b ↦ E m ∇hb ; ω ðÞ be continuous.Then, there exists at least one equilibrium in A:Proof.A direct application of the Brouwer-Schauder's fixed-point theorem which states that if ϕ : A !A is continuous and A non-empty convex compact then ϕ has at least one fixedpoint in A: Here we choose ϕ aðÞ¼proj A a þ ηE m ∇ha ; ωðÞ½ : Clearly ϕ A ðÞ ⊆A and ϕ is continuous on A as the mapping b ↦ E m ∇hb ; ω ðÞ and the projection operator b ↦ proj A b ½are both continuous.Then the announced result follows.This completes the proof.Note that we do not need sophisticated set-valued fixed-point theory to obtain this result.Definition 8.The virtual population state a is evolutionarily stable if a ∈ A and for any alternative deviant state b 6 ¼ a there is an invasion barrier e b > 0 such that a À b, E m ∇haþ e b À a ðÞ ; ω ðÞ > 0, ∀e ∈ 0; e b ðÞ : h The function ϱ : A A p ¼ a p ∈ R n p j a pl ∈ a pl ; a pl hi ; l ∈ 1; …; n p ÈÉ ; http://dx.doi.org/10.5772/intechopen.76686Feasibility condition: If a pl < a pl ,l ∈ 1; …; n p ÈÉ ,c pl > 0, P n p l¼1 c pl a pl < b p , c p ∈ R n p >0 and P p ∈ P c p ; a p DE < b, the constraint set A is non-empty, convex and compact.
[6]ume that a * be a feasible action profile, i.e., a * ∈ A: Consider the continuous time analogue of the Armijo gradient flow[6], which is given by À E m hat ðÞ ; ω ðÞ dt 2.4.1.Armijo gradient flow Algorithm 1.The Armijo's gradient pseudocode is as follows: 1: Procedure ARMIJO GRADIENT a 0 ðÞ ; e; T; g; m; h ðÞ ⊳ The Armijo's gradient starting from a 0 ðÞwithin 0; T ½ 3: while regret > e and t ≤ T do ⊳ We have the answer if regret is 0 t ⊳ get a(t) and the regret 8: end procedure Proposition 2. Let a ↦ E m ha ; ω ðÞ : R nþ2 !R be a concave function that has a unique global maximizer a * : a E m hat ðÞ ; ω ðÞ , Distributionally Robust Optimization http://dx.doi.org/10.5772/intechopen.76686where the last inequality is by convexity of g: It follows that d dt Wat ðÞ ðÞ ≤ 0 along the path of the gradient flow.This decreasing property implies 0 ≤ Wat ðÞ ðÞ ≤ Wa0 ðÞ ðÞ ¼ d g a * ; a 0 ðÞ ðÞ : In particular, 0 ≤ tE m ha * ; ω ðÞ À ha ; ω ðÞ ½ ≤ Wa0 ðÞ ðÞ < þ∞: Thus, the error to the value E m ha * ; ω Note that the above regret-bound is established without assuming strong convexity of a ↦ À E m ha ; ω ðÞ : Also no Lipschitz continuity bound of the gradient is assumed.

Table 1 .
(14)e the trajectory generated by Bregman algorithm starting from a 0 at time t 0 : The convergence time to be within a ball B E m ha * ; ω Under the assumptions above, the error generated by the algorithm is at most(14)which means that it takes at most T δ ¼ β À1 log c 0 Convergence rate under different set of functions.The Legendre-Fenchel transform of f is f * ξ ðÞ¼y * ¼ e ξÀ1 : Let a 1 ; a 2 ðÞ ↦ ga ðÞ¼∥a∥ 2 2 , and a 1 ; a 2 ; ω ðÞ ↦ ra 1 ; a 2 Definition 5. (Convergence time).Let δ > 0 and a t δ ¼ inf t j E m ha * ; ω ðÞ À hat ðÞ ; ω ðÞ ½ ≤ δ; t > t 0 fg : Proposition 6. * ; ω ðÞ .Proof.The proof is immediate.For δ > 0 the average regret bound of Proposition 5, e Àβ s ðÞ ds ≤ δ,(17)provides the announced convergence time bound.This completes the proof.See Table1for detailed parametric functions on the bound T δ : Figure 2. Gradient ascent vs. risk-aware Bregman dynamics for r ¼À 1 þ P 2 k¼1 ω 2 k a 2 k : This section presents distributed distributionally robust optimization problems over a direct graph.A large number of virtual agents can potentially choose a node (vertex) subject to constraint.The vector a represents the population state.Since a has n components, the graph has n vertices.The interactions between virtual agents are interpreted as possible connections of the graph.Let us suppose that the current interactions are represented by a directed graphG ¼ L; EðÞ, where E ⊆ L 2 is the set of links representing the possible interaction among the proportion of agents, i.e., if l; k ðÞ ∈ E, then the component l of a can interact with the kÀth component of a.In other words, l; k ðÞ ∈ E means that virtual agents selecting the strategy l ∈ L could migrate to strategy k ∈ L: Moreover, Λ ∈ 0; 1 fg nÂn is the adjacency matrix of the graph G, and whose entries are λ lk ¼ 1, if l; k ðÞ ∈ E; and λ lk ¼ 0, otherwise.Proposition 8. Let the set of virtual population state A be non-empty convex compact and b ↦ E m ∇hb ; ω ðÞ be continuous.Then the following conditions are equivalent: Let a l ∈ R þ be the power generated by the generator l ∈ L. Each power generation should satisfy the physical and/or operation constraints a l ∈ a l ; a l ÂÃ , for all l ∈ L. It is desired to satisfy the power demand given by d ∈ R, i.e., it is necessary to guarantee that P l ∈ L a l ¼ d, i.e., the supply meets the demand.The objective is to minimize the generation quadratic costs for all the generators, i.e., 1: procedure POPULATION-INSPIRED ALGORITHM a 0 ðÞ ; e; T; ϱ; g; m; h; Λ 3: while regret > e and t ≤ T do ⊳ We have the answer if regret is 0 Example 6.Let us consider a power system that is composed of 10 generators, i.e., let L ¼ 1; …; 10 fg .
¼ 0 n and a ¼ d1l n , 2. a l ¼ 0, for all l ∈ L 9; 10 fg , a 9 ¼ 1:1, and a 10 ¼ 1; and a l ¼ d, for all l ∈ L 1; 2 fg , a 1 ¼ 3, and a 2 ¼ 2:5,3.Case 1 constraints and with interaction restricted to the cycle graph G ¼ L; EðÞ with set of links E ¼ ∪ l ∈ L n fg l; l þ 1 Case 2 constraints and with interaction restricted as in Case 3.
pl ¼ ξ À1 a pl l that satisfies the constraint of player p at any time t: