Bilevel Disjunctive Optimization on Affine Manifolds

Bilevel optimization is a special kind of optimization where one problem is embedded within another. The outer optimization task is commonly referred to as the upper-level optimization task, and the inner optimization task is commonly referred to as the lowerlevel optimization task. These problems involve two kinds of variables: upper-level variables and lower-level variables. Bilevel optimization was first realized in the field of game theory by a German economist von Stackelberg who published a book (1934) that described this hierarchical problem. Now the bilevel optimization problems are commonly found in a number of real-world problems: transportation, economics, decision science, business, engineering, and so on. In this chapter, we provide a general formulation for bilevel disjunctive optimization problem on affine manifolds. These problems contain two levels of optimization tasks where one optimization task is nested within the other. The outer optimization problem is commonly referred to as the leaders (upper level) optimization problem and the inner optimization problem is known as the followers (or lower level) optimization problem. The two levels have their own objectives and constraints. Topics affine convex functions, optimizations with auto-parallel restrictions, affine convexity of posynomial functions, bilevel disjunctive problem and algorithm, models of bilevel disjunctive programming problems, and properties of minimum functions.

The function f is: (1) linear affine if its restriction f x t ð Þ ð Þ on each autoparallel x t ð Þ satisfies f x t ð Þ ð Þ ¼ at þ b, for some numbers a, b that may depend on x t ð Þ; (2) affine convex if its restriction f x t ð Þ ð Þis convex on each auto-parallel x t ð Þ. Proof. For given Γ, if we consider ∂x h as a PDEs system (a particular case of a Frobenius-Mayer system of PDEs) with 1 2 n n þ 1 ð Þ equations and the unknown function f , then we need the complete integrability conditions Since, Corollary 1.1 If there exists n linear affine functions f l , l ¼ 1, …, n on M; Γ ð Þ, whose df l are linearly independent, then Γ is flat, that is, R h ikj ¼ 0.
Of course this only means the curvature tensor is zero on the topologically trivial region we used to set up our co-vector fields df l x ð Þ. But we can always cover any manifold by an atlas of topologically trivial regions, so this allows us to deduce that the curvature tensor vanishes throughout the manifold. Remark 1.1 There is actually no need to extend df l x ð Þ to the entire manifold. If this could be done, then df l x ð Þ would now be everywhere nonzero co-vector fields; but there are topologies, for example, S 2 , for which we know such things do not exist. Therefore, there are topological manifolds for which we are forced to work on topologically trivial regions.
The following theorem is well-known [16,17,19,23]. Due to its importance, now we offer new proofs (based on catastrophe theory, decomposing a tensor into a specific product, and using slackness variables).
(1) If f is regular or has only one minimum point, then there exists a connection Γ such that f is affine convex.
(2) If f has a maximum point x 0 , then there is no connection Γ making f affine convex throughout.
Proof. For the Hessian Hess Γ f ð Þ ij be positive semidefinite, we need n conditions like inequalities and equalities. The number of unknowns Γ h ij is 1 2 n 2 n þ 1 ð Þ: The inequalities can be replaced by equalities using slackness variables.
The first central idea for the proof is to use the catastrophe theory, since almost all families f x; c ð Þ, x ¼ x 1 ; …; ; x n À Á ∈ R n , c ¼ c 1 ; …; c m ð Þ∈ R m , of real differentiable functions, with m ≤ 4 parameters, are structurally stable and are equivalent, in the vicinity of any point, with one of the following forms [15]: We eliminate the case with maximum point, that is., Morse 0-saddle and the saddle point. Around each critical point (in a chart), the canonical form f x; c ð Þ is affine convex, with respect to appropriate locally defined linear connections that can be found easily. Using change of coordinates and the partition of unity, we glue all these connections to a global one, making f x; c ð Þ affine convex on M.
At any critical point x 0 , the affine Hessian Hess Γ f is reduced to Euclidean Hessian, Then the maximum point condition or the saddle condition is contradictory to affine convexity condition.
A direct proof based on decomposition of a tensor: Let M; Γ ð Þ be an affine manifold and Suppose f has no critical points (is regular). If the function f is not convex with respect to Γ, we look to find a new connection Γ where σ ij x ð Þ is a positive semi-definite tensor. A very particular solution is the decomposition where the vector field a has the property Remark 1.2 The connection Γ h ij is strongly dependent on both the function f and the tensor field σ ij .
Suppose f has a minimum point x 0 . In this case, observe that we must have the condition Can we make the previous reason for x 6 ¼ x 0 and then extend the obtained connection by continuity? The answer is generally negative. Indeed, let us compute Here we cannot plug in the point x 0 because we get 0 0 , an indeterminate form.
To contradict, we fix an auto-parallel γ t ð Þ, t ∈ 0; e ½ Þ, starting from minimum point But this result depends on the direction v (different values along two different auto-parallels).
In some particular cases, we can eliminate the dependence on the vector v. For example, the conditions are sufficient to do this.
A particular condition for independence on v is In this particular condition, we can show that we can build connections of previous type good everywhere.

Lightning through examples
Let us lightning our previous statements by the following examples.
Then ∂f ∂x ¼ 3x 2 þ 3, ∂f ∂y ¼ 3y 2 þ 3 and f has no critical point. Moreover, the Euclidean Hessian of f is not positive semi-definite overall. Let us make the above construction for σ ij x; y ð Þ ¼ δ ij . Taking a 1 ¼ a 2 ¼ 1, we obtain the connection that is not unique.
Example 1.2 (for one minimum point) Let us consider the function f : Þ and f has a unique critical minimum point 0; 0 ð Þ. However, the Euclidean Hessian of f is not positive semi-definite overall. We make previous reason for The next example shows what happens if we come out of the conditions of the previous theorem.
Let us consider the ODE of auto-parallels The solutions Þ , where t 1 , t 2 are real solutions of À2 þ t 2 À ct ¼ 0. These curves are extended at t ¼ 0 by continuity. The manifold R; Γ ð Þis not auto-parallely complete. Since the image x R ð Þ is not a "segment", the function f : Remark 1.3 For n ≥ 2, there exists C 1 functions φ : R n ! R which have two minimum points without having another extremum point. As example, The restriction is difference of two affine convex functions (see Section 2).

Optimizations with autoparallel restrictions 2.1. Direct theory
The auto-parallel curves x t ð Þ on the affine manifold M; Γ ð Þ are solutions of the second order ODE system is strictly positive (negative), then x 0 is a minimum (maximum) point of f constrained by the autoparallel system.
and all the other Γs are equal to zero. We can show that the apparent singularity at θ ¼ 0, π can be removed by a better choice of coordinates at the poles of the sphere. Thus, the above affine connection extends to the whole sphere.
The second order system defining auto-parallel curves (geodesics) on S 2 are The solutions are great circles on the sphere. For example, θ ¼ α t þ β and φ = const.
We compute the curvature tensor R of the unit sphere S 2 . Since there are only two independent coordinates, all the non-zero components of curvature tensor R are given by R θ ¼ À1 and the other components are 0.
Consequently, the critical points are either The components of the Hessian of f are At the critical points θ 0 ; φ ð Þor θ 1 ; φ ð Þ , the Hessian of f is positive or negative semi-definite. On the other hand, along ξ 1 ¼ 0; , is a minimum point of f along each auto-parallel, starting from given point and tangent to ξ 1 ¼ 0; ξ 2 6 ¼ 0 À Á .

Theory via the associated spray
This point of view regarding extrema comes from paper [22]. The second order system of auto-parallels induces a spray (special vector field) Y x; y ð Þ ¼ The solutions γ t ð Þ ¼ x t ð Þ; y t ð Þ ð Þ: I ! D of class C 2 are called field lines of Y. They depend on the initial condition γ t ð Þj t¼t0 ¼ x 0 ; y 0 À Á , and therefore the notation γ t; x 0 ; y 0 À Á is more suggestive.
is strictly positive (negative), then x 0 ; y 0 À Á is a minimum (maximum) point of f constrained by the spray.
Theorem 3.1 Each posynomial function is affine convex, with respect to some affine connection.

Proof. A posynomial function has the form
where all the coefficients c k are positive real numbers, and the exponents a ik are real numbers. Let us consider the auto-parallel curves of the form joining the points a ¼ a 1 ; …; a n À Á and b ¼ b 1 ; …; b n À Á , which fix, as example, the affine connection One term in this sum is of the form Remark 3.1 Posynomial functions belong to the class of functions satisfying the statement "product of two convex function is convex".
Corollary 3.1 Each signomial function is difference of two affine convex posynomials, with respect to some affine connection.

Proof. A signomial function has the form
where all the exponents a ik are real numbers and the coefficients c k are either positive or negative. Without loss of generality, suppose that for k ¼ 1, …, k 0 we have c k > 0 and for k ¼ k 0 þ 1, …, K we have c k < 0. We use the decomposition we apply the Theorem and the implication u 00 t ð Þ ≥ v 00 t ð Þ ) u À v convex. □ Corollary 3.2 (1) The polynomial functions with positive coefficients, restricted to R n þþ , are affine convex functions.
(2) The polynomial functions with positive and negative terms, restricted to R n þþ , are differences of two affine convex functions.
Proudnikov [18] gives the necessary and sufficient conditions for representing Lipschitz multivariable function as a difference of two convex functions. An algorithm and a geometric interpretation of this representation are also given. The outcome of this algorithm is a sequence of pairs of convex functions that converge uniformly to a pair of convex functions if the conditions of the formulated theorems are satisfied.

Bilevel disjunctive problem
Let M 1 , 1 Γ À Á , the leader decision affine manifold, and M 2 , 2 Γ À Á , the follower decision affine manifold, be two connected affine manifolds of dimension n 1 and n 2 , respectively. Moreover, M 2 , 2 Γ À Á is supposed to be complete. Let also f : M 1 Â M 2 ! R be the leader objective function, and let F ¼ F 1 ; …; F r ð Þ: M 1 Â M 2 ! R r be the follower multiobjective function.
The components F i : M 1 Â M 2 ! R are (possibly) conflicting objective functions.
A bilevel optimization problem means a decision of leader with regard to a multi-objective optimum of the follower (in fact, a constrained optimization problem whose constraints are obtained from optimization problems). For details, see [5,10,12].
Let x ∈ M 1 , y ∈ M 2 be the generic points. In this chapter, the disjunctive solution set of a follower multiobjective optimization problem is defined by (1) the set-valued function where Argmin y ∈ M2 F x; y ð Þ≔ ∪ r i¼1 Argmin y ∈ M2 F i x; y ð Þ or (2) the set-valued function where Argmax y ∈ M2 F x; y ð Þ≔∪ r i¼1 Argmax y ∈ M2 F i x; y ð Þ: We deal with two bilevel problems: (1) The optimistic bilevel disjunctive problem In this case, the follower cooperates with the leader; that is, for each x ∈ M 1 , the follower chooses among all its disjunctive solutions (his best responses) one which is the best for the leader (assuming that such a solution exists).
(2) The pessimistic bilevel disjunctive problem In this case, there is no cooperation between the leader and the follower, and the leader expects the worst scenario; that is, for each x ∈ M 1 , the follower may choose among all its disjunctive solutions (his best responses) one which is unfavorable for the leader.
So, a general optimization problem becomes a pessimistic bilevel problem. Proof. Let us consider the multi-functions Proof. In our hypothesis, the set ψ x ð Þ is nonvoid, for any x, and the compacity assures the existence of min x f x; ψ x ð Þ ð Þ .
In the next Theorem, we shall use the Value Function Method or Utility Function Method. □ Proof. Let min y L x; y ð Þ ¼ L x; y * ð Þ. Suppose that for each i ¼ 1, …, k, min y F i x; y ð Þ < F i x; y * ð Þ. Then y * would not be minimum point for the partial function y ! L x; y ð Þ. Hence, there exists an index i such that y * ∈ ψ i x ð Þ. □ Boundedness of f implies that the bilevel problem has solution once it is well-posed, but the fact that the problem is well-posed is shown in the first part of the proof.

Bilevel disjunctive programming algorithm
An important concept for making wise tradeoffs among competing objectives is bilevel disjunctive programming optimality, on affine manifolds, introduced in this chapter.
We present an exact algorithm for obtaining the bilevel disjunctive solutions to the multiobjective optimization in the following section.
Step 1: Solve Let ψ x ð Þ ¼ ∪ r i¼1 ψ i x ð Þ be a subset in M 2 representing the mapping of optimal solutions for the follower multi-objective function.
Step 3: Solve the leader's following program min x f x; y ð Þ; y ∈ ψ x ð Þ ½ : From numerical point of view, we can use the Newton algorithm for optimization on affine manifolds, which is given in [19].

Models of bilevel disjunctive programming problems
The manifold M is understood from the context. The connection Γ h ij can be realized in each case, imposing convexity conditions. Example 5.1 Let us solve the problem (cite [7], p. 7; [9]): Both the lower and the upper level optimization tasks have two objectives each. For a fixed y value, the feasible region of the lower-level problem is the area inside a circle with center at origin x 1 ¼ x 2 ¼ 0 ð Þ and radius equal to y. The Pareto-optimal set for the lower-level optimization task, preserving a fixed y, is the bottom-left quarter of the circle, The linear constraint in the upper level optimization task does not allow the entire quarter circle to be feasible for some y. Thus, at most a couple of points from the quarter circle belongs to the Pareto-optimal set of the overall problem. Eichfelder [8] reported the following Pareto-optimal set of solutions The Pareto-optimal front in F 1 À F 2 space can be written in parametric form Example 5.2 Consider the bilevel programming problem where the set-valued function is ψ x ð Þ ¼ Argmin y xy : Àx À 1 ⩽ y ⩽ À x þ 1 ½ : Explicitly, Since F x; y on the regions where the functions are defined.
Taking into account À2x À 1 ð Þ 2 þ x 2 > 0 and À2x þ 1 ð Þ 2 þ x 2 > 0, it follows that x ∘ ; y ∘ ð Þ¼ 0; 0 ð Þis the unique optimistic optimal solution of the problem. Now, if the leader is not exactly enough in choosing his solution, then the real outcome of the problem has an objective function value above 1 which is far away from the optimistic optimal value zero.
If we consider only ψ 1 x ð Þ as active, then the unique optimal solution 0; 0 ð Þ is maintained. If ψ 2 x ð Þ is active, then the optimal solution is 0; 1 ð Þ.

Properties of minimum functions
Let M 1 , 1 Γ À Á , the leader decision affine manifold, and M 2 , 2 Γ À Á , the follower decision affine manifold, be two connected affine manifolds of dimension n 1 and n 2 , respectively. Starting from a function with two vector variables φ : M 1 Â M 2 ! R, x; y ð Þ ! φ x; y ð Þ, and taking the infimum after one variable, let say y, we build a function which is called minimum function.
A minimum function is usually specified by a pointwise mapping a of the manifold M 1 in the subsets of a manifold M 2 and by a functional φ x; y ð Þ on M 1 Â M 2 . In this context, some differential properties of such functions were previously examined in [4]. Now we add new properties related to increase and convexity ideas.