## Abstract

Bilevel optimization is a special kind of optimization where one problem is embedded within another. The outer optimization task is commonly referred to as the upper-level optimization task, and the inner optimization task is commonly referred to as the lower-level optimization task. These problems involve two kinds of variables: upper-level variables and lower-level variables. Bilevel optimization was first realized in the field of game theory by a German economist von Stackelberg who published a book (1934) that described this hierarchical problem. Now the bilevel optimization problems are commonly found in a number of real-world problems: transportation, economics, decision science, business, engineering, and so on. In this chapter, we provide a general formulation for bilevel disjunctive optimization problem on affine manifolds. These problems contain two levels of optimization tasks where one optimization task is nested within the other. The outer optimization problem is commonly referred to as the leaders (upper level) optimization problem and the inner optimization problem is known as the followers (or lower level) optimization problem. The two levels have their own objectives and constraints. Topics affine convex functions, optimizations with auto-parallel restrictions, affine convexity of posynomial functions, bilevel disjunctive problem and algorithm, models of bilevel disjunctive programming problems, and properties of minimum functions.

### Keywords

- convex programming
- affine manifolds
- optimization along curves
- bilevel disjunctive optimization
- minimum functions
- Mathematics Subject Classification 2010: 90C25
- 90C29
- 90C30

## 1. Affine convex functions

In optimization problems [16, 17, 19, 23, 24, 25, 26, 27], one can use an *affine manifold* as a pair

They are used for defining the convexity of subsets in

**Definition 1.1** *An affine manifold* *is called autoparallely complete if any auto-parallel* *starting at* *is defined for all values of the parameter*

**Theorem 1.1** [1] *Let* *be a (Hausdorff, connected, smooth) compact* *-manifold endowed with an affine connection* *and let* *. If the holonomy group* *(regarded as a subgroup of the group* *of all the linear automorphisms of the tangent space* *) has compact closure, then* *is autoparallely complete.*

Let

**Definition 1.2** *A* *function* *is called:*

*(1) linear affine with respect to* *if* *, throughout;*

*(2) affine convex (convex with respect to* *) if* *(positive semidefinite), throughout.*

The function

**Theorem 1.2** *If there exists a linear affine nonconstant function* *on* *, then the curvature tensor field* *is in*

*Proof.* For given

as a PDEs system (a particular case of a Frobenius-Mayer system of PDEs) with

Since,

it follows

**Corollary 1.1** *If there exists* *linear affine functions* *on* *, whose* *are linearly independent, then* *is flat, that is,*

Of course this only means the curvature tensor is zero on the topologically trivial region we used to set up our co-vector fields

**Remark 1.1** *There is actually no need to extend* *to the entire manifold. If this could be done, then* *would now be everywhere nonzero co-vector fields; but there are topologies, for example,* *, for which we know such things do not exist. Therefore, there are topological manifolds for which we are forced to work on topologically trivial regions.*

The following theorem is well-known [16, 17, 19, 23]. Due to its importance, now we offer new proofs (based on catastrophe theory, decomposing a tensor into a specific product, and using slackness variables).

**Theorem 1.3** *Let* *be a* *function.*

*(1) If* *is regular or has only one minimum point, then there exists a connection* *such that* *is affine convex.*

*(2) If* *has a maximum point* *, then there is no connection* *making* *affine convex throughout.*

*Proof.* For the Hessian

The first central idea for the proof is to use the catastrophe theory, since almost all families

We eliminate the case with maximum point, that is., Morse

At any critical point

**A direct proof based on decomposition of a tensor:** Let

Suppose

where

and the tensor

**Remark 1.2** *The connection* *is strongly dependent on both the function* *and the tensor field*

Suppose

Here we cannot plug in the point

To contradict, we fix an auto-parallel

But this result depends on the direction

In some particular cases, we can eliminate the dependence on the vector

are sufficient to do this.

A particular condition for independence on

In this particular condition, we can show that we can build connections of previous type good everywhere.

### 1.1. Lightning through examples

Let us lightning our previous statements by the following examples.

**Example 1.1** *(for the first part of the theorem) Let us consider the function* *and* *. Then* *and* *has no critical point. Moreover, the Euclidean Hessian of* *is not positive semi-definite overall. Let us make the above construction for* *. Taking* *, we obtain the connection*

*that is not unique.*

**Example 1.2** *(for one minimum point) Let us consider the function* *and* *. Then* *and* *has a unique critical minimum point* *. However, the Euclidean Hessian of* *is not positive semi-definite overall. We make previous reason for* *. Hence we obtain*

*Observe that* *. Hence take*

The next example shows what happens if we come out of the conditions of the previous theorem.

**Example 1.3** *Let us take the function* *, where the critical point* *is an inflection point. We take* *, which is not defined at the critical point* *, but the relation of convexity is realized by prolongation,*

*Let us consider the ODE of auto-parallels*

*The solutions*

*are auto-parallels on* *, where* *are real solutions of* *. These curves are extended at* *by continuity. The manifold* *is not auto-parallely complete. Since the image* *is not a “segment”, the function* *is not globally convex.*

**Remark 1.3** *For* *, there exists* *functions* *which have two minimum points without having another extremum point. As example,*

*has two (global) minimum points*

*The restriction*

*is difference of two affine convex functions (see Section 2).*

Our chapter is based also on some ideas in: [3] (convex mappings between Riemannian manifolds), [7] (geometric modeling in probability and statistics), [13] (arc length in metric and Finsler manifolds), [14] (applications of Hahn-Banach principle to moment and optimization problems), [21] (geodesic connectedness of semi-Riemannian manifolds), and [28] (tangent and cotangent bundles). For algorithms, we recommend the paper [20] (sequential and parallel algorithms).

## 2. Optimizations with autoparallel restrictions

### 2.1. Direct theory

The auto-parallel curves

Obviously, the complete notation is

**Definition 2.1** *Let* *be open and connected and* *a* *function. The point* *is called minimum (maximum) point of* *conditioned by the auto-parallel system, together with initial conditions, if for the maximal solution* *, there exists a neighborhood* *of* *such that*

**Theorem 2.1** *If* *is an extremum point of* *conditioned by the previous second order system, then*

**Definition 2.2** *The points* *which are solutions of the equation* *are called critical points of* *conditioned by the previous spray.*

**Theorem 2.2** *If* *is a conditioned critical point of the function* *of class* *constrained by the previous auto-parallel system and if the number*

*is strictly positive (negative), then* *is a minimum (maximum) point of* *constrained by the auto-parallel system.*

**Example 2.1** *We compute the Christoffel symbols on the unit sphere* *, using spherical coordinates* *and the Riemannian metric*

*When* *, we find*

*and all the other* *s are equal to zero. We can show that the apparent singularity at* *can be removed by a better choice of coordinates at the poles of the sphere. Thus, the above affine connection extends to the whole sphere.*

*The second order system defining auto-parallel curves (geodesics) on* *are*

*The solutions are great circles on the sphere. For example,* *and* *= const.*

*We compute the curvature tensor* *of the unit sphere* *. Since there are only two independent coordinates, all the non-zero components of curvature tensor* *are given by* *, where* *. We get* *and the other components are*

*Let* *be the maximal auto-parallel which satisfies* *. We wish to compute* *with the restriction*

*Since* *, the critical point condition* *becomes* *. Consequently, the critical points are either* *, or* *, or*

*The components of the Hessian of* *are*

*At the critical points* *or* *, the Hessian of* *is positive or negative semi-definite. On the other hand, along* *, we find* *Consequently, each point* *, is a minimum point of* *along each auto-parallel, starting from given point and tangent to*

### 2.2. Theory via the associated spray

This point of view regarding extrema comes from paper [22].

The second order system of auto-parallels induces a spray (special vector field)

The solutions

**Definition 2.3** *Let* *be open and connected and* *a* *function. The point* *is called minimum (maximum) point of* *conditioned by the previous spray, if for the maximal field line* *, there exists a neighborhood* *of* *such that*

**Theorem 2.3** *If* *is an extremum point of* *conditioned by the previous spray, then* *is a point where* *is in*

**Definition 2.4** *The points* *which are solutions of the equation*

*are called critical points of* *conditioned by the previous spray.*

**Theorem 2.4** *If* *is a conditioned critical point of the function* *of class* *constrained by the previous spray and if the number*

*is strictly positive (negative), then* *is a minimum (maximum) point of* *constrained by the spray.*

**Example 2.2** *We consider the Volterra-Hamilton ODE system* [2].

*which models production in a Gause-Witt* *-species evolving in* *: (1) competition if* *and (2) parasitism if*

*Changing the real parameter* *into an affine parameter* *, we find the connection with constant coefficients*

*Let* *be the maximal field line which satisfies* *. We wish to compute* *with the restriction*

*We apply the previous theory. Introduce the vector field*

*We set the critical point condition* *. Since* *, it follows the relation* *, that is, the critical point set is a conic in*

*Since* *, the sufficiency condition is reduced to* *, that is,*

*This last relation is equivalent either to*

*or to*

*Each critical point satisfying one of the last two conditions is a maximum point.*

## 3. Affine convexity of posynomial functions

For the general theory regarding geometric programming (based on posynomial, signomial functions, etc.), see [11].

**Theorem 3.1** *Each posynomial function is affine convex, with respect to some affine connection.*

*Proof.* A posynomial function has the form

where all the coefficients

joining the points

It follows

One term in this sum is of the form

**Remark 3.1** *Posynomial functions belong to the class of functions satisfying the statement “product of two convex function is convex”.*

**Corollary 3.1** *Each signomial function is difference of two affine convex posynomials, with respect to some affine connection.*

*Proof.* A signomial function has the form

where all the exponents

we apply the Theorem and the implication

**Corollary 3.2** *(1) The polynomial functions with positive coefficients, restricted to* *, are affine convex functions.*

*(2) The polynomial functions with positive and negative terms, restricted to* *, are differences of two affine convex functions.*

Proudnikov [18] gives the necessary and sufficient conditions for representing Lipschitz multivariable function as a difference of two convex functions. An algorithm and a geometric interpretation of this representation are also given. The outcome of this algorithm is a sequence of pairs of convex functions that converge uniformly to a pair of convex functions if the conditions of the formulated theorems are satisfied.

## 4. Bilevel disjunctive problem

Let *leader decision affine manifold*, and *follower decision affine manifold*, be two connected affine manifolds of dimension *leader objective function*, and let *follower multiobjective function*.

The components

A *bilevel optimization problem* means a decision of leader with regard to a multi-objective optimum of the follower (in fact, a constrained optimization problem whose constraints are obtained from optimization problems). For details, see [5, 10, 12].

Let *disjunctive solution set of a follower multiobjective optimization problem* is defined by

(1) the set-valued function

where

or

(2) the set-valued function

where

We deal with two bilevel problems:

(1) The *optimistic bilevel disjunctive problem*

In this case, the follower cooperates with the leader; that is, for each

(2) The *pessimistic bilevel disjunctive problem*

In this case, there is no cooperation between the leader and the follower, and the leader expects the worst scenario; that is, for each

So, a general optimization problem becomes a pessimistic bilevel problem.

**Theorem 4.1** *The value*

*exists if and only if, for an index* *, the minimum* *exists and, for each* *, either* *exists or* *. In this case,*

*coincides to the minimum of minima that exist.*

*Proof.* Let us consider the multi-functions

Taking minimum of minima that exist, we find

**Theorem 4.2** *Suppose* *is a compact manifold. If for each* *, at least one partial function* *is affine convex and has a critical point, then the problem* *has a solution.*

*Proof.* In our hypothesis, the set

In the next Theorem, we shall use the Value Function Method or Utility Function Method.□

**Theorem 4.3** *If a* *increasing scalarization partial function*

*has a minimum, then there exists an index* *such that* *. Moreover, if* *is bounded, then the bilevel problem*

*has solution.*

*Proof.* Let

Boundedness of

### 4.1. Bilevel disjunctive programming algorithm

An important concept for making wise tradeoffs among competing objectives is bilevel disjunctive programming optimality, on affine manifolds, introduced in this chapter.

We present an exact algorithm for obtaining the bilevel disjunctive solutions to the multi-objective optimization in the following section.

**Step 1**: Solve

Let

**Step 2**: Build the mapping

**Step 3**: Solve the leader’s following program

From numerical point of view, we can use the Newton algorithm for optimization on affine manifolds, which is given in [19].

## 5. Models of bilevel disjunctive programming problems

The manifold

**Example 5.1** *Let us solve the problem (cite* [7]*, p. 7;* [9]*):*

*subject to*

*Both the lower and the upper level optimization tasks have two objectives each. For a fixed* *value, the feasible region of the lower-level problem is the area inside a circle with center at origin* *and radius equal to* *. The Pareto-optimal set for the lower-level optimization task, preserving a fixed* *, is the bottom-left quarter of the circle,*

*The linear constraint in the upper level optimization task does not allow the entire quarter circle to be feasible for some* *. Thus, at most a couple of points from the quarter circle belongs to the Pareto-optimal set of the overall problem. Eichfelder [*8*] reported the following Pareto-optimal set of solutions*

*The Pareto-optimal front in* *space can be written in parametric form*

**Example 5.2** *Consider the bilevel programming problem*

*where the set-valued function is*

*Explicitly,*

*Since* *, we get*

*on the regions where the functions are defined.*

*Taking into account* *and* *, it follows that* *is the unique optimistic optimal solution of the problem. Now, if the leader is not exactly enough in choosing his solution, then the real outcome of the problem has an objective function value above* *which is far away from the optimistic optimal value zero.*

**Example 5.3** *Let* *and a Pareto disjunctive problem*

*Then it appears a bilevel disjunctive programming problem of the form*

*This problem is interesting excepting the case* *. If* *and* *are convex functions, then*

*To write an example, we use*

*and we consider a bilevel disjunctive programming problem of the form*

*with*

*where*

*The objective* *and the multi-function* *produce a multi-function*

*In context, we find the inferior envelope*

*and then*

*Since* *, the unique optimal solution is*

*If we consider only* *as active, then the unique optimal solution* *is maintained. If* *is active, then the optimal solution is*

## 6. Properties of minimum functions

Let *leader decision affine manifold*, and *follower decision affine manifold*, be two connected affine manifolds of dimension

and taking the infimum after one variable, let say

which is called *minimum function*.

A *minimum function* is usually specified by a pointwise mapping

First we give a new proof to Brian White Theorem (see Mean Curvature Flow, p. 7, Internet 2017).

**Theorem 6.1** *Suppose that* *is compact,* *and* *. Let* *. If, for each* *with* *, we have* *, then* *is an increasing function.*

*Proof.* We shall prove the statement in three steps.

(1) *If* *is continuous, then* *is (uniformly) continuous.*

Indeed,

On one hand, if we put

Hence

On the other hand,

Hence

Finally,

(2) *Let us fix* *. If* *and* *, then it exists* *such that* *, for any*

Suppose

If

(3) *is an increasing function.*

Let

**Remark** The third step shows that a function having the properties (1) and (2) is increasing. For this the continuity is essential. Only property (2) is not enough. For example, the function defined by

**Remark** Suppose that

Consequently,

□

**Example 6.1** *The single-time perspective of a function* *is the function* *. The single-time perspective* *is convex if* *is convex.*

*The single-time perspective is an example verifying Theorem 7.1. Indeed, the critical point condition for* *, in* *, gives* *, where* *is a critical point of* *. Consequently,* *. On the other hand, in the minimum point, we have* *. Then* *is increasing if* *, as in* *Theorem 4.1*.

**Theorem 6.2** *Suppose that* *is compact and* *. Let* *. If, for each* *with* *, we have* *, then* *is a partially increasing function.*

*Proof.* Suppose that

Consequently,

**Theorem 6.3** *Suppose that* *is compact and* *. Let* *. If, for each* *with* *, we have* *, then* *is an affine concave function.*

*Proof.* Without loss of generality, we work on Euclidean case. Suppose that

Taking the partial derivative with respect to

On the other hand

□

**Theorem 6.4** *Let* *be a* *function and*

*If the set* *is affine convex and* *is affine convex, then* *is affine convex.*

*Proof.* Suppose