Open access peer-reviewed chapter

# Stochastic Leader-Follower Differential Game with Asymmetric Information

By Jingtao Shi

Submitted: December 18th 2017Reviewed: February 14th 2018Published: September 26th 2018

DOI: 10.5772/intechopen.75413

## Abstract

In this chapter, we discuss a leader-follower (also called Stackelberg) stochastic differential game with asymmetric information. Here the word “asymmetric” means that the available information of the follower is some sub- σ -algebra of that available to the leader, though they play as different roles in the classical literatures. Stackelberg equilibrium is represented by the stochastic versions of Pontryagin’s maximum principle and verification theorem with partial information. A linear-quadratic (LQ) leader-follower stochastic differential game with asymmetric information is studied as applications. If some system of Riccati equations is solvable, the Stackelberg equilibrium admits a state feedback representation.

### Keywords

• backward stochastic differential equation (BSDE)
• asymmetric information
• stochastic filtering
• Stackelberg equilibrium

## 1. Introduction

Throughout this chapter, we denote by Rnthe Euclidean space of n-dimensional vectors, by Rn×dthe space of n×dmatrices, by Snthe space of n×nsymmetric matrices. and denote the scalar product and norm in the Euclidean space, respectively. Τappearing in the superscripts denotes the transpose of a matrix. fx,fxxdenote the partial derivative and twice partial derivative with respect to xfor a differentiable function f.

### 1.1. Motivation

In practice, there are many problems which motivate us to study the leader-follower stochastic differential games with asymmetric information. Here we present two examples.

Example 1.1: (Continuous time principal-agent problem) The principal contracts with the agent to manage a production process, whose cumulative proceeds (or output) Ytevolve on 0Tas follows:

dYt=Betdt+σdWt+σ˜dW˜t,Y0=Y0R,E1

where etRis the agent’s effort choice, Brepresents the productivity of effort, and there are two additive shocks (due to the two independent Brownian motions W,W˜) to the output. The proceeds of the production add to the principal’s asset yt, which earns a risk free return r, and out of which he pays the agent stRand withdraws his own consumption dtR. Thus the principal’s asset evolves as

dyt=ryt+Betstdtdt+σdWt+σ˜dW˜t,y0=y0R,E2

where y0is the initial asset. The agent has his own wealth mt, out of which he consumes ct, thus

dmt=rmt+stctdt+σ¯dWt+σ¯˜dW˜t,m0=m0R,E3

Thus, the agent earns the same rate of return ron his savings, gets income flows due to his payment st, and draws down wealth to consume. In the above σ,σ˜,σ¯,σ¯˜are all constants. At the terminal time T, the principal makes a final payment sTand the agent chooses consumption based on this payment and his terminal wealth mT. In the above, we restrict yt,st,dtto be nonnegative.

We consider an optimal implementable contract problem in the so-called “hidden savings” information structure (Williams [1], also in Williams [2]). In this problem, the principal can observe his asset ytand the agent’s initial wealth m0but cannot monitor the agent’s effort et, consumption ct, and wealth mtfor t>0. The principal must provide incentives for the agent to put forth the desired amount of the effort. For any st,dt, the agent first chooses his effort etand consumption ct, such that his exponential preference

J1ecsd=E0Teρtexpλct12et2dt+eρTsT+mTE4

is maximized. Here ρ>0is the discount rate and λ>0denotes the risk aversion parameter. The above etctis called an implementable contract if it meets the recommended actions of the principal’s, which is based on the principal’s observable wealth yt. Then, the principal selects his payment stand consumption dtto maximize his exponential preference

J2ecsd=E0Teρtexpλdtdt+eρTyTsT.E5

Let Ftdenote the σ-algebra generated by Brownian motions Ws,W˜s,0st. Intuitively, Ftcontains all the information up to time t. Let G1,tcontains the information available to the agent, and G2,tcontains the information available to the principal, up to time trespectively. Moreover, G1,tG2,t. In the game problem, first the agent solves the following optimization problem:

J1ecsd=maxe,cJ1ecsd,E6

where ecis a G1,t-adapted process pair. And then the principal solves the following optimization problem:

J2ecsd=maxs,dJ2ecsd,E7

where sdis a G2,t-adapted process pair. This formulates a stochastic Stackelberg differential game with asymmetric information. In this setting, the agent is the follower and the principal is the leader. Any process quadruple ecsdsatisfying the above two equalities is called a Stackelberg equilibrium. In Williams [1], a solvable continuous time principal-agent model is considered under three information structures (full information, hidden actions, and hidden savings) and the corresponding optimal contract problems are solved explicitly. But it can not cover our model.

Example 1.2: (Continuous time manufacturer-newsvendor problem) Let Dbe the demand rate for a product in the market, which satisfies

dDt=aμDtdt+σdWt+σ˜W˜t,D0=d0R,E8

where a,μ,σ,σ˜are constants. Suppose that the market is consisted with a manufacturer selling the product to end users through a retailer. At time t, the retailer chooses an order rate qtfor the product and decides its retail price Rt, and is offered a wholesale price wtby the manufacturer. We assume that items can be salvaged at unit price S0, and that items cannot be stored, that is, they must be sold instantly or salvaged. The retailer will obtain an expected profit

J1qRw=E0TRtSminDtqtwtSqtdt.E9

When the manufacturer has a fixed production cost per unit M0, he will get an expected profit

J2qRw=E0TwtMqtSmaxqtDt0dt.E10

In the above, we assume that S<MwtRt.

Let Ftdenote the σ-algebra generated by Ws,W˜s,0st, which contains all the information up to time t. At time t, let the information G1,t,G2,tavailable to the retailer and the manufacturer, respectively, are both sub-σ-algebras of Ft. Moreover, G1,tG2,t. This can be explained from the practical application’s aspect. Specifically, the manufacturer chooses a wholesale price wtat time t, which is a G2,t-adapted stochastic process. And the retailer chooses an order rate qtand a retail price Rtat time t, which are G1,t-adapted stochastic processes. For any w, to select a G1,t-adapted process pair qRfor the retailer such that

J1qRwJ1qwRww=maxq,RJ1qRw,E11

and then to select a G2,t-adapted process wfor the manufacturer such that

J2qRwJ2qwRww=maxwJ2qwRww,E12

formulates a leader-follower stochastic differential game with asymmetric information. In this setting, the manufacturer is the leader and the retailer is the follower. Any process triple qRwsatisfying the above is called a Stackelberg equilibrium. In Øksendal et al. [3], a time-dependent newsvendor problem with time-delayed information is solved, based on stochastic differential game (with jump-diffusion) approach. But it cannot cover our model.

### 1.2. Problem formulation

Motivated by the examples earlier, in this chapter we study the leader-follower stochastic differential games with asymmetric information. Let ΩFbe a complete probability space. WW˜is a standard R2-valued Brownian motion and Ft0tTbe its natural augmented filtration and FT=Fwhere T>0is a finite time duration. Let the state satisfy the stochastic differential equation (SDE)

dxu1,u2t=btxu1,u2tu1tu2tdt+σtxu1,u2tu1tu2tdWt+σ˜txu1,u2tu1tu2tdW˜t,xu1,u20=x0,E13

where u1and u2are control processes taken by the two players in the game, labeled 1 (the follower) and 2 (the leader), with values in nonempty convex sets U1R, U2R, respectively. xu1,u2, the solution to SDE Eq. (13) with values in R, is the state process with initial state x0Rn. Here btxu1u2:Ω×0T×R×U1×U2R,σtxu1u2:Ω×0T×R×U1×U2R,σ˜txu1u2:Ω×0T×R×U1×U2Rare given Ft-adapted processes, for each xu1u2.

Let us now explain the asymmetric information character between the follower (player 1) and the leader (player 2) in this chapter. Player 1 is the follower, and the information available to him at time tis based on some sub-σ-algebra G1,tG2,t, where G2,tis the information available to the leader. We assume in this and next sections that G1,tG2,tFt. We define the admissible control sets of the follower and the leader, respectively, as follows.

The game initiates with the announcement of the leaders control u2U2. Knowing this, the follower would like to choose a G1,t-adapted control u1=u1u2to minimize his cost functional

J1u1u2=E0Tg1(txu1,u2tu1tu2t)dt+G1xu1,u2T.E15

Here g1txu1u2:Ω×0T×R×U1×U2Ris an Ft-adapted process, and G1x:Ω×RRis an FT-measurable random variable, for each xu1u2. Now the follower encounters a stochastic optimal control problem with partial information.

SOCPF. For any chosen u2U2by the leader, choose a G1,t-adapted control u1=u1u2U1, such that

J1u1u2J1u1u2u2=infu1U1J1u1u2,E16

subject to Eqs. (13) and (15). Such a u1=u1u2is called an optimal control, and the corresponding solution xu1,u2to Eq. (13) is called an optimal state.

In the following step, once knowing that the follower will take such an optimal control u1=u1u2, the leader would like to choose a G2,t-adapted control u2to minimize his cost functional

J2u1u2=E0Tg2(txu1,u2tu1(tu2t)u2t)dt+G2xu1,u2T.E17

Here g2txu1u2:Ω×0T×R×U1×U2R,G2x:Ω×RRare given Ft-adapted processes, for each xu1u2. Now the leader encounters a stochastic optimal control problem with partial information.

SOCPL. Find a G2,t-adapted control u2U2, such that

J2u1u2=J2u1u2u2=infu2U2J2u1u2u2,E18

subject to Eqs. (13) and (17). Such a u2is called an optimal control, and the corresponding solution xxu1,u2to Eq. (13) is called an optimal state. We will rewrite the problem for the leader in more detail in the next section. We refer to the problem mentioned above as a leader-follower stochastic differential game with asymmetric information. If there exists a control process pair u1u2=u1u2u2satisfying Eqs. (16) and (18), we refer to it as a Stackelberg equilibrium.

In this chapter, we impose the following assumptions.

(A1.1) For each ωΩ, the functions b,σ,σ˜,g1are twice continuously differentiable in xu1u2. For each ωΩ, functions g2and G1,G2are continuously differentiable in xu1u2and x, respectively. Moreover, for each ωΩand any txu1u20T×R×U1×U2, there exists C>0such that

1+x+u1+u21ϕtxu1u2+ϕxtxu1u2+ϕu1txu1u2+ϕu2txu1u2+ϕxxtxu1u2+ϕu1u1txu1u2+ϕu2u2txu1u2C,E19

for ϕ=b,σ,σ˜, and

1+x21G1x+1+x1G1xx+1+x21G2x+1+x1G2xxC,1+x2+u12+u221g1txu1u2+1+x+u1+u21(g1xtxu1u2+g1u1txu1u2+g1u2txu1u2)+g1xxtxu1u2+g1u1u1txu1u2+g1u2u2txu1u2C,1+x2+u12+u221g2txu1u2+1+x+u1+u21(g2xtxu1u2+g2u1txu1u2+g2u2txu1u2)C.E20

### 1.3. Literature review and contributions of this chapter

Differential games are initiated by Issacs [4], which are powerful in modeling dynamic systems where more than one decision-makers are involved. Differential games have been researched by many scholars and have been applied in biology, economics, and finance. Stochastic differential games are differential games for stochastic systems involving noise terms. See Basar and Olsder [5] for more information about differential games. Recent developments for stochastic differential games can be seen in Hamadène [6], Wu [7], An and Øksendal [8], Wang and Yu [9, 10], and the references therein.

Leader-follower stochastic differential game is the stochastic and dynamic formulation of the Stackelberg game, which was introduced by Stackelberg [11] in 1934, when the concept of a hierarchical solution for markets where some firms have power of domination over others, is defined. This solution concept is now known as the Stackelberg equilibrium, which in the context of two-person nonzero-sum games, involves players with asymmetric roles, one leader and one follower. Pioneer study for stochastic Stackelberg differential games can be seen in Basar [12]. Specifically, a leader-follower stochastic differential game begins with the follower aims at minimizing his cost functional in response to the leader’s decision on the whole duration of the game. Anticipating the follower’s optimal decision depending on his entire strategy, the leader selects an optimal strategy in advance to minimize his cost functional, based on the stochastic Hamiltonian system satisfied by the follower’s optimal decision. The pair of the leader’s optimal strategy and the follower’s optimal response is known as the Stackelberg equilibrium.

A linear-quadratic (LQ) leader-follower stochastic differential game was studied by Yong [13] in 2002. The coefficients of the the cost functionals and system are random, the diffusion term of the state equation contain the controls, and the weight matrices for the controls in the cost functionals are not necessarily positive definite. The related Riccati equations are derived to give a state feedback representation of the Stackelberg equilibrium in a nonanticipating way. Bensoussan et al. [14] obtained the global maximum principles for both open-loop and closed-loop stochastic Stackelberg differential games, whereas the diffusion term does not contain the controls.

In this chapter, we study a leader-follower stochastic differential game with asymmetric information. Our work distinguishes itself from these mentioned above in the following aspects. (1) In our framework, the information available to the follower is based on some sub-σ-algebra of that available to the leader. Moreover, both information filtration available to the leader and the follower could be sub-σ-algebras of the complete information filtration naturally generated by the random noise source. This gives a new explanation for the asymmetric information feature between the follower and the leader, and endows our problem formulation more practical meanings in realty. (2) Our work is established in the context of partial information, which is different from that of partial observation (see e.g., Wang et al. [15]) but related to An and Øksendal [8], Huang et al. [16], Wang and Yu [10]. (3) An important class of LQ leader-follower stochastic differential game with asymmetric information is proposed and then completely solved, which is a natural generalization of that in Yong [13]. It consists of a stochastic optimal control problem of SDE with partial information for the follower, and followed by a stochastic optimal control problem of forward-backward stochastic differential equation (FBSDE) with complete information for the leader. This problem is new in differential game theory and have considerable impacts in both theoretical analysis and practical meaning with future application prospect, although it has intrinsic mathematical difficulties. (4) The Stackelberg equilibrium of this LQ problem is characterized in terms of the forward-backward stochastic differential filtering equations (FBSDFEs) which arises naturally in our setup. These FBSDFEs are new and different from those in [10, 16]. (5) The Stackelberg equilibrium of this LQ problem is explicitly given, with the help of some new Riccati equations.

The rest of this chapter is organized as follows. In Section 2, we solve our problem to find the Stackelberg equilibrium. In Section 3, we apply our theoretical results to an LQ problem. Finally, Section 4 gives some concluding remarks.

## 2. Stackelberg equilibrium

### 2.1. The Follower’s problem

In this subsection, we first solve SOCPF. For any chosen u2U2, let u1be an optimal control for the follower and the corresponding optimal state be xu1,u2. Define the Hamiltonian function H1:Ω×0T×R×U1×U2×R×R×RRas

H1txu1u2qkk˜=qbtxu1u2+txu1u2+k˜σ˜txu1u2g1txu1u2.E21

dqt=bx(txu1,u2tu1tu2t)qt+σx(txu1,u2tu1tu2t)kt+σ˜x(txu1,u2tu1tu2t)k˜tg1x(txu1,u2tu1tu2t)dtktdWtk˜tdW˜t,qT=G1xxu1,u2TE22

Proposition 2.1 Let (A1.1) hold. For any given u2U2, let u1be the optimal control for SOCPF, and xu1,u2be the corresponding optimal state. Let qkk˜be the adjoint process triple. Then we have

EH1u1(txu1,u2tu1tu2tqtktk˜t)u1u1tG1,t0,a.e.t0T,a.s.,E23

holds, for any u1U1.

Proof Similar to the proof of Theorem 2.1 of [10], we can get the result.

Proposition 2.2 Let (A1.1) hold. For any given u2, let u1U1and xu1,u2be the corresponding state. Let qktk˜be the adjoint process triple. For each tω0T×Ω, H1tu2tqtktk˜tis concave, G1is convex, and

EH1(txu1,u2tu1tu2tqtktk˜t)G1,t=maxu1U1EH1(txu1,u2tu1u2tqtktk˜t)G1,t,E24

holds for a.e.t0T, a.s. Then u1is an optimal control for SOCPF.

Proof Similar to the proof of Theorem 2.3 of [10], we can obtain the result.

In this subsection, we first state the SOCPL. Then, we give the maximum principle and verification theorem. For any u2U2, by Eq. (23), we assume that a functional u1t=u1tx̂u1,û2tû2tq̂tk̂tk˜̂tis uniquely defined, where

x̂u1,û2tExu1,u2tG1,t,û2tEu2tG1,t,q̂tEqtG1,t,k̂tEktG1,t,k˜̂tEk˜tG1,t.E25

For the simplicity of notations, we denote xu2xu1,u2and define ϕLon Ω×0T×R×U2as ϕLtxu2tu2tϕtxu1,u2tu1tx̂u1,û2tû2tq̂tk̂tk˜tu2t, for ϕ=b,σ,σ˜,g1, respectively. Then after substituting the above control process u1into Eq. (22), the leader encounters the controlled FBSDE system

dxu2t=bLtxu2tu2tdt+σLtxu2tu2tdWt+σ˜Ltxu2tu2tdW˜t,dqt=bxL(txu2tu2t)qt+σxL(txu2tu2t)kt+σ˜xL(txu2tu2t)k˜tg1xL(txu2tu2t)dtktdWtk˜tdW˜t,xu20=x0,qT=G1xxu2T.E26

Note that Eq. (26) is a controlled conditional mean-field FBSDE, which now is regarded as the “state” equation of the leader. That is to say, the state for the leader is the quadruple xu2qkk˜.

Remark 2.1 The equality u1t=u1tx̂u1,û2tû2tq̂tk̂tk˜̂tdoes not hold in general. However, for LQ case, it is satisfied and we will make this point clear in the next section.

Define

J2Lu2J2u1u2=E0Tg2(txu1,u2tu1tu2t)dt+G2xu1,u2TE0Tg2(txu1,u2tu1(tx̂u1,û2tû2tq̂tk˜̂tk˜t)u2t)dt+G2xu1,u2TE0Tg2L(txu2tu2t)dt+G2xu2T,E27

where g2L:Ω×0T×R×U2R. Note the cost functional of the leader is also conditional mean-field’s type. We propose the stochastic optimal control problem with partial information of the leader as follows.

SOCPL. Find a G2,t-adapted control u2U2, such that

J2Lu2=infu2U2J2Lu2,E28

subject to Eqs. (26) and (27). Such a u2is called an optimal control, and the corresponding solution xxu2to Eq. (26) is called an optimal state process for the leader.

Let u2be an optimal control for the leader, and the corresponding state xqkk˜is the solution to Eq. (26). Define the Hamiltonian function of the leader H2:Ω×0T×Rn×U2×R×R×R×R×R×R×R×RRas

H2txu2u2qkk˜yzz˜p=ybLtxu2u2+zσLtxu2u2+z˜σ˜Ltxu2u2+g2Ltxu2u2pbxLtxu2u2q+σxL(txu2u2)k+σ˜xL(txu2u2)k˜g1xL(txu2u2).E29

Let ϕLtϕLtxtx̂tu2tû2tfor ϕ=b,σ,σ˜,g1,g2and all their derivatives. Suppose that yzz˜pR×R×R×Ris the unique Ft-adapted solution to the adjoint conditional mean-field FBSDE of the leader

dpt=bxLtpt+E[bx̂LtptG1,t]dt+σxLtpt+E[σx̂LtptG1,t]dWt+σ˜xLtpt+E[σ˜x̂LtptG1,t]dW˜t,p0=0,dyt=bxLtyt+E[bx̂LtytG1,t]+σxLtzt+E[σx̂LtztG1,t]+σ˜xLtz˜t+Eσ˜x̂Ltz˜tG1,tbxxLtqtptEbxx̂LqtptG1,tσxxLtktptEσxx̂LtktptG1,tσ˜xxLtk˜tptEσxx̂Ltk˜tptG1,t+g1xxLtpt+E[g1xx̂LtptG1,t]+g2xLt+E[g2x̂LtG1,t]dtztdWtz˜tdW˜t,yT=G1xxxTpT+G2xxT.E30

Now, we have the following two results.

Proposition 2.3 Let (A1.1) hold. Let u2U2be an optimal control for SOCPL and xqkk˜be the optimal state. Let yzz˜pbe the adjoint quadruple, then

EH2u2(txtu2tqtktk˜tytztz˜tpt)u2u2t+E[H2û2(txtu2tqtktk˜tytztz˜tpt)G1,t]û2û2tG2,t0,a.e.t0T,a.s.,foranyu2U2.E31

Proof The maximum condition Eq. (31) can be derived by convex variation and adjoint technique, as Anderson and Djehiche [17]. We omit the details for saving space. See also Li [18], Yong [19] and the references therein for mean-field stochastic optimal control problems.

Proposition 2.4 Let (A1.1) hold. Let u2U2and xqkk˜be the corresponding state, with G1xxxG1Sn. Let yzz˜pbe the adjoint quadruple. For each tω0T×Ω, suppose that H2tytztz˜tptand G2are convex, and

EH2(txtu2tqtktk˜tytztz˜tpt)+E[H2(txtu2tqtktk˜tytztz˜tpt)G1,t]G2,t=maxu2U2EH2(txtu2qtktk˜tytztz˜tpt)+E[H2(txtu2qtktk˜tytztz˜tpt)G1,t]G2,t,a.e.t0T,a.s.E32

Then u2is an optimal control for SOCPL.

Proof This follows similar to Shi [20]. We omit the details for simplicity.

## 3. Applications to LQ case

In order to illustrate the theoretical results in Section 2, we study an LQ leader-follower stochastic differential game with asymmetric information. In this section, we let G1,tσW˜s0stand G2,t=Ft. This game is a special case of the one in Section 2, but the resulting deduction is very technically demanding. We split this section into two subsections, to deal with the problems of the follower and the leader, respectively.

### 3.1. Problem of the follower

Suppose that the state xu1,u2Rsatisfies a linear SDE

dxu1,u2t=Axu1,u2t+B1u1t+B2u2tdt+Cxu1,u2t+D1u1t+D2u2tdWt+C˜xu1,u2t+D˜1u1t+D˜2u2t]dW˜t,xu1,u20=x0.E33

Here, u1is the follower’s control process and u2is the leader’s control process, which take values both in R; A,C,C˜,B1,D1,D˜1,B2,D2,D˜2are constants. In the first step, for announced u2, the follower would like to choose a G1,t-adapted, square-integrable control u1to minimize the cost functional

J1u1u2=12E0TQ1xu1,u2t2+N1u1t2dt+G1xu1,u2T2.E34

In the second step, knowing that the follower would take u1, the leader wishes to choose an Ft-adapted, square-integrable control u2to minimize

J2u1u2=12E0TQ2xu1,u2t2+N2u2t2dt+G2xu1,u2T2,E35

where Q1,Q2,G1,G20,N10,N2>0are constants. This is an LQ leader-follower stochastic differential game with asymmetric information. We wish to find its Stackelberg equilibrium u1u2.

Define the Hamiltonian function of the follower as

H1txu1u2qkk˜=qAx+B1u1+B2u2+kCx+D1u1+D2u2+k˜C˜x+D˜1u1+D˜2u212Q1x212N1u12.E36

For given control u2, suppose that there exists a G1,t-adapted optimal control u1of the follower, and the corresponding optimal state is xu1,u2. By Proposition 2.1, Eq. (36) yields that

0=N1u1tB1q̂tD1k̂tD˜1k˜̂t,E37

where the Ft-adapted process triple qkk˜R×R×Rsatisfies the BSDE

dqt=Aqt+Ckt+C˜k˜tQ1xu1,u2tdtktdWtk˜tdW˜t,qT=G1xu1,u2T.E38

We wish to obtain the state feedback form of u1. Noting the terminal condition of Eq. (38) and the appearance of u2, we set

qt=Ptxu1,u2tφt,t0T,E39

for some deterministic and differentiable R-valued function Pt, and R-valued, Ft-adapted process φwhich admits the BSDE

t=αtdt+βtdW˜t,φT=0.E40

In the above equation, αR,βRare Ft-adapted processes, which are to be determined later. Now, applying Itô’s formula to Eq. (39), we have

dqt=Ṗtxu1,u2t+PtAxu1,u2t+αt+PtB1u1t+PtB2u2tdt+PtCxu1,u2t+D1u1t+D2u2tdWt+PtC˜xu1,u2t+D˜1u1t+D˜2u2t+βtdW˜t.E41

Comparing Eq. (41) with Eq. (38), we arrive at

kt=PtCxu1,u2t+D1u1t+D2u2t,k˜t=PtC˜xu1,u2t+D˜1u1t+D˜2u2tβt,E42

and

αt=Ṗt+2APt+Q1xu1,u2ttPtB1u1tPtB2u2t+Ckt+C˜k˜t,E43

respectively. Taking EG1,ton both sides of Eqs. (39) and (42), we get

q̂t=Ptx̂u1,û2tφ̂t,E44

and

k̂t=PtCx̂u1,û2t+D1u1t+D2û2t,k˜̂t=PtC˜x̂u1,û2t+D˜1u1t+D˜2û2tβ̂t,E45

respectively. Applying Lemma 5.4 in [21] to Eqs. (33) and (38) corresponding to u1, we derive the optimal filtering equation

dx̂u1,û2t=Ax̂u1,û2t+B1u1t+B2û2tdt+C˜x̂u1,û2t+D˜1u1t+D˜2û2tdW˜t,dq̂t=Aq̂t+Ck̂t+C˜k˜̂tQ1x̂u1,û2tdtk˜tdW˜t,x̂u1,û20=x0,q̂T=G1x̂u1,û2T.E46

Note that Eq. (46) is not a classical FBSDFE, since the generator of the BSDE depends on an additional process k̂. For given u2, it is important if Eq. (46) admits a unique G1,t-adapted solution x̂u1,û2q̂k̂k˜̂. We will make it clear soon. For this target, first, by Eq. (37) and supposing that.

(A2.1) N˜1tN1+D12Pt+D˜12Pt>0, t0T,

we immediately arrive at

u1t=N˜11tS˜1tx̂u1,û2t+S˜tû2t+B1φ̂t+D˜1β̂t,E47

where S˜1tB1+CD1+C˜D˜1Pt,S˜tD1D2+D˜1D˜2Pt. Substituting Eq. (47) into Eq. (43), we can obtain that if

Ṗt+2A+C2+C˜2PtB1+CD1+C˜D˜12N1+D12Pt+D˜12Pt1Pt2+Q1=0,PT=G1,E48

admits a unique differentiable solution Pt, then

αt=S˜12tN˜11txu1,u2t+S˜12tN˜11tx̂u1,û2tt+S˜1tN˜11tB1φ̂tS˜2tu2t+S˜1tN˜11tS˜tû2tC˜βt+S˜1tN˜11tD˜1β̂t,E49

where S˜2tB2+CD2+C˜D˜2Pt. By (A2.1), we know that Eq. (48) admits a unique solution Pt>0from standard Riccati equation theory [22]. In particular, if C˜=D˜1=0, Eq. (48) reduces to

Ṗt+2A+C2PtB1+CD12N1+D12Pt1Pt2+Q1=0,PT=G1,N1+D12Pt>0,E50

which recovers the standard one in [22]. With Eq. (49), the BSDE Eq. (40) takes the form

t=[S˜12tN˜11txu1,u2tS˜12tN˜11tx̂u1,û2t+tS˜1tN˜11tB1φ̂t+C˜S˜1tN˜11tD˜1βt+S˜2tu2tS˜1tN˜11tS˜tû2t]dtβtdW˜t,φT= 0.E51

Moreover, for given u2, plugging Eq. (47) into the forward equation of Eq. (46), and letting

A˜tAB1N˜11tS˜1t,C˜˜tC˜D˜1N˜11tS˜1t,B˜2tB2B1N˜11tS˜1t,F˜1tB1N˜11tB1,B˜1tB1N˜11tD˜1,F˜3tD˜1N˜11tD˜1,D˜˜2tD˜2D˜1N˜11tS˜t,E52

we have

dx̂u1,û2t=A˜tx̂u1,û2t+F˜1tφ̂t+B˜1tβ̂t+B˜2tû2tdt+C˜˜tx̂u1,û2t+B˜1tφ̂t+F˜3tβ̂t+D˜˜2tû2tdW˜t,x̂u1,û20=x0,E53

which admits a unique G1,t-adapted solution x̂u1,û2, for given φ̂β̂. Applying Lemma 5.4 in [21] to Eq. (51) again, we have

dφ̂t=A˜tφ̂t+C˜˜tβ̂t+F˜4tû2tdtβ̂tdW˜t,φ̂T=0,E54

where F˜4tS˜2tS˜1tN˜11tS˜t. For given û2, Eq. (54) admits a unique solution φ̂β̂from standard BSDE theory. Putting Eqs. (53) and (54) together, we get

dx̂u1,û2t=A˜tx̂u1,û2t+F˜1tφ̂t+B˜1tβ̂t+B˜2tû2tdt+C˜˜tx̂u1,û2t+B˜1tφ̂t+F˜3tβ̂t+D˜˜2tû2tdW˜t,dφ̂t=A˜tφ̂t+C˜˜tβt+F˜4tû2tdtβ̂tdW˜t,x̂u1,û20=x0,φ̂T=0,E55

which admits a unique G1,t-adapted solution x̂u1,û2φ̂β̂. By Eqs. (55), (44), (45), and (47), we can uniquely obtain the solvability of Eq. (46). Moreover, we can check that the convexity/concavity conditions in Proposition 2.2 hold, and u1given by Eq. (47) is really optimal. We summarize the above procedure in the following theorem.

Theorem 3.1 Let (A2.1) hold, Ptsatisfy Eq. (48). For chosen u2of the leader, u1given by Eq. (47) is the optimal control of the follower, where x̂u1,û2φ̂β̂is the unique G1,t-adapted solution to Eq. (55).

### 3.2. Problem of the leader

Since the leader knows that the follower will take u1by Eq. (47), the state equation of the leader writes

dxu2t=Axu2t+A˜tAx̂û2t+F˜1tφ̂t+B˜1tβ̂t+B2u2t+B˜2tB2û2tdt+Cxu2t+F˜5tx̂û2t+B˜˜1tφ̂t+D˜˜1tβ̂t+D2u2t+F˜2tû2tdWt+C˜xu2t+C˜˜tC˜x̂û2t+B˜1tφ̂t+F˜3tβ̂t+D˜2u2t+D˜˜2tD˜2û2tdW˜t,dφ̂t=A˜tφ̂t+C˜˜tβ̂t+F˜4tû2tdtβ̂tdW˜t,xu20=x0,φ̂T=0,E56

where xu2xu1,u2,x̂û2x̂u1,û2and B˜˜1tB1N˜11tD1,D˜˜1tD1N˜11tD˜1,F˜5tD1N˜11tS˜1t,F˜2tD1N˜11tS˜t. Noting that Eq. (56) is a decoupled conditional mean-field FBSDE, its solvability for Ft-adapted solution xu2φ̂β̂can be easily guaranteed.

The problem of the leader is to choose an Ft-adapted optimal control u2such that the cost functional

J2u2=12E0TQ2xu2t2+N2u2t2dt+G2xu2T2E57

is minimized. Define the Hamiltonian function of the leader as

H2txu2u2φ̂β̂yzz˜p=12Q2xu22+N2u22+yAxu2+A˜tAx̂û2+F˜1tφ̂+B˜1tβ̂+B2u2+(B˜2tB2)û2+pA˜tφ̂+C˜˜tβ̂+F˜4tû2+zCxu2+F˜5tx̂û2+B˜˜1tφ̂+D˜˜1tβ̂+D2u2+F˜2tû2+z˜C˜xu2+C˜˜tC˜x̂û2+B˜1tφ̂+F˜3tβ̂+D˜2u2+D˜˜2tD˜2û2.E58

Suppose that there exists an Ft-adapted optimal control u2of the leader, and the corresponding optimal state is xφ̂β̂xu2φ̂β̂. Then by Propositions 2.3, 2.4, Eq. (58) yields that

0=N2u2t+F˜4tp̂t+B2yt+B˜2tB2ŷt+D2zt+F˜2tẑt+D˜2tz˜t+D˜˜2tD˜2z˜t,E59

dpt=A˜tpt+F˜1tyt+B˜˜1tzt+B˜1tz˜tdt+C˜˜tpt+B˜1tyt+D˜˜1tzt+F˜3tz˜tdW˜t,dyt=Ayt+A˜tAŷt+Czt+F˜5tẑt+C˜z˜t+C˜˜tC˜z˜t+Q2xtdtztdWtz˜tdW˜t,p0=0,yT=G2xT.E60

In fact, the problem of the leader can also be solved by a direct calculation of the derivative of the cost functional. Without loss of generality, let x00, and set u2+εu2for ϵ>0sufficiently small, with u2R. Then it is easy to see from the linearity of Eqs. (56) and (60), that the solution to Eq. (56) is x+ϵxu2. We first have

J˜ϵJ2u2+ϵu2=12E0T[Q2xt+ϵxu2txt+ϵxu2t+N2u2t+ϵu2tu2t+ϵu2t]dt+12EG2xT+ϵxu2TxT+ϵxu2T.E61

Hence

0=J˜ϵ∂ϵε=0=E0TQ2xtxu2t+N2u2tu2tdt+EG2xTxu2T.E62

0=E0TQ2xtxu2t+N2u2tu2tdt+EyTxu2T.E63

Applying Itô’s formula to xu2tytptφ̂t, noting Eqs. (56) and (60), we derive.

0=E0TQ2xt+Ayt+Czt+C˜z˜txu2tdt+E0TA˜tAyt+F˜5tzt+C˜˜tC˜z˜tx̂û2tdt+E0TN2u2t+B2yt+D2zt+D˜2z˜tu2tdt+E0TB˜2B2yt+F˜2tzt+D˜˜2tD˜2z˜tû2tdtE0TQ2xt+Ayt+A˜tAŷt+Czt+C˜z˜t+F˜5tẑt+C˜˜tC˜z˜txu2tdt+E0TF˜1tyt+B˜˜1tzt+B˜1tz˜tφ̂tdt+E0TB˜1tyt+D˜˜1tzt+F˜3tz˜tβ̂tdt+E0TptA˜tφ̂t+C˜˜tβ̂tdtE0Tφ̂tA˜tpt+F˜1tyt+B˜˜1tzt+B˜1tz˜tdtE0Tβ̂tC˜˜tpt+B˜1tyt+D˜˜1tzt+F˜3tz˜tdt+E0TptF˜4tû2tdt=E0TN2u2t+B2yt+D2zt+D˜2z˜tu2tdt+E0TB˜2tB2yt+F˜2tzt+D˜˜2tD˜2z˜tû2tdt+E0TptF˜4tû2tdt=E0TN2u2t+F˜4tp̂t+B2yt+B˜2tB2ŷt+D2zt+F˜2tẑt+D˜2tz˜t+D˜˜2tD˜2z˜t,u2tdt.E64

This implies Eq. (59).

In the following, we wish to obtain a “nonanticipating” representation for the optimal controls u2and u1. For this target, let us regard xpΤas the optimal state, put

X=xp,Y=yφ̂,Z=z0,Z˜=z˜β̂,E65

and (suppressing some tbelow)

A1A00A˜t,A2A˜tA000,B˜10B˜1tB˜1t0,B˜˜100B˜˜1t0,B2B20,B˜2B˜2tB20,C1C000,C˜1C˜00C˜˜t,C˜2C˜˜tC˜000,D˜˜10D˜˜1t00,D˜2D˜20,D˜˜2D˜˜2tD˜20,D2D20,G2G2000,F˜10F˜1tF˜1t0,F˜2F˜2t0,X0x00,F˜30F˜3tF˜3t0,F˜40F˜4t,F˜5F˜5t000,Q2Q2000.E66

With the notations, Eq. (56) with Eq. (60) is rewritten as

dXt=A1Xt+A2X̂t+F˜1Yt+B˜˜1Zt+B˜1Z˜t+B2wt+B˜2û2tdt+C1Xt+F˜5X̂t+B˜˜1ΤYt+D˜˜1Z˜t+D2wt+F˜2û2tdWt+C˜1Xt+C˜2X̂t+B˜1ΤYt+D˜˜1ΤZt+F˜3Z˜t+D˜2wt+D˜˜2û2tdW˜t,dYt=Q2Xt+A1ΤYt+A2ΤŶt+C1ΤZt+F˜5ΤẐt+C˜1ΤZ˜t+C˜2ΤZ˜t+F˜4û2tdtZtdWtZ˜tdW˜t,X0=X0,YT=G2XT.E67

Noting Eq. (59), we have

u2t=N21F˜4ΤX̂t+B2ΤYt+B˜2ΤŶt+D2ΤZt+F˜2ΤẐt+D˜2ΤZ˜t+D˜˜2ΤZ˜t,û2t=N21F˜4ΤX̂t+B2+B˜2ΤŶt+D2+F˜2ΤẐt+D˜2+D˜˜2ΤZ˜t.E68

Inserting Eq. (68) into Eq. (67), we get

dXt=A1Xt+A¯2X̂t+F˜¯1Yt+B¯2Ŷt+B3Zt+B¯¯2Ẑt+B˜¯1Z˜t+B˜¯2Z˜tdt+C1Xt+F˜¯5X̂t+B˜3ΤYt+D¯2Ŷt+D˜˜2Zt+D¯¯2Ẑt+D3Z˜t+D˜¯2Z˜tdWt+C˜1Xt+C˜¯2X̂t+B˜¯1ΤYt+D¯3Ŷt+D3ΤZt+D¯¯3ΤẐt+F˜¯3Z˜t+D˜¯3Z˜tdW˜t,dYt=Q2Xt+F˜¯4X̂t+A1ΤYt+A¯2ΤŶt+C˜1ΤZ˜t+C1ΤZt+F˜¯5ΤẐt+C˜¯2ΤZ˜tdtZtdWtZ˜tdW˜t,X0=X0,YT=G2XT,E69

where

A¯2A2B2+B˜2N21F˜4Τ,B˜¯1B˜1B2N21D˜2Τ,B¯2B2N21B˜2ΤB˜2N21B2+B˜2Τ,B¯¯2B2N21F˜2ΤB˜2N21D2+F˜2Τ,B˜¯2B2N21D˜˜2ΤB˜2N21D˜2+D˜˜2Τ,B3B˜˜1B2N21D2Τ,C˜¯2C˜2D˜2+D˜˜2N21F˜4Τ,D˜¯2D2N21D˜˜2ΤF˜2N21D˜2+D˜˜2Τ,D˜˜2D2N21D2Τ,D¯2D2N21B˜2ΤF˜2N21B2+B˜2Τ,D¯¯2D2N21F˜2ΤF˜2N21D2+F˜2Τ,D3D˜˜1D2N21D˜2Τ,D¯3D˜2N21B˜2ΤD˜˜2N21B2+B˜2Τ,D¯¯3D˜2N21F˜2ΤD˜˜2N21D2+F˜2Τ,D˜¯3D˜2N21D˜˜2ΤD˜˜2N21D˜2+D˜˜2Τ,F˜¯1F˜1B2N21B2Τ,F˜¯3F˜3D˜2N21D˜2Τ,F˜˜4F˜4N21F˜4Τ,F˜¯5F˜5D2+F˜2N21F˜4Τ.E70

We need to decouple Eq. (69). Similar to Eq. (39), put

Yt=P1tXt+P2tX̂t,t0T,E71

where P1t,P2tare differentiable, deterministic 2×2matrix-valued functions with P1T=G2,P2T=0. Applying Lemma 5.4 in [21] to the forward equation in Eq. (35), we obtain

dX̂t=A1+A¯2X̂t+F˜¯1+B¯2Ŷt+B3+B¯¯2Ẑt+B˜¯1+B˜¯2Z˜tdt+C˜1+C˜¯2X̂t+B˜¯1Τ+D¯3Ŷt+D3Τ+D¯¯3Ẑt+F˜¯3+D˜¯3Z˜tdW˜t,X̂0=X0.E72

Applying Itô’s formula to (3.31), we get

dYt=Ṗ1+P1A1+P1F˜¯1P1Xt+[Ṗ2+P1A¯2+P1B¯2P1+P2A1+A¯2+P2F˜¯1+B¯2P1+P1F˜¯1+B¯2P2+P2F˜¯1+B¯2P2X̂t+P1B3Zt+P1B˜¯1Z˜t+P1B¯¯2+P2B3+B¯¯2Ẑt+P1B˜¯2+P2B˜¯1+B˜¯2Z˜tdt+P1C1+P1tB3ΤP1Xt+P1F˜¯5+P1B3ΤP2+P1D¯2P1+P2X̂t+P1D˜˜2Zt+P1D¯¯2Ẑt+P1D3Z˜t+P1D˜¯2Z˜tdWt+P1C˜1+P1B˜¯1ΤP1Xt+P1C˜¯2+P2B˜¯1Τ+D¯3P1+P1B˜¯1Τ+D¯3P2+P2B˜¯1Τ+D¯3P2+P2C˜1+C˜¯2+P1D¯3P1X̂t+P1D3ΤZt+P1F˜¯3Z˜t+P1D¯¯3Τ+P2D3Τ+D¯¯3Ẑt+P1D˜¯3Τ+P2F˜¯3+D˜¯3Z˜tdW˜t=Q2+A1ΤP1Xt+F˜¯4+A¯2ΤP1+A1ΤP2+A¯2ΤP2X̂t+C1ΤZt+F˜¯5ΤẐt+C˜1ΤZ˜t+C˜¯2ΤZ˜tdt+ZtdWt+Z˜tdW˜t.E73

Comparing dWtand dW˜ton both sides of Eq. (73), we have

Zt=P1C1+P1B3ΤP1Xt+P1F˜¯5+P1B3ΤP2+P1D¯2P1+P2X̂t+P1D˜˜2Zt+P1D¯¯2Ẑt+P1D3Z˜t+P1D˜¯2Z˜t,Z˜t=P1C˜1+P1B˜¯1ΤP1Xt+P2C˜1+C˜¯2+P2B˜¯1Τ+D¯3P1+P1B˜¯1Τ+D¯3P2+P2B˜¯1Τ+D¯3P2+P1C˜¯2+P1D¯3P1X̂t+P1D3ΤZt+P1D¯¯3Τ+P2D3Τ+D¯¯3Ẑt+P1F˜¯3Z˜t+P1D˜¯3Τ+P2F˜¯3+D˜¯3Z˜t.E74

Taking EG1,t, we derive

Ẑt=P1C1+F˜¯5+P1B3Τ+D¯2P1+P1B3Τ+D¯2P2X̂t+P1D˜˜2+D¯¯2Ẑt+P1D3+D˜¯2Z˜t,Z˜̂t=P1C˜1+C˜¯2+P1B˜¯1Τ+D¯3P1+P2C˜1+C˜¯2+P2B˜¯1Τ+D¯3P1+P1B˜¯1Τ+D¯3P2+P2B˜¯1Τ+D¯3P2X̂t+P1+P2D3Τ+D¯¯3Ẑt+P1+P2F˜¯3+D˜¯3Z˜t.E75

Supposing that (I2denotes the 2×2unit matrix)

A2.2N˜21I2P1D˜˜2+D¯¯21andN˜˜21I2P1+P2D3Τ+D¯¯3N˜21P1D3+D˜¯2P1+P2F˜¯3+D˜¯31exist,E76

we get

Ẑt=Σ0P1P2X̂t,Z˜t=Σ˜0P1P2X̂t,E77

where

Σ0P1P2N˜21P1D3+D˜¯2N˜˜21P1+P2C˜1+C˜¯2+B˜¯1Τ+D¯3P1+P2+D3Τ+D¯¯3N˜21P1C1+F˜¯5+B3Τ+D¯2P1+P2+P1C1+F˜¯5+B3Τ+D¯2P1+P2,Σ˜0P1P2N˜˜21P1+P2C˜1+C˜¯2+B˜¯1Τ+D¯3P1+P2+D3Τ+D¯¯3N˜21P1C1+F˜¯5+B3Τ+D¯2P1+P2.E78

Inserting Eq. (77) into Eq. (74), we have

Zt=P1C1+P1B3ΤP1Xt+P1F˜¯5+B3ΤP2+D¯2P1+P2+D¯¯2Σ0P1P2+D˜¯2Σ˜0P1P2X̂t+P1D˜˜2Zt+P1D3Z˜t,Z˜t=P1C˜1+P1B˜¯1ΤP1Xt+P2C˜1+C˜¯2+P1C˜¯2+P2B˜¯1Τ+D¯3P1+P1B˜¯1Τ+D¯3P2+P1D¯3P1+P2B˜¯1Τ+D¯3P2+P1D¯¯3Τ+P2D3Τ+D¯¯3P1D¯¯2Σ0P1P2+P1D˜¯3Τ+P2F˜¯3+D˜¯3P1D˜¯2Σ˜0(P1P2)X̂t+P1D3ΤZt+P1F˜¯3Z˜t.E79

Supposing that

A2.3N¯21I2P1D˜˜21I2n+P1D2N21D2Τ1andN¯¯21I2P1D3ΤN¯21P1D3P1F˜¯31I2nP1D˜˜1D2N21D˜2ΤΤ×I2n+P1D2N21D2Τ1P1D˜˜1D2N21D˜2ΤP1F˜3D˜2N21D˜2Τ1exist,E80

we get

Zt=Σ1P1P2Xt+Σ2P1P2X̂t,Z˜t=Σ˜1P1P2Xt+Σ˜2P1P2X̂t,E81

where.

Σ1P1P2N¯21P1C1+B3ΤP1+D3P1C1+B˜¯1ΤP1+D3N¯¯21P1D3ΤN¯21P1C1+B3ΤP1,Σ˜1P1P2N¯¯21P1C1+B˜¯1ΤP1+D3ΤN¯21P1C1+B3ΤP1,Σ2P1P2N¯21P1F˜¯5+B3ΤP2+D¯2P1+P2+D¯¯2Σ0(P1P2)+D˜¯2Σ˜0(P1P2)+D3N¯¯21P2C˜1+C˜¯2+P2B˜¯1Τ+D¯3P1+P1B˜¯1Τ+D¯3P2+P1D¯3P1+P2B˜¯1Τ+D¯3P2+P1C˜¯2+P1D¯¯3Τ+P2D3Τ+D¯¯3P1D¯¯2Σ0P1P2+P1D˜¯3Τ+P2F˜¯3+D˜¯3P1D˜¯2Σ˜0P1P2+P1D3ΤN¯21P1[F˜¯5+B3ΤP2+D¯2P1+P2+D¯¯2Σ0(P1P2)+D˜¯2Σ˜0(P1P2),Σ˜2P1P2N¯¯21P2C˜1+C˜¯2+P2B˜¯1Τ+D¯3P1+P1D¯3P1+P1C˜¯2+P1B˜¯1Τ+D¯3P2+P2B˜¯1Τ+D¯3P2+P1D¯¯3Τ+P2D3Τ+D¯¯3P1D¯¯2Σ0P1P2+P1D˜¯3Τ+P2F˜¯3+D˜¯3P1D˜¯2Σ˜0P1P2+P1D3ΤN¯21P1[F˜¯5+B3ΤP2+D¯2P1+P2+D¯¯2Σ0(P1P2)+D˜¯2Σ˜0(P1P2)].E82

Comparing the coefficients of dtin Eq. (73) and putting Eqs. (77) and (81) into them, we get

0=Ṗ1+P1A1+A1ΤP1+P1F˜¯1P1+Q2+C1+P1B3Σ1P1P2+C˜1Τ+P1B˜¯1Σ˜1P1P2,0=Ṗ2+P2A1+A¯2+A1+A¯2ΤP2+P2F˜¯1+B¯2P1+P1F˜¯1+B¯2P2+P2F˜¯1+B¯2P2+P1A¯2+A¯2ΤP1+P1B¯2P1+F˜¯4+C1+P1B3Σ2P1P2+C˜1Τ+P1B˜¯1Σ˜2P1P2+F˜¯5Τ+P1B¯¯2+P2B3+B¯¯2Σ0P1P2+C˜¯2Τ+P1B˜¯2+P2B˜¯1+B˜¯2Σ˜0P1P2,P1T=G2,P2T=0.E83

Note that the system of Riccati equations (83) is not standard, and its solvability is open. Due to some technical reason, we can not obtain the solvability of it now. However, in some special case, P1tand P2tare not coupled. Then we can first solve the first equation of P1t, then that of P2tby standard Riccati equation theory. We will not discuss for the space limit. And we will consider the general solvability of Eq. (83) in the future.

Instituting Eqs. (77) and (81) into Eq. (68), we obtain

u2t=N21{B2ΤP1+D2ΤΣ1P1P2+D˜2ΤΣ˜1P1P2Xt+[F˜4Τ+B2ΤP2+B˜2ΤP1+P2+D2ΤΣ2P1P2+F˜2ΤΣ0P1P2+D˜2ΤΣ˜2P1P2+D˜˜2ΤΣ˜0P1P2]X̂t},E84

and the optimal “state” X=xpΤof the leader satisfies

dXt=[A1+F˜¯1P1+B3Σ1P1P2+B˜¯1Σ˜1(P1P2)]Xt+[A¯2+F˜¯1P2+B¯2P1+P2+B3Σ2P1P2+B¯¯2Σ0(P1P2)+B˜¯1Σ˜2(P1P2)+B˜¯2Σ˜0(P1P2)]X̂tdt+C1+B˜3ΤP1+D˜˜2Σ1P1P2Xt+[F˜¯5+B˜3ΤP2+D¯2P1+P2+D˜˜2Σ2P1P2+D¯¯2Σ˜0(P1P2)]X̂tdWt+[C˜1+B˜¯1ΤP1+D3ΤΣ1P1P2+F˜¯3Σ˜1(P1P2)]Xt+[C˜¯2+B˜¯1ΤP2+D¯3P1+P2+D3ΤΣ2P1P2+D¯¯3ΤΣ0(P1P2)+F˜¯3Σ˜2(P1P2)+D˜¯3Σ˜0(P1P2)]X̂tdW˜t,X0=X0,E85

where X̂is governed by

dX̂t=A1+A¯2+F˜¯1+B¯2P1+P2+B3Σ1(P1P2)+B˜¯1Σ˜1(P1P2)+B3Σ2P1P2+B¯¯2Σ0(P1P2)+B˜¯1Σ˜2(P1P2)+B˜¯2Σ˜0(P1P2)X̂tdt+C˜1+C˜¯2+B˜¯1Τ+D¯3P1+P2+D3ΤΣ1(P1P2)+F˜¯3Σ˜1(P1P2)+D3ΤΣ2P1P2+D¯¯3ΤΣ0(P1P2)+F˜¯3Σ˜2(P1P2)+D˜¯3Σ˜0(P1P2)X̂tdW˜t,X̂0=X0.E86

We summarize the above analysis in the following theorem.

Theorem 3.2 Let (A2.1)(A2.3) hold, P1tP2tsatisfy Eq. (83), X̂be the G1,t-adapted solution to Eq. (86), and Xbe the Ft-adapted solution to Eq. (85). Define YZZ˜by Eqs. (71) and (81), respectively. Then Eq. (69) holds, and u2given by Eq. (84) is a feedback optimal control of the leader.

Finally, the optimal control u1of the follower can also be represented in a nonanticipating way. In fact, by Eq. (47), noting Eqs. (68), (71), and (77), we have

u1t=N˜11tS˜1Τtx̂t+S˜tû2t+B1Τφ̂t+D˜1Τβt=N˜11tS˜1Τt  0X̂t+S˜tû2t+0B1ΤŶt+0D˜1ΤZ˜t=N˜11tS˜1Τt  0S˜tN21[F˜4Τ+B2+B˜2ΤP1+P2+D2+F˜2ΤΣ0(P1P2)+D˜2+D˜˜2ΤΣ˜0(P1P2)]+0B1ΤP1+P2+0D˜1ΤΣ˜0(P1P2)X̂t,E87

which is observable for the follower.

Remark 3.3 When we consider the complete information case, that is, W˜disappears and G1,t=Ft, Theorems 3.1 and 3.2 coincide with Theorems 2.3 and 3.3 in Yong [13].

## 4. Concluding remarks

In this chapter, we have studied a leader-follower stochastic differential game with asymmetric information. This kind of game problem possesses several attractive features. First, the game problem has the Stackelberg feature, which means the two players play as different roles during the game. Thus the usual approach to deal with game problems, such as [6, 7, 8, 10], where the two players act as equivalent roles, does not apply. Second, the game problem has the asymmetric information between the two players, which was not considered in [3, 13, 14]. In detail, the information available to the follower is based on some sub-σ-algebra of that available to the leader. Stochastic filtering technique is introduced to compute the optimal filtering estimates for the corresponding adjoint processes, which act as the solution to some FBSDFE. Third, the Stackelberg equilibrium is represented in its state feedback form for the LQ problem under some appropriate assumptions. Some new conditional mean-field FBSDEs and system of Riccati equations are introduced to deal with the leader’s LQ problem.

In principle, Theorems 3.1 and 3.2 provide a useful tool to seek Stackelberg equilibrium. As a first step in this direction, we apply our results to the LQ problem to obtain explicit solutions. We hope to return to the more general case in our future research. It is worthy to study the closed-loop Stackelberg equilibrium for our problem, as well as the solvability of the system of Riccati equations. These challenging topics will be considered in our future work.

## Acknowledgments

Jingtao Shi would like to thank the book editor for his/her comments and suggestions. Jingtao Shi also would like to thank Professor Guangchen Wang from Shandong University and Professor Jie Xiong from Southern University of Science and Technology, for their effort and discussion during the writing of this chapter.

## Notes

The main content of this chapter is from the following two published article papers: (1) Shi, J.T., Wang, G.C., & Xiong, J. (2016). Leader-follower stochastic differential games with asymmetric information and applications. Automatica, Vol. 63, 60–73. (2) Shi, J.T., Wang, G.C., & Xiong, J. (2017). Linear-quadratic stochastic Stackelberg differential game with asymmetric information. Science China Information Sciences, Vol. 60, 092202:1–15.

chapter PDF
Citations in RIS format
Citations in bibtex format

## More

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## How to cite and reference

### Cite this chapter Copy to clipboard

Jingtao Shi (September 26th 2018). Stochastic Leader-Follower Differential Game with Asymmetric Information, Game Theory - Applications in Logistics and Economy, Danijela Tuljak-Suban, IntechOpen, DOI: 10.5772/intechopen.75413. Available from:

### Related Content

Next chapter

#### Dipping Headlights: An Iterated Prisoner’s Dilemma or Assurance Game

By Torkel Bjørnskau

#### Game Theory Relaunched

Edited by Hardy Hanappi

First chapter

#### The Neumann-Morgenstern Project – Game Theory as a Formal Language for the Social Sciences

By Hardy Hanappi

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

View all Books