Open access peer-reviewed chapter

Recent Advances in Nonlinear Filtering with a Financial Application to Derivatives Hedging under Incomplete Information

By Claudia Ceci and Katia Colaneri

Submitted: November 15th 2016Reviewed: June 8th 2017Published: November 2nd 2017

DOI: 10.5772/intechopen.70060

Downloaded: 266

Abstract

In this chapter, we present some recent results about nonlinear filtering for jump diffusion signal and observation driven by correlated Brownian motions having common jump times. We provide the Kushner-Stratonovich and the Zakai equation for the normalized and the unnormalized filter, respectively. Moreover, we give conditions under which pathwise uniqueness for the solutions of both equations holds. Finally, we study an application of nonlinear filtering to the financial problem of derivatives hedging in an incomplete market with partial observation. Precisely, we consider the risk-minimizing hedging approach. In this framework, we compute the optimal hedging strategy for an informed investor and a partially informed one and compare the total expected squared costs of the strategies.

Keywords

  • nonlinear filtering
  • jump diffusions
  • risk minimization
  • Galtchouk-Kunita-Watanabe decomposition
  • partial information

1. Introduction

Bayesian inference and stochastic filtering are strictly related, since in both approaches, one wants to estimate quantities which are not directly observable. However, while in Bayesian inference, all uncertainty sources are considered as random variables, stochastic filtering refers to stochastic processes. It also covers many situations, from linear to nonlinear case, with various types of noises.

The objective of this chapter is to present nonlinear filtering results for Markovian partially observable systems where the state and the observation processes are described by jump diffusions with correlated Brownian motions and common jump times. We also aim at applying this theory to the financial problem of derivatives hedging for a trader who has limitative information on the market.

A filtering model is characterized by a signal process, denoted by X, which cannot be observed directly, and an observation process denoted by Y whose dynamics depends on X. The natural filtration of Y, FY={FtY,t[0,T]}, represents the available information. The goal of solving a filtering problem is to determine the best estimation of the signal Xt from the knowledge of FtY. Similar to optimal Bayesian filtering, we seek for the best estimation of the signal according to the minimum mean-squared error criterion, which corresponds to compute the posterior distribution of Xt given the available observations up to time t.

Historically, the first example of continuous-time filtering problem is the well-known Kalman-Bucy filter which concerns the case where Y gives the observation of X in additional Gaussian noise and both processes X and Y are modeled by linear stochastic differential equations. In this case, one ends up with a filter having finite-dimensional realization. Since then, the problem has been extended in many directions. To start, a number of authors including Refs. [13] studied the nonlinear case in the setting of additional Gaussian noise. Other references in a similar framework are given, for instance, by Refs. [48]. Subsequently also the case of counting process or marked point process observation has been considered (see Refs. [914] and reference therein). A more recent literature contains the case of mixed-type observations (marked point processes and diffusions or jump-diffusion processes), see, for, example, Refs. [1518].

There are two major approaches to nonlinear filtering problems: the innovations method and the reference probability method. The latter is usually employed when it is possible to find an equivalent probability measure that makes the state X and the observations Y independent. This technique may appear problematic when, for instance, signal and observation are correlated and present common jump times. Therefore, in this chapter, we use the innovations approach which allows circumventing the technical issues arising in the reference probability method. By characterizing the innovation process and applying a martingale representation theorem, we can derive the dynamics of the filter as the solution of the Kushner-Stratonovich equation, which is a nonlinear stochastic partial integral differential equation. By considering the unnormalized version of the filter, it is possible to simplify this equation and make it at least linear. The resulting equation is called the Zakai equation, and due to its linear nature, it is of particular interest in many applications. We also compute the dynamics of the unnormalized filter, and we investigate pathwise uniqueness for the solutions of both equations. Normalized and unnormalized filters are probability measure and finite measure-valued processes, respectively, and therefore in general infinite-dimensional. Due to this, various recursive algorithms for statistical inference have come in to address this intractability, such as extended Kalman filter, statistical linearization, or particle filters. These algorithms intend to estimate both state and parameters. For the parameter estimation, we also mention the expectation maximization (EM) algorithm which enables to estimate parameters in models with incomplete data, see, for example, Ref. [19].

The success of the filtering theory over the years is due to its use in a great variety of problems arising from many disciplines such as engineering, informational sciences and mathematical finance. Specifically, in this chapter, we have a financial application in view. In real financial markets, it is reasonable that investors cannot fully know all the stochastic factors that may influence the prices of negotiated assets, since these factors are usually associated with economic quantities which are hard to observe. Filtering theory represents a way to measure, in some sense, this uncertainty. A consistent part of the literature over the last years has considered stochastic factor models under partial information for analyzing various financial problems, as, for example, pricing and hedging of derivatives, optimal investment, credit risk, and insurance modeling. A list, definitely nonexhaustive, is given by Refs. [15, 16, 2026]).

In the following, we consider the problem of a trader who wants to determine the hedging strategy for a European-type contingent claim with maturity T in an incomplete financial market where the investment possibilities are given by a riskless asset, assumed to be the numéraire, and a risky asset with price dynamics given by a geometric jump diffusion, modeled by the process Y. We assume that the drift, as well as the intensity and the jump size distribution of the price process, is influenced by an unobservable stochastic factor X, modeled as a correlated jump diffusion with common jump times. By common jump times, we intend to take into account catastrophic events which affect both the asset price and the hidden state variable driving its dynamics. The agent knows the asset prices, since they are publicly available, and trades on the market by using the available information FY.

Partial information easily leads to incomplete financial markets as clearly the number of random sources is larger than the number of tradeable risky asset. Therefore, the existence of a self-financing strategy that replicates the payoff of the given contingent claim at maturity is not guaranteed. Here, we assume that the risky asset price is modeled under a martingale measure, and we choose the risk-minimization approach as hedging criterion, see, for example, Refs. [27, 28].

According to this method, the optimal hedging strategy is the one that perfectly replicates the claim at maturity and has minimum cost in the mean-square sense. Equivalently, we say that it minimizes the associated risk defined as the conditional expected value of the squared future costs, given the available information (see Refs. [28, 29] and references therein).

The risk-minimizing hedging strategy under restricted information is strictly related to Galtchouk-Kunita-Watanabe decomposition of the random variable representing the payoff of the contingent claim in a partial information setting. Here, we provide a characterization of the risk-minimizing strategy under partial information via this orthogonal decomposition and obtain a representation in terms of the corresponding risk-minimizing hedging strategy under full information (see, e.g., Refs. [29, 30]) via predictable projections on the available information flow by means of the filter. Finally, we investigate the difference of expected total risks associated with the optimal hedging strategies under full and partial information.

The chapter has the following structure. In Section 2, we introduce the general framework. In Section 3, we study the filtering equations. In particular, we derive the dynamics for both normalized and unnormalized filters, and we investigate uniqueness of the solutions of the Kushner-Stratonovich and the Zakai equation. In Section 4, we analyze a financial application to risk minimization by computing the optimal hedging strategies for a European-type contingent claim under full and partial information and providing a comparison between the corresponding expected squared total costs.

2. The setting

We consider a pair of stochastic processes (X,Y), with values on R× Rand càdlàg trajectories, on a complete filtered probability space (Ω,F,F,P), where F={Ft,t[0,T]}is a filtration satisfying the usual condition of right continuity and completeness, and T is a fixed time horizon. The pair (X, Y) represents a partially observable system, where X is a signal process that describes a phenomenon which is not directly observable and Y gives the observation of X, and it is modeled by a process correlated with the signal, having possibly common jump times.

Remark 1. In view of the financial application discussed in Section 4, Y represents the price of some risky asset, while X is an unknown stochastic factor, which may describe the activity of other markets, macroeconomic factors or microstructure rules that influences the dynamics of the stock price process.

We define the observed history as the natural filtration of the observation process Y, that is, FY={FtY}t[0,T], where FtY:=σ(Ys,0st). The σ-algebra FtYcan be interpreted as the information available from observations up to time t. We aim to compute the best estimate of the signal X from the available information, in the quadratic sense. In other terms, this corresponds to determine the filter which furnishes the conditional distribution of Xt given FtY, for every t ∈ [0, T].

Let M(R)be the space of finite measures over Rand P(R)the subspace of the probability measures over R. Given μM(R), for any bounded measurable function f, we write

μ(f)=Rf(x)μ(dx).E1

Definition 2. The filter is the FY-càdlàg process π taking values in P(R) defined by

πt(f):=E[f(t,Xt)|FtY]=Rf(t,x)πt(dx),E2

for all bounded and measurable functions f(t, x) on [0, T] × R.

In the sequel, we denote by πt the left version of the filter and for all functions F(t, x, y) such that E|F(t,Xt,Yt)|<(resp. E|F(t,Xt,Yt)|<) for every t ∈ [0,T], we use the notation πt(F):=πt(F(t,,Yt))(resp. πt(F):=πt(F(t,,Yt))).

In this paper, we wish to consider the filtering problem for a partially observable system (X, Y) described by the following pair of stochastic differential equations:

{dXt=b0(t,Xt)dt+σ0(t,Xt)dWt0+ZK0(t,Xt;ζ)N(dt,dζ);X0=x0RdYt=b1(t,Xt,Yt)dt+σ1(t,Yt)dWt1+ZK1(t,Xt,Yt;ζ)N(dt,dζ);Y0=y0RE3

where W0 and W1 are correlated (F,P)-Brownian motions with correlation coefficient ρ ∈ [−1,1] and N(dt,dζ)is a Poisson random measure on R+×Zwhose intensity ν(dζ)dtis a σ – finite measure on a measurable space (Z,Z). Here, b0,b1,σ0,σ1,K0, and K1 are R-valued and measurable functions of their arguments. In particular, σ0(t, x) and σ1(t, x, y) are strictly positive for every (t,x,y)[0,T]×R2.

For the rest of the paper, we assume that strong existence and uniqueness for system Eq. (3) holds. Sufficient conditions are collected, for instance, in Ref. [18, Appendix]. These assumptions also imply Markovianity for the pair (X, Y).

Remark 3. Note that the quadratic variation process of Y defined by

[Y]t=Yt220tYudYu,t[0,T],E4

is FY-adapted and [Y]t=0tσ12(u,Yu)du+ut(ΔYu)2, where ΔYt:=YtYt. Therefore, it is natural to assume that the signal X does not affect the diffusion coefficient in the dynamics of Y. If Y describes the price of a risky asset, this implies that the volatility of the stock price does not depend on the stochastic factor X.

The jump component of Y can be described in terms of the following integer-valued random measure on [0, T] × R:

m(dt,dz)=s:ΔYs0δ{s,ΔYs}(dt,dz),E5

where δa denotes the Dirac measure at point a. Note that the following equality holds:

0tRzm(ds,dz)=0tZK1(s,Xs,Ys;ζ)N(ds,dζ).E6

For all t ∈ [0, T], for all AB(R), we define the following sets:

d0(t,x):={ζZ:K0(t,x;ζ)0},d1(t,x,y):={ζZ:K1(t,x,y;ζ)0},E7
dA(t,x,y):={ζZ:K1(t,x,y;ζ)A\{0}} d1(t,x,y),E8
DtA:=dA(t,Xt,Yt)Dt:=d1(t,Xt,Yt),Dt0:=d0(t,Xt).E9

Typically, we have Dt0DtØPa.s., which means that state and observation may have common jump times. This characteristic is particularly meaningful in financial applications to model catastrophic events that produce jumps in both the stock price and the underlying stochastic factor that influences its dynamics.

To ensure existence of the first moment for the pair (X, Y) and non-explosiveness for the jump process governing the dynamics of X and Y, we make the following assumption:

Assumption 4.

E[0T|b0(t,Xt)|+σ02(t,Xt)+Z|K0(t,Xt;ζ)|ν(dζ)dt]<,E10
E[0T|b1(t,Xt,Yt)|+σ12(t,Yt)+Z|K1(t,Xt,Yt;ζ)|ν(dζ)dt]<,E11
E[0Tν(Dt0Dt)dt]<.E12

Denote by ηP(dt,dz)the (F,P)compensator of m(dt,dz)(see, e.g., Refs. [9, 31] for the definition).

Then, in Ref. [14, Proposition 2.2], it is proved that

ηP(dt,dz)=λ(t,Xt,Yt)φ(t,Xt,Yt,dz)dt,E13

where

λ(t,x,y)φ(t,x,y,dz)=d1(t,x,y)δK1(t,x,y;ζ)(dz)ν(dζ)E14

and in particular λ(t,x,y)=ν(d1(t,x,y)).

Remark 5. Let us observe that both the local jump characteristics (λ(t,Xt,Yt),φ(t,Xt,Yt,dz))depend on X and, for all AB(R), λ(t,Xt,Yt)φ(t,Xt,Yt,A)=ν(DtA)provides the (F,P)-intensity of the point process Nt(A):=m((0,t]×A). According to this, the process λ(t,Xt,Yt)=ν(Dt)is the (F,P)-intensity of the point process Nt(R)which counts the total number of jumps of Y until time t.

2.1. The innovation process

To derive the filtering equation, we use the innovations approach. This method requires to introduce a pair (I,mπ), called the innovation process, consisting of the (FY,P)-Brownian motion and the (FY,P)-compensated jump measure that drive the dynamics of the filter. The innovation also represents the building block of (FY,P)-martingales.

To introduce the first component of the innovation process, we assume that

E[exp{120T(b1(t,Xt,Yt)σ1(t,Yt))2dt}]<,E15

and define

It:=Wt1+0t(b1(s,Xs,Ys)σ1(s,Ys)πs(b1)σ1(s,Ys))ds,t[0,T].E16

The process I is an (FY,P)-Brownian motion (see, e.g., Ref. [4]) and the (FY,P)-compensated jump martingale measure is given by

mπ(dt,dz)=mπ(dt,dz)πt(λφ(dz))dt,E17

See, e.g. Ref. [14]. The following theorem provides a characterization of the (FY,P)-martingale in terms of the innovation process.

Theorem 6 (A martingale representation theorem). Under Assumption 4 and the integrability condition Eq. (15), every (FY,P)-local martingale M admits the following decomposition:

Mt=M0+0tRws(z)mπ(ds,dz)+0thsdIs,t[0,T],E18

where w(z)={wt(z),t∈[0,T]} is an FY-predictable process indexed by z, and h={ht,t∈[0,T]} is an FY-adapted process such that

0TR|wt(z)|πt(λφ(dz))dt<,0Tht2dt<P-a.s..E19

Proof. The proof is given in Ref. [17, Proposition 2.4]. Note that here condition (15) implies that E[0T(b1(t,Xt,Yt)σ1(t,Yt))2dt]<, and also that the process L defined by

Lt=exp(0tb1(s,Xs,Ys)σ1(s,Ys)dWs1120t(b1(s,Xs,Ys)σ1(s,Ys))2ds),E20

for every t[0,T], is an (F,P)-martingale.

3. The filtering equations

Theorem 7 (The Kushner-Stratonovich equation). Under Assumptions 4 and condition (15), the filter π solves the following Kushner-Stratonovich equation, that is, for every fCb1,2([0,T]×R):

πt(f)=f(0,x0)+0tπs(LXf)ds+0tRwsπ(f,z)mπ(ds,dz)+0thsπ(f)dIs,t[0,T]E21

where

wtπ(f,z)=dπt(λφf)dπt(λφ)(z)πt(f)+dπt(L¯f)dπt(λφ)(z),E22
htπ(f)=σ11(t)[πt(b1f)πt(b1)πt(f)]+ρπt(σ0fx).E23

Here, by dπt−(λφf)dπt−(λφ)(z) and dπt−(L¯f)dπt−(λφ)(z), we mean the Radon-Nikodym derivatives of the measures πt−(λfφ(dz)) and πt−(L¯f)(dz), with respect to πt−(λφ(dz)). Moreover, the operator L¯ defined by L¯tf(dz):=L¯f(.,Yt−,dz) is such that for every A∈B(R),

L¯f(t,x,y,A)=dA(t,x,y)[f(t,x+K0(t,x;ζ))f(t,x)]ν(dζ)E24

takes into account common jump times between the signal X and the observation Y.

Finally, the operator LX given by

LXf(t,x)=ft+b0(t,x)fx+12σ02(t,x)2fx2+Z{f(t,x+K0(t,x;ζ))f(t,x)}ν(dζ).E25

denotes the generator of the Markov process X.

Proof. The theorem is proved in Ref. [17, Theorem 3.1].

Example 8 (Observation dynamics driven by independent point processes with unobservable intensities). In the sequel, we provide an example where the Kushner-Stratonovich equation simplifies and the Radon-Nikodym derivatives appearing in the dynamics of π(f) reduce to ratios. Suppose that there exists a finite set of measurable functions K1i(t,y)0for all (t,y)[0,T]×R, for i{1,…,n}, such that the dynamics of Y is given by

dYt=b1(t,Xt,Yt)dt+σ1(t,Yt)dWt1+i=1nK1i(t,Yt)dNti,Y0=y0R,E26

where Ni are independent counting processes with (F,P)intensities λi(t,Xt,Yt).

For simplicity, in this example, we assume that X and Y have no common jump times. Then, the filtering Eq. (21) reads as

πt(f)=f(0,x0)+0tπs(LXf)ds+0t{σ1(s)1[πs(b1f)πs(b1)πs(f)]+ρπs(σ0fx)}dIs+i=1n0t1πs(λi)>0πs(λif)πs(f)πs(λi)πs(λi)(dNsiπs(λi)ds),t[0,T].E27

Note that Eq. (21) has an equivalent expression in terms of the operator L0X, given by

L0Xf(t,x,y)=LXf(t,x)L¯f(t,x,y,R)=ft(t,x)+b0(t,x)fx+12σ02(t,x)2fx2+dt1(t,x,y)c{f(t,x+K0(t,x,ζ))f(t,x)}ν(dζ),E28

where d1(t,x,y)c={ζZ:K1(t,x,y,ζ)=0}. Indeed, we get

dπt(f)={πt(L0Xf)+πt(f)πt(λ)πt(λf)}dt+htπdIt+Rwπ(t,z)m(dt,dz).E29

Moreover, the filter has a natural recursive structure. To show this, define the sequence {Tn,Zn}nNof jump times and jump sizes of Y, that is, Zn=YTnYTn. These are observable data. Then, between two consecutive jump times the filter is governed by a diffusion process, that is, for t(TnT,Tn+1T)

πt(f)=πTn(f)+Tnt{πs(L0Xf)+πs(f)πs(λ)πs(λf)}ds+Tnthsπ(f)dIs,E30

and at any jump time Tn occurring before time T, it is given by

πTn(f)=dπTn(λφf)dπTn(λφ)(Zn)+dπTn(L¯f)dπTn(λφ)(Zn),E31

which implies that πTn(f)is completely determined by the observed data (Tn, Zn) and the knowledge of πt (f) in the time interval [Tn1,Tn), since πTn(f)=limtTnπt(f).

Note that the Kushner-Stratonovich equation is an infinite-dimensional nonlinear stochastic differential equation. Often, it is possible to characterize the filter in terms of a simpler equation, known as the Zakai equation which provides the dynamics of the unnormalized version of the filter. Although the Zakai equation is still infinite-dimensional, it has the advantage to be linear.

The idea for getting the dynamics of the unnormalized filter consists of performing an equivalent change of probability measure defined by

dP0dP|Ft=Zt,t[0,T]E32

for a suitable strictly positive (F,P)-martingale Z, in such a way that the so-called unnormalized filter p is the M(R)-valued process defined by

pt(f):=E0[Zt1f(t,Xt)|FtY],t[0,T],E33

Remark 9. By the Kallianpur-Striebel formula, we get that

πt(f)=E0[f(t,Xt)Zt1|FtY]E0[Zt1|FtY]=pt(f)pt(1),t[0,T],E34

where pt(1):=E0[Zt−1|FtY]. This provides the relation between the filter and its unnormalized version.

In order to compute the Zakai equation, we make the following assumption.

Assumption 10. Suppose that there exists a transition function η0(t,y,dz)such that the (FY,P)-predictable measure η0(t,Yt,dz)is equivalent to λ(t,Xt,Yt)φ(t,Xt,Yt,dz)and

E[0Tη0(t,Yt,R)dt]<.E35

Remark 11. In Ref. [18], a weaker assumption is considered. That condition allows to introduce an equivalent probability measure on (Ω,FTY) which is not necessarily the restriction on FTY of an equivalent probability measure on (Ω,FT).

Remark 12. In the context of Example 8, Assumption 10 is satisfied if, for instance, λi(t,Xt−,Yt−)>0 P-a.s. for every t∈[0,T].

Assumption 10 equivalently means that there exists an (FY,P)-predictable process Ψ(t,Xt,Yt,z)such that

λ(t,Xt,Yt)φ(t,Xt,Yt,dz)dt=(1+Ψ(t,Xt,Yt,z))η0(t,Yt,dz)dtE36

and 1+Ψ(t,Xt,Yt,z)>0P-a.s. for every t[0,T],zR. Setting

U(t,z):=11+Ψ(t,Xt,Yt,z)1,E37

we also assume that the following integrability condition holds:

E[exp{120T(b1(s,Xs,Ys)σ1(s,Ys))2ds+0TRU2(s,z)λ(s,Xs,Ys)φ(s,Xs,Ys,dz)ds}]<.E38

The subsequent proposition provides a useful version of the Girsanov Theorem that fits to our setting.

Proposition 13. Let Assumptions 4 and 10, and condition (38) hold and define the process Zt:=E(0tb1(s,Xs,Ys)σ1(s,Ys)dWs1+0tRU(s,z)(m(ds,dz)λ(s,Xs,Ys)φ(s,Xs,Ys,dz)ds)),for every t[0,T], where E(M)denotes the Doléans-Dade exponential of a martingale M. Then, Z is a strictly positive (F,P)-martingale. Let P0 be the probability measure equivalent to P given by

dP0dP|Ft=Zt,t[0,T].E39

Then, the process

W˜t1:=Wt1+0tb1(s,Xs,Ys)σ1(s,Ys)ds,t[0,T]E40

is an (F,P0)-Brownian motion, and the (F,P0)-predictable projection of the integer-valued random measure m(dt,dz) is given by η0(t,Yt−,dz)dt.

Proof. [32, Theorem 9] ensures that Z is a martingale under Assumptions 10, 4 and integrability condition Eq. (38). Then the proof follows by Ref. [31, Chapter III, Theorem 3.24].

Note that, by Eq. (16), we get that the process W1˜can also be written as

W˜t1=It+0tπs(b1σ1)ds,t[0,T]E41

which implies that W˜1is also an (FY,P0)-Brownian motion. Moreover, since η0(t,Yt,dz)is FYpredictable, it provides the (FY,P0)-predictable projection of the measure m(dt,dz)and the observation process Y satisfies dYt=σ1(t,Yt)dW˜t1+Rzm(dt,dz). In particular, ηt0(R):=η0(t,Yt,R)is the (FY,P0)-intensity of the point process which counts the total jumps of Y until time t.

Theorem 14 (The Zakai equation). Under Assumptions 4 and 10 and condition (38), let P0 be the probability measure defined in Proposition 13. For every f∈Cb1,2([0,T]×R), the unnormalized filter defined in Eq. (33) satisfies the equation

dpt(f)={pt(L0Xf)pt(λf)+ηt0(R)pt(f)}dt+{pt(b1f)σ1(t,Yt)+ρ pt(σ0fx)}dW˜t1+R{pt(fΨ)(z)+dpt(L¯f)dηt0(z)}m(dt,dz).E42

See Ref. [18, Theorem 3.6] for the proof.

3.1. Uniqueness of the filtering equations

In this section, we show pathwise uniqueness for the solution of the Kushner-Stratonovich and the Zakai equations. The first result provides the equivalence of uniqueness of the solutions to the filtering Eqs. (21) and (42).

Theorem 15. Let Assumptions 4 and 10 and condition (38) hold.

  1. Assume strong uniqueness for the solution to the Zakai equation, let μ be a P(R)-valued process which is a strong solution of the Kushner-Stratonovich equation. Then μt = πt P − a.s. for all t ∈ [0, T].

  2. Conversely, suppose that pathwise uniqueness for the solution of the Kushner-Stratonovich equation holds and let ξ be an M(R)-valued process which is a strong solution of the Zakai equation. Then ξt=pt P−a.s. for all t∈[0,T].

Proof. The proof follows by Ref. [18, Theorems 4.5 and 4.6]. Here, note that Assumption 10 implies that the measures μt(λφ(dz))and πt(λφ(dz))are equivalent.

Finally, strong uniqueness for the solution of both filtering equations is established in the subsequent theorems.

Theorem 16. Let (X, Y) be the partially observed system defined in Eq. (3), and assume in addition to Assumptions 4 and 10 and condition (15) that

supt,x,yZ{|K0(t,x;ζ)|+|K1(t,x,y;ζ)|}ν(dζ)<.E43

Let μ be a strong solution of the Kushner-Stratonovich equation. Then μt = πt P-a.s. for every t[0,T].

Proof. See Ref. [17, Theorem 3.3].

Theorem 17. Let (X, Y) be the partially observed system in Eq. (3). Under Assumptions 4 and 10 and conditions (38) and (43), let ξ be a strong solution to the Zakai equation, then ξt = pt P-a.s. for every t[0,T].

Proof. The proof follows by Ref. [18, Theorem 4.7], after noticing that under Assumption 10 the measures ξt(λφ(dz))and pt(λφ(dz))are equivalent.

4. A financial application to risk minimization

In the current section, we focus on a financial application. We consider a simple financial market where agents may invest in a risky asset whose price is described by the process Y given in Eq. (3) and a riskless asset with price process B. Without loss of generality, we assume that Bt = 1 for every t[0,T]. We also assume throughout the section the following dynamics for the process Y:

dYt=Yt(σ(t,Yt)dWt1+ZK(t,Xt,Yt;ζ)(N(dt,dζ)ν(dζ)dt)),Y0=y0R+E44

for some functions σ(t,y)and K(t,x,y;ζ)such that σ(t,y)>0and K(t,x,y;ζ)>1.

This choice for the dynamics of Y has a double advantage. On one side assuming a geometric form, together with the condition that K(t,x,y;ζ)>1guarantees nonnegativity which is desirable when talking about prices. On the other hand, we are modeling Y directly under a martingale measure, and by Assumption 18, it turns out to be a square integrable (F,P)-martingale.

Considering Eq. (44) corresponds to take in system (3)

b1(t,x,y)=yZK(t,x,y;ζ)ν(dζ)σ1(t,y)=yσ(t,y),K1(t,x,y;ζ)=yK(t,x,y;ζ).E45

In addition, we me make the following assumption.

Assumption 18.

0<c1<σ(t,y)<c2,|K(t,x,y;ζ)|<c3,ν(Dt)<c4,E46

for every (t,x,y)[0,T]×R×R+, ζZ and for some positive constants c1,c2,c3,c4.

Remark 19. In the sequel, it might be useful to specify the dynamics of Y also in terms of the jump measure m(dt,dz). Recalling Eqs. (6) and (14), we have

dYt=Ytσ(t,Yt)dWt1+Rz(m(dt,dz)λ(t,Xt,Yt)φ(t,Xt,Yt,dz)dt).E47

The stochastic factor X which affects intensity and jump size distribution of Y may represent the state of the economy and is not directly observable by market agents. This is a typical situation arising in real financial markets.

We model by FYthe available information to investors. Since Y is FYadapted, it is in particular an (FY,P)-martingale with the following decomposition:

Yt=y0+0tYsσ(s,Ys)dIs+0tRz(m(ds,dz)πs(λφ(dz))ds),t[0,T].E48

By Eqs. (14) and (45), in this setting the first component of the innovation process I defined in Eq. (16) is given by It=Wt1+0t1Ysσ(s,Ys)Rz(λ(s,Xs,Ys)φ(s,Xs,Ys,dz)πs(λφ(dz)))ds.

Suppose that we are given a European-type contingent claim whose final payoff is a square integrable FTY-measurable random variable ξ, that is, ξL2(FTY)where

L2(FTY):={randomvariablesΓFTY:E[Γ2]<}.E49

The objective of the agent is to find the optimal hedging strategy for this derivative. Since the number of random sources exceeds the number of tradeable risky assets, the market is incomplete. It is well known that in this setting, perfect replication by self-financing strategies is not feasible. Then, we suppose that the investor intends to pursue the risk-minimization approach. Risk minimization is a quadratic hedging method that allows determining a dynamic investment strategy that replicates perfectly the claim with minimal cost. Let us properly introduce the objects of interest. We start with the following notation. For any pair of F-adapted (respectively, FY-adapted) processes Ψ1,Ψ2we refer to Ψ1,Ψ2Ffor the predictable covariation computed with respect to filtration F(respectively, Ψ1,Ψ2FYfor the predictable covariation computed with respect to filtration FY). Note that

YtF=0tYs2(σ2(s,Ys)+ZK2(s,Xs,Ys;ζ)ν(dζ))ds=0t(Ys2σ2(s,Ys)+Rz2λ(s,Xs,Ys)φ(s,Xs,Ys,dz))ds,t[0,T],E50

and since Y is also FYadapted, we also have

YtFY=0t(Ys2σ2(s,Ys)+Rz2πs(λφ(dz)))ds,t[0,T].E51

We stress that, due to the presence of a jump component, the predictable quadratic variations of Y with respect to filtrations Fand FYare different.

Now we introduce a technical definition of two spaces, Θ(F)and Θ(FY)

Definition 20. The space Θ(FY)(respectively, Θ(F)) is the space of all FY-predictable (respectively, F-predictable) processes θ such that

E[0Tθu2dYuFY]<(respectivelyE[0Tθu2dYuF]<).E52

We observe that for every θΘ(FY), thanks to FY-predictability, we have

E[0Tθu2dYuF]=E[0Tθu2dYuFY]<,E53

which implies that Θ(FY)  Θ(F).

Since we have two different levels of information represented by the filtrations Fand FY, we may define two classes of admissible strategies.

Definition 21. An FY-strategy (respectively, F-strategy) is a pair ψ=(θ,η) of stochastic processes, where θ represents the amount invested in the risky asset and η is the amount invested in the riskless asset, such that θ∈Θ(FY) (respectively, θ∈Θ(F)) and η is FY-adapted (respectively, F-adapted).

This definition reflects the fact that investor’s choices should be adapted to her/his knowledge of the market. The value of a strategy ψ=(θ,η)is given by

Vt(ψ)=θtYt+ηt,t[0,T],E54

and its cost is described by the process

Ct(ψ)=Vt(ψ)0tθudYu,t[0,T].E55

In other terms, the cost of a strategy is the difference between the value process and the gain process. For a self-financing strategy, the value and the gain processes coincide, up to the initial wealth V0, and therefore the cost is constant and equal to Ct=V0, for every t[0,T]. We continue by defining the risk process, in the partial information setting.

Definition 22. Given an FY-strategy (respectively, an F-strategy) ψ=(θ,η), we denote by RFY(ψ) (respectively, RF(ψ)) the associated risk process defined as

RtFY(ψ):=E[(CT(ψ)Ct(ψ))2|FtY],(respectively RtF(ψ):=E[(CT(ψ)Ct(ψ))2|Ft]),E56

for every t[0,T].

Then, we have the following definition of risk-minimizing strategy under partial information.

Definition 23. An FY-strategy ψ is risk minimizing if

  1. VT(ψ)=ξ,

  2. for any other FY-strategy ψ˜we have RtFY(ψ)RtFY(ψ˜), for every t[0,T].

The corresponding definitions of risk process and risk-minimizing strategy under full information can be obtained replacing FYand RtFYwith Fand RtFin Definition 23. To differentiate, when it is necessary, we use the terms FY-risk-minimizing strategy or F-risk-minimizing strategy. The criterion (ii) in Definition 23 can be also written as

minψΘ(FY)E[(CT(ψ)Ct(ψ))2],t[0,T],E57

which intuitively means that a strategy is risk minimizing if it minimizes the variance of the cost. This equivalent definition allows to obtain a nice property of risk-minimizing strategies which turn out to be self-financing on average, that is, the cost process C is a martingale and therefore has constant expectation (see, e.g., Ref. [27, Lemma 2] or [28, Lemma 2.3]).

In the sequel, we aim to characterize the optimal hedging strategy for the contingent claim ξ under full and partial information, that is, the F- and the FY-risk-minimizing strategies. To this, we introduce two orthogonal decompositions known as the Galtchouk-Kunita-Watanabe decompositions under full and partial information (see, e.g., [30]). To understand better the relevance of these decompositions, we assume for a moment completeness of the market and full information. Then, it is well known that for every European-type contingent claim with final payoff ξ, there exists a self-financing strategy ψ=(θ,η)such that

ξ=V0+0TθudYu,Pa.s.E58

that is, a replicating portfolio is uniquely determined by the initial wealth and the investment in the risky asset. When the market is incomplete, decomposition Eq. (58) does not hold in general. Intuitively, this implies that we might expect additional terms in Eq. (58), and according to the risk-minimization criterion, this additional terms need to be such that the final cost does not deviate too much from the average cost, in the quadratic sense. Specifically, we have the following decomposition of the random variable ξ:

ξ=V0+0TθudYu+GT,Pa.s.E59

where GT is the value at time T of a suitable process G. The minimality criterion requires that G is a martingale orthogonal to Y. We refer the reader to Ref. [28] for a detailed survey. Under suitable hypothesis, the above decomposition takes the name of Galtchouk-Kunita-Watanabe decomposition.

Now we wish to be more formal, and we introduce the following definitions:

Consider a random variable ξL2(FTY). Since FTYFT, we can define the following decompositions for ξ.

Definition 24. a. The Galtchouk-Kunita-Watanabe decomposition of ξL2(FTY)with respect to Y and Fis given by

ξ=U0F+0TθuFdYu+GTFPa.s.,E60

where U0FL2(F0), θFΘ(F)and GFis a square integrable (F,P)-martingale, with G0F=0, orthogonal to Y, that is, GF,YtF=0for every t[0,T].

b. The Galtchouk-Kunita-Watanabe decomposition of ξL2(FTY)with respect to Y and FYis given by

ξ=U0FY+0TθuFYdYu+GTFYPa.s.,E61

where U0FYL2(F0Y), θFYΘ(FY)and GFYis a square integrable (FY,P)-martingale, With G0FY=0, strongly orthogonal to Y, that is, GF,YtFY=0for every t[0,T].

In the sequel, we refer to Eqs. (60) and (61) as the Galtchouk-Kunita-Watanabe decompositions under full information and under partial information, respectively. Since Y is a square integrable martingale with respect to both filtrations Fand FY, decompositions Eqs. (60) and (61) exist.

Next proposition provides a relation between the integrands θFand θFYof decompositions Eqs. (60) and (61) in terms of predictable projections. For any (F,P)-predictable process A of finite variation, we denote by Ap,FYits (FY,P)-dual-predictable projection.1

Proposition 25. The integrands in decompositions Eqs. (60) and (61) satisfy the following relation:

θtFY=d(0tθuFdYuF)p,FYdYtp,FY,t[0,T].E62

Here, Yp,FYdenotes the (FY,P)-dual-predictable projection of YFand it is given by

Ytp,FY=YtFY=0tYs2σ2(s,Ys)ds+0tRz2πs(λφ(dz))ds,t[0,T].E63

Proof. First note that the (FY,P)-dual-predictable projection of the process YFcoincides with the predictable quadratic variation of the process Y itself, computed with respect to its internal filtration, given in Eq. (51), since for any (FY,P)-predictable-(bounded) process φ, we have that E[0TφtdYtF]=E[0TφtdYtFY]. This proves Eq. (63).

Let

θt:=d(0tθuFdYuF)p,FYdYtp,FY,t[0,T].E64

By the Galtchouk-Kunita-Watanabe decomposition Eq. (60), we can write

ξ=U0F+0TθudYu+GTF+G˜TPa.s.,E65

where G˜t:=0t(θuFθu)dYu,for every t[0,T]. We observe that for every FY-predictable process φ the following holds:

E[0TφuθudYuF]=E[0TφuθudYuFY]=E[0Tφu(θuFdYuF)p,FY]=E[0TφuθuFdYuF].E66

By choosing φ = θ and applying the Cauchy-Schwarz inequality, we obtain

E[0T(θu)2dYuFY]E[0T(θuF)2dYuF]<.E67

This implies that θΘ(FY)  Θ(F)and that G˜is an (F,P)-martingale. Taking the conditional expectation with respect to FTYin Eq. (65) leads to

ξ=E[U0F|FTY]+0TθudYu+GTF+G˜T=E[U0F|F0Y]+0TθudYu+G^TFYPa.s.E68

where

G^tFY:=E[U0F|FtY]E[U0F|F0Y]+E[GTF|FtY]+E[G˜T|FtY],t[0,T],E69

which provides the Galtchouk-Kunita-Watanabe decomposition Eq. (61) if we can show that the (FY,P)-martingale G^FYis strongly orthogonal to Y, that is, if for any (FY,P)-predictable-(bounded) process φ the following holds:

E[G^TFY0TφudYu]=0.E70

Note that orthogonality of the term E[U0F|FtY]E[U0F|F0Y]+E[GTF|FtY]follows by the orthogonality of GFand Y. Moreover, we have

E[E[G˜T|FTY]0TφudYu]=E[G˜T0TφudYu]=E[0Tφu(θuFθu)dYuF],E71

and by Eq. (64)

E[0TφuθudYuF]=E[0TφuθudYuFY]=E[0Tφud(0uθrFdYr)p,FY]=E[0TφuθuFdYuF],E72

which proves strong orthogonality.

Theorem 26 shows the relation between the Galtchouk-Kunita-Watanabe decompositions and the optimal strategies under full and partial information.

Theorem 26. i. Every contingent claim ξ∈L2(FTY,P) admits a unique F-risk-minimizing strategy ψ*,F=(θ*,F,η*,F), explicitly given by

θ*,F=θF,η*,F=V(ψ*,F)θ*,FY,E73

where Vt(ψ*,F)=E[ξ|Ft] for every t∈[0,T], with minimal cost

Ct(ψ*,F)=U0F+GtF,t[0,T].E74

Here, θF, U0F, and GF are given in Definition 24 part a.

ii. Moreover, it also admits a unique FY-risk-minimizing strategy ψ*,F=(θ*,F,η*,FY), explicitly given by

θ*,FY=θFY,η*,FY=V(ψ*,FY)θ*,FYY,E75

where Vt(ψ*,FY)=E[ξ|FtY] for every t∈[0,T], with minimal cost

Ct(ψ*,FY)=U0FY+GtFY,t[0,T],E76

and θFY, U0FY and GFY are given in Definition 24 part b.

Proof. The proof of part i. is given, for example, in Ref. [28, Theorem 2.4]. For part ii., note that using the martingale representation of Y with respect to its inner filtration given in Eq. (48) and the fact that ξL2(FTY), it is possible to reduce the partial information case to full information and apply again [28, Theorem 2.4]. □

Proposition 25 helps us in the computation of the optimal strategy under partial information. Indeed, it is sufficient to compute the corresponding strategy θ*,Funder full information and the Radon-Nikodym derivative given in Eq. (62). To get more explicit representations, we assume that the payoff of the contingent claim has the form ξ=H(T,YT), for some function H:[0,T]×R+R. Let LX,Ydenote the Markov generator of the pair (X, Y), that is

LX,Yf(t,x,y)=ft+b0(t,x)fx+b1(t,x,y)fy+12σ02(t,x)2fx2+ρyσ0(t,x)σ(t,y)2fxy+12y2σ2(t,y)2fy2+ZΔf(t,x,y;ζ)ν(dζ)E77

for every fCb1,2,2([0,T]×R×R+), where

Δf(t,x,y;ζ):=f(t,x+K0(t,x;ζ),y(1+K(t,x,y;ζ)))f(t,x,y).E78

By the Markov property, we have that for any t[0,T]there exists a measurable function h(t,x,y)such that

h(t,Xt,Yt)=E[H(T,YT)|Ft].E79

If the function h is sufficiently regular, for instance hCb1,2,2([0,T]×R×R+), we can apply Itô’s formula and get that

h(t,Xt,Yt)=h(0,X0,Y0)+0tLX,Yh(s,Xs,Ys)ds+MthE80

where Mh is the (F,P)-martingale given by

dMth=0thx(s,Xs,Ys)σ0(s,Xs)dWs0+0thy(s,Xs,Ys)Ysσ(s,Ys)dWs1+0tZΔh(s,Xs,Ys;ζ)(N(ds,dζ)ν(dζ)ds).E81

By Eq. (79), the process {h(t,Xt,Yt),t[0,T]}is an (F,P)-martingale. Then, the finite variation term vanishes, which means that the function h satisfies LX,Yh(t,Xt,Yt)=0, P-a.s. and for almost every t[0,T]. The next proposition provides the risk-minimizing strategy under partial information.

Proposition 27. Assume h∈Cb1,2,2([0,T]×R×R+). Then the first components θ*,F and θ*,FY of the risk-minimizing strategies under full and partial information are given by

θt*,F=g(t,Xt,Yt)Yt2σ2(t,Yt)+Rz2λ(t,Xt,Yt)φ(t,Xt,Yt,dz),t[0,T]E82
θt*,FY=πt(g)Yt2σ(t,Yt)+Rz2πt(λφ(dz)),t[0,T]E83

respectively, where the function g(t, x, y) is

g(t,x,y)=ρσ0(t,x)yσ(t,y)hx+y2σ2(t,y)hy+ZyK(t,x,y;ζ)Δh(t,x,y;ζ)ν(dζ).E84

Proof. Consider decomposition Eq. (60) for ξ=H(T,YT). Then, conditioning on Ftwe get

h(t,Xt,Yt)=U0+0tθs*,FdYs+GtF.E85

Taking the covariation with respect to Y and F, we obtain

h(,X,Y),YtF=0tθs*,FdYsF.E86

On the other hand, h(t,Xt,Yt)=Mth, then taking Eqs. (81) and (44) into account we get that

h(,X,Y),YtF=0tg(s,Xs,Ys)ds,E87

where g(t, x, y) is given in Eq. (84). Hence, by Eqs. (50) and (87), we may represent θ*,Fas

θt*,F=dh(,X,Y),YtFdYtF=g(t,Xt,Yt)Yt2σ2(t,Yt)+Rz2λ(t,Xt,Yt)φ(t,Xt,Yt,dz)E88

Note that by Eq. (51) and

(0tθu*,FdYuF)p,FY=(0tg(s,Xs,Ys)ds)p,FY=0tπs(g)ds,E89

applying Eq. (62) we get representation Eq. (83).

Our ultimate objective in this section is to investigate on the relation between costs of the F-optimal strategy and the FY-optimal strategy, or equivalently the associated risk processes.

It clearly holds that θ*,FYΘ(F), and then the FY-risk-minimizing strategy is also an F-strategy. Considering the corresponding risks, we have

E[(CT(ψ*,FY)Ct(ψ*,FY))2|FtY]=E[E[(CT(ψ*,FY)Ct(ψ*,FY))2|Ft]|FtY]E[E[(CT(ψ*,F)Ct(ψ*,F))2|Ft]|FtY]=E[(CT(ψ*,F)Ct(ψ*,F))2|FtY],E90

and then E[RtF(ψ*,F)]E[RtFY(ψ*,FY)], for every t[0,T]. In the remaining part of the paper, we assume that F0Y=F0={Ω,Ø}, and we wish to measure the difference in the total risk taken by an informed investor, endowed with a filtration F, and a partially informed investor, whose information is described by FY. Precisely, we compute the difference R0FY(ψ*,FY)R0F(ψ*,F). By decompositions Eqs. (60) and (61), we have that CT(ψ*,F)C0(ψ*,F)=GTFand CT(ψ*,FY)C0(ψ*,FY)=GTFYand also

GTFY=U0FU0FY+0T(θr*,Fθr*,FY)dYr+GTF,E91

since F0Y=F0={Ω,Ø}, U0F=U0FY. Then computing the square of GTFYand taking the expectation we get

E[(GTFY)2]=E[(GTF)2]+E[(0T(θr*,Fθr*,FY)dYr)2]+2E[GTF0T(θr*,Fθr*,FY)dYr].E92

It follows from Itô isometry and the fact that GFis orthogonal to Y, that

E[(GTFY)2]=E[(GTF)2]+E[0T(θr*,Fθr*,FY)2YrF].E93

Then the difference that we want to evaluate becomes

R0FY(ψ*,FY)R0F(ψ*,F)=E[(GTFY)2]E[(GTF)2]=E[0T(θr*,Fθr*,FY)2dYrF]=E[0T(θr*,F)2dYrF]+E[0T(θr*,FY)2dYrF]2E[0Tθr*,Fθr*,FYdYrF].E94

Using Eq. (62) and the definition of FY-dual-predictable projections, we have that

E[0tθr*,FYθr*,FdYrF]=E[0t(θr*,FY)2dYrFY]=E[0t(θr*,FY)2dYrF],E95

which implies

R0FY(ψ*,FY)R0F(ψ*,F)=E[0T(θr*,F)2dYrF]E[0T(θr*,FY)2dYrFY].E96

Plugging in the expressions for the optimal strategies given in Eqs. (82) and (83), respectively, and denoting Σ(t,Xt,Yt):=Yt2(σ2(t,Yt)+Zz2λ(t,Xt,Yt)φ(t,Xt,Yt,dz)), we have

R0FY(ψ*,FY)R0F(ψ*,F)=E[0T(g2(t,Xt,Yt)Σ(t,Xt,Yt)πt2(g)πt(Σ))dt]CE[0T(g2(t,Xt,Yt)πt2(g))dt]=CE[0T(g(t,Xt,Yt)πt(g))2dt]E97

for some C > 0, where the inequality follows by Assumption 18, and in the last equality, we used E[0T2g(t,Xt,St)πt(g)dt]=E[0T2πt(g)2dt].

We can conclude by saying that we found an upper bound for the expected difference between the total risks taken by an informed investor and a partially informed one which is directly proportional to the mean-squared error between the process {g(t,Xt,St),t[0,T]}and its filtered estimate π(g)={πt(g),t[0,T]}.

Notes

  • We call (FY,P)- dual predictable projection of a process A the FY-predictable finite variation process Ap,FY such that for any FY-predictable-bounded process φ we haveE[∫0TφsdAs]=E[∫0TφsdAsp,FY]

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Claudia Ceci and Katia Colaneri (November 2nd 2017). Recent Advances in Nonlinear Filtering with a Financial Application to Derivatives Hedging under Incomplete Information, Bayesian Inference, Javier Prieto Tejedor, IntechOpen, DOI: 10.5772/intechopen.70060. Available from:

Embed this chapter on your site Copy to clipboard

<iframe src="http://www.intechopen.com/embed/bayesian-inference/recent-advances-in-nonlinear-filtering-with-a-financial-application-to-derivatives-hedging-under-inc" />

Embed this code snippet in the HTML of your website to show this chapter

chapter statistics

266total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Airlines Content Recommendations Based on Passengers' Choice Using Bayesian Belief Networks

By Sien Chen, Wenqiang Huang, Mengxi Chen, Junjiang Zhong and Jie Cheng

Related Book

First chapter

Bayesian Networks for Supporting Model Based Predictive Control of Smart Buildings

By Alessandro Carbonari, Massimo Vaccarini and Alberto Giretti

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More about us