Open access peer-reviewed chapter

Recent Advances in Nonlinear Filtering with a Financial Application to Derivatives Hedging under Incomplete Information

Written By

Claudia Ceci and Katia Colaneri

Submitted: November 15th, 2016 Reviewed: June 8th, 2017 Published: November 2nd, 2017

DOI: 10.5772/intechopen.70060

Chapter metrics overview

1,317 Chapter Downloads

View Full Metrics


In this chapter, we present some recent results about nonlinear filtering for jump diffusion signal and observation driven by correlated Brownian motions having common jump times. We provide the Kushner-Stratonovich and the Zakai equation for the normalized and the unnormalized filter, respectively. Moreover, we give conditions under which pathwise uniqueness for the solutions of both equations holds. Finally, we study an application of nonlinear filtering to the financial problem of derivatives hedging in an incomplete market with partial observation. Precisely, we consider the risk-minimizing hedging approach. In this framework, we compute the optimal hedging strategy for an informed investor and a partially informed one and compare the total expected squared costs of the strategies.


  • nonlinear filtering
  • jump diffusions
  • risk minimization
  • Galtchouk-Kunita-Watanabe decomposition
  • partial information

1. Introduction

Bayesian inference and stochastic filtering are strictly related, since in both approaches, one wants to estimate quantities which are not directly observable. However, while in Bayesian inference, all uncertainty sources are considered as random variables, stochastic filtering refers to stochastic processes. It also covers many situations, from linear to nonlinear case, with various types of noises.

The objective of this chapter is to present nonlinear filtering results for Markovian partially observable systems where the state and the observation processes are described by jump diffusions with correlated Brownian motions and common jump times. We also aim at applying this theory to the financial problem of derivatives hedging for a trader who has limitative information on the market.

A filtering model is characterized by a signal process, denoted by X, which cannot be observed directly, and an observation process denoted by Y whose dynamics depends on X. The natural filtration of Y, FY={FtY,t[0,T]}, represents the available information. The goal of solving a filtering problem is to determine the best estimation of the signal Xt from the knowledge of FtY. Similar to optimal Bayesian filtering, we seek for the best estimation of the signal according to the minimum mean-squared error criterion, which corresponds to compute the posterior distribution of Xt given the available observations up to time t.

Historically, the first example of continuous-time filtering problem is the well-known Kalman-Bucy filter which concerns the case where Y gives the observation of X in additional Gaussian noise and both processes X and Y are modeled by linear stochastic differential equations. In this case, one ends up with a filter having finite-dimensional realization. Since then, the problem has been extended in many directions. To start, a number of authors including Refs. [13] studied the nonlinear case in the setting of additional Gaussian noise. Other references in a similar framework are given, for instance, by Refs. [48]. Subsequently also the case of counting process or marked point process observation has been considered (see Refs. [914] and reference therein). A more recent literature contains the case of mixed-type observations (marked point processes and diffusions or jump-diffusion processes), see, for, example, Refs. [1518].

There are two major approaches to nonlinear filtering problems: the innovations method and the reference probability method. The latter is usually employed when it is possible to find an equivalent probability measure that makes the state X and the observations Y independent. This technique may appear problematic when, for instance, signal and observation are correlated and present common jump times. Therefore, in this chapter, we use the innovations approach which allows circumventing the technical issues arising in the reference probability method. By characterizing the innovation process and applying a martingale representation theorem, we can derive the dynamics of the filter as the solution of the Kushner-Stratonovich equation, which is a nonlinear stochastic partial integral differential equation. By considering the unnormalized version of the filter, it is possible to simplify this equation and make it at least linear. The resulting equation is called the Zakai equation, and due to its linear nature, it is of particular interest in many applications. We also compute the dynamics of the unnormalized filter, and we investigate pathwise uniqueness for the solutions of both equations. Normalized and unnormalized filters are probability measure and finite measure-valued processes, respectively, and therefore in general infinite-dimensional. Due to this, various recursive algorithms for statistical inference have come in to address this intractability, such as extended Kalman filter, statistical linearization, or particle filters. These algorithms intend to estimate both state and parameters. For the parameter estimation, we also mention the expectation maximization (EM) algorithm which enables to estimate parameters in models with incomplete data, see, for example, Ref. [19].

The success of the filtering theory over the years is due to its use in a great variety of problems arising from many disciplines such as engineering, informational sciences and mathematical finance. Specifically, in this chapter, we have a financial application in view. In real financial markets, it is reasonable that investors cannot fully know all the stochastic factors that may influence the prices of negotiated assets, since these factors are usually associated with economic quantities which are hard to observe. Filtering theory represents a way to measure, in some sense, this uncertainty. A consistent part of the literature over the last years has considered stochastic factor models under partial information for analyzing various financial problems, as, for example, pricing and hedging of derivatives, optimal investment, credit risk, and insurance modeling. A list, definitely nonexhaustive, is given by Refs. [15, 16, 2026]).

In the following, we consider the problem of a trader who wants to determine the hedging strategy for a European-type contingent claim with maturity T in an incomplete financial market where the investment possibilities are given by a riskless asset, assumed to be the numéraire, and a risky asset with price dynamics given by a geometric jump diffusion, modeled by the process Y. We assume that the drift, as well as the intensity and the jump size distribution of the price process, is influenced by an unobservable stochastic factor X, modeled as a correlated jump diffusion with common jump times. By common jump times, we intend to take into account catastrophic events which affect both the asset price and the hidden state variable driving its dynamics. The agent knows the asset prices, since they are publicly available, and trades on the market by using the available information FY.

Partial information easily leads to incomplete financial markets as clearly the number of random sources is larger than the number of tradeable risky asset. Therefore, the existence of a self-financing strategy that replicates the payoff of the given contingent claim at maturity is not guaranteed. Here, we assume that the risky asset price is modeled under a martingale measure, and we choose the risk-minimization approach as hedging criterion, see, for example, Refs. [27, 28].

According to this method, the optimal hedging strategy is the one that perfectly replicates the claim at maturity and has minimum cost in the mean-square sense. Equivalently, we say that it minimizes the associated risk defined as the conditional expected value of the squared future costs, given the available information (see Refs. [28, 29] and references therein).

The risk-minimizing hedging strategy under restricted information is strictly related to Galtchouk-Kunita-Watanabe decomposition of the random variable representing the payoff of the contingent claim in a partial information setting. Here, we provide a characterization of the risk-minimizing strategy under partial information via this orthogonal decomposition and obtain a representation in terms of the corresponding risk-minimizing hedging strategy under full information (see, e.g., Refs. [29, 30]) via predictable projections on the available information flow by means of the filter. Finally, we investigate the difference of expected total risks associated with the optimal hedging strategies under full and partial information.

The chapter has the following structure. In Section 2, we introduce the general framework. In Section 3, we study the filtering equations. In particular, we derive the dynamics for both normalized and unnormalized filters, and we investigate uniqueness of the solutions of the Kushner-Stratonovich and the Zakai equation. In Section 4, we analyze a financial application to risk minimization by computing the optimal hedging strategies for a European-type contingent claim under full and partial information and providing a comparison between the corresponding expected squared total costs.


2. The setting

We consider a pair of stochastic processes (X,Y), with values on R × R and càdlàg trajectories, on a complete filtered probability space (Ω,F,F,P), where F={Ft,t[0,T]} is a filtration satisfying the usual condition of right continuity and completeness, and T is a fixed time horizon. The pair (X, Y) represents a partially observable system, where X is a signal process that describes a phenomenon which is not directly observable and Y gives the observation of X, and it is modeled by a process correlated with the signal, having possibly common jump times.

Remark 1. In view of the financial application discussed in Section 4, Y represents the price of some risky asset, while X is an unknown stochastic factor, which may describe the activity of other markets, macroeconomic factors or microstructure rules that influences the dynamics of the stock price process.

We define the observed history as the natural filtration of the observation process Y, that is, FY={FtY}t[0,T], where FtY:=σ(Ys,0st). The σ-algebra FtY can be interpreted as the information available from observations up to time t. We aim to compute the best estimate of the signal X from the available information, in the quadratic sense. In other terms, this corresponds to determine the filter which furnishes the conditional distribution of Xt given FtY, for every t ∈ [0, T].

Let M(R) be the space of finite measures over R and P(R) the subspace of the probability measures over R. Given μM(R), for any bounded measurable function f, we write


Definition 2. The filter is the FY-càdlàg process π taking values in P(R) defined by


for all bounded and measurable functions f(t, x) on [0, T] × R.

In the sequel, we denote by πt the left version of the filter and for all functions F(t, x, y) such that E|F(t,Xt,Yt)|< (resp. E|F(t,Xt,Yt)|<) for every t ∈ [0,T], we use the notation πt(F):=πt(F(t,,Yt)) (resp. πt(F):=πt(F(t,,Yt))).

In this paper, we wish to consider the filtering problem for a partially observable system (X, Y) described by the following pair of stochastic differential equations:


where W0 and W1 are correlated (F,P)-Brownian motions with correlation coefficient ρ ∈ [−1,1] and N(dt,dζ) is a Poisson random measure on R+×Z whose intensity ν(dζ)dt is a σ – finite measure on a measurable space (Z,Z). Here, b0,b1,σ0,σ1,K0, and K1 are R-valued and measurable functions of their arguments. In particular, σ0(t, x) and σ1(t, x, y) are strictly positive for every (t,x,y)[0,T]×R2.

For the rest of the paper, we assume that strong existence and uniqueness for system Eq. (3) holds. Sufficient conditions are collected, for instance, in Ref. [18, Appendix]. These assumptions also imply Markovianity for the pair (X, Y).

Remark 3. Note that the quadratic variation process of Y defined by


is FY-adapted and [Y]t=0tσ12(u,Yu)du+ut(ΔYu)2, where ΔYt:=YtYt. Therefore, it is natural to assume that the signal X does not affect the diffusion coefficient in the dynamics of Y. If Y describes the price of a risky asset, this implies that the volatility of the stock price does not depend on the stochastic factor X.

The jump component of Y can be described in terms of the following integer-valued random measure on [0, T] × R:


where δa denotes the Dirac measure at point a. Note that the following equality holds:


For all t ∈ [0, T], for all AB(R), we define the following sets:

dA(t,x,y):={ζZ:K1(t,x,y;ζ)A\{0}} d1(t,x,y),E8

Typically, we have Dt0DtØ Pa.s., which means that state and observation may have common jump times. This characteristic is particularly meaningful in financial applications to model catastrophic events that produce jumps in both the stock price and the underlying stochastic factor that influences its dynamics.

To ensure existence of the first moment for the pair (X, Y) and non-explosiveness for the jump process governing the dynamics of X and Y, we make the following assumption:

Assumption 4.


Denote by ηP(dt,dz) the (F,P) compensator of m(dt,dz) (see, e.g., Refs. [9, 31] for the definition).

Then, in Ref. [14, Proposition 2.2], it is proved that




and in particular λ(t,x,y)=ν(d1(t,x,y)).

Remark 5. Let us observe that both the local jump characteristics (λ(t,Xt,Yt),φ(t,Xt,Yt,dz)) depend on X and, for all AB(R), λ(t,Xt,Yt)φ(t,Xt,Yt,A)=ν(DtA) provides the (F,P) -intensity of the point process Nt(A):=m((0,t]×A). According to this, the process λ(t,Xt,Yt)=ν(Dt) is the (F,P) -intensity of the point process Nt(R) which counts the total number of jumps of Y until time t.

2.1. The innovation process

To derive the filtering equation, we use the innovations approach. This method requires to introduce a pair (I,mπ), called the innovation process, consisting of the (FY,P)-Brownian motion and the (FY,P)-compensated jump measure that drive the dynamics of the filter. The innovation also represents the building block of (FY,P) -martingales.

To introduce the first component of the innovation process, we assume that


and define


The process I is an (FY,P)-Brownian motion (see, e.g., Ref. [4]) and the (FY,P)-compensated jump martingale measure is given by


See, e.g. Ref. [14]. The following theorem provides a characterization of the (FY,P)-martingale in terms of the innovation process.

Theorem 6 (A martingale representation theorem). Under Assumption 4 and the integrability condition Eq. (15), every (FY,P)-local martingale M admits the following decomposition:


where w(z)={wt(z),t[0,T]} is an FY-predictable process indexed by z, and h={ht,t[0,T]} is an FY-adapted process such that


Proof. The proof is given in Ref. [17, Proposition 2.4]. Note that here condition (15) implies that E[0T(b1(t,Xt,Yt)σ1(t,Yt))2dt]<, and also that the process L defined by


for every t[0,T], is an (F,P)-martingale.


3. The filtering equations

Theorem 7 (The Kushner-Stratonovich equation). Under Assumptions 4 and condition (15), the filter π solves the following Kushner-Stratonovich equation, that is, for every fCb1,2([0,T]×R):




Here, by dπt(λφf)dπt(λφ)(z) and dπt(L¯f)dπt(λφ)(z), we mean the Radon-Nikodym derivatives of the measures πt(λfφ(dz)) and πt(L¯f)(dz), with respect to πt(λφ(dz)). Moreover, the operator L¯ defined by L¯tf(dz):=L¯f(.,Yt,dz) is such that for every AB(R),


takes into account common jump times between the signal X and the observation Y.

Finally, the operator LX given by


denotes the generator of the Markov process X.

Proof. The theorem is proved in Ref. [17, Theorem 3.1].

Example 8 (Observation dynamics driven by independent point processes with unobservable intensities). In the sequel, we provide an example where the Kushner-Stratonovich equation simplifies and the Radon-Nikodym derivatives appearing in the dynamics of π(f) reduce to ratios. Suppose that there exists a finite set of measurable functions K1i(t,y)0 for all (t,y)[0,T]×R, for i{1,…,n}, such that the dynamics of Y is given by


where Ni are independent counting processes with (F,P) intensities λi(t,Xt,Yt).

For simplicity, in this example, we assume that X and Y have no common jump times. Then, the filtering Eq. (21) reads as


Note that Eq. (21) has an equivalent expression in terms of the operator L0X, given by


where d1(t,x,y)c={ζZ:K1(t,x,y,ζ)=0}. Indeed, we get


Moreover, the filter has a natural recursive structure. To show this, define the sequence {Tn,Zn}nN of jump times and jump sizes of Y, that is, Zn=YTnYTn. These are observable data. Then, between two consecutive jump times the filter is governed by a diffusion process, that is, for t(TnT,Tn+1T)


and at any jump time Tn occurring before time T, it is given by


which implies that πTn(f) is completely determined by the observed data (Tn, Zn) and the knowledge of πt (f) in the time interval [Tn1,Tn), since πTn(f)=limtTnπt(f).

Note that the Kushner-Stratonovich equation is an infinite-dimensional nonlinear stochastic differential equation. Often, it is possible to characterize the filter in terms of a simpler equation, known as the Zakai equation which provides the dynamics of the unnormalized version of the filter. Although the Zakai equation is still infinite-dimensional, it has the advantage to be linear.

The idea for getting the dynamics of the unnormalized filter consists of performing an equivalent change of probability measure defined by


for a suitable strictly positive (F,P)-martingale Z, in such a way that the so-called unnormalized filter p is the M(R)-valued process defined by


Remark 9. By the Kallianpur-Striebel formula, we get that


where pt(1):=E0[Zt1|FtY]. This provides the relation between the filter and its unnormalized version.

In order to compute the Zakai equation, we make the following assumption.

Assumption 10. Suppose that there exists a transition function η0(t,y,dz) such that the (FY,P)-predictable measure η0(t,Yt,dz) is equivalent to λ(t,Xt,Yt)φ(t,Xt,Yt,dz) and


Remark 11. In Ref. [18], a weaker assumption is considered. That condition allows to introduce an equivalent probability measure on (Ω,FTY) which is not necessarily the restriction on FTY of an equivalent probability measure on (Ω,FT).

Remark 12. In the context of Example 8, Assumption 10 is satisfied if, for instance, λi(t,Xt,Yt)>0 P-a.s. for every t[0,T].

Assumption 10 equivalently means that there exists an (FY,P)-predictable process Ψ(t,Xt,Yt,z) such that


and 1+Ψ(t,Xt,Yt,z)>0 P-a.s. for every t[0,T],zR. Setting


we also assume that the following integrability condition holds:


The subsequent proposition provides a useful version of the Girsanov Theorem that fits to our setting.

Proposition 13. Let Assumptions 4 and 10, and condition (38) hold and define the process Zt:=E(0tb1(s,Xs,Ys)σ1(s,Ys)dWs1+0tRU(s,z)(m(ds,dz)λ(s,Xs,Ys)φ(s,Xs,Ys,dz)ds)), for every t[0,T], where E(M) denotes the Doléans-Dade exponential of a martingale M. Then, Z is a strictly positive (F,P) -martingale. Let P0 be the probability measure equivalent to P given by


Then, the process


is an (F,P0)-Brownian motion, and the (F,P0)-predictable projection of the integer-valued random measure m(dt,dz) is given by η0(t,Yt,dz)dt.

Proof. [32, Theorem 9] ensures that Z is a martingale under Assumptions 10, 4 and integrability condition Eq. (38). Then the proof follows by Ref. [31, Chapter III, Theorem 3.24].

Note that, by Eq. (16), we get that the process W1˜ can also be written as


which implies that W˜1 is also an (FY,P0)-Brownian motion. Moreover, since η0(t,Yt,dz) is FY predictable, it provides the (FY,P0)-predictable projection of the measure m(dt,dz) and the observation process Y satisfies dYt=σ1(t,Yt)dW˜t1+Rzm(dt,dz). In particular, ηt0(R):=η0(t,Yt,R) is the (FY,P0)-intensity of the point process which counts the total jumps of Y until time t.

Theorem 14 (The Zakai equation). Under Assumptions 4 and 10 and condition (38), let P0 be the probability measure defined in Proposition 13. For every fCb1,2([0,T]×R), the unnormalized filter defined in Eq. (33) satisfies the equation

dpt(f)={pt(L0Xf)pt(λf)+ηt0(R)pt(f)}dt+{pt(b1f)σ1(t,Yt)+ρ pt(σ0fx)}dW˜t1+R{pt(fΨ)(z)+dpt(L¯f)dηt0(z)}m(dt,dz).E42

See Ref. [18, Theorem 3.6] for the proof.

3.1. Uniqueness of the filtering equations

In this section, we show pathwise uniqueness for the solution of the Kushner-Stratonovich and the Zakai equations. The first result provides the equivalence of uniqueness of the solutions to the filtering Eqs. (21) and (42).

Theorem 15. Let Assumptions 4 and 10 and condition (38) hold.

  1. Assume strong uniqueness for the solution to the Zakai equation, let μ be a P(R)-valued process which is a strong solution of the Kushner-Stratonovich equation. Then μt = πt P − a.s. for all t ∈ [0, T].

  2. Conversely, suppose that pathwise uniqueness for the solution of the Kushner-Stratonovich equation holds and let ξ be an M(R)-valued process which is a strong solution of the Zakai equation. Then ξt=pt Pa.s. for all t[0,T].

Proof. The proof follows by Ref. [18, Theorems 4.5 and 4.6]. Here, note that Assumption 10 implies that the measures μt(λφ(dz)) and πt(λφ(dz)) are equivalent.

Finally, strong uniqueness for the solution of both filtering equations is established in the subsequent theorems.

Theorem 16. Let (X, Y) be the partially observed system defined in Eq. (3), and assume in addition to Assumptions 4 and 10 and condition (15) that


Let μ be a strong solution of the Kushner-Stratonovich equation. Then μt = πt P-a.s. for every t[0,T].

Proof. See Ref. [17, Theorem 3.3].

Theorem 17. Let (X, Y) be the partially observed system in Eq. (3). Under Assumptions 4 and 10 and conditions (38) and (43), let ξ be a strong solution to the Zakai equation, then ξt = pt P-a.s. for every t[0,T].

Proof. The proof follows by Ref. [18, Theorem 4.7], after noticing that under Assumption 10 the measures ξt(λφ(dz)) and pt(λφ(dz)) are equivalent.


4. A financial application to risk minimization

In the current section, we focus on a financial application. We consider a simple financial market where agents may invest in a risky asset whose price is described by the process Y given in Eq. (3) and a riskless asset with price process B. Without loss of generality, we assume that Bt = 1 for every t[0,T]. We also assume throughout the section the following dynamics for the process Y:


for some functions σ(t,y) and K(t,x,y;ζ) such that σ(t,y)>0 and K(t,x,y;ζ)>1.

This choice for the dynamics of Y has a double advantage. On one side assuming a geometric form, together with the condition that K(t,x,y;ζ)>1 guarantees nonnegativity which is desirable when talking about prices. On the other hand, we are modeling Y directly under a martingale measure, and by Assumption 18, it turns out to be a square integrable (F,P)-martingale.

Considering Eq. (44) corresponds to take in system (3)


In addition, we me make the following assumption.

Assumption 18.


for every (t,x,y)[0,T]×R×R+, ζZ and for some positive constants c1,c2,c3,c4.

Remark 19. In the sequel, it might be useful to specify the dynamics of Y also in terms of the jump measure m(dt,dz). Recalling Eqs. (6) and (14), we have


The stochastic factor X which affects intensity and jump size distribution of Y may represent the state of the economy and is not directly observable by market agents. This is a typical situation arising in real financial markets.

We model by FY the available information to investors. Since Y is FY adapted, it is in particular an (FY,P)-martingale with the following decomposition:


By Eqs. (14) and (45), in this setting the first component of the innovation process I defined in Eq. (16) is given by It=Wt1+0t1Ysσ(s,Ys)Rz(λ(s,Xs,Ys)φ(s,Xs,Ys,dz)πs(λφ(dz)))ds.

Suppose that we are given a European-type contingent claim whose final payoff is a square integrable FTY-measurable random variable ξ, that is, ξL2(FTY) where


The objective of the agent is to find the optimal hedging strategy for this derivative. Since the number of random sources exceeds the number of tradeable risky assets, the market is incomplete. It is well known that in this setting, perfect replication by self-financing strategies is not feasible. Then, we suppose that the investor intends to pursue the risk-minimization approach. Risk minimization is a quadratic hedging method that allows determining a dynamic investment strategy that replicates perfectly the claim with minimal cost. Let us properly introduce the objects of interest. We start with the following notation. For any pair of F-adapted (respectively, FY-adapted) processes Ψ1,Ψ2 we refer to Ψ1,Ψ2F for the predictable covariation computed with respect to filtration F (respectively, Ψ1,Ψ2FY for the predictable covariation computed with respect to filtration FY). Note that


and since Y is also FY adapted, we also have


We stress that, due to the presence of a jump component, the predictable quadratic variations of Y with respect to filtrations F and FY are different.

Now we introduce a technical definition of two spaces, Θ(F) and Θ(FY)

Definition 20. The space Θ(FY) (respectively, Θ(F)) is the space of all FY-predictable (respectively, F-predictable) processes θ such that


We observe that for every θΘ(FY), thanks to FY-predictability, we have


which implies that Θ(FY)  Θ(F).

Since we have two different levels of information represented by the filtrations F and FY, we may define two classes of admissible strategies.

Definition 21. An FY-strategy (respectively, F-strategy) is a pair ψ=(θ,η) of stochastic processes, where θ represents the amount invested in the risky asset and η is the amount invested in the riskless asset, such that θΘ(FY) (respectively, θΘ(F)) and η is FY-adapted (respectively, F-adapted).

This definition reflects the fact that investor’s choices should be adapted to her/his knowledge of the market. The value of a strategy ψ=(θ,η) is given by


and its cost is described by the process


In other terms, the cost of a strategy is the difference between the value process and the gain process. For a self-financing strategy, the value and the gain processes coincide, up to the initial wealth V0, and therefore the cost is constant and equal to Ct=V0, for every t[0,T]. We continue by defining the risk process, in the partial information setting.

Definition 22. Given an FY-strategy (respectively, an F-strategy) ψ=(θ,η), we denote by RFY(ψ) (respectively, RF(ψ)) the associated risk process defined as

RtFY(ψ):=E[(CT(ψ)Ct(ψ))2|FtY],(respectively RtF(ψ):=E[(CT(ψ)Ct(ψ))2|Ft]),E56

for every t[0,T].

Then, we have the following definition of risk-minimizing strategy under partial information.

Definition 23. An FY-strategy ψ is risk minimizing if

  1. VT(ψ)=ξ,

  2. for any other FY -strategy ψ˜ we have RtFY(ψ)RtFY(ψ˜), for every t[0,T].

The corresponding definitions of risk process and risk-minimizing strategy under full information can be obtained replacing FY and RtFY with F and RtF in Definition 23. To differentiate, when it is necessary, we use the terms FY-risk-minimizing strategy or F-risk-minimizing strategy. The criterion (ii) in Definition 23 can be also written as


which intuitively means that a strategy is risk minimizing if it minimizes the variance of the cost. This equivalent definition allows to obtain a nice property of risk-minimizing strategies which turn out to be self-financing on average, that is, the cost process C is a martingale and therefore has constant expectation (see, e.g., Ref. [27, Lemma 2] or [28, Lemma 2.3]).

In the sequel, we aim to characterize the optimal hedging strategy for the contingent claim ξ under full and partial information, that is, the F- and the FY-risk-minimizing strategies. To this, we introduce two orthogonal decompositions known as the Galtchouk-Kunita-Watanabe decompositions under full and partial information (see, e.g., [30]). To understand better the relevance of these decompositions, we assume for a moment completeness of the market and full information. Then, it is well known that for every European-type contingent claim with final payoff ξ, there exists a self-financing strategy ψ=(θ,η) such that


that is, a replicating portfolio is uniquely determined by the initial wealth and the investment in the risky asset. When the market is incomplete, decomposition Eq. (58) does not hold in general. Intuitively, this implies that we might expect additional terms in Eq. (58), and according to the risk-minimization criterion, this additional terms need to be such that the final cost does not deviate too much from the average cost, in the quadratic sense. Specifically, we have the following decomposition of the random variable ξ:


where GT is the value at time T of a suitable process G. The minimality criterion requires that G is a martingale orthogonal to Y. We refer the reader to Ref. [28] for a detailed survey. Under suitable hypothesis, the above decomposition takes the name of Galtchouk-Kunita-Watanabe decomposition.

Now we wish to be more formal, and we introduce the following definitions:

Consider a random variable ξL2(FTY). Since FTYFT, we can define the following decompositions for ξ.

Definition 24. a. The Galtchouk-Kunita-Watanabe decomposition of ξL2(FTY) with respect to Y and F is given by


where U0FL2(F0), θFΘ(F) and GF is a square integrable (F,P)-martingale, with G0F=0, orthogonal to Y, that is, GF,YtF=0 for every t[0,T].

b. The Galtchouk-Kunita-Watanabe decomposition of ξL2(FTY) with respect to Y and FY is given by


where U0FYL2(F0Y), θFYΘ(FY) and GFY is a square integrable (FY,P) -martingale, With G0FY=0, strongly orthogonal to Y, that is, GF,YtFY=0 for every t[0,T].

In the sequel, we refer to Eqs. (60) and (61) as the Galtchouk-Kunita-Watanabe decompositions under full information and under partial information, respectively. Since Y is a square integrable martingale with respect to both filtrations F and FY, decompositions Eqs. (60) and (61) exist.

Next proposition provides a relation between the integrands θF and θFY of decompositions Eqs. (60) and (61) in terms of predictable projections. For any (F,P)-predictable process A of finite variation, we denote by Ap,FY its (FY,P)-dual-predictable projection.1

Proposition 25. The integrands in decompositions Eqs. (60) and (61) satisfy the following relation:


Here, Yp,FY denotes the (FY,P)-dual-predictable projection of YF and it is given by


Proof. First note that the (FY,P)-dual-predictable projection of the process YF coincides with the predictable quadratic variation of the process Y itself, computed with respect to its internal filtration, given in Eq. (51), since for any (FY,P)-predictable-(bounded) process φ, we have that E[0TφtdYtF]=E[0TφtdYtFY]. This proves Eq. (63).



By the Galtchouk-Kunita-Watanabe decomposition Eq. (60), we can write


where G˜t:=0t(θuFθu)dYu, for every t[0,T]. We observe that for every FY-predictable process φ the following holds:


By choosing φ = θ and applying the Cauchy-Schwarz inequality, we obtain


This implies that θΘ(FY)  Θ(F) and that G˜ is an (F,P)-martingale. Taking the conditional expectation with respect to FTY in Eq. (65) leads to




which provides the Galtchouk-Kunita-Watanabe decomposition Eq. (61) if we can show that the (FY,P)-martingale G^FY is strongly orthogonal to Y, that is, if for any (FY,P)-predictable-(bounded) process φ the following holds:


Note that orthogonality of the term E[U0F|FtY]E[U0F|F0Y]+E[GTF|FtY] follows by the orthogonality of GF and Y. Moreover, we have


and by Eq. (64)


which proves strong orthogonality.

Theorem 26 shows the relation between the Galtchouk-Kunita-Watanabe decompositions and the optimal strategies under full and partial information.

Theorem 26. i. Every contingent claim ξL2(FTY,P) admits a unique F-risk-minimizing strategy ψ*,F=(θ*,F,η*,F), explicitly given by


where Vt(ψ*,F)=E[ξ|Ft] for every t[0,T], with minimal cost


Here, θF, U0F, and GF are given in Definition 24 part a.

ii. Moreover, it also admits a unique FY-risk-minimizing strategy ψ*,F=(θ*,F,η*,FY), explicitly given by


where Vt(ψ*,FY)=E[ξ|FtY] for every t[0,T], with minimal cost


and θFY, U0FY and GFY are given in Definition 24 part b.

Proof. The proof of part i. is given, for example, in Ref. [28, Theorem 2.4]. For part ii., note that using the martingale representation of Y with respect to its inner filtration given in Eq. (48) and the fact that ξL2(FTY), it is possible to reduce the partial information case to full information and apply again [28, Theorem 2.4]. □

Proposition 25 helps us in the computation of the optimal strategy under partial information. Indeed, it is sufficient to compute the corresponding strategy θ*,F under full information and the Radon-Nikodym derivative given in Eq. (62). To get more explicit representations, we assume that the payoff of the contingent claim has the form ξ=H(T,YT), for some function H:[0,T]×R+R. Let LX,Y denote the Markov generator of the pair (X, Y), that is


for every fCb1,2,2([0,T]×R×R+), where


By the Markov property, we have that for any t[0,T] there exists a measurable function h(t,x,y) such that


If the function h is sufficiently regular, for instance hCb1,2,2([0,T]×R×R+), we can apply Itô’s formula and get that


where Mh is the (F,P)-martingale given by


By Eq. (79), the process {h(t,Xt,Yt),t[0,T]} is an (F,P)-martingale. Then, the finite variation term vanishes, which means that the function h satisfies LX,Yh(t,Xt,Yt)=0, P-a.s. and for almost every t[0,T]. The next proposition provides the risk-minimizing strategy under partial information.

Proposition 27. Assume hCb1,2,2([0,T]×R×R+). Then the first components θ*,F and θ*,FY of the risk-minimizing strategies under full and partial information are given by


respectively, where the function g(t, x, y) is


Proof. Consider decomposition Eq. (60) for ξ=H(T,YT). Then, conditioning on Ft we get


Taking the covariation with respect to Y and F, we obtain


On the other hand, h(t,Xt,Yt)=Mth, then taking Eqs. (81) and (44) into account we get that


where g(t, x, y) is given in Eq. (84). Hence, by Eqs. (50) and (87), we may represent θ*,F as


Note that by Eq. (51) and


applying Eq. (62) we get representation Eq. (83).

Our ultimate objective in this section is to investigate on the relation between costs of the F-optimal strategy and the FY-optimal strategy, or equivalently the associated risk processes.

It clearly holds that θ*,FYΘ(F), and then the FY-risk-minimizing strategy is also an F-strategy. Considering the corresponding risks, we have


and then E[RtF(ψ*,F)]E[RtFY(ψ*,FY)], for every t[0,T]. In the remaining part of the paper, we assume that F0Y=F0={Ω,Ø}, and we wish to measure the difference in the total risk taken by an informed investor, endowed with a filtration F, and a partially informed investor, whose information is described by FY. Precisely, we compute the difference R0FY(ψ*,FY)R0F(ψ*,F). By decompositions Eqs. (60) and (61), we have that CT(ψ*,F)C0(ψ*,F)=GTF and CT(ψ*,FY)C0(ψ*,FY)=GTFY and also


since F0Y=F0={Ω,Ø}, U0F=U0FY. Then computing the square of GTFY and taking the expectation we get


It follows from Itô isometry and the fact that GF is orthogonal to Y, that


Then the difference that we want to evaluate becomes


Using Eq. (62) and the definition of FY-dual-predictable projections, we have that


which implies


Plugging in the expressions for the optimal strategies given in Eqs. (82) and (83), respectively, and denoting Σ(t,Xt,Yt):=Yt2(σ2(t,Yt)+Zz2λ(t,Xt,Yt)φ(t,Xt,Yt,dz)), we have


for some C > 0, where the inequality follows by Assumption 18, and in the last equality, we used E[0T2g(t,Xt,St)πt(g)dt]=E[0T2πt(g)2dt].

We can conclude by saying that we found an upper bound for the expected difference between the total risks taken by an informed investor and a partially informed one which is directly proportional to the mean-squared error between the process {g(t,Xt,St),t[0,T]} and its filtered estimate π(g)={πt(g),t[0,T]}.


  1. 1. Kushner H. On the differential equations satisfied by conditional probability densities of Markov processes, with applications. Journal of the Society for Industrial and Applied Mathematics, Series A: Control. 1964;2(1):106-119
  2. 2. Kushner H. Dynamical equations for optimal nonlinear filtering. Journal of Differential Equations. 1967;3(2):179-190
  3. 3. Zakai M. On the optimal filtering of diffusion processes. Probability Theory and Related Fields. 1969;11(3):230-243
  4. 4. Lipster RS, Shiryaev A. Statistics of Random Processes I. Springer-Verlag; Berlin Heidelberg, 1977
  5. 5. Kallianpur G. Stochastic Filtering Theory. Springer; Springer Verlag New York, 1980
  6. 6. Elliott RJ. Stochastic Calculus and Applications. Springer; Berlin Heidelberg New York, 1982
  7. 7. Kurtz TG, Ocone D. Unique characterization of condition distribution in nonlinear filtering. Annals of Probability. 1988;16:80-107
  8. 8. Bhatt AG, Kallianpur G, Karandikar RL. Uniqueness and robustness of solution of measure-valued equations of nonlinear filtering. The Annals of Probability. 1995;23(4):1895-1938
  9. 9. Brémaud P. Point Processes and Queues. Springer-Verlag; New York, 1980
  10. 10. Kliemann WH, Koch G, Marchetti F. On the unnormalized solution of the filtering problem with counting process observations. IETIT. 1990;36:1415-1425
  11. 11. Ceci C, Gerardi A. Filtering of a Markov jump process with counting observations. Applied Mathematics & Optimization. 2000;42:1-18
  12. 12. Frey R, Runggaldier W. A nonlinear filtering approach to volatility estimation with a view towards high frequency data. International Journal of Theoretical and Applied Finance. 2001;4(2):199-210
  13. 13. Ceci C, Gerardi A. A model for high frequency data under partial information: A filtering approach. International Journal of Theoretical and Applied Finance. 2006;9(4):1-22
  14. 14. Ceci C. Risk minimizing hedging for a partially observed high frequency data model. Stochastics: An International Journal of Probability and Stochastic Processes. 2006;78(1):13-31
  15. 15. Frey R, Runggaldier W. Pricing credit derivatives under incomplete information: A nonlinear-filtering approach. Finance and Stochastics. 2010;14:495-526
  16. 16. Frey R, Schimdt T. Pricing and hedging of credit derivatives via the innovation approach to nonlinear filtering. Finance and Stochastics. 2011;16(1):105-133
  17. 17. Ceci C, Colaneri K. Nonlinear filtering for jump diffusion observations. Advances in Applied Probability. 2012;44(3):678-701
  18. 18. Ceci C, Colaneri K. The Zakai equation of nonlinear filtering for jump-diffusion observations: Existence and uniqueness. Applied Mathematics & Optimization. 2014;69(1):47-82
  19. 19. Elliott R, Malcolm W. Discrete-time expectation maximization algorithms for Markov-modulated Poisson processes. IEEE Transactions on Automatic Control. 2008;53(1):247-256
  20. 20. Björk T, Davis M, Landén C. Optimal investment under partial information. Mathematical Methods of Operations Research. 2010;71(2):371-399
  21. 21. Ceci C, Colaneri K, Cretarola A. Local risk-minimization under restricted information to asset prices. Electronic Journal of Probability. 2015;20(96):1-30
  22. 22. Ceci C, Colaneri K, Cretarola A. Hedging of unit-linked life insurance contracts with unobservable mortality hazard rate via local risk-minimization. Insurance: Mathematics and Economics. 2015;60:47-60.
  23. 23. Ceci C, Gerardi A. Pricing for geometric marked point processes under partial information: entropy approach. International Journal of Theoretical and Applied Finance. 2009;12:179-207
  24. 24. Frey R. Risk minimization with incomplete information in a model for high-frequency data. Mathematical Finance. 2000;10(2):215-222
  25. 25. Nagai H, Peng S. Risk-sensitive dynamic portfolio optimization with partial information on infinite time horizon. Annals of Applied Probability. 2000;12:173-195
  26. 26. Bäuerle N, Rieder U. Portfolio optimization with jumps and unobservable intensity process. Mathematical Finance. 2007;17(2):205-224
  27. 27. Föllmer H, Sondermann D. Hedging of non redundant contingent claims. In: Hildenbrand W, Mas-Colell A, editors. Contribution to Mathematical Economics. North Holland, Amsterdam New York Oxford Tokyo; 1986. pp. 205-223
  28. 28. Schweizer M. A guided tour through quadratic hedging approaches. In: Jouini E, Cvitanic J, Musiela M, editors. Option Pricing, Interest Rate and Risk Management. Cambridge University Press; Cambridge, 2001. pp. 538-574
  29. 29. Schweizer M. Risk minimizing hedging strategies under partial information. Mathematical Finance. 1994;4:327-342
  30. 30. Ceci C, Cretarola A, Russo F. GKW representation theorem under restricted information. An application to risk-minimization. Stochastics and Dynamics. 2014;14(2):1350019 (p. 23)
  31. 31. Jacod J, Shiryaev A. Limit Theorems for Stochastic Processes. 2nd ed. Berlin: Springer; 2003
  32. 32. Protter P, Shimbo K, Ethier SN, Feng J, Stockbridge RH eds, No arbitrage and general semimartingales. In: Markov Processes and Related Topics: A Festschrift for Thomas G. Kurtz. Institute of Mathematical Statistics; Beachwood, Ohio, USA, 2008. pp. 267-283


  • We call (FY,P)- dual predictable projection of a process A the FY-predictable finite variation process Ap,FY such that for any FY-predictable-bounded process φ we haveE[∫0TφsdAs]=E[∫0TφsdAsp,FY]

Written By

Claudia Ceci and Katia Colaneri

Submitted: November 15th, 2016 Reviewed: June 8th, 2017 Published: November 2nd, 2017