Open access

Fractional Brownian Motions in Financial Models and Their Monte Carlo Simulation

Written By

Masaaki Kijima and Chun Ming Tam

Submitted: 05 August 2012 Published: 06 March 2013

DOI: 10.5772/53568

From the Edited Volume

Theory and Applications of Monte Carlo Simulations

Edited by Victor (Wai Kin) Chan

Chapter metrics overview

4,230 Chapter Downloads

View Full Metrics

1. Introduction

Fractional Brownian motion (fBm) was first introduced within a Hilbert space framework by Kolmogorov [1], and further studied and coined the name ‘fractional Brownian motion’ in the 1968 paper by Mandelbrot and Van Ness [2]. It has been widely used in various scientific fields, most notability in hydrology as first suggested in [3]. It also plays an important role in communication technology by enriching the queuing theory in terms of simulating real network traffic.

In recent years, it has been steadily gaining attention in the area of finance, as it appears that the traditional stochastic volatility model driven by ordinary Brownian motion implies geometric time-decay of the volatility smile, which is much faster than in what we see in real market data (which has a hyperbolic decay). Such stylized feature is called the volatility persistence, see [4] for detail discussion about the statistical test for the existence of long-range memory (author has also proposed a robust extension of the R / S statistics for that particular purpose). Several modeling approaches have been suggested capturing this persistence in conditional variance either via a unit-root or long memory process. In order to keep the pricing-framework largely intact, it is more interesting to study the long memory process, and fBm has a particular good match due to its similarity to the ordinary Brownian motion and its Gaussian properties.

In this chapter, we will outline several approaches to simulate fractional Brownian motion with H > 1 / 2 , including the exact methods and approximate methods, where the Hurst Index H is a parameter used in literature to generalize Brownian motion into fractional Brownian motion, first made popular by Benoit Mandelbrot, which we will give a detail definition in Definition 1.1. We also provide a brief introduction of the truncated fractional Brownian motion (long-memory model in continuous time) as proposed in [5,6].

1.1. Financial motivation and backgrounds

Numerous empirical studies have pointed out that, in options markets, the implied volatility back-out from the Black-Scholes equation displays volatility skews or smiles; the smile effect, which is well known to practitioners, refers to the U-shape price distortion on the implied volatility surface.

In Hull and White [7] and Scott [8], they have proposed this feature of volatility to be captured by stochastic regime, known as the stochastic volatility model:

{ d S ( t ) S ( t ) = μ ( t , S t ) d t + σ ( t ) d B 1 ( t ) d ( ln σ ( t ) ) = k ( θ ln ( σ ( t ) ) d t + ν d B 2 ( t ) ) E1

Here, S t is the asset price and σ ( t ) is the instantaneous volatility at time t, and { B 1 ( t ) , B 2 ( t ) } are ordinary standard Brownian motions. Hull and White [7] have shown that, the price of European option at time t of exercise date T is the conditional expectation of the Black Scholes option pricing formula where the constant volatility from the original formula is replaced by the quadratic average over the period [ t , T ] :

σ t , T 2 = 1 T - t t T σ 2 u d u   E2

Such models are successful at capturing the symmetric smile and skewness of the implied volatility by imposing relations between the driving Brownian motions in (1) (symmetric smile is explained by independence while skewness can be explained by linear correlation).

Due to the temporal aggregation effect evident in (2), however, the smile effect deteriorates along with time-to-maturity since the temporal aggregation gradually erases the conditional heteroskedasticity; in the standard stochastic volatility setup, this particular decay is much faster than what is observed in market data. The phenomenon of slow decaying volatility smile is known as the volatility persistence (long-range dependence of volatility process). This phenomenon is particularly poignant for high frequency data, for which the conditional variance process displays near unit-root behavior.

Furthermore, we emphasize the existence of such phenomenon collaborated by large quantities of researches, pointing out that the conditional volatility of asset returns also displays long range dependence: [9-12] have discussed extensively the evidence of such phenomenon in empirical data. Bayesian estimation in [13] of stochastic volatility models shows similar patterns of persistence.

Motivated by this inadequacy, long-memory process was deemed more appropriate enrichment for this purpose. Hence, fractional Brownian motion is a prime candidate among all long-memory process given its tractability and similarity with the ordinary Brownian motion: both the fractional Brownian motion and ordinary Brownian motion are self-similar with similar Gaussian structure. For discussions of estimation and evidence of the long-range dependence in conditional variance of asset returns, the reader is referred to [10] and section 3.1 of [14].

Now, we provide a formal definition of fractional Brownian motion (fBm). We adopt the definition as given in [15].

Definition 1.1 Let H ( 0,1 ) be a constant. A fractional Brownian motion { B H t } t 0 with Hurst index H is a continuous and centered Gaussian process with covariance function

E [ B H ( t ) B H ( s ) ] = 1 2 ( t 2 H + s 2 H | t s | 2 H ) E3

In particular, for H = 1 2 , it reduces to the ordinary Brownian motion with E [ B H ( t ) B H ( s ) ] = min ( t , s ) .

From equation (3) we have the following properties:

  1. B H 0 = 0 and E B H t = 0   ,   t 0 .

  2. B H ( ) has stationary increment: B H t + s - B H ( s ) has the same distribution as B H ( t ) for any s , t 0 .

  3. B H is self-similar, meaning that B H T t has the same distribution law as T H B H t .

  4. B H ( ) is a Gaussian process with the variance E B H t 2 = t 2 H ,   t 0 .

  5. B H has continuous trajectories.

Fractional Brownian motions are divided into three very different categories: H < 1 2 ,   H = 1 2 ,   H > 1 2 . This is of particular importance because there is a deterministic difference between the case of H < 1 2 and H > 1 2 , as we introduce the mathematical notion of long-range dependence.

Definition 1.2 (Long-range dependence) A stationary sequence X n n N exhibits long-range dependence if the autocovariance function γ n c o v X k , X k + n satisfies

lim n γ n c n - α = 1   E4

This can be written as n ~ n - α ,

In this case, for some constants c and α ( 0,1 ) , the dependence between X k and X k + n decays slowly as n and

n = 1 γ n =

If we set X k = B H k - B H k - 1 and X k + n = B H k + n - B H k + n - 1 and apply equation (3), we have

γ H n = 1 2 n + 1 2 H + n - 1 2 H - 2 n 2 H

where γ H n =   c o v B H k - B H k - 1 , B H k + n - B H k + n - 1 . In particular,

lim n γ H n H 2 H - 1 n 2 H - 2 = 1

Therefore, we conclude that: for H > 1 / 2 , n = 1 γ H n = , and for H < 1 / 2 , n = 1 γ H n < . Hence, only in the case of H > 1 / 2 , fractional Brownian motions display long-memory dependence.

As pointed out in [16], large lags difference between γ ( ) may be difficult to estimate in practice, so that models with long-range dependence are often formulated in terms of self-similarity. Self-similarity allows us to extrapolate across time scales and deduce long time behavior from short time behavior, which is more readily observed.

Because we are interested in capturing the long-memory phenomenon observed in financial markets, the rest of this chapter will only concern the case of H > 1 / 2 .

1.2. Stochastic integral representation

In the original paper [2], Mandelbrot and van Ness represent the fBm in stochastic integral with respect to the ordinary Brownian motion:

B H t = 1 Γ H + 1 2 - 0 t - s H - 1 2 - - s H - 1 2 d B s + 0 t t - s H - 1 2   d B s   E5

where Γ ( ) is the gamma function.

They have also included an alternative representation of the fBm which is the basis of the model in [5,6]:

B ^ H t = 0 t t - s H - 1 2 Γ H + 1 2 d B s   E6

This version of fBm is ‘truncated’ in the sense that the integration from negative infinity to zero in equation (5) is truncated. We will refer to the model (6) as the ‘truncated fractional Brownian motion’ in the rest of this chapter. As pointed out in [2], the representation (6) was first proposed by Paul Lévy to define the fBm by the Riemann-Liouville fractional integral, while the original integral in equation (5) is the Weyl fractional integral.

The definition (6) of fBm is in general not asymptotically covariance-stationary, even though it retains self-similarity. For further technical discussion and rigorous definition of the truncated fractional Brownian motion, we refer the reader to [5].

Given the difference, if one employs analytical tools such as Malliavin calculus should be applied with care and note the differences between the two versions of fBM. [17] offers an in-depth discussion of the theoretical differences.

Advertisement

2. Fractional brownian motions in financial models

We first look at several examples that utilize the fractional Brownian motions in the realm of financial modeling.

2.1. Asset price model

In the previous section, we mention that the motivation of fBms in finance models is to capture the long-range dependence in the volatility process. However, it is worth discussing the possibility of applying it to the asset process itself.

In practice, it is considered that an asset process driven by fBm will result in arbitrage. The idea is that, since for H 1 / 2 , B H ( t ) is not a semimartingale in continuous time, the asset process described by B H ( t ) violates the NFLVR (no free lunch with vanishing risk), a weaker version of arbitrage, and thus doesn’t admit an equivalent martingale measure according to Theorem 7.2 of [18]. Such findings and construction of arbitrage can be found in Rogers [19].

In contrast, Cheridito [20] has given multiple classes of trading strategies that allow various level of arbitrages (NFLVR, arbitrage in the classical sense and strong arbitrage) under fBm driven assets, and shown that the arbitrages are all eliminated if the inter-transaction time is not zero, i.e., the classes of strategies become arbitrage-free. Such assumption is reasonable in practice, given the physical limit and non-zero transaction cost. For more information on arbitrage theory and its generalization, the reader is referred to [18-20].

2.2. Stochastic volatility model

As we have mentioned in the introductory section, the main motivation for fBm is to capture the volatility persistence, the stylized feature observed in empirical data. There are several prominent models involving a volatility process with fBm. Here, we just outline several of them.

2.2.1. Long-memory continuous model

In [5, 6], Comte and Renault consider the following stochastic volatility model driven by fBm:

{ d S ( t ) S ( t ) = r d t + σ ( t ) d B ( t ) σ ( t ) = σ 0 e x ( t ) d x ( t ) = k x ( t ) d t + ν d B ^ H ( t ) E7

where the log-volatility term follows a fractional-OU process driven by the truncated fBm (6). The asset price process follows a geometric Brownian motion with volatility persistence.

Although Mandelbrot [2] deemed it as signifying the origin too heavily, the model (7) is easier and more robust than the ordinary fBms from the perspective of simulation. A simulation example is explored in Section 5.

2.2.2. Affine fractional stochastic volatility model

In [21], Comte et al. assume that the squared-volatility process follows

σ 2 t = θ + X α t   E8

where X α t is defined by the fractional integral:

X α t = - t t - s H - 1 / 2 Γ H + 1 / 2 x s d s   E9

Similar treatment of the fractional integration is outlined in [5]. The affine structure in (8) is similar to the one originally studied by Duffie et al. [22].

The affine structure is adopted for the extra tractability, and thus better suited for practical option pricing and hedging than the original idea (7). In fact, Comte et al. [21] have shown that this model can better depict the difference between the short and long memory properties in the resulting option prices.

2.2.3. Martingale expansion

Fukasawa [23] adopts and expands the asymptotic expansion technique first proposed by Yoshida [24] of European option prices around the Black-Scholes equation by means of perturbation technique and partial Malliavin calculus. It is shown that the logarithm of the asset process can be expressed as

l n S t = Z t = r t - 1 2 0 t g Y s n 2 d s + 0 t g Y s n θ d W s ' + 1 - θ 2 d W s

with

Y s n = y + ϵ n W s H ,   W t H = 0 t K H t , s d W s '

Here, r is the riskless rate, θ ( - 1,1 ) , y∈ℝ is a constant, ( W , W ' ) is a 2-dimensional standard (independent) Brownian motion, ϵ n 0 is a deterministic sequence for n , and g ( ) is the stochastic volatility process which is an adapted process for the minimal filtration.

Note that W t H is a fractional Brownian motion with Hurst index H, where K H t ( t , s ) is the kernel of the stochastic integral representation over a finite interval of Brownian motion. According to [15], pertaining to our interest, for the case of H > 1 / 2 , the kernel has the following expression:

K H t , s = c H s 1 / 2 - H s t u - s H - 3 / 2 u H - 1 / 2 d u

where

c H = H 2 H - 1 β 2 - 2 H , H - 1 / 2 1 / 2

Then, according to Corollary (2.6) of Fukasawa [23], the implied volatility can be expressed as

σ { 1 ϵ n 2 ρ 13 d 2 } + o ( ϵ n ) = a T H 1 / 2 log ( K / S ) + σ + b T H + 1 / 2 + o ( ϵ n ) E10

where d 2 is the typical argument in the N ( d 2 ) of the Black-Scholes formula, and

ρ 13 = 2 θ g ' y c H ' T H g y   ,   σ = g y ,   a = θ g ' y c H ' σ ϵ n ,   b = - a r - σ 2 2 E11

Equation (10) can be seen as an approximation ϵ n is the model parameter calibrated from market data. This shape of this approximation remains largely stationary, to check the accuracy of the approximation, we have no means but performing Monte Carlo simulation of fBm.

Advertisement

3. Simulation with exact methods

First, we look at methods that completely capture the covariance structure and true realization of the fractional Brownian motion (fBm) or fractional Gaussian noise (fGn). Any method described in this section has their starting point at the covariance matrix. The approximate scheme we see later is merely numerically close to the value of fBm (or fGn) or asymptotically coincides with it. The collection of algorithm in this section is not meant to be exhaustive. For more algorithm and discussion, see [25].

3.1. Hosking method

The Hosking method utilizes the well-known conditional distribution of the multivariate Gaussian distribution on a recursive scheme to generate samples based on the explicit covariance structure. This method generates a general stationary Gaussian process with given covariance structure, not limited to generating fBms.

More specifically, this algorithm generates an fGn sequence { Z k } and fBm is recovered by accumulative sum. That is, the distribution of Z n + 1 conditioned on the previous realization Z n , Z 1 , Z 0 can be explicitly computed.

Denote γ ( k ) as the autocovariance function of the zero-mean process:

γ ( k ) : = E ( X n X n + k )

where we assume for convenience that γ 0 = 1 .   For n , k = 0,1 , 2 , we have the following recursive relationship for the n + 1 × ( n + 1 ) autocovariance matrix Γ n = γ i - j i , j = 0 , , n :

Γ n + 1 = 1 c n ' c n Γ n   E12
= Γ ( n ) F ( n ) c ( n ) c ( n ) ' F ( n ) 1   E13

where c n is the ( n + 1 ) -column vector with elements c n k = γ k + 1 ,   k = 0 , , n and F n = 1 i = n - j i , j = 0 , , n is the n + 1 × ( n + 1 ) ‘mirrored’ identity matrix

F n = 0 0 1 0 1 0 1 0 0 0

Theorem 3.1 (Multivariate Gaussian distribution) Any multivariate Gaussian random vector z can be partitioned into z 1 and z 2 with the mean vector and covariance matrix with the corresponding partition:

μ = μ 1 μ 2   Σ = Σ 11 Σ 12 Σ 21 Σ 22   E14

The distribution of z 1 conditioned on z 2 = a is a multivariate normal z 1 z 2 = a ~ N ( μ ¯ , Σ ¯ ) with

μ ¯ = μ 1 + Σ 12 Σ 22 - 1 a - μ 2 E15
Σ ¯ = Σ 11 - Σ 12 Σ 22 - 1 Σ 21   E16

If we substitute equation (12) into the partition in (14) with Σ 11 = 1 ,   μ = 0 , we have the following expression for the conditional distribution:

μ n + 1 = E ( Z n + 1 | Z n , , Z 0 ) = c ( n ) ' Γ ( n ) 1 ( Z n Z 1 Z 0 ) E17
σ n + 1 2 = V a r Z n + 1 Z n , , Z 0 = 1 - c ( n ) ' Γ n - 1 c ( n )   E18

With Z 0 ~ N ( 0,1 ) , subsequently X n + 1 for n = 0,1 , can be generated.

Taking the inverse of Γ at every step is computational expensive; the algorithm proposed by Hosking [26] computes the inverse Γ n + 1 - 1 recursively. The next result is due to Dieker [25].

Proposition 3.1 (Hosking algorithm for simulating fGn) Define d n = Γ n - 1 c ( n ) , and applying the blockwise method of inversion on equation (13):

Γ n + 1 = 1 σ n 2 σ n 2 Γ n - 1 + F n d n d ( n ) ' F ( n ) - F n d ( n ) - d ( n ) ' F ( n ) 1   E19

where σ n + 1 2 satisfies the recursion

σ n + 1 2 = σ n 2 - γ n + 1 - τ n - 1 2 σ n 2   E20

with τ n d n ' F n c n = c n ' F n d ( n ) . Also, the recursion for d n + 1 = Γ n + 1 - 1 c ( n + 1 ) is obtained as

d n + 1 = d n - ϕ n F n d ( n ) ϕ n

where

ϕ n = γ n + 2 - τ n σ n 2

With μ 1 = γ 1 Z 0 ,   σ 1 2 = 1 - γ 1 2 ,   τ 0 = γ 1 2 , μ n + 1 , σ n + 1 2 , τ n + 1 can be readily computed, and fractional Brownian motion is recovered by the cumulative summation.

This algorithm is also applicable to non-stationary processes (see [27] for details). Even though this algorithm is very simple and easy to understand and sample paths can be generated on-the-fly, the complexity of this algorithm is of O ( N 2 ) and computational (as well as memory) expense of this algorithm grows at a prohibitive speed.

3.2. Cholesky method

Given that we are dealing with the covariance structure in matrix form, it is natural to go with the Cholesky decomposition: decomposing the covariance matrix into the product of a lower triangular matrix and its conjugate-transpose Γ n = L n L n * . If the covariance matrix is proven to be positive-definite (the situation will be addressed in the next subsection), L ( n ) will have real entries and Γ n = L n L n ' .

Suppose that in matrix form the n + 1 × ( n + 1 ) product is given by

γ ( 0 ) γ ( 1 ) γ ( 2 ) γ ( n ) γ ( 1 ) γ ( 0 ) γ ( 1 ) γ ( n - 1 ) γ 2 γ ( 1 ) γ ( 0 ) γ ( n - 2 ) γ ( n ) γ ( n - 1 ) γ ( n - 2 ) γ ( 0 ) = l 00 0 0 0 l 10 l 11 0 0 l 20 l 21 l 22 0 l n 0 l n 1 l n 2 l n n × l 00 l 10 l 20 l n 0 0 l 11 l 21 l n 1 0 0 l 22 l n 2 0 0 0 0 l n n

It is easy to see that l 00 2 = γ 0 and that l 10 l 00 = γ ( 1 ) and l 10 2 + l 11 2 = γ ( 0 ) on i = 1 (2nd row). For i 1 , the entries of the lower triangular matrix can be determined by

l i , 0 = γ i l 0,0

l i , j = 1 l j , j γ i - j - k = 0 j - 1 l i , k l j , k ,   0 < j n

l i , i 2 = γ 0 - k = 0 i - 1 l i , k 2

Given independent, identically distributed (i.i.d.) standard normal random variables V i i = 0 , , n + 1 , the fGn sequence is generated by

Z n + 1 = k = 0 n + 1 l n + 1 , k V k

Or in matrix form, we have Z ( n ) = L ( n ) V ( n ) . If Γ ( n ) is assumed to be positive-definite, the non-negativity of l i , i 2 is guaranteed and L ( n ) is guaranteed to be real. The covariance structure of the process is captured, since

C o v Z ( n ) = C o v L n V n = L n C o v V n L n ' = L n L n ' = Γ n   E21

Even though the Cholesky method is easy to understand and implement, the computation time is O ( N 3 ) , which renders this scheme extremely uneconomical in practice. To resolve this problem, we will proceed to another exact method. The idea is similar to retain the same relation as equation (21), but with a different decomposition.

3.3. Fast fourier transform method

As we have seen from the last section, using the Cholesky decomposition seems to be the most straightforward idea to simulate Gaussian process with a given covariance structure; but, it also is the most rudimentary and thus slow. In order to improve upon the speed, the idea of utilizing the fast Fourier transform (FFT) was proposed by Davies and Harte [28] and further generalized by Dietrich and Newsam [29].

Similar to the idea before, this method tries to find a decomposition of the covariance matrix as Γ = G G ' and the sample is generated by y = G x for given standard normal random variable x. Then, on the given covariance structure, we have

C o v y = C o v G x = G   C o v x G ' = G G ' = Γ

The idea is to 'embed' the original covariance matrix a circulant matrix in order to carry out the FFT. Before we outline the idea, we shall give out some detail of the linkage between Fourier transform and the circulant matrix.

Definition 3.1 (Circulant matrix) Circulant matrix is a special case of the Toeplitz matrix where each row vector is shifted to the right (the last element is shifted back to the beginning of the row). In matrix form, an n-by-n circulant matrix can be written as

C = c 0 c n - 1 c n - 2 c 1 c 1 c 0 c n - 1 c 2 c 2 c 1 c 0 c 3 c n - 1 c n - 2 c 1 c 0

Remark: As one can see, the first row/column completely describes the whole matrix, and it can be put more succinctly in the following form:

c j , k = c j - k m o d   n ,   where   0 j , k n - 1

Note that the indices range from 0 to n-1 instead of the usual convention that ranges from 1 to n.

Definition 3.2 (Generating circulant matrix) We define an n-by-n generating circulant matrix by

G = 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0

By a simple calculation, we can see that the ‘square’ of the generating circulant matrix is given by

G 2 = 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0

From the point of view of row and column operation of the matrix, this can be seen as each row of the matrix being shifted one element forward, where the bumped element is replaced to the end of the row (it can also be thought of as the whole row is shifted down and the bumped row is placed back on top, but this is irrelevant to our interest). Arbitrary power can be deduced accordingly; this operation has a cycle of n iterations.

The generating circulant matrix is served as our building block. Looking back at our original circulant matrix, we have a corresponding polynomial

p x = c 0 + c 1 x + c 2 x 2 + + c n - 1 x n - 1   E22

Then, the original circulant matrix C can be expressed as

C = c 0 c n - 1 c n - 2 c 2 c 1 c 1 c 0 c n - 1 c 3 c 2 c 2 c 1 c 0 c 3 c 3 c 2 c 1 c n - 1 c n - 1 c n - 2 c n - 3 c 1 c 0

C = p G = c 0 1 + c 1 G + c 2 G 2 + + c n - 1 G n - 1   E23

This can be verified by doing the row-operation of arbitrary power on G as shown above. It can be shown that each operation is one-element sub-diagonal compared to the previous power.

Definition 3.3 (Fourier matrix) The Fourier matrix is introduced as

F = 1 1 1 1 1 1 ξ ξ 2 ξ n - 2 ξ n - 1 1 ξ 2 ξ 2 × 2 ξ n - 2 1 ξ 3 ξ 3 × 2 ξ 2 1 ξ n - 1 ξ n - 2 ξ 2 ξ

Here, we define the n-th unity root as ω = e 2 π i 1 n , and ξ = ω ¯ = e - 2 π i 1 n is the conjugate of the unity root.

The Fourier matrix can be defined using the positive argument ω instead of ξ . Also, as we will see later, some definition includes the normalizing scalar 1 n (or 1 n ). This is analogous to the continuous counterpart of the Fourier integral definition F f = - x t e - 2 π i f t d t or - x t e 2 π i f t d t , as long as the duality is uphold by the opposite sign in the exponential power in the inverse Fourier transform. This duality will be restated in the diagonalization representation of the circulant matrix later.

Proposition 3.2 If 1 n normalizes the Fourier matrix, then 1 n F is a unitary matrix. It is symmetric (i.e., F T = F ), and the inverse of the Fourier matrix is given by

F - 1 = n n F - 1 = 1 n 1 n F - 1 = 1 n 1 n F ¯ T = 1 n F ¯ = 1 n 1 1 1 1 1 1 ω ω 2 ω n - 2 ω n - 1 1 ω 2 ω 2 × 2 ω n - 2 1 ω 3 ω 3 × 2 ω 2 1 ω n - 1 ω n - 2 ω 2 ω

Proposition 3.3 If we multiply the Fourier matrix with the generating circulant matrix, we have

F G = 1 1 1 1 1 1 ξ ξ 2 ξ n - 2 ξ n - 1 1 ξ 2 ξ 2 × 2 ξ n - 2 1 ξ 3 ξ 3 × 2 ξ 2 1 ξ n - 1 ξ n - 2 ξ 2 ξ 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 = 1 1 1 1 1 ξ ξ 2 ξ n - 2 ξ n - 1 1 ξ 2 ξ 2 × 2 ξ n - 2 1 ξ 3 ξ 3 × 2 1 ξ 2 1 ξ n - 1 ξ n - 2 ξ 2 ξ 1

This is the same as shifting (rotating) the first column to the back of the matrix, and is also equivalent to multiplying the first row with ξ 0 , the 2nd row with ξ 1 , etc. In matrix operation, it can be seen as

F G = ξ 0 0 0 0 0 0 ξ 1 0 0 0 0 0 ξ 2 0 0 0 0 0 0 0 0 0 ξ n - 1 1 1 1 1 1 1 ξ ξ 2 ξ n - 2 ξ n - 1 1 ξ 2 ξ 2 × 2 ξ n - 2 1 ξ 3 ξ 3 × 2 ξ 2 1 ξ n - 1 ξ n - 2 ξ 2 ξ = Λ F

where Λ is the diagonal matrix with the k-th diagonal Λ k = ξ k ,   f o r   0 k n - 1 . It follows that

F G F - 1 = Λ   E24

That is, the Fourier matrix diagonalizes the generating circulant matrix with eigenvalues ξ k 0 k n - 1 .

Theorem 3.2 The circulant matrix is decomposable by the Fourier matrix, i.e.   C = F - 1 Λ F with eigenvalue matrix Λ = p ξ k k = 0 n - 1 . Also, with equation (23), the diagonalization of C can be written as

F C F ( 1 ) = F ( c 0 1 + c 1 G + c 2 G 2 + + c ( n 1 ) G n 1 ) F 1 = c 0 1 + c 1 ( F G F 1 ) + c 2 ( F G F 1 ) 2 + + c n 1 ( F G F 1 ) n 1 = ( p ( 1 ) 0 0 0 0 0 p ( ξ ) 0 0 0 0 0 p ( ξ 2 ) 0 0 0 0 0 0 0 0 0 p ( ξ n 1 ) )

Note that F G F - 1 2 = F G F - 1 F G F - 1 = F G G F - 1 = F G 2 F - 1 . The other powers can be deduced iteratively.

This theorem gives us the fundamental theoretical framework to build up the FFT exact simulation of fBms. The basic idea of the simulation is to embed the covariance matrix into a bigger circulant matrix to carry out the discrete Fourier transform as outlined above (with technique of FFT). Such technique is called Circulant Embedding Method (CEM), and is outlined in Dietrich and Newsam [29] and Perrin et al. [30].

Suppose that we need sample size of N (N should be a power of 2, i.e. N = 2 g for some g N for the sake of convenience when facilitating FFT). Generate the N-by-N covariance matrix Γ with entries Γ j , k = γ ( j - k ) , where γ is the covariance function given in the definition of fractional Gaussian noise (fGn), by

Γ = γ ( 0 ) γ ( 1 ) γ ( N - 1 ) γ ( 1 ) γ ( 0 ) γ ( N - 2 ) γ ( N - 1 ) γ ( N - 2 ) γ ( 0 )

The technique to simulate fGn with FFT is called the Circulant Embedding Method (CEM), generalized by Davies and Harte [28], and consists of embedding this covariance matrix into a bigger M-by-M (with M = 2N) circulant covariance matrix C such as

C = γ ( 0 ) γ ( 1 ) γ ( N - 1 ) 0 γ ( N - 1 ) γ ( N - 2 ) γ ( 2 ) γ ( 1 ) γ ( 1 ) γ ( 0 ) γ ( N - 2 ) γ ( N - 1 ) 0 γ ( N - 1 ) γ ( 3 ) γ ( 2 ) γ ( N - 1 ) γ ( N - 2 ) γ ( 0 ) γ ( 1 ) γ ( 2 ) γ ( 3 ) γ ( N - 1 ) 0 0 γ ( N - 1 ) γ ( 1 ) γ ( 0 ) γ ( 1 ) γ ( 2 ) γ ( N - 2 ) γ ( N - 1 ) γ ( N - 1 ) 0 γ ( 2 ) γ ( 1 ) γ ( 0 ) γ ( 1 ) γ ( N - 3 ) γ ( N - 2 ) γ ( 1 ) γ ( 2 ) 0 γ ( N - 1 ) γ ( N - 2 ) γ ( N - 3 ) γ ( 1 ) γ ( 0 )

where the covariance matrix is embedded on the top left hand corner. It is sufficient to point out that

C 0 , k = γ k ,   k = 0 , , N - 1   γ 2 N - k ,   k = N + 2 , , 2 N - 1  

Remark: As Perrin et al. [30] have pointed out, the size M can be M 2 ( N - 1 ) , and the case M = 2 ( N - 1 ) is minimal embedding. For any other choice of M , the choice of C 0 , N , ,   C 0 , M - N + 1 is arbitrary and can be conveniently chosen as long as the symmetry of the matrix is upheld; more zeros can be padded if M is bigger to make C circulant. For the rest of the chapter, we will concern ourselves with the case M = 2N.

From Theorem 3.2, we know that, given any circulant matrix, it can be decomposed as C = Q Λ Q * , where

Q j , k = 1 2 N exp - 2 π i j k 2 N ,   f o r   j , k = 0 , , 2 N - 1   E25

The matrix Λ is the diagonal matrix with eigenvalues

λ k = j = 0 2 N - 1 c 0 , j exp 2 π i j k 2 N ,   f o r   j , k = 0 , , 2 N - 1   E26

This differs slightly from the previous definition, but similar to the continuous counterpart; the sign of the exponential power in the Fourier transform is just conventional difference. The approach is identical as long as the duality is maintained. That is, if written in the form of C = Q Λ Q * , the sign of the exponential power of the component in Q and Λ should be opposite. In the case of the previous theorem where C = F - 1 Λ F , it is easy to check that F - 1   and Λ ( ξ ) indeed have the opposite sign in the exponential power.

It should be noted that C is not guaranteed to be positive-definite. Davies and Harte [28] suggest setting zero every negative value that may appear in Λ . In Perrin et al. [30], they prove that the circulant covariance matrix for fGn is always non-negative definite, so the approach is feasible without any modification. The reader is referred to Dietrich and Newsam [29] and Wood and Chan [31] for more detail on dealing with this issue.

Assuming that C is positive definite and symmetric, the eigenvalues are positive and real. The ‘square root’ of C is readily formed, S = Q Λ 1 / 2 Q * , where Λ 1 / 2 is the diagonal matrix with eigenvalues 1 , λ 1 , , λ 2 N - 1 . It is easy to check that S S * = S S ' = C . So, S has the desired properties we look for.

Theorem 3.3 (Simulation of fGn with FFT) The simulation of the sample path of fGn, we are going to simulate   y = S V , consists of the following steps:

  1. Compute the eigenvalues λ k k = 0 , , 2 N - 1 from equation (26) by means of FFT. This will reduce the computational time from O ( N 2 ) to O ( N l o g N ) .

  2. Calculate W = Q * V .

  3. Generate two standard normal random variables W 0 = V 0 1 and W N = V N 1

  4. For 1 j < N , generate two standard normal random variables V j 1 and V j 2 and let

  W j = 1 2   V j 1   + i   V j 2  

W 2 N - j = 1 2   V j 1   - i   V j 2  

  1. Compute Z = Q Λ 1 / 2 W . This can be seen as another Fourier transform of the vector Λ 1 / 2 W :

  2. Z k = 1 2 N j = 0 2 N - 1 λ j W j e x p ( - 2 π i j k 2 N )

  3. It is identical to carry out FFT on the following sequence:

w j = { λ 0 2 N V 0 ( 1 ) j = 0 ; λ j 4 N ( V j ( 1 ) + i V j ( 2 ) ) j = 1 , , N 1 ; λ N 2 N V N ( 1 ) j = N ; λ j 4 N ( V 2 N j ( 1 ) i V 2 N j ( 2 ) ) j = N + 1 , , 2 N 1 ;

Due to the symmetric nature of the sequence, the Fourier sum of { w j }   = Z k k = 0 2 N - 1 is real. The first N samples have the desired covariance structure. But, since the 2nd half of samples (N…2N-1) are not independent of the first N samples, this sample cannot be used.

  1. Recover fBm from the recursive relationship:

  2. B H 0 = 0 ;   B H i = B H i - 1 + Z i - 1   f o r   1 i N

Advertisement

4. Approximate methods

As we have seen in the previous section, the exact methods all take on the covariance structure matrix as starting point and try to reproduce the covariance structure by different decomposition. That can be time and resource consuming, so rather it is preferable to have approximation of the fractional Brownian motion that permits robust simulation.

In this section, we will start with the Mandelbrot representation due to historical reason and move onto several other methods that provide us with better understanding of the process and increasing robustness.

4.1. Mandelbrot representation

Recalling from Section 2.3, fractional Brownian motion permits a stochastic integral representation. To approximate equation (5), it is natural to truncate the lower limit from negative infinity to some point, say at –b:

B ~ H n = C H k = - b 0 n - k H - 1 2 - - k H - 1 2 B 1 k + k = 0 n n - k H - 1 2 B 2 k   E27

Note that the C H is not the same constant term as in equation (5), because one has to re-calculate the normalizing factor due to the truncation. As pointed out in [25], the recommended choice of b is N 3 / 2 . Even though this is a straightforward way to generate fractional Brownian motion, it is rather inefficient. It is included in this section for the sake of completeness.

4.2. Euler hypergeometric integral

fBm permits the stochastic integral form involving the Euler hypergeometric integral:

B H t = 0 t K H t , s d B s   E28

where B ( s ) is the standard Brownian motion and

K H t , s = t - s H - 1 2 Γ H + 1 2 F 2,1 H - 1 2 ; 1 2 - H ; H + 1 2 ; 1 - t s   E29

is the hypergeometric function, which can be readily computed by most mathematical packages. By discretizing (28), at each time index t j , we have

B H t j = n T i = 0 j - 1 t i t i + 1 K H t j , s d s δ B i   E30

where δ B i = T n Δ B i and Δ B i is drawn according to the standard normal distribution. This means that δ B i i = 1 n is the increments of a scaled Brownian motion on [ 0 , T ] with quadratic variation   T . The inner integral can be computed by the Gaussian quadrature efficiently.

4.3. Construction by correlated random walk

This particular algorithm proposed in [32] of constructing fBm relies on the process of correlated random walks and summation over generated paths. This is similar to the generation of ordinary Brownian method through summation of the sample paths of normal random walk, which is related to the central limit theorem.

Definition 4.1 (Correlated Random Walk) For any p [ 0,1 ] , denote X n p as the correlated random walk with persistence index p . It consists of a jump on each time-step with jump size of either +1 or -1 such that:

  • X 0 p = 0 ,   P X 1 p = - 1 = 1 2 , P X 1 p = + 1 = 1 2

  • n 1 , ϵ n p X n p - X n - 1 p which equals either + 1 or - 1

  • n 1 , P ϵ n + 1 p = ϵ n p σ X k p ,   0 k n = p

Theorem 4.1 For any m 1 , n 0 , we have

E [ ϵ m p ϵ m + n p ] = ( 2 p 1 ) n E31

In order to add additional randomness into the correlated random walks, we replace the constant persistence index p with a random variable   μ , and we denote the resulting correlated random walk as X n μ . Or, to put it more formally, denote by P p the law of X p for a given persistence index p . Now, consider a probability measure μ on [ 0,1 ] , which we call the corresponding probability law P μ , the annealed law of the correlated walk associated to μ . Note that P μ 0 1 P p d μ ( p ) .

Proposition 4.1 For all m 1 ,   n 0 , we have

E [ ϵ m μ ϵ m + m μ ] = 0 1 ( 2 p 1 ) n d μ ( p ) E32

The next result is due to Enriquez [32]. The proof is based on Lemma 5.1 of Taqqu [33].

Theorem 4.2 For 1 2 < H < 1 , denote by μ H the probability on [ 1 2 , 1 ] with density 1 - H 2 3 - 2 H 1 - p 1 - 2 H . Let X μ H , i i 1 be a sequence of independent processes with probability law P μ H . Then,

L D lim N L lim N c H X N t μ H , 1 + + X N t μ H , M N H M = B H t   E33

where c H = H 2 H - 1 Γ 3 - 2 H , L stands for the convergence in law, and L D means the convergence in the sense of weak convergence in the Skorohod topology on D [ 0,1 ] , the space of cadlag functions on [ 0,1 ] . Here, [ ] is the floor function and rounds the argument to the closest integer, M is the number of trajectories of correlated random walks and N is number of time steps.

Remark: For H=1/2, there is a similar expression (see [32]). The order of limit in equation (33) is important, because if reversed, the sum would result in 0. Theorem 4.2 is mostly for theoretical construction.

In [32], the above theorem is further simplified from double summations into a single summation by applying Berry-Essen bound: As long as M N is of order O N 2 - 2 H ,

L D lim N c H X N t μ H , 1 + + X N t μ H , M ( N ) N H M ( N ) = B H t   E34

In practice, any probability measure with moment equivalent to 1 n 2 - 2 H L ( n ) , where L is a slowly varying function, will be used. This could be shown by Karamata’s theorem, for which further elaboration is found in [33]. In [32], three families of equivalent measures are provided, and specifically the 2nd family of the measures μ H , k ' k > 0 is most appropriate for simulation purpose: For H > 1 2 , μ H , k ' has the density of 1 - 1 - U 1 k 1 2 - 2 H 2   with the corresponding normalizing factor c H , k ' = c H k . U is a standard uniform random variable. The error given by the Berry-Essen bound for H > 1 2   is given by

0.65 × D H N 1 - H / k M   E35

where D H = 6 2 H - 1 ( H + 1 ) ( 2 H + 1 ) . Here, k serves as a control variable of order k N = o ( N ) , and the error term contains 1 k   which can be seen as a way to restrict error with the price of distortion of the covariance relation in X N , though it is advisable to keep   k 1 . For more discussion on the choice of k, we refer to Section 4 of [32].

Theorem 4.3 (Simulation of fBm with correlated random walk) Simulating fBm with correlated random walk for the case of H > 1 2 consists of the following steps:

  1. Calculate M(N) by the tolerable error level from equation (35). Calculate N T , where is the floor function, and create time-index { t i } : { 1,2 , ,   N T } .

  2. Simulate M independent copies of μ H , k j j = 1 M = 1 - 1 - U 1 k 1 2 - 2 H 2 for M trajectories.

  3. Simulate M copies of correlated random walks:

    • If t i = 1 , ϵ 1 j = 2 * B e r n o u l l i 1 2 - 1 ,   X 1 j = ϵ 1 j

    • If t i > 1 , ϵ t j j = ϵ t j - 1 j * 2 * B e r n o u l l i μ H , k j - 1 ,   X t j j = X t j - 1 j + ϵ t j j

  4. At each t j , calculate

    • B H t j = c H ' X N t j μ H , 1 + + X N t j μ H , M N H M

Remark: For any given time-horizon T, it is easier to simulate the path of B H 1 with given resolution N and scale it to arrive at B H T = T H B H ( 1 ) .

This algorithm is interesting from a theoretical point of view, since it gives us a construction of fractional Brownian motion reflecting its ordinary Brownian motion counterpart with the help of central limit theorem. But, in practice, it might not be fast enough for simulation purpose that requires large number of simulated paths.

4.4. Conditional random midpoint displacement

This algorithm is put forth by Norros et al. [34] and uses the similar approach to compute the conditional distribution of the fractional Gaussian noises as we have seen in the Hosking algorithm in Section 3.1. The difference is that this algorithm does not completely capture the covariance of all the sample points, instead it chooses certain number of points of the generated samples and uses a different ordering to do conditioning on (recall that, in the Hosking method, the conditioning is done by chronological order).

Again we are interested in the stationary fractional Gaussian noises and will back out the fractional Brownian motion on the time interval [ 0,1 ] , which can be scaled back according to self-similarity relationship. We will first outline the idea of bisection method, which will be expanded into the conditional mid-point replacement scheme later on.

4.4.1. Bisection scheme and basic algorithm

In this section we adopt the notation in Norros et al. [34]: Z ( t ) is the fractional Brownian motion, and X i , j is the fractional Gaussian noise of a certain interval j in a given level i .

The idea is to simulate Z t on the interval [ 0,1 ] . First, given Z 0 = 0 and Z ( 1 ) with the standard normal distribution of N ( 0,1 ) , we compute the conditional distribution of Z 1 2 Z 0 , Z 1 ~ N 1 2 Z 1 , 2 - 2 H - 1 4 . The bisection involves the indices i and j, where i indicates the ‘level’ and j for the ‘position. Let

X i , j = Z j 2 - i - Z j - 1 2 - i , for i = 0,1 , 2 , . ,   j = 1 , 2 i

It is easy to see that, at any given level i , the interval [0,1] is divided into 2 i sub-intervals.

If we denote ( i - 1 ) th level as the ‘mother-level’, it will be divided twice finer in the next level. So given any interval on the mother level, it is easy to observe the relationship

X i , 2 j - 1 + X i , 2 j = X i - 1 , j   E36

Because of equation (36), it is enough to just generate X i , j for odd number j . So, let us proceed from left to right, assuming that the sample points X i , 1 , X i , 2 k   have already been generated k 0,1 , , 2 i - 1 - 1 . For the point X i , 2 k + 1 , we have

X i , 2 k + 1 = e i , k X i , max 2 k - m + 1,1 , , X i , 2 k , X i - 1 , k + 1 , , X i - 1 , min k + n , 2 i - 1 ' + v i , k U i , k   E37

where U i , k are i.i.d. standard Gaussian random variables i = 0,1 , ;   k = 0,1 , , 2 i - 1 - 1 . Equation (37) can be rewritten as

X i , 2 k + 1 = e ( i , k ) [ X i , max ( 2 k m + 1 , 1 ) , , X i , 2 k , X i 1 , k + 1 , , X i 1 , min ( k + n , 2 i 1 ) ] ' + v ( i , k ) U i , k = E [ X i , 2 k + 1 | X i , max ( 2 k m + 1 , 1 ) , , X i , 2 k , X i 1 , k + 1 , , X i 1 , min ( k + n , 2 i 1 ) ] E38
v i , k = V a r X i , 2 k + 1 X i , max 2 k - m + 1,1 , , X i , 2 k , X i - 1 , k + 1 , , X i - 1 , min k + n , 2 i - 1   E39

As mentioned before, this scheme conditions on a fixed number of past samples instead of the whole past, where the two integers ( m 0 ,   n 1 ) indicate the number of the intervals the expectation and variance are conditioned on, and is called RMD(m,n). Looking at ( 38 ) and ( 39 ) X i , max 2 k - m + 1,1 , , X i , 2 k ” indicates that there are at most m neighboring increments to the left of the interval in question ( X i , 2 k + 1 ) , and “ X i - 1 , k + 1 , , X i - 1 , min k + n , 2 i - 1 ” indicates that there are at most n neighboring increments to the right of the interval.

Denote by Γ i k the covariance matrix with X i , 2 k + 1 as the first entry, and X i , max 2 k - m + 1,1 , , X i , 2 k , X i - 1 , k + 1 , , X i - 1 , min k + n , 2 i - 1 as the rest of the entries. Then, we have

Γ i k = C o v X i , 2 k + 1 , X i , max 2 k - m + 1,1 , , X i , 2 k , X i - 1 , k + 1 , , X i - 1 , min k + n , 2 i - 1

where

C o v x 1 , x 2 , , x n = C o v ( x 1 , x 1 ) C o v ( x 1 , x 2 ) C o v ( x 1 , x n ) C o v ( x 2 , x 1 ) C o v ( x 2 , x 2 ) C o v ( x 2 , x n ) C o v ( x n , x 1 ) C o v ( x n , x 2 ) C o v ( x n , x n )

Similar to (14)-(16), we can partition Γ i k as

Γ i k = V a r ( X i , 2 k + 1 ) Γ i k ( 1,2 ) Γ i k ( 2,1 ) Γ i k ( 2,2 )

Note that Γ i k ( 1,2 ) = Γ i k 2,1 ' . Hence, we have

e i , k = Γ i k 1,2 Γ i k 2,2 - 1 E40
v i , k = Γ i k Γ i k ( 2,2 )   E41

By the stationarity of the increment of Z and by self-similarity, e i , k is independent of i and k when 2 k m and k 2 i - 1 - n (meaning that the sequence is not truncated by max and min ), and it only depends on i only when 2 i < m + 2 n .

4.4.2. On-the-fly RMD(m,n) generation

Norros et al. [34] further propose that, instead of having the previous level (i-1) completely generated first, partition and conditioning can be done ‘on-the-fly’, meaning that we can have multiple unfinished levels going at the same time. Unlike the previous RMD(m,n) scheme, the level here is defined differently.

First define the ‘resolution’ by δ , as the smallest interval that we will be dealing with in this scheme. Note that this is different from the previous sub-section where at the i-th level δ = 2 - i , which can be bisected finer indefinitely. In the on-the-fly RMD scheme, the minimum interval length is defined as δ .

At each level i , the interval [ 0,2 i δ ] is split finely into interval with length of δ and X i , k samples are generated on each point until all points within the interval are filled. Then, the trace is expanded to the next level i + 1 , the interval [ 0,2 i + 1 ] , and this procedure can be considered as ‘enlargement’. So, instead of having a pre-determined time-horizon and zooming in with twice-finer resolution in the original RMD scheme, the new RMD scheme has a pre-determined resolution and expand twice-fold the horizon at each level.

Within the same interval [ 0,2 i ] , the following rules are applied for the inner intervals:

  1. Must have n-1 (or all) right-neighboring intervals (of which, have the same length as the mother-interval).

  2. Also, there should be m (or all) left-neighboring intervals (of which, have the same length as the interval being considered).

  3. If there are intervals that satisfy both the conditions above, choose the one as left as possible.

  4. When all intervals are filled out, expand to the next level by ‘enlargement’.

Here we use

Y i , j = Z j 2 i δ - Z j - 1 2 i δ ,   i = 0,1 , ,   j = 1,2 ,   E42

instead of X i , j to avoid confusion with the original RMD scheme notation.

Similar to the ordinary RMD, we have the following equations from the conditional distribution of multivariate Gaussian processes: The enlargement stage is defined by

Y i , 1 = E [ Y i , 1 | Y i 1 , 1 ] + V a r [ Y i , 1 | Y i 1 , 1 ] U i , 1 = e ( i , 1 ) Y i 1 , 1 + v ( i , 1 ) U i , 1 E43

Inner interval points are constructed similar to the ordinary RMD scheme (the right-neighboring intervals are of level i+1 instead of i-1) as

Y i , 2 k + 1 = e i , k Y i , max 2 k - m + 1,1 , , Y i , 2 k , Y i + 1 , k + 1 , , Y i + 1 , min k + n , N i + 1 ' + v i , k U i , k   E44

where N i + 1 is the last generated increment on the (i+1)-th level. The order of splitting for the on-the-fly scheme is given in Figure 1, where its ordinary RMD counterpart’s splitting order is also given.

Norros et al. [34] have done an extensive comparison between on-the-fly RMD schemes in terms of accuracy and robustness compared to the FFT and aggregate methods. On-the-fly RMD and FFT are significantly faster than the aggregate method, and on-the-fly RMD can generate samples with no fixed time-horizon, while for FFT the whole trace has to be generated before it can be used. So, RMD seems superior in terms of flexibility.

4.5. Spectral method

In this subsection, we will look into the spectral method of approximating the fractional Gaussian noises, which has the origin from spectral analysis in physics: A time-domain can be transformed into a frequency-domain without loss of information through Fourier transform. With the typical Fourier-time series, the original input is deterministic and transformed into the spectral density that represents the magnitude of different frequencies in the frequency domain. It is possible to extend this approach to analyzing stochastic processes. Though it is impossible to study all realization, it is possible to analyze in a probabilistic/distribution sense by observing the expected frequency information contained in the autocovariance function.

Figure 1.

The order of splits by the ordinary RMD(m,n) scheme (top) and the on-the-fly RMD(m,n) scheme (bottom). Note that, for the on-the-fly scheme, the order changes according to the choice of n.

Spectral density is computed for frequencies, - π λ π , as

f λ = j = - γ j exp i j λ   E45

The γ ( ) here is the autocovariance function, which can be recovered by the inverse formula

γ j = 1 2 π - π π f ( λ ) e x p ( - i j λ )   E46

The spectral density of the fGn can be approximated according to [25] and [35] as

f λ = 2 sin π H Γ 2 H + 1 1 - c o s λ λ - 2 H - 1 + B λ , H   E47

where

B λ , H = j = 1 2 π j + λ - 2 H - 1 + 2 π j - λ - 2 H - 1  

Note that the domain is only - π λ π , since any frequency higher would only correspond to amplitude between our desired sample points.

The problem with the above expression is that there is no known form for B ( λ , H ) , Paxson [35] proposes the following scheme for B ( λ , H ) :

B