## 1. Introduction

One challenge posed by financial markets is to correctly forecast the volatility of financial securities, which is a crucial variable in trading and risk management of derivative securities. Dynamic hedging is very sensitive to volatility forecast and good hedges require accurate estimate of volatility. Implied volatilities, generated from option markets, can be particularly useful in such contents as they are forward-looking measures of the market's expected volatility during the remaining life of an option [1, 2]. Since there is no explicit formula available to compute directly the implied volatility, the latter can be obtained by inverting the option pricing model. On the contrary, the genetic programming offers explicit formulas which can compute directly the implied volatility. This volatility forecasting method should be free of strong assumptions regarding underlying price dynamics and more flexible than parametric methods. This paper proposes a non parametric approach based on genetic programming to improve the accuracy of the implied volatility forecast and consequently the dynamic hedging.

Genetic Programming [3] is an optimization technique which extends the basic genetic algorithms [4] to process non-linear problem structure. In genetic programming, solutions are represented as tree structures that can vary in size and shape, rather than fixed length character strings as in genetic algorithms. This means that genetic programming can be used to perform optimization at a structural level. In the standard genetic programming, the entire population of function-trees is evaluated against the entire training data set, so the number of function-tree evaluations carried out per generation is directly proportional to both the population size and the size of the training set. Genetic programming can encounter the problem of managing training sets which are too large to fit into the memory of computers, and then the realization of predictors. In machine learning, the practiced solution to learn large data set is the application of resampling techniques, such as, bagging [5], boosting [6] and arcing [7]. However, these techniques require that the entire data sets be stored in the main memory. When applied to large data sets, this approach could be impractical. In this paper, we proposed to split data into smaller subsets. First, the genetic programming is run separately on all training sub-samples. Such approach is called static training-subset selection method [8]; it might provide local solutions not adaptive to the entire enlarged data set. Alternatively, a dynamic training approach is developed. It allows genetic programming to learn simultaneously on all training sub-samples and it implies a new parameter added to the basic genetic programming algorithm which is the number of generations to change sample. This approach lightens the training task for the genetic programming and favors the discovery of solutions that are more robust across different learning data samples and seem to have better generalization ability. Comparison between generated models using static and dynamic selection methods reveals that, the dynamic approach improves the forecasting performance of the generated models using genetic programming. The best forecasting implied volatility models are selected according to total MSE criterion. They are used to compute hedge factors and implement dynamic hedging strategies. According to the average hedging errors, the genetic programming presented accurate hedging performance compared to that of Black-Scholes model.

The rest of the paper is organized as follows: section 2 provides background information regarding related works in forecasting volatility and dynamic hedging, section 3 describes research design and methodology used in this paper, section 4 reports experimental results and finally section 5 concludes.

## 2. Related works

Traditional parametric methods have limited success in estimating and forecasting volatility as they are dependent on restrictive assumptions and difficult to estimate. Several machine learning techniques have been recently used to overcome these difficulties such as artificial neural networks and evolutionary computation algorithms. In particular, genetic programming has been successfully applied to forecast financial time series [9,10].

This paper makes an initial attempt to test whether the hedger can benefit more by using generated genetic programming implied volatilities instead of Black-Scholes implied volatilities in conducting dynamic hedging strategies.

Changes in asset prices is not the only risk faced by market participants, instantaneous changes in market implied volatility can also bring a hedging portfolio significantly out of balance. Extensive research during the last two decades has demonstrated that the volatility of stocks is not constant over time [11]. The Autoregressive Conditional Heteroskedasticiy (ARCH ) and the Generalized ARCH (GARCH) models are introduced [12,13] to describe the evolution of the asset price’s volatility in discrete time. Econometric tests of these models clearly reject the hypothesis of constant volatility and find evidence of volatility clustering over time. In the financial literature, stochastic volatility models have been proposed to model these effects in a continuous-time setting [14-17]. Although these models improve the benchmark Black-Scholes model, they are complex because they require strong assumptions and computational effort to estimate parameters and stochastic process. As mentioned in [18], traditional financial engineering methods based on parametric models such as the GARCH model family, seem to have difficulty to improve the accuracy in volatility forecasting due to their rigid as well as linear structure. Using its basic and flexible tree-structured representation, genetic programming is capable of solving non-linear problems. In the context of forecasting volatility, most of research papers have focused on forecasting historical volatility based on past returns in different markets. Using historical returns of Nikkei 225 and S&P500 indices, Chen and Yeh [19] have applied a recursive genetic programming approach to estimate volatility by simultaneously detecting and adapting to structural changes. Results have shown that the recursive genetic programming is a promising tool for the study of structural changes. Using high frequency foreign exchange USD-CHF and USD-JPY time series, Zumbach et al. [20] have compared the genetic programming forecasting accuracy to that of historical volatilities, the GARCH (1,1), FIGARCH and HARCH models. According to the root-mean squared errors, the generated genetic programming volatility models did consistently outperform the benchmarks. Similarly, Neely and Weller [21] have tested the forecasting performance of genetic programming for USD-DEM and USD-YEN daily exchange rates against that of GARCH (1,1) model and a related RiskMetrics volatility forecast over different time horizons, using various accuracy criteria. While the genetic programming rules did not usually match the GARCH (1,1) or RiskMetrics models' MSE or

Option contracts prices are affected by new information and changes in expectations as much as they are by changes in the value of the underlying index. If traders have perfect foresight on forward volatility, then dynamic hedging would be essentially riskless. In practice, continuous hedging is impossible, but the convexity of option contract allows for adjustments in the exposure to higher-order sensitivities of the model, such as gamma, vega, etc. Most of the existing literature on hedging a target contract using other exchange-traded options focuses on static strategies, motivated at least in part by the desire to avoid the high costs of frequent trading. The goal of static hedging is to construct a buy-and-hold portfolio of exchange traded claims that perfectly replicates the payoff of a given over-the-counter product [24,25]. The static hedging strategy does not require any rebalancing and therefore, it does not incur significant transaction costs. Unfortunately, the odds of coming up with a perfect static hedge for a given over-the-counter claim are small, given the limited number of exchange listed option contracts with sufficient trading volume. In other words, the static hedge can only be efficient if traded options are available with sufficiently similar maturity and moneyness as the over-the-counter product that has to be hedged. Under a stochastic volatility, a perfect hedge can in principle be constructed with a dynamically rebalanced portfolio consisting of the underlying and one additional option. In practice, the dynamic replication strategy for European options will only be perfect if all of the assumptions underlying the Black-Scholes formula hold. For general contingent claims on a stock, under market frictions, the delta might still be used as first-order approximation to set up a riskless portfolio. However, if the volatility of the underlying stock varies stochastically, then the delta hedging method might fail severely. A simple method to limit the volatility risk is to consider the volatility sensitivity vega of the contract. The portfolio will have to be rebalanced frequently to ensure delta-vega neutrality. With transaction costs, frequent rebalancing might result in considerable losses. In practice, investors can rebalance their portfolios only at discrete intervals of time to reduce transactions costs.

Non parametric hedging strategies as an alternative to the existing parametric model based- strategies, have been proposed [26,27]. Those studies estimated pricing formulas by nonparametric or semi-parametric statistical methods such as neural networks and kernel regression, and they measured their performance in terms of delta-hedging. Few researches have focused on the dynamic hedging using genetic programming, however. Chen et al. [28] have applied genetic programming to price and hedge S&P500 index options. By distinguishing the case in-the-money from the case out-of-the-money, the performance of genetic programming is compared with the Black-Scholes model in terms of hedging accuracy. Based on the post-sample performance, it is found that in approximately 20% of the 97 test paths, genetic programming has lower tracking error than the Black-Scholes formula.

Based on the literature survey, one can conclude that the genetic programming could be used to efficiently forecast volatility and implement accurate dynamic hedging strategies, which opens up an alternative path besides other data-based approaches.

## 3. Research design and methodology

Accurate volatility forecasting is an essential element in conducting good dynamic hedging strategies. The first thrust of this paper deals with generation of implied volatility from option markets using static and dynamic training of genetic programming, respectively. While the static training [8] is characterized by training the genetic programming independently on a single Sub-sample, the dynamic training allows the genetic programming to train on the entire data sub samples simultaneously rather than just a single subset by changing the training Sub-sample during the run process. This permits to improve the robustness of genetic programming to generate general models adaptive to all training samples. The second thrust of this paper is to study the accuracy of the generated genetic programming implied volatility models in terms of dynamic hedging. Since the true volatility is unobservable, it is impossible to assess the accuracy of any particular model; forecasts can only be related to realized volatility. In this paper, we assume that the implied volatility is a reasonable proxy for realized volatility, to generate forecasting implied volatility models using genetic programming and then to analyze the implications of this predictability for hedging purposes.

Figure 1 illustrates the operational procedure to implement the proposed approach.

The operational procedure consists of the following steps: The first step is devoted for the data division schemes. The second step deals with the implementation of genetic programming[1] - (GP), the application of training subset selection methods and the selection of the best forecasting implied volatility models. The last step is dedicated to dynamic hedging results.

### 3.1. Data division schemes

Data used in this study consist of daily prices for the European-style S&P 500 index calls and puts options traded on the Chicago Board of Options Exchange from 02 January to 29 August 2003. The data base include the time of the quote, the expiration date, the exercise price and the daily bid and ask quotes for call and put options. Similar information for the underlying S&P 500 index is also available on a daily basis. S&P500 index options are among the most actively traded financial derivatives in the world. The minimum tick for series trading below 3 is 1/16 and for all other series 1/8. Strike price intervals are 5 points, and 25 points for far months. The expiration months are three near term months followed by three additional months from the March quarterly cycle (March, June, September, and December). Following a standard practice, we used the average of an option’s bid and ask price as a stand-in for the market value of the option. The risk free interest rate is approximated by using 3 month US Treasury bill rates. It is assumed that there are no transaction costs and no dividend.

To reduce the likelihood of errors, data screening procedures are used [29,30]. We apply four exclusion filters to construct the final option sample. First, as implied volatilities of short-term options are very sensitive to small errors in the option price and may convey liquidity-related biases, options with time to maturity less than 10 days are excluded. Second, options with low quotes are eliminated to mitigate the impact of price discreteness on option valuation. Third, deep-in-the-money and deep-out-of-the money option prices are also excluded due to the lack of trading volume. Finally, option prices not satisfying the arbitrage restriction [31],

The final sample contains 6670 daily option quotes, with at-the-money (ATM), in-the-money (ITM) and out-of-the money (OTM) options respectively taking up 37%, 34% and 29% of the total sample.

In this paper, two data division schemes are used. The full sample is sorted first, by time series (TS) and second by moneyness-time to maturity (MTM). For time series, data are divided into 10 successive samples (S_{1}, S_{2}…S_{10}), each contains 667 daily observations. The first nine samples are used as training sub-samples. For moneyness-time to maturity, data are divided into nine classes with respect to moneyness and time to maturity criteria. According to moneyness criterion: A call option is said out-of-the money (OTM) if_{i} is divided on training set C_{i}^{L} and test set C_{i}^{T}, which produces respectively nine training and nine test MTM sub-classes. Figure 2 illustrates the two division schemes.

### 3.2. Implied volatility forecasting using genetic programming:

This subsection describes the design of genetic programming and the experiments accomplished using the genetic programming method to forecast implied volatility. In the first experiment, the genetic programming is trained using static training-subset selection method; in the second one, we used dynamic training-subset selection methods. We describe training and test samples used in these experiments.

#### 3.2.1. The design of genetic programming:

Our genetic programming software is referred to as symbolic regression written in C++ language. It is designed to find a function that relates a set of inputs to an output without making any assumptions about the structure of that function. Symbolic regression was one of the earliest applications of genetic programming [3], and has continued to be widely studied [32-35]. The following pseudo code describes the genetic programming's algorithm structure used in this paper.

The genetic programming’s algorithm structure consists of the following steps: nodes definition, initialization, fitness evaluation, selection, genetic operators (crossover and mutation) and termination condition.

*Nodes Definition*: The nodes in the tree structure of genetic programming can be classified into terminal (leaf) nodes and function (non-terminal) nodes. The terminal and function sets used are described in Table 1.

The terminal set includes the inputs variables, notably, the option price divided by strike price (

Individuals are encoded as LISP S-expressions which can also be depicted as a parse tree. The search space for genetic programming is the space of all possible parse trees that can be recursively created from the terminal and function sets.

Initialization: The generated genetic programming volatility models are performed using a ramped half and half as initialization method [3]. This method involves generating an equal number of trees using a maximum initial depth that ranges from 2 to 6, as specified in Table 2. For each level of depth, 50% of the initial trees are generated via the full method and the other 50% are generated via the grow method. In the full method, the initial trees have the property that every path from root to endpoint is of maximum depth. In the grow method, initial trees can be of various depths, subject to the constraint that they do not exceed the maximum depth.

*Fitness function:* The fitness function assigned to a particular individual in the population must reflect how closely the output of an individual program comes to the target function. In this paper, the Black-Scholes implied volatility

The generated genetic programming trees provide at each time t the forecast value

Where, N is the number of data sample.

*Selection:* Based on the fitness criterion, the selection of the individuals for reproduction is done with the tournament selection algorithm. A group of individuals is selected from the population with a uniform random probability distribution. The fitness values of each member of this group are compared and the actual best is selected. The size of the group is given by the tournament size which is equal to 4, as indicated in Table 2.

*Genetic operators*: Crossover and mutation are the two basic operators which are applied to the selected individuals in order to generate new individuals for the next generation. As described in Figure 4, the subtree crossover creates new offspring trees from two selected parents by exchanging their sub-trees. As indicated in Table 2, the crossover operator is used to generate about 60% of the individuals in the population. The maximum tree size (measured by depth) allowed after the crossover is 17. This is a popular number used to limit the size of tree [3]. It is large enough to accommodate complicated formulas and works in practice.

The mutation operator randomly changes a tree by randomly altering nodes or sub-trees to create a new offspring. Often multiple types of mutation are beneficially used simultaneously [37,38]. In this paper, three mutation operators are used simultaneously, they are described below:

Branch (or subtree) mutation operator randomly selects an internal node in the tree, and then it replaces the subtree rooted at that node with a new randomly-generated subtree [3].

Point mutation operator consists of replacing a single node in a tree with another randomly-generated node of the same arity [39].

Expansion mutation operator randomly selects a terminal node in the tree, and then replaces it with a new randomly-generated subtree.

As indicated in Table 2, Branch mutation is applied with a rate of 20%; Point and Expansion mutations are applied with a rate of 10%, respectively.

*Replacement:* The method of replacing parents for the next generation is comma replacement strategy [40], which selects the best offspring to replace the parents. It assumes that offspring size is higher than parents' size. If µ is the population size and λ is the number of the new individuals (which can be larger than µ), the population is constructed using the best µ out of the λ new individuals.

*Termination criterion*: The stopping criterion is the maximum number of generations. It is fixed at 400 and 1000 for static and dynamic training- subset selection, respectively. In the dynamic training- subset selection approach, the maximum number of generations is increased to allow the genetic programming to train on the maximum of samples simultaneously. The number of generations to change sample varied between 20 and 100 generations.

The implementation of genetic programming involves a series of trial and error experiments to determine the optimal set of genetic parameters which is listed in Table 2. By varying genetic parameters, each program is run ten times with ten different random seeds. The choice of the best genetic program is made according to the mean and median of Mean Squared Errors (MSE) for training and testing sets.

#### 3.2.2. Dynamic training-subset selection method

As data are divided in several sub-samples, the genetic programming is trained, first, independently on each sub-sample relative to each data division scheme (algorithm 1). This approach is called static training-subset selection method [8]. Second, the genetic programming is trained simultaneously on the entire data sub-samples relative to each data division scheme, rather than just a single subset by changing the training sub-sample during the run process. This approach is called dynamic training-subset selection method. The main goal of this method is to make genetic programming adaptive to all training samples and able to generate general models and solutions that are more robust across different learning data samples. In the context of evolutionary algorithms, there are at least three approaches for defining the frequency of resampling [41]. The first approach called “individual-wise” consists of extracting a new sample of data instances from the training set for each individual of the population. As a result, different individuals will probably be evaluated on different data samples, which cast some doubts on the fairness of the selection procedure of the evolutionary algorithm. The second approach called “run-wise” consists of extracting a single fixed sample of data instances from the training set used to evaluate the fitness of all individuals throughout the evolutionary run, which will probably reduce significantly the robustness and predictive accuracy of the evolutionary algorithm. The third approach called “generation-wise” consists of extracting a single fixed sample of data instances from the training set at each generation, and all individuals of that generation will have their fitness evaluated on that data sample. This method avoids the disadvantages of the two previous approaches, and as such seems more effective. In particular, an individual will only survive for several generations if it has a good predictive accuracy across different data samples. The dynamic approach proposed in this study differs from the three previous approaches as it doesn’t extract a fixed sample of data instances from the training set, but selects it from the whole sub-samples data which are already built up and use it to evaluate the fitness of all individuals when the generations’ number to change sample is reached. In this paper, we proposed four dynamic training-subset selection methods: *Random Subset Selection* method (*RSS*), *Sequential Subset Selection* method (*SSS*), *Adaptive-Sequential Subset Selection* method (ASSS) and *Adaptive-Random Subset Selection* method (*ARSS*). The *RSS* and *SSS* allow the genetic programming to learn on all training samples in turn (*SSS*) or randomly (*RSS*). However, with these methods, there is no certainty that genetic programming will focus on the samples which are difficult to learn. Then, the *ASSS* and the *ARSS,* which are variants of* the adaptive subset selection (ASS),* are introduced to focus the genetic programming’s attention onto the difficult samples i.e. having the greatest MSE and then to improve the learning algorithm.

Dynamic subset selection is easily added to the basic GP algorithm with no additional computational cost according to the static subset selection.

Let S be the set of training samples S_{i (i=1…k)}, where k is the total number of samples. A selection probability P (S_{i}) is allocated to each sample S_{i} from S. The training sample *S _{i}* is changed each g generations (g is the number of generations to change sample) according to this selection probability and the dynamic training-subset selection method used. Once a new training sample is selected, the best individuals are used as population for the next training samples. This procedure is repeated until the maximum number of generations is reached. This permits genetic programming to adapt its generating process to changing data in response to feedback from the fitness function which is the mean squared error computed as in static approach. By the end of the evolution, only individuals with the desirable characteristics that are well adapted to the environmental changes will survive.

It selects randomly the training samples with replacement. At each g generations, all the samples from S have the same probability to be selected as the current training sample: P (S_{i}) =1/k, 1≤ i ≤ k. This method differs from that proposed by Gathercole and Ross [42] as random selection concerns training samples which are already constructed according to data division scheme, rather than data instances.

As selection of training samples is random, the performance of the current population changes with the training sample used for evolving the genetic program. Figure 8 illustrates an example of the best fitness (MSE) curve along evolution using RSS method. With the sample change, the MSE may increase, but it is improved during the following generations, the time that the population adapts itself to the new environment.

Figure 8 shows that some training samples could be duplicated, but some others could be eliminated.

It selects all the training samples in the order. If, at generation g-1, the current training sample is S_{i,} then at generation g: P (S_{j}) = 1, with j= i+1 if i<k, or j=1 if i=k.

As illustrated in Figure 9, all the learning subsets are used during the evolution in an iterative way.

Instead of selecting a training subset data in a random or sequential way, one can use an adaptive approach to dynamically select difficult training subsets data which are frequently misclassified. This approach is inspired from the dynamic subset selection method proposed by Gathercole and Ross [42] which is based on the idea of dynamically selecting instances, not training samples, which are difficult and/or have not been selected for several generations. Selection is made according to a weight computed proportionally to the sample's average fitness. Each g generations, the weights are updated as follows:

Where, M is the size of S_{i} (

At each g generations, training samples are re-ordered, so that the most difficult training samples, which have higher weights, will be moved to the beginning of the ordered training list, and the easiest training samples, which have smaller weights, will be moved to the end of the ordered training list.

1. Adaptive-Sequential training-Subset Selection method (ASSS):

It uses the following procedure (step 1 to step 3):

Step 1. Let the first generation t be set to 0. Each training sample is assigned an equal weight, i.e., W(S_{i}) = 1 for 1≤ i ≤ k.

Step 2. The probability P (S_{i}) that a training sample S_{i} is selected to be included in the training set and evolve genetic programming is determined using the Roulette wheel selection scheme.

Where, the summation is over all training samples.

Moreover, the probability P (S_{i}) is positively related to the fitness of the parse tree generated relative to the corresponding training sample.

Where,

Compute a fitness function which is the mean squared error for each individual in the training sample and then the average fitness. Update the weights:

Step 3. t=t+g. If t<T (T is the total number of generations), then go to step 2.

As illustrated in Figure 10, selection of training samples is made in the order for the first t generations using the SSS method. Some training samples could be duplicated to improve the genetic programming learning. Later, samples are selected for the next run according to the adaptive approach based on the re-ordering procedure.

2. Adaptive-Random training-Subset Selection method (ARSS):

The ARSS method uses the same procedure as the ASSS method, except that the initial weights are generated randomly in the start of running, rather than initialized with a constant: For t=0,

#### 3.2.2. Training and test samples

Different forecasting genetic programming volatility models are estimated from the training set and judged upon their performance on the test set. Table 3 summarizes the training and test data samples used for static and dynamic training-subset selection methods, respectively.

In static training-subset selection approach, first, the genetic program is trained separately on each of the first nine TS sub-samples (S_{1},…, S_{9}) using ten different seeds and is tested on the subset data from the immediately following date (S_{2},…, S_{10}). Second, using the same genetic parameters and random seeds applied for TS data, the genetic programming is trained separately on each of the first nine MTM sub-classes (C_{1}^{L},…, C_{9}^{L}) and is tested on the second nine MTM sub-classes (C_{1}^{T},…, C_{9}^{T}).

In dynamic training-subset selection approach, first, the genetic program is trained on the first nine TS sub-samples simultaneously (S_{1},…, S_{9}) using ten different seeds and it is tested only on the tenth sub-sample data (S_{10}). Second, the genetic programming is trained on the first nine MTM sub-classes simultaneously (C_{1}^{L},…, C_{9}^{L}) and it is tested on the second nine MTM sub-classes regrouped in one test sample data (C_{1}^{T} + C_{2}^{T} …+ C_{9}^{T}). Third, the genetic programming is trained on both the nine TS sub-samples and the nine MTM sub-classes simultaneously (S_{1}, …, S_{9} ; C_{1}^{L}, …, C_{9}^{L} ) and it is tested on one test sample data composed of the TS and MTM test data (S_{10} + C_{1}^{T} + C_{2}^{T} …+ C_{9}^{T}).

Based on the training and test MSE, the best generated genetic programming volatility models relative to static and dynamic training-subset selection methods respectively are selected. These models are then compared with each other according to the MSE total and the best ones are used to implement the dynamic hedging strategies as described in the following section.

### 3.3. Dynamic hedging

To assess the accuracy of selected generated genetic programming volatility models in hedging with respect to Black-Scholes model, three dynamic hedging strategies are employed, notably, delta-neutral, delta-gamma neutral and delta-vega neutral strategies.

For delta hedging, at date zero, a delta hedge portfolio consisting of a short position in one call (or put) option and a long (short) position in the underlying index is formed. At any time t, the value of the delta hedge portfolio

Where,

The portfolio is assumed self-financed, so the initial value of the hedge portfolio at the beginning of the hedge horizon is zero:

A dynamic trading strategy is performed in underlying and bond to hedge the option during the hedge horizon. The portfolio rebalancing takes place at intervals of length

The delta hedge error is defined as the absolute value of the delta hedge portfolio at the end of the hedge horizon of the option,

For delta-gamma hedging, a new position in a traded option is required. Then, the delta-gamma hedge portfolio is formed with:

Where,

Where, the values of

At the beginning of the hedge horizon, the value of the hedge portfolio is zero:

At each rebalancing time

The delta-gamma hedge error is defined as the absolute value of the delta-gamma hedge portfolio at the end of the hedge horizon of the option,

For delta-vega hedging, a new position in a traded option is required as in the delta-gamma hedging. The proportions of the underlying

Where,

As in delta-gamma hedging, at each rebalancing time

35 option contracts are used as hedging options and 35 other contracts which depend on the same underlying, with the same maturity but different strike prices are used as additional options. Contracts used to implement the hedging strategies are divided according to moneyness and time to maturity criteria, which produces nine classes.

The delta, gamma and vega hedge factors are computed using the Black-Scholes formula by taking the derivative of the option value with respect to index price, the derivative of delta with respect to index price and the derivative of the option value with respect to volatility respectively. For the genetic programming models, the hedge ratios are computed using the same formulas replacing the Black-Scholes implied volatilities with the generated genetic programming volatilities. Two rebalancing frequencies are considered: 1-day and 7 days revision.

The average hedging error is used as performance measure. For a particular moneyness- time to maturity class, the tracking error is given by:

Where, n is the number of options corresponding to a particular moneyness-time to maturity class and

## 4. Result analysis and empirical findings

### 4.1. Selection of the best genetic programming-implied volatility forecasting models

Selection of the best generated genetic programming volatility model, relative to each training set, for TS, MTM, and both TS and MTM classifications, is made according to the training and test MSE. For static training-subset selection method, nine generated genetic programming volatility models are selected for TS (M1S1…M9S9) and similarly nine generated genetic programming volatility models are selected for MTM classification (M1C1…M9C9). The performance of these models is compared according to the MSE Total, computed using the same formula as the basic MSE for the enlarged data sample.

Table 4 reports the MSE total and the standard deviation (in parentheses) of the generated genetic programming volatility models, using static training-subset selection method, relative to the TS samples and the MTM classes.

Table 4 shows that, the generated genetic programming volatility models M4S4, M4C4 and M6C6 present the smallest MSE on the enlarged sample for TS and MTM samples respectively. Comparison between these models reveals that the TS model M4S4 seems to be more performing than MTM models M4C4 and M6C6 for the enlarged sample. Furthermore, results show that the performance of TS models is more uniform than that of MTM models. MTM models are not able to fit appropriately the entire data sample as well as the TS models as they have large Total MSE. Indeed, the MSE total exceed 1 with some MTM classes, however it does not reach 0.006 for all TS samples. Figure 12 describes the evolution's pattern of the squared errors given by TS models and MTM models for all observations in the enlarged data sample. Some extreme MSE values for MTM data are not shown in this figure.

It appears throughout Figure 12 that, the TS models are adaptive not only to training samples, but also to the enlarged sample. In contrast, the MTM models such as M1C1 are adaptive to training classes, but not all to the enlarged sample. A first plausible explanation of these unsatisfied results is an insufficient search intensity inducing difficulty to obtain general model suitable for the entire benchmark input data. To enhance exploration intensity during learning and thus improve the genetic programming performance, we introduced to the evolution procedure the dynamic subset selection, which aims to obtain a general model that can be adaptive to both TS and MTM classes simultaneously.

For dynamic training-subset selection methods (RSS, SSS, ASSS and ARSS), four generated genetic programming volatility models are selected for TS classification (MSR, MSS, MSAS and MSAR). Similarly, four generated genetic programming volatility models are selected for MTM classification (MCR, MCS, MCAS and MCAR) and four generated genetic programming volatility models are selected for global classification, both TS and MTM classes (MGR, MGS, MGAS and MGAR). Table 5 reports the best generated genetic programming volatility models, using dynamic training-subset selection, relative to TS samples, MTM classes and both TS and MTM data.

Based on the MSE total as performance criterion, the generated genetic programming volatility models MSS, MCAR and MGAR are selected. They seem to be more accurate in forecasting implied volatility than the other models because they have the smallest MSE in enlarged sample. However, the MTM model MCAR and the global model MGAR outperform the TS model MSS. Figure 13 describes the evolution's pattern of the squared errors for these generated volatility models.

Figure 13 shows that almost all models relative to each data's group are performing on the enlarged sample and present forecasting errors which are small and very closed. Forecasting errors are higher for the MTM classes than for the TS samples and both TS and MTM samples. Comparison between models generated using static training-subset selection method (Figure 12) and dynamic training-subset selection methods (Figure 13) respectively, reveals that the amplitude of forecasting errors relative to TS and MTM classes respectively is lower for the models generated using dynamic training-subset selection methods than for the models generated using static training-subset selection method. Actually, the quality of the generated genetic programming forecasting models has been improved with the dynamic training, in particular for MTM classes.

The best generated genetic programming volatility models selected, relative to dynamic training-subset selection method, are compared to the best generated genetic programming volatility model, relative to static training-subset selection method. Results are reported in Table 6.

Comparison between models reveals that the best models generated respectively by static (M4S4) and dynamic selection methods (MCAR and MGAR) present total MSE small and very close. While the generated genetic programming volatility models M4S4 and MCAR have total MSE smaller than the MGAR model, the latest seems to be more accurate in forecasting implied volatility than the other models. This can be explained by the fact that, on one hand, the difference between forecasting errors is small, and on the other hand, the MGAR model is more general than MCAR and M4S4 models because it is adaptive to all TS and MTM classes simultaneously. In fact, the MGAR model, generated using ARSS method, is trained on all TS and MTM classes simultaneously. Whereas, the MCAR model, generated using ARSS method, is trained only on MTM classes simultaneously; and the M4S4 model, generated using static training-subset selection method, is trained separately on each subset of TS.

As the adaptive-random training subset selection method is considered the best one to generate implied volatility model for call options, it is applied to put options. The decoding of volatility forecasting formulas generated for call and put options as well as their forecasting errors are reported in Table 7.

A detailed examination of the formulas in Table 7 shows that the implied volatilities generated by genetic programming are function of all the inputs used, namely the option price divided by strike price (

### 4.2. Dynamic hedging results:

The performance of the best genetic programming forecasting models is compared to the Black-Scholes model in delta, gamma and vega hedging strategies. Table 8 reports the average hedging errors for call options using Black-Scholes (BS) and genetic programming (GP) models, at the 1-day and 7-days rebalancing frequencies. Values in bold correspond to the GP hedging errors which are less than the BS ones.

Results in Table 8 show that the delta hedging performance improves for out-of-the money call options at longer maturities, for at-the-money call options at medium maturities and for in-the money call options at shorter maturities, regardless of the model used at daily hedge revision frequency. The best delta hedging performance is achieved using in-the-money short term call options for all MTM classes, regardless of the option model used.

The delta-gamma hedging performance improves for all moneyness classes of call options at longer maturities, regardless of the model used at daily hedge frequency (except in-the-money call options using the genetic programming model). The best delta-gamma hedging performance is achieved, for BS model, using at-the-money long term call options for all MTM classes. However, the best delta-gamma hedging performance is achieved, for genetic programming model, using in-the-money short term call options for all MTM classes.

The delta-vega hedging performance improves for out-of-the money and in-the-money call options at longer maturities and for at-the-money call options at shorter maturities, regarding BS model at daily hedge revision frequency. However, the delta-vega hedging performance improves for out-of-the money call options at shorter maturities, for at-the-money call options at medium maturities and for in-the money call options at longer maturities, regarding genetic programming model at daily hedge revision frequency. The best delta-vega hedging performance is achieved, for BS model, using out-of-the-money long term call options for all moneyness and time to maturity classes. However, the best delta-gamma hedging performance is achieved, for genetic programming model, using at-the-money medium term call options for all MTM classes.

The percentage of cases where the hedging error of the genetic programming model is less than the BS hedging error is around 59%. In particular, the performance of genetic programming model is better than the BS model on in-the-money call options class. Further, the total of hedging errors relative to genetic programming model is about 21 percent slightly lower than 19 percent relative to BS model. Table 9 displays the average hedge errors for put options using BS and genetic programming models, at the 1-day and 7-days rebalancing frequencies. Values in bold correspond to the genetic programming hedging errors which are less than the BS ones.

Results in Table 9 show that the delta-gamma hedging performance improves for all moneyness classes of put options (except in-the-money put options) at longer maturities, regarding BS model at daily hedge frequency. However, the delta-gamma hedging performance improves for in-the money put options and at-the-money put options at medium maturities and for out-of-the money put options at longer maturities, regarding genetic programming model at daily hedge revision frequency. The best delta-gamma hedging performance is achieved, for BS model, using at-the-money long term put options for all MTM classes. However, the best delta-gamma hedging performance is achieved, for genetic programming model, using out-of-the-money long term put options for all MTM classes.

The delta-vega hedging performance improves for BS using at-the-money and out-of-the-money put options at longer maturities and in-the-money put options at shorter maturities, at daily hedge revision frequency. However, the delta-vega hedging performance improves for all moneyness classes of put options (except in-the-money put options) at longer maturities, regarding genetic programming model at daily hedge frequency. The best delta-vega hedging performance is achieved, for BS model, using out-of-the-money long term put options for all MTM classes. However, the best delta-vega hedging performance is achieved, for genetic programming model, using at-the-money long term put options for all MTM classes.

The percentage of cases where the hedging error of the genetic programming model is less than the BS hedging error is around 57%. In particular, the performance of genetic programming model is better than the BS model on at-the-money put options class. But, the total of hedging errors relative to genetic programming model is about 50 percent slightly higher than 46 percent relative to BS model.

In summary, the genetic programming model is more accurate in all hedging strategies than the BS model, for in-the-money call options and at-the-money put options. The performance of genetic programming is pronounced essentially in terms of delta hedging for call and put options. The percentage of cases where the delta hedging error of the genetic programming model is less than the BS delta hedging error is 100% for out-of-the money and in-the-money call options as well as for at-the-money and out-of-the-money put options. The percentage of cases where the delta-vega hedging error of the genetic programming model is less than the BS delta-vega hedging error is 100% for in-the-money call options as well as for at-the-money put options. The percentage of cases where the delta-gamma hedging error of the genetic programming model is less than the BS delta-gamma hedging error is 100% for at-the-money put options.

Furthermore, results exhibit that as the rebalancing frequency changes from 1-day to 7-days revision, as the hedging errors increase and vice versa. The option value is a nonlinear function of the underlying, therefore, hedging is instantaneous and hedging with discrete rebalancing gives rise to error. Frequent rebalancing can be impractical due to transactions costs. In the literature, consequences of discrete time hedging have been considered usually in conjunction with the existence of transaction costs, that’s why hedgers would like to trade at least frequently as possible. Pioneered by Leland [43], asymptotic approaches are used as well [44-46]. For most MTM classes, delta-gamma and delta-vega hedging strategies are shown to perform better in dynamic hedging when compared with delta hedging strategy, regardless of the model used. The delta-gamma strategy enables the performance of a discrete rebalanced hedging to be improved. The delta-vega strategy corrects partly for the risk of a randomly changing volatility.

## 5. Conclusion

This paper is concerned with improving the dynamic hedging accuracy using generated genetic programming implied volatilities. Firstly, genetic programming is used to predict implied volatility from index option prices. Dynamic training-subset selection methods are applied to improve the robustness of genetic programming to generate general forecasting implied volatility models relative to static training-subset selection method. Secondly, the implied volatilities derived are used in dynamic hedging strategies and the performance of genetic programming is compared to that of Black-Scholes in terms of delta, gamma and vega hedging.

Results show that the dynamic training of genetic programming yields better results than those obtained from static training with fixed samples, especially when applied on time series and moneyness-time to maturity samples simultaneously. Based on the MSE total as performance criterion, three generated genetic programming volatility models are selected M4S4, MCAR and MGAR. However, the MGAR seems to be more accurate in forecasting implied volatility than MCAR and M4S4 models because it is more general and adaptive to all time series and moneyness-time to maturity classes simultaneously.

The main conclusion concerns the importance of implied volatility forecasting in conducting hedging strategies. Genetic programming forecasting volatility makes hedge performances higher than those obtained in the Black-Scholes world. The best genetic programming hedging performance is achieved for in-the-money call options and at-the-money put options in all hedging strategies. The percentage of cases where the hedging error of the genetic programming model is less than the Black-Scholes hedging error is around 59% for calls and 57% for puts. The performance of genetic programming is pronounced essentially in terms of delta hedging for call and put options. The percentage of cases where the delta hedging error of the genetic programming model is less than the Black-Scholes delta hedging error is 100% for out-of-the money and in-the-money call options as well as for at-the-money and out-of-the-money put options. The percentage of cases where the delta-vega hedging error of the genetic programming model is less than the Black-Scholes delta-vega hedging error is 100% for in-the-money call options as well as for at-the-money put options. The percentage of cases where the delta-gamma hedging error of the genetic programming model is less than the Black-Scholes delta-gamma hedging error is 100% for at-the-money put options.

Finally, improving the accuracy of implied volatility forecasting using genetic programming can lead to well hedged options portfolios relative to the conventional parametric models.

Our results suggest some interesting issues for further investigation. First, the genetic programming can be used to hedge options contracts using implied volatility of other models than Black-Scholes model, notably stochastic volatility models and models with jump, as a proxy for genetic programming volatility forecasting. Further, the hedge factors can be computed numerically not analytically. Second, this work can be reexamined using data from individual stock options, American style index options, options on futures, currency and commodity options. Third, as the genetic programming can incorporate known analytical approximations in the solution method, parametric models such as GARCH models can be used as a parameter in the genetic programming to build the forecasting volatility model and the hedging strategies. Finally, the genetic programming can be extended to allow for dynamic parameter choices including the form and the rates of genetic operators, the form and pressure of selection mechanism, the form of replacement strategy and the size of population. This dynamic genetic programming method can improve the performance without extra calculation costs. We believe these extensions are of interest for application and will be object of our future works.