*CV* lower bound.

## Abstract

For reinsurance contract simulated annual losses, an inequality relating their standard deviation and mean is found, σ f ≥ m f μ A C μ A , where the coefficient in the inequality is the square root of the ratio of numbers of zero losses years to numbers of non-zero losses years. The largest such coefficient is also proved to be the universal upper bound. As direct application of this inequality, bounds for other risk measures of reinsurance contract, the TVaR (average of the annual losses that are larger than a given loss), the probability of attaching (greater than a given attachment loss), and the probability of exceeding (the annual loss limit) are obtained, which in turn reveal the capability upper limit of the simulation approach.

### Keywords

- reinsurance contract
- simulation
- standard deviation
- coefficient of variation
- inequality
- ratio distribution
- model risk

## 1. Introduction

In reinsurance industry, simulated losses from catastrophe events combined with reinsurance contract financial terms are used to calculate the contract expected annual loss, the standard deviation of the expected annual loss, and quantiles of the losses (such as the AEP: Aggregate Exceedance Probability, or TVaR: Tail Value at Risk, in Ref. [1]). These numbers are in turn used for pricing or risk management of the contract. There are two kinds of model risks in this approach: independent simulations may give different results without any model change, and simulations vary before or after model change such as from the yearly catastrophe events sets or parameters updates. Empirically, the distribution of the contract expected annual loss may be more like a Beckmann, MaxStable, Gamma, Inverse Gaussian, or even a Lognormal Distribution than a Normal Distribution, but consider that the mean annual loss is the average of a large number of losses, and especially for simplicity, if we assume they obey a Normal Distribution, to quantify the model risk, we can use Hinkley formula [2] or Marsaglia formula [3] to calculate the probability of two simulations have expected annual loss deviated from each other by more than say 50%. Their formulas, in our model risk quantification context and the simplest scenario, depend on only two factors: the correlation coefficient *ρ* of the distribution in the two simulations, and the coefficient of variation (*CV*) of the distribution, that is, the ratio of the standard deviation to the mean. Most reinsurance contracts, due to the carefully selected financial terms, have many or the majority of simulated year’s losses zero. Thus call for a study of the *CV* range or bounds of those scarcely payout contracts.

## 2. Results

### 2.1. CV range

Starting from Hölder’s inequality (

suppose *f* in the formula is the contract annual loss with mean *mf* deducted, that is, is of the form *f* − *mf* and *p* = *q* = 2, *g* is some nonnegative weights on the discrete probability space of the simulated years Ω, for example, the set {1,2*,…,*100,000} for 100,000 years of simulations, each element with a probability of 1e-5 (a typical setting in practice). Then we get:

Suppose *f* is nonnegative and zero outside of a subset *A* of Ω and *g* is constant *a* on *A* and constant *b* on *AC*: *f*|*AC* = 0*, f*|*A >* 0*, g*|*AC* = *b* ≥ 0*, g*|*A* = *a* ≥ 0. Then we can deduct that:

due to

and

The maximum of the right-hand side of Eq. (3) is achieved by

As a corollary, if we use the 2-sigma or 3-sigma rule for the confidence interval of the estimate of the mean, for those contracts that have most years 0 losses, these interval will be very large. For example, if we have only 100 years have nonzero losses out of the 100,000 simulated years, then we will get [math]::sqrt((1e5-100)/100) = 31.6069612585582:

And if we have 1000 years nonzero losses, we get the constant [math]::sqrt((1e5-1000)/1000) = 9.9498743710662:

Checking against some concrete examples, for a contract we see 2220 nonzero losses records, suppose all of them are in different years, then we get the ratio [math]:: sqrt((1e5-2220)/2220) = 6.63664411016931, and we have its mean loss *mf* = 3848 and standard deviation *σf* = 37,149 from the simulation, 37,149/3848 = 9.65410602910603 *>* 6.63664411016931.

Another contract have unique 41,143 nonzero losses years out of the 100,000 simulated years, with mean loss *mf* = 1,874,487 and standard deviation *σf* = 2,357,787, *σf*/*mf* = 1.25783054243641 *>* [math]::sqrt((1e5-41,143)/41,143) = 1.19605481319037.

From these examples, we see that our lower bound formula for *CV* is relatively close.

Typically, more than half of the contracts may have nonzero losses years count below 10,000, for which we can see their *σf* ≥ 3*mf*, since [math]::sqrt((1e5-1e4)/1e4) = 3. More than 80% of the contracts may have nonzero losses years count below 50,000, for which there is the simple inequality *σf* ≥ *mf*, since sqrt((1e5-5e4)/5e4) = 1. These bounds are collected in Table 1.

Nonzero losses years | CV lower bound |
---|---|

1 | 316.226184874055 |

10 | 99.9949998749938 |

100 | 31.6069612585582 |

1000 | 9.9498743710662 |

2000 | 7 |

5000 | 4.35889894354067 |

10,000 | 3 |

20,000 | 2 |

30,000 | 1.52752523165195 |

40,000 | 1.22474487139159 |

50,000 | 1 |

80,000 | 0.5 |

90,000 | 0.333333333333333 |

99,000 | 0.100503781525921 |

99,900 | 0.0316385998584166 |

99,990 | 0.0100005000375031 |

99,999 | 0.00316229347167527 |

This inequality Eq. (8) can explain the observation that when we sort the contracts by the mean annual loss, the lower quarter of contracts may have more than tens of percent deviation from different simulations, since smaller mean loss usually corresponding to fewer years of nonzero losses and higher *CV* (more explanation in the following section).

To get an upper bound for *CV*, suppose the total simulated years is *n* and each year has the loss *xi* ≥ 0. Then:

For the expression *x*_{1}, we can know that it is decreasing in

For the extreme case when only one year have nonzero losses, we thus verified that the coefficient [math]::sqrt((1e5-1)/1) = 316.226184874055 is exact, that is, we have:

The overall upper bound *CV* is approachable when we let all but one of the *xi* arbitrarily close to 0.

From the fact that the minimum of the expression in Eq. (11) is attained at

More generally, if year *i*s have possibly unequal probability *pi* of occurrence (such as when using importance sampling), then:

The minimum value of the right side is attained when

We summarize our deduction and discussion into the following:

**Theorem 1.** *For reinsurance contract simulated annual losses f, the standard deviation σf with respect to the mean mf is bound below by:*

*where μ*(

*AC*)

*and μ*(

*A*)

*are the measure of the numbers of zero losses years and the numbers of non-zero losses years, respectively. The lower bound*

*if and only if all the non-zero losses are of the same value. The standard deviation σf with respect to the mean mf is bound above by:*

where the p_{i} is the probability of occurrence of year i. The upper bound is attained if and only if the smallest occurrence probability year is the only year of non-zero losses. And when only year i have nonzero losses:

For not necessarily nonnegative loss contracts (such as contracts with complex layers structure and hedging design), and for contracts that have significant concentration on the upper bound (due to limit and annual limit), replacing *f* by *f* − *m* or *M*− *f*, where *m* and *M* are the minimum and the maximum annual loss, from the theorem we get the following lower bounds:

**Corollary 1.** *For arbitrary reinsurance contract simulated annual losses f, the standard deviation σf with respect to the mean mf, minimum annual loss m, and maximum annual loss M, is bound below by:*

*where μ*(

*LC*)

*and μ*(

*L*)

*are the measure of the numbers of minimum losses years and the numbers of not-minimum losses years, μ*(

*UC*)

*and μ*(

*U*)

*are the measure of the numbers of maximum losses years and the numbers of not-maximum losses years, respectively. The equality hold if and only if f is a bivalued distribution.*

From Theorem 1, we can get an upper bound for the average annual loss on an arbitrary subset of the years:

**Corollary 2.** 1*For a nonnegative random variable f on a probability space Ω, an arbitrary subset B* ⊂ *Ω, the average**is bound above by the standard deviation σf and the mean mf by:*

**Proof:** Define two functions *f*_{1} and *f*_{2} from *f* such that they are the restrictions of *f* on the subset *B* and *BC*: *f*_{1}|*BC* = 0*, f*_{1}|*B* = *f*|*B, f*_{2}|*BC* = *f*|*BC, f*_{2}|*B* = 0. Then we have *f* = *f*_{1} + *f*_{2} and *f*_{1} *f*_{2} = 0. The standard deviation:

*f*

_{1}include the set

*BC*and the domain with zero value for

*f*

_{2}include the set

*B*. Hence:

The inequality Eq. (22) is arrived by the fact that *mf* = *m*_{f1} + *m*_{f2} and

If we let the subset *B* be {*x*|*f*(*x*) >0}, then Eq. (22) become Eq. (16). If we let the subset *B* be {*x*|*CDFf* (*x*) ≥ *q,*0 ≤ *q* ≤ 1}, we get the so called AEP TVaR upper bound for the given quantile *q* or the return period *TVaR*(*q*) ≤ *TVaR* upper bound (now simply

Return period | Quantile | TVaR upper bound |
---|---|---|

100,000 | 0.99999 | 316.226184874055σf + mf |

10,000 | 0.9999 | 99.9949998749938σf + mf |

5000 | 0.9998 | 70.7036066972541σf + mf |

1000 | 0.999 | 31.6069612585582σf + mf |

500 | 0.998 | 22.3383079036887σf + mf |

250 | 0.996 | 15.7797338380595σf + mf |

200 | 0.995 | 14.1067359796659σf + mf |

100 | 0.99 | 9.9498743710662σf + mf |

50 | 0.98 | 7σf + mf |

30 | 0.966666666666667 | 5.3851648071345σf + mf |

25 | 0.96 | 4.89897948556636σf + mf |

20 | 0.95 | 4.35889894354067σf + mf |

10 | 0.9 | 3σf + mf |

5 | 0.8 | 2σf + mf |

4 | 0.75 | 1.73205080756888σf + mf |

2 | 0.5 | 1σf + mf |

Numerical example shows that our *TVaR* upper bound is relatively close in the quantile range [0.8,0.9], with the theoretical upper bound deviated from the simulated value by less than 20%, no matter what the distribution of the annual loss is.

Notice that the measure of the numbers of nonzero losses years is also called the probability of attaching in insurance, we can rearrange the terms in the formula Eq. (8) to get a lower bound for the probability of attaching:

**Corollary 3.** *For a reinsurance contract simulated annual losses f, the probability of attaching, ProbA* ≡ *Prob*{*f >* 0}*, with respect to the CV is bound below by:*

As an application of Corollary 3, we see that if *CV* ≤ 3, then *ProbA* ≥ 0.1, the 0.9 quantile of *f* is larger than zero. Equivalently, if the 0.9 quantile of *f* (the so called *AEP* in insurance) is zero, we know *CV >* 3: then we will less prone to think that those zero quantiles is due to simulation inaccuracy. The *CV* bounds for commonly used *AEP*, related to probability of attaching by the formula

CV upper bound | ProbA lower bound | Beginning quantile with nonzero loss | Return period |
---|---|---|---|

99.9949998749938 | 0.0001 | 0.9999 | 10,000 |

9.9498743710662 | 0.01 | 0.99 | 100 |

3 | 0.1 | 0.9 | 10 |

2 | 0.2 | 0.8 | 5 |

1.73205080756888 | 0.25 | 0.75 | 4 |

1 | 0.5 | 0.5 | 2 |

0.577350269189626 | 0.75 | 0.25 | 1.33333333333333 |

0.5 | 0.8 | 0.2 | 1.25 |

Similarly, from Corollary 2, we can easily rearrange terms to get an upper bound for the probability of exceeding a given loss, which is called the Cantelli’s inequality in the literature (

**Corollary 4.** *For a reinsurance contract simulated annual losses f, the probability of exceeding a given loss x, ProbE* ≡ *Prob*{*f* ≥ *x*}*, with respect to the mean loss mf and the standard deviation σf, when x* ≥ *mf, is bound above by:*

*Specifically, if* *, then:*

This bound gives a limitation on simulation with a given number *N* of simulated years where each year have equal probability of occurrence *mf* and given standard deviation *σf* (to be shown in Lemma 1). In other words, if

**Lemma 1.** *For a reinsurance contract simulated annual losses f that are bound up by M and with a given mean loss mf and a given standard deviation σf, we must have:*

*On the other hand, with the given max loss M and mean loss mf, the standard deviation σf must satisfy:*

*The maximum standard deviation given mf and M is attained only by a bivalued distribution of values either* 0 *or M, with probability* *and* *, respectively, whose CV is then* *. Similarly, the minimal exposure given mf and σf is attained only by a bivalued distribution of values either* 0 *or* *with probability**and**, respectively, whose CV is then*

Proof:

Let *g* is a random variable with values in interval [0,1]. So:

This proves both of our inequalities. Without loss of generality, suppose any nonempty subset of Ω have nonzero measure, the equality hold in Eq. (34) and its subsequent inequalities if and only if *g* = 0 or *g* = 1.□

Because of the probability of *M*, we cannot solve the limitation on *CV* by increasing *M*. The only solution is then by increasing *N* or using unequal probabilities (please refer to Eq. (18)), otherwise we may have to choose to only match the mean loss, and reduce the simulated standard deviation.

By examining the proof of Corollary 2 and Theorem 1 Eq. (17), forcing the inequality in Eq. (25) to be an equality, we can prove that:

**Corollary 5.** *For a nonnegative random variable f on a probability space Ω, an arbitrary subset B ⊂ Ω, assuming* *, the average* *attain its upper bound with respect to the standard deviation σf and the mean mf:*

^{C}.

So the maximum *TVaR* distribution is bivalued, this corollary provides a guide for implementing relatively high *CV* distribution simulation: for *CV* close to *AEP* distribution which are bound by *TVaR’s* bound and attain the same upper bound given in Eq. (39). Both conclusions give clue for a risk measure of the maximally likely or best compromise quantile by comparing simulated *TVaR* or *AEP* with the theoretical bound for the best match but that is the topic of a different research.

### 2.2. Simulation deviation

Typical correlation coefficient *ρ* from yearly model update range from 0.27 to 0.96, the *CV* range is 0.003–316 as we calculated in Table 1. With these parameters we used Mathematica Manipulate function to explore the probability of the ratio of two simulated annual-mean-loss be within the range of 0.5–1.5, assuming the annual-mean-loss obeys the Normal Distribution. We find that the probability is small when *ρ* is close to 0, and is decreasing when *CV* is increasing, but is stabilized after *CV* ≥ 7. For an example *ρ* = 0.822434, for almost half of the contracts, the simulated annual-mean-loss being within 50% to each other has probability of 0.459, that is, with probability of 0.551 we will see two simulation have simulated annual-mean-loss increased or decreased by more than 50% (Figures 1 and 2). These factors should be considered for model risk management or individual contract evaluation.

The Mathematica code for the plot in Figures 1 and 2 is in Appendix A. The two plots are identical even though their formulas are quite different and we do not know whether they can be analytically proved to be equivalent: our plots are numerical validation of both of their formulas.

## 3. Discussion

The lower bound for *CV* of reinsurance contract annual loss is established. The largest of those bound are also proved to be the upper bound for all *CV*. Applying this range information to ratio distribution, we can get theoretical value of the probability that different simulations will have simulated mean annual loss with deviation from each other less than a given percentage, under the Normal Distribution assumption of the mean annual loss. We think this assumption can be removed by using more suitable distributions, with numerical methods, but may still give the probability not too different. Typical example case numerical study confirmed this, and showed that the “Normal approximation” gives probability only a few percent (2–5%) less than using more suitable distributions that do not have explicit formula for the probability.

As the starting point and the application of the *CV* range, the ratio distribution and the model risk quantification results we get may be only rudimentarily correct due to other factors, such as the distribution modeling, the dependence modeling, and additional parameters dependence than just the *CV* and *ρ*, but our *CV* inequality itself is mathematically sound.

The less general upper bound

Using the same Hölder’s inequality and calculus technique which may not have a simple elementary inequality approach counterpart, we can prove a more complex formula:

**Theorem 2.** *For reinsurance contract simulated annual losses f, the standard deviation σf with respect to the mean mf is bound below by:*

*when μ*(

*m*) +

*μ*(

*M*)

*<*1

*, where σf is the standard deviation, mf is the mean, m is the minimum annual loss, M is the maximum annual loss, μ*(

*m*)

*denote the measure of the numbers of minimum losses years, μ*(

*M*)

*denote the measure of the numbers of maximum losses years.*

Proof:

In the inequality Eq. (2), we divide Ω into three subset and let the nonnegative function *g* be constant in each of the three sets:

Then:

We get:

Using the negative form of the inequality

Define:

Then

The derivative

If *B* ≤ 0, then *F*(*t*) take the maximum *B >* 0, then *F*(*t*) increase on (0,

Apply the same argument to

*M*−

*mf*)

*μ*(

*M*)

*>*0 since

*μ*(

*m*) +

*μ*(

*M*)

*<*1, we have if (

*mf*−

*m*)

*μ*(

*m*) − (

*M*−

*mf*)

*μ*(

*M*)

*>*0, then

If (*mf* − *m*)*μ*(*m*) − (*M* − *mf*)*μ*(*M*) = 0, then

If (*mf* − *m*)*μ*(*m*) − (*M* − *mf*)*μ*(*M*) *<* 0, then use the inequality Eq. (50), we can follow the same steps to arrive at the same form of maximum formula. We thus proved the maximal

We can also prove by calculus that:

**Theorem 3.** *In the terminology of* *Theorem 2, if m* = 0,

Proof:

Define

*t*≥ 0.

Then *F*(*t*) is continuous at 0 and

The derivative of *F*(*t*) is

*F*(

*t*) ≥ 0 for any

*t*≥ 0.□

Theorem 3 can be combined with the following form of the Hölder’s inequality:

to give an alternative proof of Theorem 2 and then Eq. (16) (or directly for Eq. (20) by using the set {*f > m*}).

So there is a complex but better lower bounds Eq. (40), and empirical study shows that when *μ*(*m*) *>* 0.86, both bounds are close to the true *σf* to within 86% with the simple form Eq. (8) 3–4% lower than the complex form Eq. (40). Even though the complex form Eq. (40) is generally valid for any discrete random variable, it may not be as easily applicable as the simple form Eq. (8) when we need a fast first approximation, and hence of less practical interest.

With numerical simulation, we can get *σf* and *CV* directly, so these formulas seems not to be useful for the numerical results. But since each simulation may arrive at a different value, known a priori their approximate value will be a check for any possible simulation process problem. Our inequalities also reveal that the *CV* is intrinsically related to important value distribution characteristics of the annual loss random variable. This essentialness of *CV* is also confirmed by other studies, such as the correlation and cluster analysis of these random variables.

## 4. Conclusions

Lower bound for reinsurance contract annual loss standard deviation involving zero losses years counts are obtained, which imply a general upper bound for annual loss *TVaR* or *AEP* with no mention of zero losses years. Alternative forms of these bounds give inequalities for probability of attaching and exceeding. These bounds can explain the difficulties or instabilities observed in numerical simulations, show the major reason of the limitation of the simulation is high *CV* and give clue to alternative solutions.

## Acknowledgments

This research is supported by Validus Research Inc.

## Conflict of interest

The authors declare no conflict of interest.

## Thanks

The author thanks Nancy Wang for checking against a C++ application that validated the practical usefulness of our inequality.

hinkley[x_?NumericQ,c_,p_]:=1/Sqrt[2 Pi]/c (x + 1) (1 − p)/(xˆ2–2 p x + 1)ˆ1.5 Exp[−1/2/cˆ2 (x − 1)ˆ2/(xˆ2–2 p x + 1)](CDF[NormalDistribution[0,1],1/c Sqrt[(1 − p)/(1 + p)] (x + 1)/Sqrt[xˆ2–2 p x + 1]] − CDF[NormalDistribution[0,1],−1/c Sqrt[(1 − p)/(1 + p)] (x + 1)/Sqrt[xˆ2–2 p x + 1]]) + 1/Pi Sqrt[1 − pˆ2]/(xˆ2–2 p x + 1) Exp[−1/cˆ2/(1 + p)];

H[x_{_},c_{_},p_{_}]:=**Integrate**[hinkley[y,c,p],{y,−**Infinity**,x}];

LogLinearPlot[**Evaluate**[H[1.5,x,0.822434] − H[0.5,x,0.822434]],{x,0.1,10},**PlotRange**− *>* **All**, **GridLines**− *>* **Full**,GridLinesStyle− *>* Directive[**Gray**,Dashed],**Mesh**− *>* **Automatic**, **ImageSize**− *>* **Full**,**Frame**− *>* on];

**Clear**[q,marsaglia,M,MA]

**Off**[**NIntegrate**::inumr]

q[t_{_},p_{_},c_{_}]:=q[t,p,c] = **With**[{},1.0/c (1.0 + (1.0 − p) t/**Sqrt**[1.0 − pˆ2])/**Sqrt**[1.0 + tˆ2]]

marsaglia[t_{_},p_{_},c_{_}]:=marsaglia[t,p,c] = **With**[{},**Exp**[−1.0/(1.0 + p)/cˆ2]/**Pi**/(1.0 + tˆ2)(1.0 + q[t,p,c] **Exp**[q[t,p,c]ˆ2/2.0] **Evaluate**[**Integrate**[**Exp**[−yˆ2/2.0],{y,0.0,q[t,p,c]}]])]

M[v_{_},p_{_},c_{_}]:=M[v,p,c] = **With**[{},CDF[NormalDistribution[0,1],1/c (1 − p)/**Sqrt**[1 − pˆ2]] + CDF[

NormalDistribution[0,1],1/c] − 2 CDF[NormalDistribution[0,1],1/c (1 − p)/**Sqrt**[1 − pˆ2]] CDF[.

NormalDistribution[0,1],1/c] + **Evaluate**[**NIntegrate**[marsaglia[u,p,c],{u,0.0,v}]]]

MA[v_{_},p_{_},c_{_}]:=MA[v,p,c] = **With**[{},M[(v − p)/**Sqrt**[1.0 − pˆ2],p,c]];

DistributeDefinitions [q,marsaglia,M,MA];

LogLinearPlot[**Evaluate**[MA[1.5,0.822434,x] − MA[0.5,0.822434,x]],{x,0.1,10},**PlotRange**− *>* **All**, **GridLines**− *>* **Full**,GridLinesStyle− *>* Directive[**Gray**,Dashed],**Mesh**− *>* **Automatic**, **ImageSize**− *>* **Full**,**Frame**− *>* on]

## Notes

- The nonnegative condition can be relax to f is bound below.