Open access peer-reviewed chapter - ONLINE FIRST

# From Asymptotic Normality to Heavy-Tailedness via Limit Theorems for Random Sums and Statistics with Random Sample Sizes

By Victor Korolev and Alexander Zeifman

Submitted: April 22nd 2019Reviewed: September 10th 2019Published: October 22nd 2019

DOI: 10.5772/intechopen.89659

## Abstract

This chapter contains a possible explanation of the emergence of heavy-tailed distributions observed in practice instead of the expected normal laws. The bases for this explanation are limit theorems for random sums and statistics constructed from samples with random sizes. As examples of the application of general theorems, conditions are presented for the convergence of the distributions of random sums of independent random vectors with finite covariance matrices to multivariate elliptically contoured stable and Linnik distributions. Also, conditions are presented for the convergence of the distributions of asymptotically normal (in the traditional sense) statistics to multivariate Student distributions. The joint asymptotic behavior of sample quantiles is also considered.

### Keywords

• random sum
• random sample size
• multivariate normal mixtures
• heavy-tailed distributions
• multivariate stable distribution
• multivariate Linnik distribution
• Mittag-Leffler distribution
• multivariate Student distribution
• sample quantiles
• AMS 2000 Subject Classification: 60F05
• 60G50
• 60G55
• 62E20
• 62G30

## 1. Introduction

In many situations related to experimental data analysis, one often comes across the following phenomenon: although conventional reasoning based on the central limit theorem of probability theory concludes that the expected distribution of observations should be normal, instead, the statistical procedures expose the noticeable non-normality of real distributions. Moreover, as a rule, the observed non-normal distributions are more leptokurtic than the normal law, having sharper vertices and heavier tails. These situations are typical in the financial data analysis (see, e.g., Chapter 4 in  or Chapter 8 in  and references therein), in experimental physics (see, e.g., ), and other fields dealing with statistical analysis of experimental data. Many attempts were undertaken to explain this heavy-tailedness. Most significant theoretical breakthrough is usually associated with the results of B. Mandelbrot and others who proposed, instead of the standard central limit theorem, to use reasoning based on limit theorems for sums of random summands with infinite variances (see, e.g., ) resulting in non-normal stable laws as heavy-tailed models of the distributions of experimental data. However, first, in most cases the key assumption within this approach, the infiniteness of the variances of elementary summands can hardly be believed to hold in practice and, second, although more heavy-tailed than the normal law, the real distributions often turn out to be more light-tailed than the stable laws.

In this work, in order to give a more realistic explanation of the observed non-normality of the distributions of real data, an alternative approach based on limit theorems for statistics constructed from samples with random sizes is developed. Within this approach, it becomes possible to obtain arbitrarily heavy tails of the data distributions without assuming the non-existence of the moments of the observed characteristics.

This work was inspired by the publication of the paper  in which, based on the results of , a particular case of random sums was considered. One more reason for writing this work was the recent publication , the authors of which reproduced some results of [8, 9] without citing these earlier papers.

Here we give a more general description of the transformation of the limit distribution of a sum of independent random variables or another statistic (i.e., of a measurable function of a sample) under the replacement of the non-random number of summands or the sample size by a random variable. General limit theorems are proved (Section 3). Section 4 contains some comments on heavy-tailedness of scale mixtures of normal distributions. As examples of the application of general theorems, conditions are presented for the convergence of the distributions of random sums of independent random vectors with finite covariance matrices to multivariate elliptically contoured stable and Linnik distributions (Section 5). Also, conditions are presented for the convergence of the distributions of asymptotically normal (in the traditional sense) statistics to multivariate Student distributions (Section 6).

In Section 7, the joint asymptotic behavior of sample quantiles is considered. In applied researches related to risk analysis, such characteristic as VaR (Value-at-Risk) is very popular. Formally, VaR is a certain quantile of the observed risky value. Therefore, the joint asymptotic behavior of sample quantiles in samples with random sizes is considered in detail in Section 7 as one more example of the application of the general theorem proved in Section 3. In this section, we show how the proposed technique can be applied to the continuous-time case assuming that the sample size increases in time following a Cox process. One more interpretation of this setting is related with an important case where the sample size has the mixed Poisson distribution.

In classical problems of mathematical statistics, the size of the available sample, that is, the number of available observations, is traditionally assumed to be deterministic. In the asymptotic settings, it plays the role of infinitely increasing known parameter. At the same time, in practice very often the data to be analyzed are collected or registered during a certain period of time and the flow of informative events each of which brings a next observation forms a random point process. Therefore, the number of available observations is unknown till the end of the process of their registration and also must be treated as a (random) observation. For example, this is so in insurance statistics where, during different accounting periods, different numbers of insurance events (insurance claims and/or insurance contracts) occur and in high-frequency financial statistics where the number of events in a limit order book during a time unit essentially depends on the intensity of order flows. Moreover, contemporary statistical procedures of insurance and financial mathematics do take this circumstance into consideration as one of possible ways of dealing with heavy tails. However, in other fields such as medical statistics or quality control, this approach has not become conventional; yet, the number of patients with a certain disease varies from month to month due to seasonal factors or from year to year due to some epidemic reasons and the number of failed items varies from lot to lot. In these cases, the number of available observations as well as the observations themselves is unknown beforehand and should be treated as random to avoid underestimation of risks or error probabilities.

Therefore, it is quite reasonable to study the asymptotic behavior of general statistics constructed from samples with random sizes for the purpose of construction of suitable and reasonable asymptotic approximations. As this is so, to obtain non-trivial asymptotic distributions in limit theorems of probability theory and mathematical statistics, an appropriate centering and normalization of random variables and vectors under consideration must be used. It should be especially noted that to obtain reasonable approximation to the distribution of the basic statistics, both centering and normalizing values should be non-random. Otherwise, the approximate distribution becomes random itself and, for example, the problem of evaluation of quantiles or significance levels becomes senseless.

In asymptotic settings, statistics constructed from samples with random sizes are special cases of random sequences with random indices. The randomness of indices usually leads to the limit distributions for the corresponding random sequences being heavy-tailed even in the situations where the distributions of non-randomly indexed random sequences are asymptotically normal (see, e.g., [2, 8, 10]).

Many authors noted that the asymptotic properties of statistics constructed from samples with random samples differ from those of the asymptotically normal statistics in the classical sense. To illustrate this, we will repeatedly cite  where the following example is given. Let X1,,Xnbe order statistics constructed from the sample X1,,Xn. It is well known (see, e.g., ) that in the standard situation the sample median is asymptotically normal. At the same time, in  it was demonstrated that if the sample size Nnhas the geometric distribution with expectation n, then the normalized sample median nXNn/2+1medX1has the limit distribution function

Ψx=121+x2+x2E1

(the Student distribution with two degrees of freedom) which has such heavy tails that its moments of orders δ2do not exist. In general, as it was shown in , if a statistic that is asymptotically normal in the traditional sense is constructed on the basis of a sample with random size having negative binomial distribution, then instead of the expected normal law, the Student distribution with power-type decreasing heavy tails appears as an asymptotic law for this statistic.

## 2. Notation and definitions: auxiliary results

Let rN. We will consider random elements taking values in the r-dimensional Euclidean space Rr.

Assume that all the random variables and random vectors are defined on one and the same probability space ΩAP. By the measurability of a random field, we will mean its measurability as a function of two variates, an elementary outcome and a parameter, with respect to the Cartesian product of the σ-algebra Aand the Borel σ-algebra BRrof subsets of Rr.

The distribution of a random vector ξwith respect to the measure Pwill be denoted Lξ. The weak convergence, the coincidence of distributions, and the convergence in probability with respect to a specified probability measure will be denoted by the symbols , =d, and P, respectively.

Let Σbe a positive definite matrix. The normal distribution in Rrwith zero vector of expectations and covariance matrix Σwill be denoted ΦΣ. This distribution is defined by its density

ϕx=exp12xΤΣ1x2πr/2Σ1/2,xRr.

The characteristic function fYtof a random variable Ysuch that LY=ΦΣhas the form

fYtEexpitY=exp12tΣt,tRr.E2

Consider a sequence Snn1of random elements taking values in Rr. Let ΞRrbe the set of all nonsingular linear operators acting from Rrto Rr. The identity operator acting from Rrto Rrwill be denoted Ir. Assume that there exist sequences Bnn1of operators from ΞRrand ann1of elements from Rrsuch that

YnBn1SnanYnE3

where Yis a random element whose distribution with respect to Pwill be denoted H, H=LY.

Along with Snn1, consider a sequence of integer-valued positive random variables Nnn1such that for each n1the random variable Nnis independent of the sequence Skk1. Let cnRr, DnΞRr, and n1. Now, we will formulate sufficient conditions for the weak convergence of the distributions of the random elements Zn=Dn1SNncnas n.

For gRr, denote Wng=Dn1BNng+aNncn. In [13, 14], the following theorem was proved, which establishes sufficient conditions of the weak convergence of multivariate random sequences with independent random indices under operator normalization.

Theorem 1 . LetDn1asnand let the sequence of random variablesDn1BNnn1be tight. Assume that there exist a random elementYwith distributionHand anr-dimensional random fieldWg,gRr, such that3holds and

WngWgn

for H-almost all gRr. Then the random field Wgis measurable, linearly depends on gand

ZnWYn,

where the random field Wand the random element Yare independent.

Now, consider an auxiliary statement dealing with the identifiability of a special family of mixtures of multivariate normal distributions. Let Ube a nonnegative random variable. The symbol EΦUΣwill denote the distribution which for each Borel set Ain Rris defined as

Let Ube the set of all nonnegative random variables.

It is easy to see that if Yis a random vector such that LY=ΦΣindependent of U, then EΦUΣ=LUY.

Lemma 1. Whatever nonsingular covariance matrixΣis, the family of distributionsEΦUΣ:UUis identifiable in the sense that ifU1U,U2U, and

EΦU1ΣA=EΦU2ΣAE4

for any set ABRr, then U1=dU2.

The proof of this lemma is very simple. If UU, then the characteristic function vUtcorresponding to the distribution EΦUΣhas the form

vUt=0exp12tΤuΣtdPU<u=0exp12utΤΣtdPU<u=0expusdPU<u,s=12tΤΣt,tRr,E5

But on the right-hand side of (5), there is the Laplace-Stieltjes transform of the random variable U. From (4), it follows that vU1tvU2twhence by virtue of (5) the Laplace-Stieltjes transforms of the random variables U1and U2coincide, whence, in turn, it follows that U1=dU2. The lemma is proved.

Remark 1. When proving Lemma 1, we established a simple but useful by-product result: if ψsis the Laplace-Stieltjes transform of the random variable U, then the characteristic function vUtcorresponding to the distribution EΦUΣhas the form

vUt=ψ12tΣt,tRr.E6

## 3. General theorems

First, consider the case where the random vectors Snn1are formed as growing sums of independent random variables. Namely, let X1,X2,be independent r-valued random vectors, and for nNlet

Sn=X1++Xn.

Consider a sequence of integer-valued positive random variables Nnn1such that for each n1the random variable Nnis independent of the sequence Skk1. Let bnn1be an infinitely increasing sequence of positive numbers such that

LSnbnΦΣE7

as n, where Σis some positive definite matrix.

Let dnn1be an infinitely increasing sequence of positive numbers. As Zntake the scalar normalized random vector

Zn=SNndn.

Theorem 2. LetNnin probability asn. Assume that the random variablesX1,X2,satisfy condition6with an asymptotic covariance matrixΣ. Then a distributionFsuch that

LZnFn,E8

exists if and only if there exists a distribution function Vxsatisfying the conditions

1. Vx=0forx<0;

2. for anyABRr,

1. PbNn<dnxVx, n.

Proof. The “if” part. We will essentially exploit Theorem 1. For each n1, set an=cn=0, Bn=Dn=dnIr. For the convenience of notation, introduce a random variable Uwith the distribution function Vx. Note that the conditions of the theorem guarantee the tightness of the sequence of random variables

Dn1BNn=bNndn,n=1,2,

implied by its weak convergence to the random variable U. Further, in the case under consideration, we have Wng=bNn/dng, gRr. Therefore, the condition Nn/dnUimplies WngUgfor all gRr. Condition (7) means that in the case under consideration, H=ΦΣ. Hence, by Theorem 1, ZnUYwhere Yis a random element with the distribution ΦΣindependent of the random variable U. It is easy to see that the distribution of the random element UYcoincides with EΦUΣwhere the matrix Σsatisfies (7).

The “only if” part. Let condition (8) hold. Make sure that the sequence Dn1BNnn1is tight. Let Ybe a random element with the distribution ΦΣ. There exist δ>0and R>0such that

PY>R>δ.E9

For Rspecified above and an arbitrary x>0, we have

PZn>xPSNndn>xSNnbNn>R==PbNndn>xSNnbNn1SNnbNn>RPbNndn>xRSNnbNn>R==k=1PNn=kPbkdn>xRSkbk>R=k=1PNn=kPbkdn>xRPSkbk>RE10

(the last equality holds since any constant is independent of any random variable). Since by (7) the convergence Sk/bkYtakes place as k, from (9) it follows that there exists a number k0=k0Rδsuch that

PSkbk>R>δ/2

for all k>k0. Therefore, continuing (10) we obtain

PZn>xδ2k=k0+1PNn=kPbkdn>xR==δ2PbNndn>xRk=1k0PNn=kPbkdn>xRδ2PbNndn>xRPNnk0.

Hence,

PbNndn>xR2δPZn>x+PNnk0.E11

From the condition NnPas n, it follows that for any ϵ>0there exists an n0=n0ϵsuch that PNnn0<ϵfor all nn0. Therefore, with the account of the tightness of the sequence Znn1that follows from its weak convergence to the random element Zwith LZ=Fimplied by (8), relation (11) implies

limxsupnn0ϵPbNndn>xRϵ,E12

whatever ϵ>0is. Now assume that the sequence

Dn1BNn=bNndn,n=1,2,

is not tight. In that case, there exists an α>0and sequences Nof natural and xnnNof real numbers satisfying the conditions xnnnNand

PbNndn>xn>α,nN.E13

But, according to (12), for any ϵ>0there exist M=Mϵand n0=n0ϵsuch that

supnn0ϵPbNndn>Mϵ2ϵ.E14

Choose ϵ<α/2where αis the number from (13). Then for all nNlarge enough, in accordance with (13), the inequality opposite to (14) must hold. The obtained contradiction by the Prokhorov theorem proves the tightness of the sequence Dn1BNnn1or, which in this case is the same as that, of the sequence bNn/dnn1.

Introduce the set WZcontaining all nonnegative random variables Usuch that PZA=EΦUΣAfor any ABRr. Let Lbe any probability metric that metrizes weak convergence in the space of random variables, or, which is the same in this context, n the space of distribution functions, say, the Lévy metric or the smoothed Kolmogorov distance. If X1and X2are random variables with the distribution functions F1and F2respectively, then we identify LX1X2and LF1F2. Show that there exists a sequence of random variables Unn1, UnWZ, such that

LbNndnUn0n.E15

Denote

βn=infLbNndnU:UWZ.

Prove that βn0as n. Assume the contrary. In that case, βnδfor some δ>0and all nfrom some subsequence Nof natural numbers. Choose a subsequence N1Nso that the sequence bNn/dnnN1weakly converges to a random variable U(this is possible due to the tightness of the family bNn/dnn1established above). But then WngUg(n, nN1)for any gRr. Applying Theorem 1 to nN1with condition (7) playing the role of condition (3), we make sure that UWZ, since condition (8) provides the coincidence of the limits of all weakly convergent subsequences. So, we arrive at the contradiction to the assumption that βnδfor all nN1. Hence, βn0as n.

For any n=1,2,, choose a random variable Unfrom WZsatisfying the condition

LbNndnUnβn+1n.

This sequence obviously satisfies condition (15). Now consider the structure of the set WZ. This set contains all the random variable’s defining the family of special mixtures of multivariate normal laws considered in Lemma 1, according to which this family is identifiable. So, whatever a random element Zis, the set WZcontains at most one element. Therefore, actually condition (15) is equivalent to

bNndnUn,

that is, to condition (iii) of the theorem. The theorem is proved.

Corollary 1. Under the conditions of Theorem2, non-randomly normalized random sumsSNn/dnare asymptotically normal with some covariance matrixΣif and only if there exists a numberc>0such that

bNndncn.

Moreover, in this case, Σ=cΣ.

This statement immediately follows from Theorem 2 with the account of Lemma 1.

Now consider a formally more general setting.

Let N1,N2,and W1,W2,be random variables and random vectors, respectively, such that for each n1the random variable Nntakes only natural values and is independent of the sequence W1,W2,. Let

Tn=TnW1Wn=Tn,1W1WnTn,rW1Wn

be a statistic taking values in Rr, r1. For each n1define a random vector (random element) TNnby setting

TNnω=TNnωW1ωWNnωω

for every elementary outcome ωΩ.

We shall say that a statistic Tnis asymptotically normal with the asymptotic covariance matrix Σif there exists a non-random r-dimensional vector tsuch that

LnTntΦΣn.E16

Examples of asymptotically normal statistics are well known. Under certain conditions, the property of asymptotic normality is inherent in maximum likelihood estimators, sample moments, sample quantiles, etc.

Our nearest aim is to describe the asymptotic behavior of the random elements TNn, that is, of statistics constructed from samples with random sizes Nn.

Again let dnn1be an infinitely increasing sequence of positive numbers. Now set

Zn=dnTNnt.

Theorem 3. LetNnin probability asn. Assume that a statisticTnis asymptotically normal in the sense of16with an asymptotic covariance matrixΣ. Then a distributionFsuch that

LZnFn,

exists if and only if there exists a distribution function Vxsatisfying the conditions.

(i) Vx=0forx<0;

(ii) for anyABRr

(iii) PNn<dnxVx, n.

The proof of Theorem 3 relies on Theorem 1 with (16) playing the role of (3) and Lemma 1 and differs from the proof of Theorem 2 only by that bNn/dnis replaced by dn/Nn.

Corollary 2. Under the conditions of Theorem3the statisticTNnis asymptotically normal with some covariance matrixΣif and only if there exists a numberc>0such that

Nndncn.

Moreover, in this case, Σ=c1Σ.

This statement immediately follows from Theorem 2 with the account of Lemma 1.

## 4. Some remarks on the heavy-tailedness of scale mixtures of normals

The one-dimensional marginals of the multivariate limit law in Theorems 2 and 3 are scale mixtures of normals with zero means of the form EΦx/U, xR, where Φxis the standard normal distribution function and Uis a nonnegative random variable. It turns out, although absolutely not so evident, that these distributions are always leptokurtic having sharper vertex and heavier tails than the normal law itself.

It is easy to see that

EΦx/U=PXU<x,xR,

where Xis a standard normal variable independent of U. First, as a measure of leptokurtosity, consider the excess coefficient which is traditionally used in (descriptive) statistics. Recall that for a random variable Ywith EY4<, the excess coefficient (kurtosis) κYis defined as

κY=EYEYDY4.

If PX<x=Φx, then κX=3. Densities with sharper vertices (and, respectively, with heavier tails) than the normal density, have κ>3, and κ<3for densities with more flat vertices.

Lemma 2. LetXandUbe independent random variables with finite fourth moments; moreover, letEX=0andPU0=1. Then

κXUκX.

Furthermore,κXU=κXif and only ifPU=const=1.

For the proof see .

So, if Xis a standard normal random variable and Uis a nonnegative random variable with EU4<independent of X, then κXU3and κXU=3if and only if Uis non-random.

Using the Jensen inequality, we can easily obtain one more inequality directly connecting the tails of the normal mixtures with the tails of the normal distribution.

Lemma 3. Assume that the random variableUsatisfies the normalization conditionEU1=1. Then

1EΦx/U1Φx,x>0.

From Lemma 3, it follows that if Xis the standard normal random variable and Uis a nonnegative random variable independent of Xwith EU1=1, then for any x0

PXUxPXx=21Φx,

that is, scale mixtures of normal laws are always more leptokurtic and have heavier tails than normal laws themselves.

The class of scale mixtures of normal laws is very rich and involves distributions with various character of decrease of tails. For example, this class contains Student distributions with arbitrary (not necessarily integer) number of degrees of freedom (and the Cauchy distribution included), symmetric stable distributions (see the “multiplication theorem” 3.3.1 in ), symmetric fractional stable distributions (see ), symmetrized gamma distributions with arbitrary shape and scale parameters (see ), and symmetrized Weibull distributions with shape parameters belonging to the interval 01(see [17, 18]). As an example, in the next section, we will discuss the conditions for the convergence of the distributions of the statistics constructed from samples with random sizes to the multivariate Student distribution.

## 5. Convergence of the distributions of random sums of random vectors with finite covariance matrices to multivariate elliptically contoured stable and Linnik distributions

### 5.1 Convergence of the distributions of random sums of random vectors to multivariate stable laws

Let Σbe a positive definite r×r-matrix, α02. A random vector Zα,Σis said to have the (centered) elliptically contoured stable distribution Gα,Σwith characteristic exponent α, if its characteristic function gα,Σthas the form

gα,ΣtEexpitX=exptΣtα/2,tRr.

Univariate stable distributions are popular examples of heavy-tailed distributions. Their moments of orders δαdo not exist (the only exception is the normal law corresponding to α=2). Stable laws and only they can be limit distributions for sums of a non-random number of independent identically distributed random variables with infinite variance under linear normalization. Here it will be shown that they also can be limiting for random sums of random vectors with finite covariance matrices. The result of this subsection generalizes the main theorem of  to a multivariate case.

By ζα, we will denote a positive random variable with the one-sided stable distribution corresponding to the characteristic function

gαt=exptαexp12iπαsignt,tR,

with 0<α1(for more details see  or ).

Let α02. It is known that, if Yis a random vector such that LY=ΦΣindependent of the random variable ζα/2, then

Zα,Σ=dζα/2YE17

(see Proposition 2.5.2 in ). In other words,

Gα,Σ=EΦζα/2Σ.E18

As in Section 3, let X1,X2,be independent r-valued random vectors. For nN, denote Sn=X1++Xn. Consider a sequence of integer-valued positive random variables Nnn1such that for each n1the random variable Nnis independent of the sequence Skk1. Let bnn1be an infinitely increasing sequence of positive numbers providing convergence (6) with some positive definite matrix Σ.

Theorem 4. LetNnin probability asn. Assume that the random variablesX1,X2,satisfy condition7with an asymptotic covariance matrixΣ. Then

LSNndnGα,Σn

with some infinitely increasing sequence of positive numbers dnn1and some α02, if and only if

Nndnζα/2,1

as n.

Proof. This theorem is a direct consequence of Theorem 2 with the account of relations (17) and (18).

### 5.2 Convergence of the distributions of random sums of random vectors with finite covariance matrices to multivariate elliptically contoured Linnik distributions

In 1953, Yu. V. Linnik  introduced the class of univariate symmetric probability distributions defined by the characteristic functions

fαLt=11+tα,tR,

where α02. Later, the distributions of this class were called Linnik distributions  or α-Laplace distributions . Here the first term will be used since it has become conventional. With α=2, the Linnik distribution turns into the Laplace distribution corresponding to the density

fΛx=12ex,xR.

A random variable with the Linnik distribution with parameter αwill be denoted L1,α.

The Linnik distributions possess many interesting analytic properties (see, e.g., [17, 18] and the references therein) but, perhaps, most often Linnik distributions are recalled as examples of geometric stable distributions often used as heavy-tailed models of some statistical regularities in financial data [23, 24].

The multivariate Linnik distribution was introduced by D. N. Anderson in  where it was proved that the function

fα,ΣLt=11+tΣtα/2,tRr,α02,E19

is the characteristic function of an r-variate probability distribution, where Σis a positive definite r×r-matrix. In , the distribution corresponding to the characteristic function (19) was called ther-variate Linnik distribution. For the properties of the multivariate Linnik distributions, see [25, 26].

The r-variate Linnik distribution can also be defined in another way. For this purpose, recall that the distribution of a nonnegative random variable Mδwhose Laplace transform is

ψδsEesMδ=11+sδ,s0,E20

, where 0<δ1, is called the Mittag-Leffler distribution. It is another example of heavy-tailed geometrically stable distributions; for more details see for example, [17, 18] and the references therein. The Mittag-Leffler distributions are of serious theoretical interest in the problems related to thinned (or rarefied) homogeneous flows of events such as renewal processes or anomalous diffusion or relaxation phenomena, see [27, 28] and the references therein. In , it was demonstrated that

L1,α=dY12Mα/2,E21

where Y1is a random variable with the standard univariate normal distribution independent of the random variable Mα/2with the Mittag-Leffler distribution with parameter α/2.

Now let Ybe a random vector such that LY=ΦΣ, where Σis a positive definite r×r-matrix, independent of the random variable Mα/2. By analogy with (21), introduce the random vector Lr,α,Σas

Lr,α,Σ=2Mα/2Y.

Then, in accordance with what has been said in Section 2,

LLr,α,Σ=EΦ2Mα/2Σ.E22

The distribution (14) will be called the(centered)elliptically contoured multivariate Linnik distribution.

Using Remark 1, we can easily make sure that the two definitions of the multivariate Linnik distribution coincide. Indeed, with the account of (20), according to Remark 1, the characteristic function of the random vector Lr,α,Σdefined by (22) has the form

EexpitLr,α,Σ=ψα/2tΣt=11+tΣtα/2=fα,ΣLt,tRr,

that coincides with Anderson’s definition (19).

Our definition (22) together with Theorem 2 opens the way to formulate a theorem stating that the multivariate Linnik distribution can not only be limiting for geometric random sums of independent identically distributed random vectors with infinite second moments , but it can also be limiting for random sums of independent random vectors with finite covariance matrices.

Theorem 5. LetNnin probability asn. Assume that the random variablesX1,X2,satisfy condition7with an asymptotic covariance matrixΣ. Then

LSNndnLLr,α,Σn

with some infinitely increasing sequence of positive numbers dnn1and some α02, if and only if

Nndn2Mα/2

as n.

Proof. This theorem is a direct consequence of Theorem 2 with the account of relation (22).

## 6. Convergence of the distributions of asymptotically normal statistics to the multivariate Student distribution

The multivariate Student distribution is described, for example, in  (also see ). Consider an r-dimensional normal random vector Ywith zero vector of expectations and covariance matrix Σ. Assume that a random variable Wγhas the chi-square distribution with parameter (the “number of degrees of freedom”) γ>0(not necessarily integer) and is independent of Y. The distribution Pγ,Σof the random vector

Qγ,Σ=γ/WγYE23

is called the multivariate Student distribution (with parameters γand Σ). For any xRrthe distribution density of Zhas the form

pγ,Σx=Γr+γ/2)Σ1/2Γγ/2πγr/211+1γxΤΣ1xr+γ/2.

According to Theorem 3, the multivariate Student distribution is the resulting transformation of the limit distribution of an asymptotically normal (in the sense of (16)) statistic under the replacement of the sample size by a random variable whose asymptotic distribution is chi-square. Consider this case in more detail.

Let Gm,mxbe the gamma-distribution function with the shape parameter coinciding with the scale parameter and equal to m:

Gm,mx=0ifx0,mmΓm0xemyym1dyifx>0.

Theorem 6. Letγ>0be arbitrary,Σbe a positive definite matrix and letdnn1be an infinitely increasing sequence of positive numbers. Assume thatNnin probability asn. Let a statisticTnbe asymptotically normal in the sense of16. Then the convegence

LdnTNntPγ,Σn,

takes place if and only if

PNn<dnxGγ/2,γ/2x,n,

where Gγ/2,γ/2xis the gamma-distribution function with coinciding shape and scale parameters equal to γ/2.

Proof. This statement is a direct consequence of Theorem 3, representation (23) and Lemma 1.

Let Np,mbe a random variable with the negative binomial distribution

PNp,m=k=Cm+k2k1pm1pk1,k=1,2,E24

Here m>0and p01are parameters; for non-integer m, the quantity Cm+k2k1is defined as

Cm+k2k1=Γm+k1k1!Γm.

In particular, for m=1,relation (24) determines the geometric distribution. It is well known that

ENp,m=m1p+pp,

so that ENp,mas p0.

As is known, the negative binomial distribution with natural madmits an illustrative interpretation in terms of Bernoulli trials. Namely, the random variable with distribution (24) is the number of the Bernoulli trials held up to the mth failure, if the probability of the success in a trial is 1p.

Lemma 4. For any fixedm>0

limp0supxRPNp,mENp,m<xGm,mx=0,

where Gm,mxis the gamma-distribution function with the shape parameter coinciding with the scale parameter and equal to m.

The proof is a simple exercise on characteristic functions; for more details, see .

Corollary 3. Letm>0be arbitrary. Assume that for eachn1the random variableNnhas the negative binomial distribution with parametersp=1nandm. Let a statisticTnbe asymptotically normal in the sense of16. Then

LmnTNntP2m,Σn

where P2m,Σis the r-variate Student distribution with parameters γ=2mand Σ.

Proof. By Lemma 4 we have

Nnnm=NnENnENnnm=NnENnmn1+1mr=NnENn1+O1nUm

as nwhere Umis the random variable having the gamma-distribution function with coinciding shape and scale parameters equal to m. Now the desired assertion directly follows from Theorem 6.

Remark 2. The r-variate Cauchy distribution (γ=1) appears in the situation described in Corollary 2 when the sample size Nnhas the negative binomial distribution with the parameters p=1n, m=12, and nis large.

Remark 3. In the case where the sample size Nnhas the negative binomial distribution with the parameters p=1n, m=1(that is, the geometric distribution with the parameter p=1n), then, as n, we obtain the limit r-variate Student distribution with parameters γ=2and Σ. Moreover, if Σ=Ir(that is, the r-variate Student distribution is spherically symmetric), then its one-dimensional marginals have the form (1). As we have already noted, distribution (1) was apparently for the first time introduced as a limit distribution for the sample median in a sample with geometrically distributed random size in . It is worth noticing that in the cited paper , distribution (1) was not identified as the Student distribution with two degrees of freedom.

Thus, the main conclusion of this section can be formulated as follows. If the number of random factors that determine the observed value of a random variable is random itself with the distribution that can be approximated by the gamma distribution with coinciding shape and scale parameters (e.g., is negative binomial with probability of success close to one, see Lemma 4), then those functions of the random factors that are regarded as asymptotically normal in the classical situation are actually asymptotically Student with considerably heavier tails. Hence, since gamma-models and/or negative binomial models are widely applicable (to confirm this it may be noted that the negative binomial distribution is mixed Poisson with mixing gamma distribution, this fact is widely used in insurance), the Student distribution can be used in descriptive statistics as a rather reasonable heavy-tailed asymptotic approximation.

## 7. The asymptotic distribution of sample quantiles in samples with sizes generated by a Cox process

Sometimes, when the performance of a technical or financial system is analyzed, a forecast of main characteristics is made on the basis of data accumulated during a certain period of the functioning of the system. As a rule, data are accumulated as a result of some “informative events” that occur during this period. For example, inference concerning the distribution of insurance claims, which is very important for the estimation of, say, the ruin probability of an insurance company, is usually performed on the basis of the statistic W1,W2,,WNTof the values of insurance claims arrived within a certain time interval 0T(here NTdenotes the number of claims arrived during the time interval 0T). Moreover, this inference is typically used for the prediction of the value of the ruin probability for the next period T2T. But it is obvious (at least in the example above) that the observed number of informative events occurred during the time interval 0Tis actually a realization of a random variable, because both the number of insurance claims arrived within this interval follow a stochastic counting process. If the random character of the number of available observations is not taken into consideration, then all what can be done is the conditional forecast. To obtain a complete prediction with the account of the randomness of the number of “informative events,” we should use the results similar to Theorems 2 and 3. One of rather realistic and general assumptions concerning Nt, the number of observations accumulated by the time t, is that Ntis a Cox process. In this section, as an example, we will consider the asymptotic behavior of sample quantiles constructed from a sample whose size is determined by a Cox process. As we have already noted in the introduction, this problem is very important for the proper application of such risk measures as VaR (Value-at-Risk) in, say, financial engineering.

Let W1,,Wn, n1, be independent identically distributed random variables with common distribution density pxand W1,,Wnbe the corresponding order statistics, W1W2Wn. Let rNλ1,,λrbe some numbers such that 0<λ1<λ2<<λr<1. The quantiles of orders λ1,,λrof the random variable W1will be denoted ξλi, i=1,,r. The sample quantiles of orders λ1,,λrare the random variables Wλin+1,i=1,,r, with adenoting the integer part of a number a. The following result due to Mosteller  (also see , Section 9.2) is classical. Denote

Yn,j=nWλjn+1ξλj,j=1,,r.

Theorem 7 . Ifpxis differentiable in some neighborhoods of the quantilesξλiandpξλi0,i=1,,r, then, asn, the joint distribution of the normalized sample quantilesYn,1,,Yn,rweakly converges to ther-variate normal distribution with zero vector of expectations and covariance matrixΣ=σij,

σij=λi1λjpξλipξλj,ij.

To take into account the randomness of the sample size, consider the sequence W1,W2of independent identically distributed random variables with common distribution density px.

Let Nt, t0, be a Cox process controlled by a process Λt. Recall the definition of a Cox process. Let N1t, t0, be a standard Poisson process (i.e., a homogeneous Poisson process with unit intensity). Let Λt,t0, be a random process with non-decreasing right-continuous trajectories, Λ0=0, PΛt<=1for all t>0. Assume that the processes Λtand N1tare independent. Set

Nt=N1Λt,t0.

The process Ntis called a doubly stochastic Poisson process (or a Cox process) controlled by the process Λt. The one-dimensional distributions of a Cox process are mixed Poisson. For example, if Λthas the gamma distribution, then Nthas the negative binomial distribution.

Cox processes are widely used as models of inhomogeneous chaotic flows of events, see, for example, .

Assume that all the involved random variables and processes are independent. In this section, under the assumption that Λtin probability, the asymptotics of the joint distribution of the random variables WλiNt+1,i=1,,ris considered as t.

As we have already noted, it was B. V. Gnedenko who drew attention to the essential distinction between the asymptotic properties of sample quantiles constructed from samples with random sizes and the analogous properties of sample quantiles in the standard situation. Briefly recall the history of the problem under consideration. B. V. Gnedenko, S. Stomatovič, and A. Shukri  obtained sufficient conditions for the convergence of distribution of the sample median constructed from sample of random size. In the candidate (PhD) thesis of A. K. Shukri, these conditions were extended to quantiles of arbitrary orders. In , necessary and sufficient conditions for the weak convergence of the one-dimensional distributions of sample quantiles in samples with random sizes were obtained.

Our aim here is to give necessary and sufficient conditions for the weak convergence of the joint distributions of sample quantiles constructed from samples with random sizes driven by a Cox process and to describe the r-variate limit distributions emerging here, thus extending Mosteller’s Theorem 4 to samples with random sizes. The results of this section extend those of  to the continuous-time case.

Lemma 5. LetNtbe a Cox process controlled by the processΛt. ThenNtPtif and only ifΛtPt.

Lemma 6. LetNtbe a Cox process controlled by the processΛt. Letdt>0be a function such thatdtt. Then the following conditions are equivalent:

1. One-dimensional distributions of the normalized Cox process weakly converge to the distribution of some random variable Zas t:

NtdtZt.

1. One-dimensional distributions of the controlling process Λt, appropriately normalized, converge to the same distribution:

ΛtdtZt.

For the proof of Lemmas 5 and 6 see .

Now we proceed to the main results of this section. In addition to the notation introduced above, for positive integer nset Qjn=Wλjn+1, j=1,,r, Qn=Q1nQrn, ξ=ξλ1ξλr. Let dtbe an infinitely increasing positive function. Set

Zt=dtQNtξ.

Theorem 8. LetΛtPast. Ifpxis differentiable in neighborhoods of the quantilesξλiandpξλi0,i=1,,r, then the convergence

ZtZt,

to some random vector Ztakes place, if and only if there exists a nonnegative random variable Usuch that

PZA=EΦU1ΣA,ABRr,

where Σ=σij,

σij=λi1λjpξλipξλj,ij,

and

ΛtdtUt.

The proof is a simple combination of Lemmas 1, 5, and 6 and Theorem 3.

Corollary 4. Under the conditions of Theorem 8, the joint distribution of the normalized sample quantilesdtWλjNt+1ξλj,j=1,,r, weakly converges to ther-variate normal law with zero expectation and covariance matrixΣ, if and only if

Λtdt1t.

This statement immediately follows from Theorem 8 with the account of Lemma 1.

Corollary 5. Under the conditions of Theorem 8, the joint distribution of the normalized sample quantilesdtWλjNt+1ξλj,j=1,,r, weakly converges to ther-variate Student distribution with parametersγ>0andΣdefined in Theorem 4, if and only if

PΛt<xdtGγ/2,γ/2x,t,

where Gγ/2,γ/2xis the gamma-distribution function with coinciding shape and scale parameters equal to γ/2.

Let 0<λ<1and let ξλbe the λ-quantile of the random variable W1. As above, the standard normal distribution function will be denoted Φx.

## 8. Conclusion

The purpose of the chapter was to give a possible explanation of the emergence of heavy-tailed distributions that are often observed in practice instead of the expected normal laws. As the base for this explanation, limit theorems for random sums and statistics constructed from samples with random sizes were considered. Within this approach, it becomes possible to obtain arbitrarily heavy tails of the data distributions without assuming the non-existence of the moments of the observed characteristics. Some comments were made on the heavy-tailedness of scale mixtures of normal distributions. Two general theorems presenting necessary and sufficient conditions for the convergence of the distributions of random sums of random vectors and multivariate statistics constructed from samples with random sizes were proved. As examples of the application of these general theorems, conditions were presented for the convergence of the distributions of random sums of independent random vectors with finite covariance matrices to multivariate elliptically contoured stable and Linnik distributions. An alternative definition of the latter was proposed. Also, conditions were presented for the convergence of the distributions of asymptotically normal (in the traditional sense) statistics to multivariate elliptically contoured Student distributions when the sample size is replaced by a random variable. The joint asymptotic behavior of sample quantiles in samples with random sizes was considered. Special attention was paid to the continuous-time case assuming that the sample size increases in time following a Cox process resulting in the sample size having the mixed Poisson distribution.

## Acknowledgments

Supported by Russian Science Foundation, project 18-11-00155.

chapter PDF

## More

© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## How to cite and reference

### Cite this chapter Copy to clipboard

Victor Korolev and Alexander Zeifman (October 22nd 2019). From Asymptotic Normality to Heavy-Tailedness via Limit Theorems for Random Sums and Statistics with Random Sample Sizes [Online First], IntechOpen, DOI: 10.5772/intechopen.89659. Available from: