From Asymptotic Normality to Heavy-Tailedness via Limit Theorems for Random Sums and Statistics with Random Sample Sizes

This chapter contains a possible explanation of the emergence of heavy-tailed distributions observed in practice instead of the expected normal laws. The bases for this explanation are limit theorems for random sums and statistics constructed from samples with random sizes. As examples of the application of general theorems, conditions are presented for the convergence of the distributions of random sums of independent random vectors with finite covariance matrices to multivariate elliptically contoured stable and Linnik distributions. Also, conditions are presented for the convergence of the distributions of asymptotically normal (in the traditional sense) statistics to multivariate Student distributions. The joint asymptotic behavior of sample quantiles is also considered.


Introduction
In many situations related to experimental data analysis, one often comes across the following phenomenon: although conventional reasoning based on the central limit theorem of probability theory concludes that the expected distribution of observations should be normal, instead, the statistical procedures expose the noticeable non-normality of real distributions. Moreover, as a rule, the observed non-normal distributions are more leptokurtic than the normal law, having sharper vertices and heavier tails. These situations are typical in the financial data analysis (see, e.g., Chapter 4 in [1] or Chapter 8 in [2] and references therein), in experimental physics (see, e.g., [3]), and other fields dealing with statistical analysis of experimental data. Many attempts were undertaken to explain this heavytailedness. Most significant theoretical breakthrough is usually associated with the results of B. Mandelbrot and others who proposed, instead of the standard central limit theorem, to use reasoning based on limit theorems for sums of random summands with infinite variances (see, e.g., [4]) resulting in non-normal stable laws as heavy-tailed models of the distributions of experimental data. However, first, in most cases the key assumption within this approach, the infiniteness of the variances of elementary summands can hardly be believed to hold in practice and, second, although more heavy-tailed than the normal law, the real distributions often turn out to be more light-tailed than the stable laws.
In this work, in order to give a more realistic explanation of the observed nonnormality of the distributions of real data, an alternative approach based on limit theorems for statistics constructed from samples with random sizes is developed. Within this approach, it becomes possible to obtain arbitrarily heavy tails of the data distributions without assuming the non-existence of the moments of the observed characteristics.
This work was inspired by the publication of the paper [5] in which, based on the results of [6], a particular case of random sums was considered. One more reason for writing this work was the recent publication [7], the authors of which reproduced some results of [8,9] without citing these earlier papers.
Here we give a more general description of the transformation of the limit distribution of a sum of independent random variables or another statistic (i.e., of a measurable function of a sample) under the replacement of the non-random number of summands or the sample size by a random variable. General limit theorems are proved (Section 3). Section 4 contains some comments on heavy-tailedness of scale mixtures of normal distributions. As examples of the application of general theorems, conditions are presented for the convergence of the distributions of random sums of independent random vectors with finite covariance matrices to multivariate elliptically contoured stable and Linnik distributions (Section 5). Also, conditions are presented for the convergence of the distributions of asymptotically normal (in the traditional sense) statistics to multivariate Student distributions (Section 6).
In Section 7, the joint asymptotic behavior of sample quantiles is considered. In applied researches related to risk analysis, such characteristic as VaR (Value-at-Risk) is very popular. Formally, VaR is a certain quantile of the observed risky value. Therefore, the joint asymptotic behavior of sample quantiles in samples with random sizes is considered in detail in Section 7 as one more example of the application of the general theorem proved in Section 3. In this section, we show how the proposed technique can be applied to the continuous-time case assuming that the sample size increases in time following a Cox process. One more interpretation of this setting is related with an important case where the sample size has the mixed Poisson distribution.
In classical problems of mathematical statistics, the size of the available sample, that is, the number of available observations, is traditionally assumed to be deterministic. In the asymptotic settings, it plays the role of infinitely increasing known parameter. At the same time, in practice very often the data to be analyzed are collected or registered during a certain period of time and the flow of informative events each of which brings a next observation forms a random point process. Therefore, the number of available observations is unknown till the end of the process of their registration and also must be treated as a (random) observation. For example, this is so in insurance statistics where, during different accounting periods, different numbers of insurance events (insurance claims and/or insurance contracts) occur and in high-frequency financial statistics where the number of events in a limit order book during a time unit essentially depends on the intensity of order flows. Moreover, contemporary statistical procedures of insurance and financial mathematics do take this circumstance into consideration as one of possible ways of dealing with heavy tails. However, in other fields such as medical statistics or quality control, this approach has not become conventional; yet, the number of patients with a certain disease varies from month to month due to seasonal factors or from year to year due to some epidemic reasons and the number of failed items varies from lot to lot. In these cases, the number of available observations as well as the observations themselves is unknown beforehand and should be treated as random to avoid underestimation of risks or error probabilities. Therefore, it is quite reasonable to study the asymptotic behavior of general statistics constructed from samples with random sizes for the purpose of construction of suitable and reasonable asymptotic approximations. As this is so, to obtain non-trivial asymptotic distributions in limit theorems of probability theory and mathematical statistics, an appropriate centering and normalization of random variables and vectors under consideration must be used. It should be especially noted that to obtain reasonable approximation to the distribution of the basic statistics, both centering and normalizing values should be non-random. Otherwise, the approximate distribution becomes random itself and, for example, the problem of evaluation of quantiles or significance levels becomes senseless.
In asymptotic settings, statistics constructed from samples with random sizes are special cases of random sequences with random indices. The randomness of indices usually leads to the limit distributions for the corresponding random sequences being heavy-tailed even in the situations where the distributions of non-randomly indexed random sequences are asymptotically normal (see, e.g., [2,8,10]).
Many authors noted that the asymptotic properties of statistics constructed from samples with random samples differ from those of the asymptotically normal statistics in the classical sense. To illustrate this, we will repeatedly cite [11] where the following example is given. Let X 1 ð Þ , … , X n ð Þ be order statistics constructed from the sample X 1 , … , X n . It is well known (see, e.g., [12]) that in the standard situation the sample median is asymptotically normal. At the same time, in [11] it was demonstrated that if the sample size N n has the geometric distribution with expectation n, then the normalized sample median ffiffiffi (the Student distribution with two degrees of freedom) which has such heavy tails that its moments of orders δ ≥ 2 do not exist. In general, as it was shown in [8], if a statistic that is asymptotically normal in the traditional sense is constructed on the basis of a sample with random size having negative binomial distribution, then instead of the expected normal law, the Student distribution with power-type decreasing heavy tails appears as an asymptotic law for this statistic.

Notation and definitions: auxiliary results
Let r ∈ . We will consider random elements taking values in the r-dimensional Euclidean space  r .
Assume that all the random variables and random vectors are defined on one and the same probability space Ω, A, P ð Þ . By the measurability of a random field, we will mean its measurability as a function of two variates, an elementary outcome and a parameter, with respect to the Cartesian product of the σ-algebra A and the Borel σ-algebra B  r ð Þ of subsets of  r . The distribution of a random vector ξ with respect to the measure P will be denoted L ξ ð Þ. The weak convergence, the coincidence of distributions, and the convergence in probability with respect to a specified probability measure will be denoted by the symbols ), ¼ d , and ! P , respectively.
Let Σ be a positive definite matrix. The normal distribution in  r with zero vector of expectations and covariance matrix Σ will be denoted Φ Σ . This distribution is defined by its density The characteristic function Consider a sequence S n f g n ≥ 1 of random elements taking values in  r . Let Ξ  r ð Þ be the set of all nonsingular linear operators acting from  r to  r . The identity operator acting from  r to  r will be denoted I r . Assume that there exist sequences B n f g n ≥ 1 of operators from Ξ  r ð Þ and a n f g n ≥ 1 of elements from  r such that where Y is a random element whose distribution with respect to P will be denoted H, H ¼ L Y ð Þ. Along with S n f g n ≥ 1 , consider a sequence of integer-valued positive random variables N n f g n ≥ 1 such that for each n ≥ 1 the random variable N n is independent of the sequence S k f g k ≥ 1 . Let c n ∈  r , D n ∈ Ξ  r ð Þ, and n ≥ 1. Now, we will formulate sufficient conditions for the weak convergence of the distributions of the random elements Z n ¼ D À1 n S N n À c n ð Þas n ! ∞. For g ∈  r , denote W n g ð Þ ¼ D À1 n B N n g þ a N n À c n ð Þ . In [13,14], the following theorem was proved, which establishes sufficient conditions of the weak convergence of multivariate random sequences with independent random indices under operator normalization.
Theorem 1 [14]. Let ∥D À1 n ∥ ! ∞ as n ! ∞ and let the sequence of random variables ∥D À1 n B N n ∥ È É n ≥ 1 be tight. Assume that there exist a random element Y with distribution H and an r-dimensional random field W g ð Þ, g ∈  r , such that 3 ð Þ holds and for H-almost all g ∈  r . Then the random field W g ð Þ is measurable, linearly depends on g and where the random field W Á ð Þ and the random element Y are independent. Now, consider an auxiliary statement dealing with the identifiability of a special family of mixtures of multivariate normal distributions. Let U be a nonnegative random variable. The symbol EΦ UΣ Á ð Þ will denote the distribution which for each Borel set A in  r is defined as Let U be the set of all nonnegative random variables.
The proof of this lemma is very simple.
But on the right-hand side of (5), there is the Laplace-Stieltjes transform of the random variable U.
whence by virtue of (5) the Laplace-Stieltjes transforms of the random variables U 1 and U 2 coincide, whence, in turn, it follows that The lemma is proved.
Remark 1. When proving Lemma 1, we established a simple but useful byproduct result: if ψ s ð Þ is the Laplace-Stieltjes transform of the random variable U,

General theorems
First, consider the case where the random vectors S n f g n ≥ 1 are formed as growing sums of independent random variables. Namely, let X 1 , X 2 , … be independent r-valued random vectors, and for n ∈  let Consider a sequence of integer-valued positive random variables N n f g n ≥ 1 such that for each n ≥ 1 the random variable N n is independent of the sequence S k f g k ≥ 1 . Let b n f g n ≥ 1 be an infinitely increasing sequence of positive numbers such that as n ! ∞, where Σ is some positive definite matrix.
Let d n f g n ≥ 1 be an infinitely increasing sequence of positive numbers. As Z n take the scalar normalized random vector Theorem 2. Let N n ! ∞ in probability as n ! ∞. Assume that the random variables X 1 , X 2 , … satisfy condition 6 ð Þ with an asymptotic covariance matrix Σ. Then a distribution F such that exists if and only if there exists a distribution function V x ð Þ satisfying the conditions Proof. The "if" part. We will essentially exploit Theorem 1. For each n ≥ 1, set a n ¼ c For the convenience of notation, introduce a random variable U with the distribution function V x ð Þ. Note that the conditions of the theorem guarantee the tightness of the sequence of random variables implied by its weak convergence to the random variable ffiffiffiffi U p . Further, in the case under consideration, we have W n g ð Þ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi b N n =d n p Á g, g ∈  r . Therefore, the condition N n =d n ) U implies W n g ð Þ ) ffiffiffiffi U p g for all g ∈  r . Condition (7) means that in the case under consideration, Y where Y is a random element with the distribution Φ Σ independent of the random variable U. It is easy to see that the distribution of the random element ffiffiffiffi U p Y coincides with EΦ UΣ Á ð Þ where the matrix Σ satisfies (7). The "only if" part. Let condition (8) hold. Make sure that the sequence For R specified above and an arbitrary x > 0, we have (the last equality holds since any constant is independent of any random variable). Since by (7) for all k > k 0 . Therefore, continuing (10) we obtain From the condition N n ! P ∞ as n ! ∞, it follows that for any ϵ > 0 there exists Þ< ϵ for all n ≥ n 0 . Therefore, with the account of the tightness of the sequence Z n f g n ≥ 1 that follows from its weak convergence to the random element Z with L Z ð Þ ¼ F implied by (8), relation (11) implies whatever ϵ > 0 is. Now assume that the sequence is not tight. In that case, there exists an α > 0 and sequences N of natural and x n f g n ∈ N of real numbers satisfying the conditions x n ↑∞ n ! ∞, n ∈ N ð Þ and But, according to (12), for any Choose ϵ < α=2 where α is the number from (13). Then for all n ∈ N large enough, in accordance with (13), the inequality opposite to (14) must hold. The obtained contradiction by the Prokhorov theorem proves the tightness of the sequence ∥D À1 n B N n ∥ È É n ≥ 1 or, which in this case is the same as that, of the sequence b N n =d n f g n ≥ 1 .
Introduce the set W Z ð Þ containing all nonnegative random variables U such that P Z ∈ A ð Þ¼EΦ UΣ A ð Þ for any A ∈ B  r ð Þ. Let L Á, Á ð Þ be any probability metric that metrizes weak convergence in the space of random variables, or, which is the same in this context, n the space of distribution functions, say, the Lévy metric or the smoothed Kolmogorov distance. If X 1 and X 2 are random variables with the distribution functions F 1 and F 2 respectively, then we identify L X 1 , X 2 ð Þand L F 1 , F 2 ð Þ. Show that there exists a sequence of random variables U n f g n ≥ 1 , Denote Prove that β n ! 0 as n ! ∞. Assume the contrary. In that case, β n ≥ δ for some δ > 0 and all n from some subsequence N of natural numbers. Choose a subsequence N 1 ⊆ N so that the sequence b N n =d n f g n ∈ N 1 weakly converges to a random variable U (this is possible due to the tightness of the family b N n =d n f g n ≥ 1 established above). But then W n g ð Þ ) ffiffiffiffi U p g n ! ∞ ð , n ∈ N 1 Þ for any g ∈  r . Applying Theorem 1 to n ∈ N 1 with condition (7) playing the role of condition (3), we make sure that U ∈ W Z ð Þ, since condition (8) provides the coincidence of the limits of all weakly convergent subsequences. So, we arrive at the contradiction to the assumption that β n ≥ δ for all n ∈ N 1 . Hence, β n ! 0 as n ! ∞.
For any n ¼ 1, 2, … , choose a random variable U n from W Z ð Þ satisfying the condition This sequence obviously satisfies condition (15). Now consider the structure of the set W Z ð Þ. This set contains all the random variable's defining the family of special mixtures of multivariate normal laws considered in Lemma 1, according to which this family is identifiable. So, whatever a random element Z is, the set W Z ð Þ contains at most one element. Therefore, actually condition (15) is equivalent to that is, to condition (iii) of the theorem. The theorem is proved. Corollary 1. Under the conditions of Theorem 2, non-randomly normalized random sums S N n =d n are asymptotically normal with some covariance matrix Σ 0 if and only if there exists a number c > 0 such that Moreover, in this case, Σ 0 ¼ cΣ. This statement immediately follows from Theorem 2 with the account of Lemma 1. Now consider a formally more general setting. Let N 1 , N 2 , … and W 1 , W 2 , … be random variables and random vectors, respectively, such that for each n ≥ 1 the random variable N n takes only natural values and is independent of the sequence W 1 , W 2 , … . Let be a statistic taking values in  r , r ≥ 1. For each n ≥ 1 define a random vector (random element) T N n by setting for every elementary outcome ω ∈ Ω.
We shall say that a statistic T n is asymptotically normal with the asymptotic covariance matrix Σ if there exists a non-random r-dimensional vector t such that Examples of asymptotically normal statistics are well known. Under certain conditions, the property of asymptotic normality is inherent in maximum likelihood estimators, sample moments, sample quantiles, etc.
Our nearest aim is to describe the asymptotic behavior of the random elements T N n , that is, of statistics constructed from samples with random sizes N n .
Again let d n f g n ≥ 1 be an infinitely increasing sequence of positive numbers. Now set Theorem 3. Let N n ! ∞ in probability as n ! ∞. Assume that a statistic T n is asymptotically normal in the sense of 16 ð Þ with an asymptotic covariance matrix Σ. Then a distribution F such that exists if and only if there exists a distribution function V x ð Þ satisfying the conditions.
The proof of Theorem 3 relies on Theorem 1 with (16) playing the role of (3) and Lemma 1 and differs from the proof of Theorem 2 only by that b N n =d n is replaced by d n =N n .
Corollary 2. Under the conditions of Theorem 3 the statistic T N n is asymptotically normal with some covariance matrix Σ 0 if and only if there exists a number c > 0 such that Moreover, in this case, Σ 0 ¼ c À1 Σ. This statement immediately follows from Theorem 2 with the account of Lemma 1.

Some remarks on the heavy-tailedness of scale mixtures of normals
The one-dimensional marginals of the multivariate limit law in Theorems 2 and 3 are scale mixtures of normals with zero means of the form EΦ x=U ð Þ, x ∈ , where Φ x ð Þ is the standard normal distribution function and U is a nonnegative random variable. It turns out, although absolutely not so evident, that these distributions are always leptokurtic having sharper vertex and heavier tails than the normal law itself.
It is easy to see that where X is a standard normal variable independent of U. First, as a measure of leptokurtosity, consider the excess coefficient which is traditionally used in (descriptive) statistics. Recall that for a random variable Y with EY 4 < ∞, the excess Densities with sharper vertices (and, respectively, with heavier tails) than the normal density, have κ > 3, and κ < 3 for densities with more flat vertices.
Lemma 2. Let X and U be independent random variables with finite fourth moments; moreover, let EX ¼ 0 and P U ≥ 0 ð Þ¼1. Then For the proof see [10]. So, if X is a standard normal random variable and U is a nonnegative random variable with EU 4 < ∞ independent of X, then κ X Á U ð Þ≥ 3 and κ X Á U ð Þ¼3 if and only if U is non-random.
Using the Jensen inequality, we can easily obtain one more inequality directly connecting the tails of the normal mixtures with the tails of the normal distribution.
Lemma 3. Assume that the random variable U satisfies the normalization condition EU À1 ¼ 1. Then From Lemma 3, it follows that if X is the standard normal random variable and U is a nonnegative random variable independent of X with EU À1 ¼ 1, then for any that is, scale mixtures of normal laws are always more leptokurtic and have heavier tails than normal laws themselves.
The class of scale mixtures of normal laws is very rich and involves distributions with various character of decrease of tails. For example, this class contains Student distributions with arbitrary (not necessarily integer) number of degrees of freedom (and the Cauchy distribution included), symmetric stable distributions (see the "multiplication theorem" 3.3.1 in [15]), symmetric fractional stable distributions (see [16]), symmetrized gamma distributions with arbitrary shape and scale parameters (see [10]), and symmetrized Weibull distributions with shape parameters belonging to the interval 0, 1 ð (see [17,18]). As an example, in the next section, we will discuss the conditions for the convergence of the distributions of the statistics constructed from samples with random sizes to the multivariate Student distribution.

Convergence of the distributions of random sums of random vectors
with finite covariance matrices to multivariate elliptically contoured stable and Linnik distributions

Convergence of the distributions of random sums of random vectors to multivariate stable laws
Let Σ be a positive definite r Â r ð Þ-matrix, α ∈ 0, 2 ð . A random vector Z α,Σ is said to have the (centered) elliptically contoured stable distribution G α,Σ with characteristic exponent α, if its characteristic function g α,Σ t ð Þ has the form Univariate stable distributions are popular examples of heavy-tailed distributions. Their moments of orders δ ≥ α do not exist (the only exception is the normal law corresponding to α ¼ 2). Stable laws and only they can be limit distributions for sums of a non-random number of independent identically distributed random variables with infinite variance under linear normalization. Here it will be shown that they also can be limiting for random sums of random vectors with finite covariance matrices. The result of this subsection generalizes the main theorem of [19] to a multivariate case.
By ζ α , we will denote a positive random variable with the one-sided stable distribution corresponding to the characteristic function with 0 < α ≤ 1 (for more details see [15] or [4]).
(see Proposition 2.5.2 in [4]). In other words, As in Section 3, let X 1 , X 2 , … be independent r-valued random vectors. For n ∈ , denote S n ¼ X 1 þ … þ X n . Consider a sequence of integer-valued positive random variables N n f g n ≥ 1 such that for each n ≥ 1 the random variable N n is independent of the sequence S k f g k ≥ 1 . Let b n f g n ≥ 1 be an infinitely increasing sequence of positive numbers providing convergence (6) with some positive definite matrix Σ. Theorem 4. Let N n ! ∞ in probability as n ! ∞. Assume that the random variables X 1 , X 2 , … satisfy condition 7 ð Þ with an asymptotic covariance matrix Σ. Then with some infinitely increasing sequence of positive numbers d n f g n ≥ 1 and some α ∈ 0, 2 ð , if and only if Proof. This theorem is a direct consequence of Theorem 2 with the account of relations (17) and (18).

Convergence of the distributions of random sums of random vectors with finite covariance matrices to multivariate elliptically contoured Linnik distributions
In 1953, Yu. V. Linnik [20] introduced the class of univariate symmetric probability distributions defined by the characteristic functions where α ∈ 0, 2 ð . Later, the distributions of this class were called Linnik distributions [21] or α-Laplace distributions [22]. Here the first term will be used since it has become conventional. With α ¼ 2, the Linnik distribution turns into the Laplace distribution corresponding to the density A random variable with the Linnik distribution with parameter α will be denoted L 1,α .
The Linnik distributions possess many interesting analytic properties (see, e.g., [17,18] and the references therein) but, perhaps, most often Linnik distributions are recalled as examples of geometric stable distributions often used as heavy-tailed models of some statistical regularities in financial data [23,24].
The multivariate Linnik distribution was introduced by D. N. Anderson in [25] where it was proved that the function is the characteristic function of an r-variate probability distribution, where Σ is a positive definite r Â r ð Þ-matrix. In [25], the distribution corresponding to the characteristic function (19) was called the r-variate Linnik distribution. For the properties of the multivariate Linnik distributions, see [25,26]. The r-variate Linnik distribution can also be defined in another way. For this purpose, recall that the distribution of a nonnegative random variable M δ whose Laplace transform is where 0 < δ ≤ 1, is called the Mittag-Leffler distribution. It is another example of heavy-tailed geometrically stable distributions; for more details see for example, [17,18] and the references therein. The Mittag-Leffler distributions are of serious theoretical interest in the problems related to thinned (or rarefied) homogeneous flows of events such as renewal processes or anomalous diffusion or relaxation phenomena, see [27,28] and the references therein. In [18], it was demonstrated that where Y 1 is a random variable with the standard univariate normal distribution independent of the random variable M α=2 with the Mittag-Leffler distribution with parameter α=2. Now let Y be a random vector such that L Y ð Þ ¼ Φ Σ , where Σ is a positive definite r Â r ð Þ-matrix, independent of the random variable M α=2 . By analogy with (21), introduce the random vector L r,α,Σ as Then, in accordance with what has been said in Section 2, The distribution (14) will be called the ð centeredÞ elliptically contoured multivariate Linnik distribution.
Using Remark 1, we can easily make sure that the two definitions of the multivariate Linnik distribution coincide. Indeed, with the account of (20), according to Remark 1, the characteristic function of the random vector L r,α,Σ defined by (22) has the form that coincides with Anderson's definition (19). Our definition (22) together with Theorem 2 opens the way to formulate a theorem stating that the multivariate Linnik distribution can not only be limiting for geometric random sums of independent identically distributed random vectors with infinite second moments [29], but it can also be limiting for random sums of independent random vectors with finite covariance matrices. Theorem 5. Let N n ! ∞ in probability as n ! ∞. Assume that the random variables X 1 , X 2 , … satisfy condition 7 ð Þ with an asymptotic covariance matrix Σ. Then with some infinitely increasing sequence of positive numbers d n f g n ≥ 1 and some α ∈ 0, 2 ð , if and only if as n ! ∞.
Proof. This theorem is a direct consequence of Theorem 2 with the account of relation (22).

Convergence of the distributions of asymptotically normal statistics to the multivariate Student distribution
The multivariate Student distribution is described, for example, in [30] (also see [31]). Consider an r-dimensional normal random vector Y with zero vector of expectations and covariance matrix Σ. Assume that a random variable W γ has the chi-square distribution with parameter (the "number of degrees of freedom") γ > 0 (not necessarily integer) and is independent of Y. The distribution P γ,Σ of the random vector is called the multivariate Student distribution (with parameters γ and Σ). For any x ∈  r the distribution density of Z has the form According to Theorem 3, the multivariate Student distribution is the resulting transformation of the limit distribution of an asymptotically normal (in the sense of (16)) statistic under the replacement of the sample size by a random variable whose asymptotic distribution is chi-square. Consider this case in more detail.
Let G m,m x ð Þ be the gamma-distribution function with the shape parameter coinciding with the scale parameter and equal to m: e Àmy y mÀ1 dy if x > 0: 8 < : Theorem 6. Let γ > 0 be arbitrary, Σ be a positive definite matrix and let d n f g n ≥ 1 be an infinitely increasing sequence of positive numbers. Assume that N n ! ∞ in probability as n ! ∞. Let a statistic T n be asymptotically normal in the sense of 16 ð Þ. Then the convegence takes place if and only if where G γ=2,γ=2 x ð Þ is the gamma-distribution function with coinciding shape and scale parameters equal to γ=2.
Proof. This statement is a direct consequence of Theorem 3, representation (23) and Lemma 1. Let N p,m be a random variable with the negative binomial distribution Here m > 0 and p ∈ 0, 1 ð Þ are parameters; for non-integer m, the quantity C kÀ1 mþkÀ2 is defined as In particular, for m ¼ 1, relation (24) determines the geometric distribution. It is well known that As is known, the negative binomial distribution with natural m admits an illustrative interpretation in terms of Bernoulli trials. Namely, the random variable with distribution (24) is the number of the Bernoulli trials held up to the mth failure, if the probability of the success in a trial is 1 À p.
where G m,m x ð Þ is the gamma-distribution function with the shape parameter coinciding with the scale parameter and equal to m.
The proof is a simple exercise on characteristic functions; for more details, see [8].
Corollary 3. Let m > 0 be arbitrary. Assume that for each n ≥ 1 the random variable N n has the negative binomial distribution with parameters p ¼ 1 n and m. Let a statistic T n be asymptotically normal in the sense of 16 ð Þ. Then where P 2m,Σ is the r-variate Student distribution with parameters γ ¼ 2m and Σ.
Proof. By Lemma 4 we have as n ! ∞ where U m is the random variable having the gamma-distribution function with coinciding shape and scale parameters equal to m. Now the desired assertion directly follows from Theorem 6.
Remark 2. The r-variate Cauchy distribution (γ ¼ 1) appears in the situation described in Corollary 2 when the sample size N n has the negative binomial distribution with the parameters p ¼ 1 n , m ¼ 1 2 , and n is large. Remark 3. In the case where the sample size N n has the negative binomial distribution with the parameters p ¼ 1 n , m ¼ 1 (that is, the geometric distribution with the parameter p ¼ 1 n ), then, as n ! ∞, we obtain the limit r-variate Student distribution with parameters γ ¼ 2 and Σ. Moreover, if Σ ¼ I r (that is, the r-variate Student distribution is spherically symmetric), then its one-dimensional marginals have the form (1). As we have already noted, distribution (1) was apparently for the first time introduced as a limit distribution for the sample median in a sample with geometrically distributed random size in [11]. It is worth noticing that in the cited paper [11], distribution (1) was not identified as the Student distribution with two degrees of freedom.
Thus, the main conclusion of this section can be formulated as follows. If the number of random factors that determine the observed value of a random variable is random itself with the distribution that can be approximated by the gamma distribution with coinciding shape and scale parameters (e.g., is negative binomial with probability of success close to one, see Lemma 4), then those functions of the random factors that are regarded as asymptotically normal in the classical situation are actually asymptotically Student with considerably heavier tails. Hence, since gamma-models and/or negative binomial models are widely applicable (to confirm this it may be noted that the negative binomial distribution is mixed Poisson with mixing gamma distribution, this fact is widely used in insurance), the Student distribution can be used in descriptive statistics as a rather reasonable heavy-tailed asymptotic approximation.

The asymptotic distribution of sample quantiles in samples with sizes generated by a Cox process
Sometimes, when the performance of a technical or financial system is analyzed, a forecast of main characteristics is made on the basis of data accumulated during a certain period of the functioning of the system. As a rule, data are accumulated as a result of some "informative events" that occur during this period. For example, inference concerning the distribution of insurance claims, which is very important for the estimation of, say, the ruin probability of an insurance company, is usually performed on the basis of the statistic W 1 , W 2 , … , W N T ð Þ of the values of insurance claims arrived within a certain time interval 0, T ½ (here N T ð Þ denotes the number of claims arrived during the time interval 0, T ½ ). Moreover, this inference is typically used for the prediction of the value of the ruin probability for the next period T, 2T ½ . But it is obvious (at least in the example above) that the observed number of informative events occurred during the time interval 0, T ½ is actually a realization of a random variable, because both the number of insurance claims arrived within this interval follow a stochastic counting process. If the random character of the number of available observations is not taken into consideration, then all what can be done is the conditional forecast. To obtain a complete prediction with the account of the randomness of the number of "informative events," we should use the results similar to Theorems 2 and 3. One of rather realistic and general assumptions concerning N t ð Þ, the number of observations accumulated by the time t, is that N t ð Þ is a Cox process. In this section, as an example, we will consider the asymptotic behavior of sample quantiles constructed from a sample whose size is determined by a Cox process. As we have already noted in the introduction, this problem is very important for the proper application of such risk measures as VaR (Value-at-Risk) in, say, financial engineering.
Our aim here is to give necessary and sufficient conditions for the weak convergence of the joint distributions of sample quantiles constructed from samples with random sizes driven by a Cox process and to describe the r-variate limit distributions emerging here, thus extending Mosteller's Theorem 4 to samples with random sizes. The results of this section extend those of [36]  1. One-dimensional distributions of the normalized Cox process weakly converge to the distribution of some random variable Z as t ! ∞: 2. One-dimensional distributions of the controlling process Λ t ð Þ, appropriately normalized, converge to the same distribution: For the proof of Lemmas 5 and 6 see [37]. Now we proceed to the main results of this section. In addition to the notation introduced above, for positive integer n set Q j n ð Þ ¼ W λ j n ½ þ1 ð Þ , j ¼ 1, … , r, Q n ð Þ ¼ Q 1 n ð Þ, … , Q r n ð Þ ð Þ , ξ ¼ ξ λ 1 , … , ξ λ r À Á . Let d t ð Þ be an infinitely increasing positive function. Set Theorem 8. Let Λ t ð Þ ! P ∞ as t ! ∞. If p x ð Þ is differentiable in neighborhoods of the quantiles ξ λ i and p ξ λ i À Á 6 ¼ 0, i ¼ 1, … , r, then the convergence to some random vector Z takes place, if and only if there exists a nonnegative random variable U such that