Open access peer-reviewed chapter

Gamma-Kumaraswamy Distribution in Reliability Analysis: Properties and Applications

Written By

Indranil Ghosh and Gholamhossein G. Hamedani

Submitted: 14 April 2016 Reviewed: 09 November 2016 Published: 26 April 2017

DOI: 10.5772/66821

From the Edited Volume

Advances in Statistical Methodologies and Their Application to Real Problems

Edited by Tsukasa Hokimoto

Chapter metrics overview

View Full Metrics

Abstract

In this chapter, a new generalization of the Kumaraswamy distribution, namely the gamma-Kumaraswamy distribution is defined and studied. Several distributional properties of the distribution are discussed in this chapter, which includes limiting behavior, mode, quantiles, moments, skewness, kurtosis, Shannon’s entropy, and order statistics. Under the classical method of estimation, the method of maximum likelihood estimation is proposed for the inference of this distribution. We provide the results of an analysis based on two real data sets when applied to the gamma-Kumaraswamy distribution to exhibit the utility of this model.

Keywords

• gamma-Kumaraswamy distribution
• Renyi’s entropy
• reliability parameter
• stochastic ordering
• characterizations

1. Introduction

The generalization of a distribution by mixing it with another distribution over the years has provided a mathematical based way to model a wide variety of random phenomena statistically. These generalized distributions are effective and flexible models to analyze and interpret random durations in a possibly heterogeneous population. In many situations, observed data may be assumed to have come from such a mixture population of two or more distributions.

Two parameter gamma and a two parameter Kumaraswamy are most popular distribution for analyzing any lifetime data. Gamma distribution is a well-known distribution, and it has several desirable properties [1].

A serious limitation of the gamma distribution, however, is that the distribution function (or survival function) is not available in a closed form if the shape parameter is not an integer, thereby it requires some numerical methods to evaluate these quantities. As a consequence, this distribution is less attractive as compared to Ref. [2], which has nice tractable distribution function, survival function and hazard function. In this paper, we consider a four parameter gamma-Kumaraswamy distribution. It is observed that it has many properties which are quite similar to those of a gamma distribution, but it has an explicit expression for the distribution function or the survival functions. The major motivation of this chapter is to introduce a new family of distributions, make a comparative study of this family with respect to a Kumaraswamy family and a gamma family and provide the practitioner with an additional option, with a hope that it may have a ‘better fit’ compared to a gamma family or Kumaraswamy family in certain situations. It is noteworthy to note that the gamma-Kumaraswamy distribution is a generalization of Kumaraswamy distribution with the property that it can exhibit various shapes. ( Figure 1 ). This provides more flexibility to the gamma-Kumaraswamy distribution in comparison with Kumaraswamy distribution in modeling different data sets. The property of left-skewness is a rare characteristic as it is not enjoyed by several generalizations of Kumaraswamy distribution. Our proposed model is different from that of Ref. [3], where the authors have proposed a generalized gamma-generated distribution with an extra positive parameter for any continuous baseline G distribution.

The rest of the paper is organized as follows. In Section 2, we propose the gamma-Kumaraswamy distribution [GK(α, β, a, b)]. In Section 3, we study various properties of the GK(α, β, a, b) including the limiting behavior, transformation, and the mode. In Section 4, the moment generating function, the moments and the mean deviations from the mean and the median, and Renyi’s entropy are studied. In Section 5, we consider the maximum likelihood estimation of the GK(α, β, a, b). In Section 6, we provide an expression for the reliability parameter for two independent GK(α, β, a, b) with different choices for the parameters α and β but for a fixed choice of the two shape parameters of Kumaraswamy distribution. In Section 7, discussion is made for the moment generating function of the r-th order statistic and also the limiting distribution of the sample minimum and the sample maximum for a random sample of size n drawn from GK(α, β, a, b). An application of GK(α, β, a, b) is discussed in Section 8. Certain characterizations of GK(α, β, a, b) are presented in Section 9. In Section 10, some concluding remarks are made.

2. The gamma-Kumaraswamy distribution

We consider the following class of gamma-X class of distributions, for which, the parent model being

f ( x ) = 1 Γ ( α ) β α g ( x ) G ¯ 2 ( x ) exp ( g ( x ) β G ¯ ( x ) ) ( G ( x ) G ¯ ( x ) ) α 1 , x > 0, E1

where α, β are positive parameters. Also, g ( x ) [ G ( x ) ] is the density function [cumulative distribution function] of the random variable X. Furthermore, G ¯ ( x ) is the survival function of the associated random variable X.

If X has density Eq. (1), then the random variable W = G ( x ) G ¯ ( x ) has a gamma distribution with parameters α, β. The reverse happens to be true as well. Here, we consider G(.) to be the cdf of a Kumaraswamy distribution with parameters a, b. Then, the cdf of the gamma-Kumaraswamy (hereafter GK) reduces to

F ( x ) = 0 1 ( 1 x a ) b ( 1 x a ) b e w / β w α 1 Γ ( α ) β α d w = γ 1 ( α , 1 ( 1 x a ) b β ( 1 x a ) b ) , 0 < x < 1. E2

where γ 1 ( α , z ) = Γ ( α , z ) Γ ( α ) with Γ ( α , x ) = 0 x u α 1 e u d u is the regularized incomplete gamma function. So the density and hazard functions corresponding to Eq. (2) are given, respectively, by

f ( x ) = a b exp ( 1 ( 1 x a ) b β ( 1 x a ) b ) Γ ( α ) β α 1 ( 1 x a ) 2 ( 1 ( 1 x a ) b β ( 1 x a ) b ) α 1 x a 1 , 0 < x < 1, E3

and

h F ( x ) = ( ( 1 x a ) b 1 ) α 1 a b x a 1 ( 1 x a ) b 1 exp ( β 1 ( ( 1 x a ) b 1 ) ) β α ( 1 x a ) b + 1 ( 1 Γ ( α , 1 ( 1 x a ) b β ( 1 x a ) b ) ) . E4

The percentile functions for GK distribution: The p th percentile x p is defined by F(x p ) = p. From Eq. (2), we have γ 1 ( α , 1 ( 1 x a ) b β ( 1 x a ) b ) = p . Define Z p = 1 ( 1 x a ) b β ( 1 x a ) b , then Z p = γ 1 1 ( α , p ) , where γ 1 1 is the inverse of regularized incomplete gamma function. Hence, x p = ( 1 ( β ( 1 + Z 1 p ) ) 1 / b ) 1 / a .

In the density equation (3), a, b, and α are shape parameters and β is the scale parameter. It can be immediately verified that Eq. (3) is a density function. Plots of the GK density and survival rate function for selected parameter values are given in Figures 1 and 2 , respectively.

If X~GK(a, b, α, β), then the survival function of X, S(x) will be

1 γ 1 ( α , 1 ( 1 x a ) b β ( 1 x a ) b ) . E5

We simulate the GK distribution by solving the nonlinear equation

( 1 u ) γ 1 ( α , 1 ( 1 x a ) b β ( 1 x a ) b ) = 0, E6

where u has the uniform (0,1) distribution. Some facts regarding the GK distribution are as follows:

• If X~GK(a, b, α, β), then X m ~GK(a, b, α, β), m 0 .

• Also, we have the following important result: If X~GK(1, b, α, β), then X 1/a ~GK(a, b, α, β), a 0 .

• The GK distribution does not possess the reproductive property. In other words, if for any two X 1~GK ( a 1 , b 1 , α 1 , β 1 ) and X 2~GK ( a 2 , b 2 , α 2 , β 2 ) , then the distribution of the sum S = X 1 + X 2 will not be a GK.

The first result provides an important property of the GK distribution for information analysis is that this distribution is closed under power transformation. The latter result is equally important because it provides a simple way to generate random variables following the GK distribution.

3. Properties of GK distribution

The following lemma establishes the relation between GK(α, β, a, b) distribution and gamma distribution.

Lemma 1. (Transformation): If a random variable X follows a gamma distribution with parameters α and β, then Y = 1 ( 1 X a ) b ( 1 X a ) b follows GK(α, β, a, b) distribution.

Proof. The proof follows immediately by using the transformation technique. W

The limiting behaviors of the GK pdf and its hazard function are given in the following theorem.

Theorem 1. The limits GK density function, f(x), and the hazard function, h F ( x ) , are given by

lim x 0 + f ( x ) = lim x 0 + h f ( x ) = { 0, a > 1, b > 1, α > 1 , min { a , b } < 1, α < 1, E7
lim x f ( x ) = lim x h f ( x ) = { 0, b > 0, α < 1 , b < 0, α > 1. E8

Proof. Straightforward and hence omitted. W

Theorem 2. The mode of the GK distribution is the solution of the equation k ( x ) = 0, where

k ( x ) = ( a 1 ) 2 x a ( 1 x a ) + a b x a ( 1 x a ) b ( β 1 + ( 1 ( 1 x a ) b β ( 1 x a ) b ) 1 ) . E9

Proof. The derivative of f(x) in Eq. (3) can be written as

x f ( x ) = 1 β α Γ ( α ) a b x a 2 ( 1 x a ) 2 exp ( β 1 ( ( 1 x a ) b 1 ) ) ( 1 ( 1 x a ) b β ( 1 x a ) b ) α 1 k ( x ) . E10

The critical values of Eq. (10) are the solutions of k ( x ) = 0. W

Next, we discuss the IFR and/or DFR property of the hazard function for the GK distribution. For this, we will consider the result of Lemma 1. According to Lemma 1, if X~GK(a, b, α, β), then Y = 1 ( 1 X a ) b ( 1 X a ) b Gamma (α, β). In such a case for the random variable Y, the hazard rate function can be written as

1 r ( t ) = 1 F ( t ) f ( t ) = t 1 β α Γ ( α ) w α 1 exp ( w / β ) d w 1 β α Γ ( α ) t α 1 exp ( t / % b e t a ) = t inf t y ( w t ) α 1 exp ( 1 / β ( w t ) ) d w = 0 ( 1 + u t ) α 1 exp ( 1 / β u ) d u . E11

Therefore, r ( t ) = ( 0 ( 1 + u t ) α 1 exp ( 1 / β u ) d u ) 1 . If α > 1 , ( 1 + u t ) α 1 is decreasing in t and hence r(t) is increasing, thereby and has a IFR. If 0 < α < 1 , then

( 1 + u t ) α 1 is increasing in t, so r(t) decreases and hence has a DFR. Now, since X is a one-to-one function of Y, the hazard rate function of X will also follow the exact pattern.

Let X and Y be two random variables. X is said to be stochastically greater than or equal to Y denoted by X s t Y if P ( X > x ) P ( Y > x ) for all x in the support set of X.

Theorem 3. Suppose X~GK ( a 1 , b 1 , α , β 1 ) and Y~GK ( a 2 , b 2 , α , β 2 ) . If β 1 > β 2, a 1 > a 2 and b 1 < b 2. Then X s t Y , for integer values of a 1 and a 2.

Proof. At first, we note that the incomplete gamma function Γ ( α , x ) is an increasing function of x for fixed α. For any real number x ( 0,1 ) , β 1 > β 2 , a 1 > a 2 , and b 1 < b 2 , we have

β 1 1 ( ( 1 x a 1 ) b 1 1 ) β 2 1 ( ( 1 x a 2 ) b 2 1 ) . E12

This implies that Γ ( α , β 1 1 ( ( 1 x a 1 ) b 1 1 ) ) Γ ( α , β 2 1 ( ( 1 x a 2 ) b 2 1 ) ) . Equivalently, it implies that P ( X > x ) P ( Y > x ) , and this completes the proof. W

Note: For fractional choices of a 1 and a 2, the reverse of the above inequality will hold.

4. Moments and mean deviations

For any r 1 ,

E ( X r ) = 0 1 x r f ( x ) d x = 1 Γ ( α ) 0 exp ( u ) u α 1 ( 1 ( 1 + u β ) 1 / b ) r / a d u ( on substitution u = 1 ( 1 x a ) b β ( 1 x a ) b ) = 1 Γ ( α ) j = 0 ( 1 ) j ( r / a j ) 0 exp ( u ) u α 1 ( 1 + u β ) j / b d u = 1 Γ ( α ) j = 0 k = 0 ( 1 ) j + k ( r / a j ) ( j / b + k 1 k ) 0 exp ( u ) β k u α + k 1 d u = β k Γ ( α ) j = 0 k = 0 ( 1 ) j + k ( r / a j ) ( j / b + k 1 k ) Γ ( α + k ) . E13

Upper bounds for the r -th order moment: Since ( n k ) n k k ! , for 1 k n , from Eq. (13), one can write E ( X r ) ( ( r a ) ( j b + k 1 ) ) + β k Γ ( α ) j = 0 k = 0 ( 1 ) j + k ( r / a ) j j ! ( ( j / b + k 1 ) k k ! ) Γ ( α + k ) , provided r/a and j/b+k−1 are both integers. Employing successively, the generalized series expansion of ( 1 ( 1 + β u ) 1 / b ) j / a , the characteristic function for X~GK ( a , b , α , θ ) will be given by [from Eq. (3)]

φ X ( t ) = 1 Γ ( α ) 0 1 e i t x f ( x ) d x = 1 Γ ( α ) 0 u α 1 e u exp ( i t ( 1 ( 1 + β u ) 1 / b ) 1 / a ) d u on substitution u = 1 ( 1 x a ) b β ( 1 x a ) b = 1 Γ ( α ) j = 0 0 ( i t ( 1 ( 1 + β u ) 1 / b ) 1 / a ) j j ! u α 1 e u d u = 1 Γ ( α ) j = 0 k 1 = 0 k 2 = 0 ( 1 ) k 1 + k 2 β k 2 ( i t ) j ( j / a k 1 ) ( k 1 / b k 2 ) Γ ( α + k 2 ) . E14

If j/a and k 1/b are integers then in Eq. (14), the second and third summations will stop at j/a and k 1/b, respectively.

If we denote the median by T, then the mean deviation from the mean, D ( μ ) , and the mean deviation from the median, D ( T ) , can be written as

D ( μ ) = E | X μ | = 2 μ G ( μ ) 2 μ x f ( x ) d x . E15
D ( T ) = E | X T | = μ 2 T x f ( x ) d x . E16

Now, consider

I t = 0 t x f ( x ) d x = 0 t x a b exp ( 1 ( 1 x a ) b β ( 1 x a ) b ) Γ ( α ) β α × 1 ( 1 x a ) 2 ( 1 ( 1 x a ) b β ( 1 x a ) b ) α 1 x a 1 d x . E17

Using the substitution u = 1 ( 1 x a ) b β ( 1 x a ) b in Eq. (17), we obtain

I t = 1 Γ ( α ) 0 1 ( 1 t a ) b β ( 1 t a ) b ( 1 ( 1 + u β ) 1 / b ) 1 / a u α 1 e u d u = 1 Γ ( α ) j = 0 k = 0 ( 1 ) j β k ( i t ) j ( 1 / a j ) ( j / b + k 1 k ) Γ ( α , 1 ( 1 t a ) b β ( 1 t a ) b ) , E18

where we used successively binomial series expansion.

By using Eqs. (2) and (18), the mean deviation from the mean and the mean deviation from the median are, respectively, given by

D ( μ ) = 2 μ Γ ( α , 1 ( 1 m a ) b β ( 1 m a ) b ) Γ ( α ) 2 I μ . D ( M ) = m u 2 I M . E19

4.1. Entropy

One useful measure of diversity for a probability model is given by Renyi’s entropy. It is defined as I R ( ρ ) = ( 1 ρ ) 1 log ( f ρ ( x ) d x ) , where ρ > 0 and ρ 1 . If a random variable X has a GK distribution, then we have

f ρ ( x ) = ( a b Γ ( α ) β α ) ρ exp ( ρ ( 1 ( 1 x a ) b ) β ( 1 x a ) b ) × 1 ( 1 x a ) 2 ρ ( 1 ( 1 x a ) b β ( 1 x a ) b ) ρ ( α 1 ) x ρ ( a 1 ) E20

Next, consider the integral

0 1 f ρ ( x ) d x = ( a b Γ ( α ) β α ) ρ ρ 1 0 u α 1 exp ( u ) ( 1 ( 1 + β u 1 / ρ ) 1 / b ) 1 / a d u ( on substitution u = ( 1 ( 1 x a ) b β ( 1 x a ) b ) ρ ) . E21

Now, using successive application of the generalized binomial expansion, we can write

( 1 ( 1 + β u 1 / ρ ) 1 / b ) 1 / a = ( 1 ) ρ 1 j = 0 k = 0 ( 1 ) j ( ρ 1 j ) ( 1 / b + k 1 k ) β k u k / ρ . E22

Hence, the integral in Eq. (21) reduces to

0 1 f ρ ( x ) d x = ( a b Γ ( α ) β α ) ρ ρ ρ α k ( 1 ) ρ 1 j = 0 k = 0 ( 1 ) j ( ρ 1 j ) ( 1 / b + k 1 k ) β k Γ ( ρ α + k ) = δ ( ρ , α , β , a , b ) , say E23

Therefore, the expression for the Renyi’s entropy will be

I R ( ρ ) = ( 1 ρ ) 1 log ( δ ( ρ , α , β , a , b ) ) E24

5. Maximum likelihood estimation

In this section, we address the parameter estimation of the GK(α, β, a, b) under the classical set up. Let X 1, X 2, …, X n be a random sample of size n drawn from the density Eq. (3). The log-likelihood function is given by

= α log β n log Γ ( α ) + n log a + n log b + ( a 1 ) i = 1 n log X i i = 1 n 1 ( 1 X i a ) b β ( 1 X i a ) b 2 i = 1 n log ( 1 X i a ) + ( α 1 ) i = 1 n log ( 1 ( 1 X i a ) b β ( 1 X i a ) b ) . E25

The derivatives of Eq. (13) with respect to α, β, a, and b are given by

α = n log β Ψ ( α ) + i = 1 n log ( 1 ( 1 X i a ) b β ( 1 X i a ) b ) , E26

where Ψ ( α ) = α log Γ ( α ) ,

β = α β + β 2 i = 1 n ( 1 ( 1 X i a ) b β ( 1 X i a ) b ( α 1 ) log ( 1 ( 1 X i a ) b β ( 1 X i a ) b ) ) . E27
a = n a + i = 1 n log X i + 2 i = 1 n X i a ( 1 X i a ) 1 log X i + b ( α 1 ) β i = 1 n ( 1 ( 1 X i a ) b β ( 1 X i a ) b ) 1 X i a log X i ( 1 X i a ) b + 1 1 β i = 1 n X i a log X i ( 1 X i a ) b + 1 E28
b = n b + 1 β ( 1 + i = 1 n log ( 1 X i a ) ( 1 ( α 1 β ) 1 ( 1 X i a ) b ( 1 X i a ) b ) ) . E29

The MLEs α ^ , β ^ , a ^ , and b ^ are obtained by setting Eqs. (2629) to zero and solving them simultaneously.

To estimate the model parameters, numerical iterative techniques must be used to solve these equations. We may investigate the global maxima of the log likelihood by setting different starting values for the parameters. The information matrix will be required for interval estimation. The elements of the 4 × 4 total observed information matrix (since expected values are difficult to calculate), J ( θ ) = J r , s ( θ ) (for r , s = α , β , a , b ), can be obtained from the authors under request, where θ = ( α , β , a , b ) . The asymptotic distribution of ( θ ^ θ ) is N 4 ( 0 , K ( θ ) 1 ) , under the regularity conditions, where K ( θ ) = E ( J ( θ ) ) is the expected information matrix, and J ( θ ^ ) 1 is the observed information matrix. The multivariate normal N 4 ( 0 , K ( θ ) 1 ) distribution can be used to construct approximate confidence intervals for the individual parameters.

5.1. Simulation study

In order to assess the performance of the MLEs, a small simulation study is performed using the statistical software R through the package (stats4), command MLE. The number of Monte Carlo replications was 20,000 For maximizing the log-likelihood function, we use the MaxBFGS subroutine with analytical derivatives. The evaluation of the estimates was performed based on the following quantities for each sample size; the empirical mean squared errors (MSEs) are calculated using the R package from the Monte Carlo replications. The MLEs are determined for each simulated data, say, ( α ^ i , β ^ i , a ^ i , b ^ i ) for i = 1,2, ,20,000 , and the biases and MSEs are computed by

b i a s h ( n ) = 1 20000 i = 1 20000 ( h ^ i h ) , E30

and

M S E h ( n ) = 1 20000 i = 1 20000 ( h ^ i h ) 2 , E31

for h = α , β , a , b . We consider the sample sizes at n = 100, 200, and 500 and consider different values for the parameters . The empirical results are given in Table 1 . The figures in Table 1 indicate that the estimates are quite stable and, more importantly, are close to the true values for these sample sizes. Furthermore, as the sample size increases, the MSEs decrease as expected.

Sample size Actual value Bias MSE
n α β a b α ^ β ^ a ^ b ^ α ^ β ^ a ^ b ^
100 0.5 0.5 2 4 −0.417 −0.419 0.355 −0.393 0.051 0.046 0.053 0.97
0.5 0.5 3 5 −0.773 0.324 −0.214 −0.342 0.018 0.042 0.098 0.626
0.7 0.8 4 3 0.489 −0.246 −0.623 0.482 0.015 0.121 0.106 0.167
0.9 0.7 6 4 0.188 0.979 −0.509 0.056 0.048 0.022 0.044 0.114
1 1.5 0.9 0.6 0.178 −0.498 −0.429 −0.545 0.427 0.028 0.092 0.495
1.5 2 0.6 0.8 −0.084 −0.363 −0.405 −0.220 0.953 0.018 0.073 0.572
200 0.5 0.5 2 4 .072 0.361 0.049 0.073 0.022 0.023 0.024 0.313
0.5 0.5 3 5 0.518 0.184 0.084 0.115 0.008 0.022 0.045 0.578
0.7 0.8 4 3 0.316 0.159 0.050 −0.329 0.006 0.059 0.044 0.158
0.9 0.7 6 4 0.137 −0.049 0.131 −0.032 0.018 0.010 0.020 0.095
1 1.5 0.9 0.6 0.125 0.475 0.086 0.242 0.064 0.013 0.028 0.147
1.5 2 0.6 0.8 0.034 0.173 0.224 −0.150 0.401 0.008 0.036 0.432
500 0.5 0.5 2 4 −0.046 −0.028 −0.036 −0.047 0.009 0.011 0.011 0.084
0.5 0.5 3 5 −0.051 −0.111 −0.035 0.002 0.004 0.010 0.022 0.018
0.7 0.8 4 3 −0.0730 −0.052 0.046 −0.022 0.004 0.031 0.024 0.036
0.9 0.7 6 4 −0.102 −0.02 −0.098 −0.023 0.008 0.005 0.010 0.021
1 1.5 0.9 0.6 −0.078 −0.052 −0.017 0.003 0.027 0.007 0.012 0.015
1.5 2 0.6 0.8 0.007 −0.069 0.066 −0.085 0.136 0.004 0.015 0.013

Table 1.

Bias and MSE of the estimates under the maximum likelihood method.

6. Reliability parameter

The reliability parameter R is defined as R = P ( X > Y ) , where X and Y are independent random variables. For a detailed study on the possible applications of the reliability parameter, an interested reader is suggested to look at Ref. [4, 5]. If X and Y are two continuous and independent random variables with the cdf’s F 1 ( x ) and F 2 ( y ) and their pdf’s f 1 ( x ) and f 2 ( y ) , respectively, then the reliability parameter R can be written as

R = P ( X > Y ) = F 2 ( t ) f 1 ( t ) d t . E32

Theorem 4. Let X~GK(a, b, α 1, β 1) and Y~(a, b, α 2, β 2), then

R = p = 0 ( 1 ) p p ! ( α 2 + p ) Γ ( α 1 ) ( β 1 β 2 ) p + α 2 Γ ( α 1 + α 2 + p ) . E33

Proof: From Eqs. (2) and (3), we have

R = 0 1 γ 1 ( α 2 , 1 ( 1 t a ) b β 2 ( 1 t a ) b ) a b exp ( 1 ( 1 t a ) b β 1 ( 1 t a ) b ) Γ ( α 1 ) β 1 α × 1 ( 1 t a ) 2 ( 1 ( 1 t a ) b β 1 ( 1 t a ) b ) α 1 1 t a 1 d t . E34

Using the series expansion for the incomplete gamma function γ 1 ( k , x ) = x k p = 0 ( x ) p k ! ( k + p ) , and using the substitution u = 1 ( 1 t a ) b β 1 ( 1 t a ) b , Eq. (34) reduces to

R = p = 0 ( 1 ) p p ! ( α 2 + p ) Γ ( α 1 ) ( u β 1 β 2 ) p + α 2 u α 1 1 exp ( u ) d u = p = 0 ( 1 ) p p ! ( α 2 + p ) Γ ( α 1 ) ( β 1 β 2 ) p + α 2 Γ ( α 1 + α 2 + p ) . E35

Hence the proof. W

7. Order statistics

Here, we derive the general r-th order statistic and the large sample distribution of the sample minimum and the sample maximum based on a random sample of size n from the GK(α, β, a, b) distribution. The corresponding density function of the r-th order statistic, X r : n , from Eq. (3) will be

f X r : n ( x ) = 1 B ( r , n r + 1 ) ( F ( x ) ) r 1 ( 1 F ( x ) ) n r f ( x ) = f ( x ) B ( r , n r + 1 ) j = 0 r 1 ( 1 ) j ( r 1 j ) ( Γ ( α , β 1 1 ( 1 x r : n a ) b ( 1 x r : n a ) b ) Γ ( α ) ) n r + j × I ( 0 < x < 1 ) . E36

Using the series expression for the incomplete gamma function: γ 1 ( α , x ) = k = 0 e x ( x ) α + k α ( α + 1 ) ( α + k ) , the pdf of X r : n can be written as

f r : n ( x ) = 1 B ( r , n r + 1 ) f ( x ) j = 0 r 1 ( 1 ) j ( r 1 j ) ( k = 0 exp ( 1 ( 1 x r : n a ) b β ( 1 x r : n a ) b ) ( 1 ( 1 x r : n a ) b β ( 1 x r : n a ) b ) α + k Γ ( α ) α ( α + 1 ) ( α + k ) ) n r + j = f ( x ) B ( r , n r + 1 ) j = 0 r 1 k 1 k n r + j = 0 ( 1 ) j + s k ( r 1 j ) exp ( ( n r + j ) 1 ( 1 x r : n a ) b β ( 1 x r : n a ) b ) × ( 1 ( 1 x r : n a ) b ( 1 x r : n a ) b ) s k + ( n r + j ) α ( Γ ( α ) ) n r + j β s k + ( n r + j ) α p k = 1 B ( r , n r + 1 ) j = 0 n r k 1 k n r + j = 0 ( 1 ) j + s k ( r 1 j ) × Γ ( s k + ( r + j ) α ) ( Γ ( α ) ) n r + j p k f ( x | s k + ( n r + j ) α , β , a , b ) , E37

where s k = i = 1 n r + j k i and p k = i = 1 n r + j ( k i + α ) .

From Eq. (37), it is interesting to note that the pdf of the r-th order statistic X r : n can be expressed as an infinite sum of the GK pdf ’s.

8. Application

Here, we consider two well-known illustrative data sets which are used to show the efficacy of the GK distribution. For details on these two data sets [6, 7], the second data set in Table 2 is from Ref. [8], and it represents the fatigue life of 6061-T6 aluminum coupons cut parallel with the direction of rolling and oscillated at 18 cycles per second. The GK distribution is fitted to the first data set and compared the result with the Kumaraswamy, gamma-uniform [9], and beta-Pareto [10]. These results are reported in Table 3 . The results show that gamma-uniform, GK distributions provide adequate fit to the data. Figure 3 displays the empirical and the fitted cumulative distribution functions. This figure supports the results in Table 3 . A close look at Figure 3 indicates that GK distribution provides better fit to the left tail than the gamma-uniform distribution. This is due to the fact that GK distribution can have longer left tail ( Figure 3 ).

70 90 96 97 99 100 103 104 104 105
107 108 108 108 109 109 112 112 113 114
114 114 116 119 120 120 120 121 121 123
124 124 124 124 124 128 128 129 129 130
130 130 131 131 131 131 131 132 132 132
133 134 134 134 134 134 136 136 137 138
142 142 144 144 145 146 148 148 149 151
151 152 155 156 157 157 157 157 158 159
162 163 163 164 166 166 168 170 174 196
212

Table 2.

Fatigue life of 6061-T6 aluminum data.

Distribution Kumaraswamy Gamma-uniform Beta-Pareto Gamma-Kumaraswamy
Parameter estimates a ^ = 0.653 α ^ = 7.528 c ^ = 5.048 α ^ = 7.891
b ^ = 1.1182 β ^ = 2.731 β ^ = 0.401 β ^ = 0.785
a ^ = 6.49 θ ^ = 6.417 a ^ = 5.352
b ^ = 0.932 b ^ = 1.735
Log likelihood −162.34 −116.58 −113.36 −113.25
AIC 217.38 119.45 167.78 107.85
AICC 218.36 120.37 168.91 108.43
CAIC 218.36 120.37 168.91 108.43
A 0 * 12.164 0.5951 1.3125 0.4282
W 0 * 2.8937 0.0931 0.1317 0.04893
K-S 0.5290 0.09521 0.4245 0.0492
K-S p-value 0.0000 0.9140 0.8374 0.9978

Table 3.

Goodness of fit of deep-groove ball bearings data.

In addition, to check the goodness-of-fit of all statistical models, several other goodness-of-fit statistics are used and are computed using computational package Mathematica. The MLEs are computed using N maximize technique as well as the measures of goodness-of-fit statistics including the log-likelihood function evaluated at the MLEs (l), Akaike information criterion (AIC), corrected Akaike information criterion (AICC), consistent Akaike information criterion (CAIC), the Anderson-Darling (A *), the Cramer-von Mises (W *), and the Kolmogrov-Smirnov (K-S) statistics with their p values to compare the fitted models. These statistics are used to evaluate how closely a specific distribution with cdf (2) fits the corresponding empirical distribution for a given data set. The distribution with better fit than the others will be the one having the smallest statistics and largest p value. Alternatively, the distribution for which one obtains the smallest of each of these criteria (i.e., AIC, AICC, K-S, etc.) will be most suitable one. The mathematical equations of those statistics are given by

• A I C = 2 ( θ ^ ) + 2 q

• A I C C = A I C + 2 q ( q + 1 ) n q 1

• C A I C = 2 ( θ ^ ) + 2 q n n q 1

• A 0 * = ( 2.25 n 2 + 0.75 n + 1 ) ( n 1 n i = 1 n ( 2 i 1 ) log ( z i ( 1 z n i + 1 ) ) )

• W 0 * = ( 0.5 n + 1 ) [ ( z i 2 i 1 2 n ) 2 + 1 12 n ]

• K S = M a x ( i n z i , z i i 1 n ) ,

where ( θ ^ ) denotes the log-likelihood function evaluated at the maximum likelihood estimates, q is the number of parameters, n is the sample size and z i = c d f ( y i ) , the y i ’s being the ordered observations.

Lieblein and Zelen [6] proposed a five parameter beta generalized Pareto distribution and fitted the data in Table 4 and compared the result with beta-Pareto and other known distributions. The results of fitting beta generalized Pareto and beta-Pareto from Ref. [8] are reported in Table 4 along with the results of fitting the Pareto (IV) and GK distributions to the data. The KS value from Table 4 indicates that the GK distribution provides the best fit. The fact that GK distribution has the least number of parameters than beta generalized Pareto and beta-Pareto adds an extra advantage over them. Figure 4 displays the empirical and the fitted cumulative distribution functions. This figure supports the results in Table 4 .

Distribution Kumaraswamy Beta-Pareto Beta generalized Pareto gamma-Kumaraswamy
Parameter estimates δ ^ = 0.235 α ^ = 485.470 α ^ = 12.112 α ^ = 4.87
γ ^ = 2.4926 β ^ = 162.060 β ^ = 1.702 β ^ = 3.352
k ^ = 0.3943 μ ^ = 40.564 a ^ = 1.1722
θ ^ = 3.910 k ^ = 0.273 b ^ = 2.0154
θ ^ = 54.837
Log likelihood −572.39 −517.33 −457.85 −417.36
AIC 1018.83 925.30 925.70 878.45
AICC 1018.956 926.01 926.84 879.63
CAIC 1018.956 926.01 926.84 879.63
A 0 * 4.1745 1.0083 0.6584 0.4921
W 0 * 0.6739 0.2827 0.1195 0.0822
K-S 0.4723 0.091 0.070 0.064
K-S p-value 0.000 0.248 0.537 0.736

Table 4.

Parameter estimates for the fatigue life of 6061-T6 aluminum coupons data.

9. Characterization of GK distribution

In this section, we present characterizations of GK distribution in terms of the ratio of two truncated moments. For the previous works done in this direction, we refer the interested readers to Glänzel [1114] and Hamedani [1517]. For our characterization results, we employ a theorem due to Ref. [11], see for further details. The advantage of the characterizations given here is that cdf F need not have a closed form. We present here a corollary as a direct application of the theorem discussed in details in Ref. [11].

Corollary 1. Let X : Ω ( 0,1 ) be a continuous random variable and let h ( x ) = β α 1 ( 1 x a ) b ( α 2 ) + 1 [ 1 ( 1 x a ) b ] 1 α and g ( x ) = h ( x ) exp ( 1 ( 1 x a ) b β ( 1 x a ) b ) for x ( 0,1 ) . Then X has pdf (3) if and only if the function η defined in Theorem 5 has the form

η ( x ) = 1 2 exp ( 1 ( 1 x a ) b β ( 1 x a ) b ) , 0 < x < 1. E38

Proof. Let X has pdf (3), then

( 1 F ( x ) ) E [ h ( X ) | X x ] = 1 β α 1 Γ ( α ) exp ( 1 ( 1 x a ) b β ( 1 x a ) b ) , 0 < x < 1, E39

and

( 1 F ( x ) ) E [ g ( X ) | X x ] = 1 2 β α 1 Γ ( α ) exp { 2 ( 1 ( 1 x a ) b β ( 1 x a ) b ) } , 0 < x < 1, E40

and finally

η ( x ) h ( x ) g ( x ) = h ( x ) 2 exp ( 1 ( 1 x a ) b β ( 1 x a ) b ) < 0, f o r 0 < x < 1. E41

Conversely, if η is given as above, then

s ( x ) = η ( x ) h ( x ) η ( x ) h ( x ) g ( x ) = a b β x a 1 ( 1 x a ) ( b + 1 ) , 0 < x < 1, E42

and hence

s ( x ) = 1 β ( 1 x a ) b , 0 < x < 1. E43

Now, in view of Theorem 5, X has pdf (3).

Corollary 2. Let X : Ω ( 0,1 ) be a continuous random variable and let h(x) be as in Proposition 1. Then, X has pdf (3) if and only if there exist functions g and η defined in Theorem 5 satisfying the differential equation

η ( x ) h ( x ) η ( x ) h ( x ) g ( x ) = a b β x a 1 ( 1 x a ) ( b + 1 ) , 0 < x < 1. E44

Remarks 1. (a) The general solution of the differential equation in Corollary 1 is

η ( x ) = exp ( 1 ( 1 x a ) b β ( 1 x a ) b ) [ a b β x a 1 ( 1 x a ) ( b + 1 ) exp ( 1 ( 1 x a ) b β ( 1 x a ) b ) ( h ( x ) ) 1 g ( x ) d x + D ] , E45

for 0 < x < 1, where D is a constant. One set of appropriate functions is given in Proposition 1 with D = 0

(b) Clearly, there are other triplets of functions (h, g, η) satisfying the conditions of Theorem 5. We presented one such triplet in Proposition 1.

10. Concluding remarks

A special case of the gamma-generated family of distributions, the gamma-Kumaraswamy distribution, is defined and studied. Various properties of the gamma-Kumaraswamy distribution are investigated, including moments, hazard function, and reliability parameter. The new model includes as special sub-models the gamma and Kumaraswamy distribution. Also, we provide various characterizations of the gamma-Kumaraswamy distribution. An application to a real data set shows that the fit of the new model is superior to the fits of its main sub-models. As future work related to this univariate GK model, we will consider the following:

• A natural bivariate extension to the model in Eq. (1) would be

f ( x , y ) g ( x , y ) Γ ( α ) β α G ¯ 2 ( x , y ) exp ( g ( x , y ) β b a r G ( x , y ) ) ( g ( x , y ) β b a r G ( x , y ) ) α 1 , x > 0, y > 0. E46

In this case, exact evaluation of the normalizing constant would be difficult to obtain, even for a simple analytic expression of a baseline bivariate distribution function, G(x, y). Numerical methods such as Monte Carlo methods of integration might be useful here. We will study and discuss structural properties of such a bivariate GK model.

• Extension of the proposed univariate GK model to multivariate GK models and discuss the associated inferential issues. It is noteworthy to mention that classical methods of estimation, such as for example, maximum likelihood method of estimation might not be a good strategy because of the enormous number of model parameters. An appropriate Bayesian inference might be the only remedy. In that case, we will separately study two different cases of estimation: (a) with non-informative priors and (b) with full conditional conjugate priors (Gibbs sampling). Since the GK distribution is in the one parameter exponential family, a reasonable choice for priors for α and β might well be gamma priors with appropriate choice of hyper-parameters. For prior choices of the parameters that are from the baseline G(.) distribution function, a data-driven prior approach will be more suitable.

• A discrete analog of the univariate GK model with a possible application in modeling rare events.

• Construction of a new class of GK mixture models by adopting Marshall-Olkin method of obtaining new distribution.

References

1. 1. Johnson, N.L., Kotz, S., and Balakrishnan, N.: Continuous Univariate Distributions. John Wiley and Sons; New York, 1994. Vol. 2.
2. 2. Kumaraswamy, P.: Generalized probability density-function for double-bounded random-processes. Journal of Hydrology, 1980; 462: 79–88.
3. 3. Zografos, K., and Balakrishnan, N.: On families of beta and generalized gamma-generated distributions and associated inference. Statistical Methodology, 2009; 4: 344–362.
4. 4. Hall, I.J.: Approximate one-sided tolerance limits for the difference or sum of two independent normal variates. Journal of Qualitative Technology, 1984; 16: 15–19.
5. 5. Weerahandi, S., and Johnson, R.A.: Testing reliability in a stress-strength model when X and Y are normally distributed. Technometrics, 1992; 38: 83–91.
6. 6. Lieblein, J., and Zelen, M.: Statistical investigation of the fatigue life of deep-groove ball bearings. Journal of Research of the National Bureau of Standards, 1956; 57: 273–316.
7. 7. Alzaatreh, A., and Ghosh, I.: A study of the Gamma-Pareto (IV) distribution and its applications. Communications in Statistics – Theory and Methods. 2016;45: 636–654, doi:10.1080/03610926. 2013.834453
8. 8. Mahmoudi, E.: The beta generalized Pareto distribution with application to lifetime data. Mathematics and Computers in Simulation, 2011; 81: 2414–2430.
9. 9. Torabi, H., and Hedesh, N.M.: The gamma-uniform distribution and its applications. Kybernetika-Praha, 2011; 48: 16–30.
10. 10. Akinsete, A., Famoye, F., and Lee, C.: The beta-Pareto distribution. Statistics, 2008; 42: 547–563.
11. 11. Glänzel, W.: A characterization theorem based on truncated moments and its application to some distribution families. Mathematical Statistics and Probability Theory, 1987: 75–84.
12. 12. Glänzel, W.: Some consequences of a characterization theorem based on truncated moments. Statistics, 1990; 21: 613–618.
13. 13. Glänzel, W., Telcs, A., and Schubert, A.: Characterization by truncated moments and its application to Pearson-type distributions. Zeitschrift fur Wahrscheinlichkeitstheorie und verwandte Gebiete, 1984; 66: 173–183.
14. 14. Glänzel, W., and Hamedani, G.G.: Characterizations of univariate continuous distributions. Studia Scientiarum Mathematicarum Hungarica, 2001; 37: 83–118.
15. 15. Hamedani, G.G.: Characterizations of univariate continuous distributions. Studia Scientia-rum Mathematicarum Hungarica, 2002; 39: 407–424.
16. 16. Hamedani, G.G.: Characterizations of univariate continuous distributions. Studia Scientia-rum Mathematicarum Hungarica, 2006; 43: 361–385.
17. 17. Hamedani, G.G.: Characterizations of continuous univariate distributions based on the truncated moments of functions of order statistics. Studia Scientiarum Mathematicarum Hungarica, 2010; 47: 462–484.

Written By

Indranil Ghosh and Gholamhossein G. Hamedani

Submitted: 14 April 2016 Reviewed: 09 November 2016 Published: 26 April 2017