Gamma-Kumaraswamy Distribution in Reliability Analysis: Properties and Applications Gamma-Kumaraswamy Distribution in Reliability Analysis: Properties and Applications

In this chapter, a new generalization of the Kumaraswamy distribution, namely the gamma-Kumaraswamy distribution is defined and studied. Several distributional properties of the distribution are discussed in this chapter, which includes limiting behavior, mode, quantiles, moments, skewness, kurtosis, Shannon ’ s entropy, and order statistics. Under the classical method of estimation, the method of maximum likelihood estima- tion is proposed for the inference of this distribution. We provide the results of an analysis based on two real data sets when applied to the gamma-Kumaraswamy distri- bution to exhibit the utility of this model.


Introduction
The generalization of a distribution by mixing it with another distribution over the years has provided a mathematical based way to model a wide variety of random phenomena statistically. These generalized distributions are effective and flexible models to analyze and interpret random durations in a possibly heterogeneous population. In many situations, observed data may be assumed to have come from such a mixture population of two or more distributions.
Two parameter gamma and a two parameter Kumaraswamy are most popular distribution for analyzing any lifetime data. Gamma distribution is a well-known distribution, and it has several desirable properties [1].
A serious limitation of the gamma distribution, however, is that the distribution function (or survival function) is not available in a closed form if the shape parameter is not an integer, thereby it requires some numerical methods to evaluate these quantities. As a consequence, this distribution is less attractive as compared to Ref. [2], which has nice tractable distribution function, survival function and hazard function. In this paper, we consider a four parameter gamma-Kumaraswamy distribution. It is observed that it has many properties which are quite similar to those of a gamma distribution, but it has an explicit expression for the distribution function or the survival functions. The major motivation of this chapter is to introduce a new family of distributions, make a comparative study of this family with respect to a Kumaraswamy family and a gamma family and provide the practitioner with an additional option, with a hope that it may have a 'better fit' compared to a gamma family or Kumaraswamy family in certain situations. It is noteworthy to note that the gamma-Kumaraswamy distribution is a generalization of Kumaraswamy distribution with the property that it can exhibit various shapes. (Figure 1). This provides more flexibility to the gamma-Kumaraswamy distribution in comparison with Kumaraswamy distribution in modeling different data sets. The property of left-skewness is a rare characteristic as it is not enjoyed by several generalizations of Kumaraswamy distribution. Our proposed model is different from that of Ref. [3], where the authors have proposed a generalized gamma-generated distribution with an extra positive parameter for any continuous baseline G distribution. The rest of the paper is organized as follows. In Section 2, we propose the gamma-Kumaraswamy distribution [GK(α, β, a, b)]. In Section 3, we study various properties of the GK(α, β, a, b) including the limiting behavior, transformation, and the mode. In Section 4, the moment generating function, the moments and the mean deviations from the mean and the median, and Renyi's entropy are studied. In Section 5, we consider the maximum likelihood estimation of the GK(α, β, a, b). In Section 6, we provide an expression for the reliability parameter for two independent GK(α, β, a, b) with different choices for the parameters α and β but for a fixed choice of the two shape parameters of Kumaraswamy distribution. In Section 7, discussion is made for the moment generating function of the r-th order statistic and also the limiting distribution of the sample minimum and the sample maximum for a random sample of size n drawn from GK(α, β, a, b). An application of GK(α, β, a, b) is discussed in Section 8. Certain characterizations of GK(α, β, a, b) are presented in Section 9. In Section 10, some concluding remarks are made.

The gamma-Kumaraswamy distribution
We consider the following class of gamma-X class of distributions, for which, the parent model being where α, β are positive parameters. Also, gðxÞ½GðxÞ is the density function [cumulative distribution function] of the random variable X. Furthermore, GðxÞ is the survival function of the associated random variable X.
If X has density Eq. (1), then the random variable W ¼ GðxÞ GðxÞ has a gamma distribution with parameters α, β. The reverse happens to be true as well. Here, we consider G(.) to be the cdf of a Kumaraswamy distribution with parameters a, b. Then, the cdf of the gamma-Kumaraswamy (hereafter GK) reduces to where γ 1 ðα; zÞ ¼ Γðα; zÞ ΓðαÞ with Γðα; xÞ ¼ is the regularized incomplete gamma function. So the density and hazard functions corresponding to Eq. (2) are given, respectively, by and The percentile functions for GK distribution: The p th percentile x p is defined by F(x p ) = p. From Eq.
(2), we have In the density equation (3), a, b, and α are shape parameters and β is the scale parameter. It can be immediately verified that Eq. (3) is a density function. Plots of the GK density and survival rate function for selected parameter values are given in Figures 1 and 2, respectively.
• The GK distribution does not possess the reproductive property. In other words, if for any two X 1~G Kða 1 ;b 1 ;α 1 ;β 1 Þ and X 2~G Kða 2 ;b 2 ;α 2 ;β 2 Þ, then the distribution of the sum S = X 1 + X 2 will not be a GK.
The first result provides an important property of the GK distribution for information analysis is that this distribution is closed under power transformation. The latter result is equally important because it provides a simple way to generate random variables following the GK distribution.

Properties of GK distribution
The following lemma establishes the relation between GK(α, β, a, b) distribution and gamma distribution.
Lemma 1. (Transformation): If a random variable X follows a gamma distribution with parameters α and β, then Proof. The proof follows immediately by using the transformation technique. W The limiting behaviors of the GK pdf and its hazard function are given in the following theorem.
Theorem 1. The limits GK density function, f(x), and the hazard function, h F ðxÞ, are given by Proof. Straightforward and hence omitted. W Theorem 2. The mode of the GK distribution is the solution of the equation kðxÞ ¼ 0; where Proof. The derivative of f(x) in Eq. (3) can be written as The critical values of Eq. (10) are the solutions of kðxÞ ¼ 0: W Next, we discuss the IFR and/or DFR property of the hazard function for the GK distribution. For this, we will consider the result of Lemma 1. According to Lemma 1, if X~GK(a, b, α, β), In such a case for the random variable Y, the hazard rate function can be written as Therefore, rðtÞ ¼ hence r(t) is increasing, thereby and has a IFR. If 0 < α < 1, then 1 þ u t À Á α−1 is increasing in t, so r(t) decreases and hence has a DFR. Now, since X is a one-to-one function of Y, the hazard rate function of X will also follow the exact pattern.
Let X and Y be two random variables. X is said to be stochastically greater than or equal to Y denoted by X ≥ st Y if PðX > xÞ≥PðY > xÞ for all x in the support set of X.
for integer values of a 1 and a 2 .
Proof. At first, we note that the incomplete gamma function Γðα;xÞ is an increasing function of x for fixed α. For any real number x∈ð0; 1Þ, β 1 > β 2 ; a 1 > a 2 , and b 1 < b 2 , we have This implies that Γ α;β −1 PðX > xÞ ≥ PðY > xÞ, and this completes the proof. W Note: For fractional choices of a 1 and a 2 , the reverse of the above inequality will hold.

Moments and mean deviations
For any r ≥ 1, Upper bounds for the r-th order moment: Since n k ≤ n k k! , for 1 ≤ k ≤ n, from Eq. (13), one can write Γðα þ kÞ, provided r/a and j/b+k−1 are both integers. Employing successively, the generalized series expansion of 1−ð1 þ βuÞ −1=b j=a , the characteristic function for X~GKða;b;α;θÞ will be given by [from Eq. (3)] If j/a and k 1 /b are integers then in Eq. (14), the second and third summations will stop at j/a and k 1 /b, respectively.
If we denote the median by T, then the mean deviation from the mean, DðμÞ, and the mean deviation from the median, DðTÞ, can be written as Now, consider (17), we obtain where we used successively binomial series expansion.
By using Eqs. (2) and (18), the mean deviation from the mean and the mean deviation from the median are, respectively, given by

Entropy
One useful measure of diversity for a probability model is given by Renyi's entropy. It is defined as where ρ > 0 and ρ ≠ 1. If a random variable X has a GK distribution, then we have Next, consider the integral Now, using successive application of the generalized binomial expansion, we can write Hence, the integral in Eq. (21) reduces to Therefore, the expression for the Renyi's entropy will be I R ðρÞ ¼ ð1−ρÞ −1 log δðρ;α;β;a;bÞ (24)

Maximum likelihood estimation
In this section, we address the parameter estimation of the GK(α, β, a, b) under the classical set up. Let X 1 , X 2 , …, X n be a random sample of size n drawn from the density Eq. (3). The loglikelihood function is given by The derivatives of Eq. (13) with respect to α, β, a, and b are given by where Ψ ðαÞ ¼ ∂ ∂α logΓðαÞ,  ;KðθÞ −1 Þ distribution can be used to construct approximate confidence intervals for the individual parameters.

Simulation study
In order to assess the performance of the MLEs, a small simulation study is performed using the statistical software R through the package (stats4), command MLE. The number of Monte Carlo replications was 20,000 For maximizing the log-likelihood function, we use the MaxBFGS subroutine with analytical derivatives. The evaluation of the estimates was performed based on the following quantities for each sample size; the empirical mean squared errors (MSEs) are calculated using the R package from the Monte Carlo replications. The MLEs are determined for each simulated data, say, ðα i ;β i ;â i ;b i Þ for i ¼ 1; 2;…; 20; 000, and the biases and MSEs are computed by and for h ¼ α;β;a;b. We consider the sample sizes at n = 100, 200, and 500 and consider different values for the parameters . The empirical results are given in Table 1. The figures in Table 1 indicate that the estimates are quite stable and, more importantly, are close to the true values for these sample sizes. Furthermore, as the sample size increases, the MSEs decrease as expected.

Reliability parameter
The reliability parameter R is defined as R ¼ PðX > YÞ, where X and Y are independent random variables. For a detailed study on the possible applications of the reliability parameter, an interested reader is suggested to look at Ref. [4,5]. If X and Y are two continuous and independent random variables with the cdf's F 1 ðxÞ and F 2 ðyÞ and their pdf's f 1 ðxÞ and f 2 ðyÞ , respectively, then the reliability parameter R can be written as Theorem 4. Let X~GK(a, b, α 1 , β 1 ) and Y~(a, b, α 2 , β 2 ), then Γðα 1 þ α 2 þ pÞ: Proof: From Eqs. (2) and (3) Using the series expansion for the incomplete gamma function γ 1 ðk;xÞ ¼ x k ∑ ∞ p¼0 ð−xÞ p k!ðkþpÞ , and using the substitution Γðα 1 þ α 2 þ pÞ: Hence the proof. W

Order statistics
Here, we derive the general r-th order statistic and the large sample distribution of the sample minimum and the sample maximum based on a random sample of size n from the GK(α, β, a, b) distribution. The corresponding density function of the r-th order statistic, X r:n ; from Eq. Using the series expression for the incomplete gamma function: γ 1 ðα;xÞ ¼ ∑ ∞ k¼0 e −x ðxÞ αþk αðαþ1Þ⋯ðαþkÞ , the pdf of X r:n can be written as ð−1Þ jþsk r−1 j · Γðs k þ ðr þ jÞαÞ ðΓðαÞÞ n−rþj p k f ðxjs k þ ðn−r þ jÞα;β;a;bÞ, where s k ¼ ∑ n−rþj i¼1 k i and p k ¼ ∏ n−rþj i¼1 ðk i þ αÞ: From Eq. (37), it is interesting to note that the pdf of the r-th order statistic X r:n can be expressed as an infinite sum of the GK pdf 's.

Application
Here, we consider two well-known illustrative data sets which are used to show the efficacy of the GK distribution. For details on these two data sets [6,7], the second data set in Table 2 is from Ref. [8], and it represents the fatigue life of 6061-T6 aluminum coupons cut parallel with the direction of rolling and oscillated at 18 cycles per second. The GK distribution is fitted to the first data set and compared the result with the Kumaraswamy, gamma-uniform [9], and beta-Pareto [10]. These results are reported in Table 3. The results show that gamma-uniform, GK distributions provide adequate fit to the data. Figure 3 displays the empirical and the fitted cumulative distribution functions. This figure supports the results in Table 3. A close look at Figure 3 indicates that GK distribution provides better fit to the left tail than the gamma-uniform distribution. This is due to the fact that GK distribution can have longer left tail (Figure 3).
In addition, to check the goodness-of-fit of all statistical models, several other goodness-of-fit statistics are used and are computed using computational package Mathematica. The MLEs are computed using N maximize technique as well as the measures of goodness-of-fit statistics including the log-likelihood function evaluated at the MLEs (l), Akaike information criterion (AIC), corrected Akaike information criterion (AICC), consistent Akaike information criterion (CAIC), the Anderson-Darling (A * ), the Cramer-von Mises (W * ), and the Kolmogrov-Smirnov (K-S) statistics with their p values to compare the fitted models. These statistics are used to evaluate how closely a specific distribution with cdf (2) fits the corresponding empirical distribution for a given data set. The distribution with better fit than the others will be the one having the smallest statistics and largest p value. Alternatively, the distribution for which one 70  90  96  97  99  100  103  104  104  105   107  108  108  108  109  109  112  112  113  114   114  114  116  119  120  120  120  121  121  123   124  124  124  124  124  128  128  129  129  130   130  130  131  131  131  131  131  132  132  132   133  134  134  134  134  134  136  136  137  138   142  142  144  144  145  146  148  148  149  151   151  152  155  156  157  157  157  157  158  159   162  163  163  164  166  166  168  170  174  196 212 obtains the smallest of each of these criteria (i.e., AIC, AICC, K-S, etc.) will be most suitable one. The mathematical equations of those statistics are given by where ℓðθÞ denotes the log-likelihood function evaluated at the maximum likelihood estimates, q is the number of parameters, n is the sample size and z i ¼ cdf ðy i Þ, the y i 's being the ordered observations.
Lieblein and Zelen [6] proposed a five parameter beta generalized Pareto distribution and fitted the data in Table 4 and compared the result with beta-Pareto and other known distributions. The results of fitting beta generalized Pareto and beta-Pareto from Ref. [8] are reported in Table 4 along with the results of fitting the Pareto (IV) and GK distributions to the data. The KS value from Table 4 indicates that the GK distribution provides the best fit. The fact that GK distribution has the least number of parameters than beta generalized Pareto and beta-Pareto adds an extra advantage over them. Figure 4 displays the empirical and the fitted cumulative distribution functions. This figure supports the results in Table 4.

Characterization of GK distribution
In this section, we present characterizations of GK distribution in terms of the ratio of two truncated moments. For the previous works done in this direction, we refer the interested readers to Glänzel [11][12][13][14] and Hamedani [15][16][17]. For our characterization results, we employ a theorem due to Ref. [11], see for further details. The advantage of the characterizations given here is that cdf F need not have a closed form. We present here a corollary as a direct application of the theorem discussed in details in Ref. [11].
Corollary 1. Let X : Ω ! ð0; 1Þ be a continuous random variable and let βð1−x a Þ b for x∈ð0; 1Þ: Then X has pdf (3) if and only if the function η defined in Theorem 5 has the form Proof. Let X has pdf (3), then 1−FðxÞ model includes as special sub-models the gamma and Kumaraswamy distribution. Also, we provide various characterizations of the gamma-Kumaraswamy distribution. An application to a real data set shows that the fit of the new model is superior to the fits of its main submodels. As future work related to this univariate GK model, we will consider the following: • A natural bivariate extension to the model in Eq. (1) would be f ðx;yÞ∝ gðx;yÞ ΓðαÞβ α G 2 ðx;yÞ exp − gðx;yÞ βbarGðx;yÞ gðx;yÞ βbarGðx;yÞ α−1 ; x > 0;y > 0: In this case, exact evaluation of the normalizing constant would be difficult to obtain, even for a simple analytic expression of a baseline bivariate distribution function, G(x, y). Numerical methods such as Monte Carlo methods of integration might be useful here. We will study and discuss structural properties of such a bivariate GK model.
• Extension of the proposed univariate GK model to multivariate GK models and discuss the associated inferential issues. It is noteworthy to mention that classical methods of estimation, such as for example, maximum likelihood method of estimation might not be a good strategy because of the enormous number of model parameters. An appropriate Bayesian inference might be the only remedy. In that case, we will separately study two different cases of estimation: (a) with non-informative priors and (b) with full conditional conjugate priors (Gibbs sampling). Since the GK distribution is in the one parameter exponential family, a reasonable choice for priors for α and β might well be gamma priors with appropriate choice of hyper-parameters. For prior choices of the parameters that are from the baseline G(.) distribution function, a data-driven prior approach will be more suitable.
• A discrete analog of the univariate GK model with a possible application in modeling rare events.
• Construction of a new class of GK mixture models by adopting Marshall-Olkin method of obtaining new distribution.