Estimation of Measles Immunization Coverage in Guwahati by Ranked Set Sampling

In order to study the efficacy of the ranked set sampling (RSS), as an alternative procedure, for estimation of the proportion of children aged 12 – 23 immunized against measles vaccine, a study is conducted in slum and non-slum regions of Guwahati, the capital of Assam, India. The RSS-based approach in the cases of both perfect and imperfect rankings is compared with its counterpart simple random sampling (SRS). The results revealed that estimates based on RSS with set size (4) are very close to Census report for Assam (2012) and has less variability than the SRS estimator. RSS-based estimates for different choices of probability of ranking error ( ρ ) are not only more accurate but are more precise and efficient than the SRS procedure, and also suggest that a the procedure of RSS better than the classical SRS.


Introduction
In public health related studies, the virus of measles is considered as highly epidemic and responsible for serious diseases. According to the medical dictionary, measles virus infects the lungs in children, which can cause pneumonia in older children, can cause inflammation of the brain, called encephalitis, which can cause seizures and brain damage [1]. As a preventive measure, adequate vaccination is introduced in early childhood to acquire immunity against measles virus. According to the Integrated Child Development Services Program (ICDS) in India, a child should have received the basic vaccines (BCG, polio, DPT and measles) at 12-23 months of age.
In this regard, the World Health Organization (WHO) launched the Expanded Program on Immunization (EPI) in May 1974 to immunize children around the world. Since then, it has been widely used to access coverage. India launched the National Vaccination Program entitled Expanded Programme of Immunization in 1978 with the introduction of the BCG, OPV, DPT and anti-typhoid-paratyphoid vaccine. The EPI was renamed and launched, with a major change in list of vaccinations, as the Universal Vaccination Program on 19 November 1985. The measles vaccine [2] has been added to latest schedule.
To study the vaccination coverage a few among various sampling procedures and are the most popular viz., lot quality assurance sampling (LQAS) [3,4], systematic sampling, cluster sampling [5] and stratified sampling [6], are based on simple random sampling (SRS) either at one stage or subsequent stage. In most of the practical situations to obtain actual measurement of an observation is not relatively easy and economical, but ranking a small subset of auxiliary information about a sample is relatively easy, economical and reliable. McIntyre in 1952 [7] have introduced the procedure of ranked set sampling (RSS), as an alternative, which is highly beneficial and much superior to the standard SRS, for estimating population parameters. In RSS procedure a set of units is randomly drawn from the population and the selected units are ranked by using judgment method or by other means which does not require actual measurement. Only the unit possessing lowest rank is measured for this set of units. Next, a second set of units of the same size is drawn as earlier and ranked; then the unit at the second lowest position is measured. This procedure of ranking and measuring is continued until we get as many observations as the size of the set. This entire procedure is regarded as a cycle. The process of cycle formation will be continued until the desired sample size is obtained for analysis.
Based on real-life primary data, in the present study, the procedure of RSS is investigated against SRS for estimating the proportion of children aged 12-23 months of slum and non-slum households of Guwahati, the capital city of state Assam, India, who are not immunized with the vaccine against measles. The information on a total of 500 (260 slum and 240 non-slum) households, were obtained after being ranked using mother's age 15-49 years (in months), as auxiliary variable is used as auxiliary variable. The data of the same size is obtained by following the SRS procedure, for the evaluation of the performance and effectiveness of the RSS estimator compared to SRS estimator.

Sampling design
The study population is a representative cross-sectional sample of children aged 12-23 months to mothers aged 15-49 of Guwahati City, India. Five hundred households, following both SRS and RSS procedures, having the children of age from 6 months to 5 years were identified for the present study. Following the SRS technique, from both slum and non-slum regions, a sample 250 households were obtained. Among the selected 500 households under RSS, 260 were residents of slum region whereas remaining 240 are of non-slum parts of the Guwahati City. The variable considered in the study is the proportion of children in Guwahati City, India who are not immunized with the vaccine against measles. It is assumed that the case of receiving the vaccination of children usually depends on the awareness of the child's mother on immunization. Lower the age of a mother who has children aged 12-23 months, lower would be the consciousness as expected. Therefore, the age of the mother (in months) was used as a ranking variable in RSS for classification of the sample. Information on children was obtained by a face to face interview with the mothers. Mothers were selected through RSS conducted in Guwahati City.
The observation were divided into m sets of size four (s ¼ 4) each. The observations under RSS procedure are obtained through the following steps.

A simple random sample of units s 2 is selected from the target population and
is distributed randomly in sets s, each with s units.

2.
In each set s, the units are ranked according to the age of the mother. Since the samples in different sets are ranked according to the age of the mother (1, 2, 3, 4).
Obviously, there is a high chance of having ties then in this situation, the observations are ordered systematically in sequence, as explained by Terpstra and Nelson (2005).
3. From the first set, the unit corresponds to the mother with the lowest age (in months) is selected. From the second set, the unit corresponds to the mother with the second lowest age is selected and so on. Finally, from sth set, the unit corresponding to the mother with the highest age is selected. The other s(s À 1) sampled units are discarded from the data set.
4. Steps 1-3, called a cycle, are repeated m times to obtain a ranked set sample of size ms.
Corresponding to each selected mother, information regarding whether her child is administrated with measles vaccination or not is collected.
Suppose X is the binary response that takes value "1" if the child is vaccinated with measles vaccine and "0" otherwise.
represent a ranked set sample of size ms, where X i ½ j takes the values "1" or "0" according as the jth child in the ith ranking class is vaccinated or not. By virtue of ranked set sampling all X i ½ j 0 s are independently distributed. Here for any i from 1 to can be regarded as the ith order statistic corresponding to a simple random sample of s observations, say, X 1 , X 2 , ⋯, X s ð Þon X. Obviously, X i s have the common probability mass function (p.m.f.) given by where p is the probability that a child is vaccinated with measles vaccine in the population. Now we have, for any i, 1 ≤ i ≤ s P X i ½ 1 ¼ 1 Obviously, p i ½ is the proportion, in ith class, of children who received the vaccination and p is the overall proportion of children receiving the vaccine in entire target population. Here it can be easily shown that

Estimation of parameter p
For a dichotomous population, estimation of population proportion p, based on ranked set samples have already been investigated [8][9][10][11][12][13][14]. A method for estimation [15] of p using RSS for the situations where the binary variable is obtained from a continuous variable. Let So the joint p.m.f. of X i ½ j , j ¼ 1, 2, ⋯, m is given by where z i ¼ P m i¼1 x i ½ j , the number of children getting vaccinated observed in ith ranking class of the given ranked set sample. Obviously, , independently for all i ¼ 1, 2, ⋯, s. Then the joint p.m.f. of the whole sample X RSS is of the form Applying standard maximum likelihood (ML) principle the ML estimate of p under ranked set sampling, is given bŷ Given the RSS data, the form of likelihood function of p is complicated and hence the MLE of p is difficult to obtain directly. Alternatively, for i ¼ 1, 2, ⋯, s, one can separately derive ML estimate, say,p i ½ of based on the likelihood function (3) and then MLE of p can be formulated by using the relation (2) aŝ Here, it can be shown that, for each i ¼ 1, 2, ⋯, s, Z i m would be the maximum likelihood estimator of p i ½ and hence we get where X RSS is the overall mean of the ranked set sample. Let Y 1 , Y 2 , ⋯, Y n are the observations drawn according to SRS design andp SRS be an unbiased estimator of p, then the corresponding unbiased estimator of p can be obtained, asp

Effect of imperfect ranking
The discussion done in the previous sections are based on the assumption that the ranking procedure produces the correct order statistics. But, a perfect ranking mechanism is very rare in practice and hence some error in judgment ranking is inevitable. So it is necessary to study how much robust the proposed procedure is against when the rankings are not perfect. Estimation of p under perfect and imperfect unbalanced RSS [13,16,17] are discussed.
Let X i ½ and X i ð Þ denote, respectively, the i th judgment order and i th true order statistic while a set of s units are ordered. In the presence of ranking error, X i ½ is not necessarily equal to X i ð Þ . Let π ij denote the probability that the i th judgment order statistic actually have the true rank j, for i ¼ 1 1 ð Þs, j ¼ 1 1 ð Þs. Assume that π ij s satisfy the conditions That is, π ¼ π ij À Á is a doubly stochastic symmetric matrix of order s Â s. Under this assumption the distribution of X i ½ would be changed to Binomial (1, p * i ½ ), for Equivalently, For the present purpose we take the particular form of π as in [18,19], Here "ρ ¼ 1" corresponds to the case of perfect ranking. Under the above probability model for imperfect ranking some consequential facts are justified below.

1.
that is, the present imperfect ranking mechanism is consistent Justification: 2. For equal sample size, i.e., n ¼ ms one can get under ranked set sampling with the presence of ranking error Also, for the above choice of π-matrix, it is verified that Now, from the definition of p i ½ s, we argue that p i is a non-decreasing sequence and subsequently, among two sequences a i f g i¼1 1 ð Þs and b i f g i¼1 1 ð Þs , one is non-decreasing and the other is non-increasing. So, from Chebyshev's inequality for a i s and b i s we have As we know that the variance ofp in SRS is p 1Àp ð Þ n , the required justification follows from the fact that The justification in case of perfect ranking follows automatically by taking ρ ¼ 1 in the above proof.

Comparison betweenp RSS andp SRS
It is easy to argue that the ML estimatesp 1 ½ ,p 2 ½ , ⋯,p s ½ are statistically independent as the variables Z i 0 s are independently distributed. Again, substituting the value Z i m ofp i ½ in Eq. (5), the estimatep RSS can be shown to be identical with the overall mean of the given ranked set sample. It is also readily verified thatp SRS is an unbiased estimator of p.
The comparison of effectiveness and efficiency of the estimators based on simple random samples and ranked set samples, is obtained on the basis of criteria viz., relative precision (RP) and relative saving (RS). The expressions for RP, RS and MSE of the estimators are described below as 6. Result and discussion Table 1 shows the estimates of proportion of measles immunized children in Assam, under SRS and RSS are very different but RSS based estimates 0.80 and 0.92 are very close to Census report for Assam (2012) [20] true value, which is 0.84 (rural) and 0.90 (urban), and has less variability than the SRS estimator and are very distinct from the true values. The estimate based on RSS is found to be 58% and 142%, for slum and non-slum region, respectively, more precision than that of SRS. Here, smaller the value of ρ represents higher will be the ranking error in RSS. The performance of estimates even in imperfect situation as compare to SRS, for different choices of the ranking error probability ρ ¼ 0:2, 0:6, 0:9, shows the estimates based on RSS is 48%, 50% and 56% for slum region, and 139%, 140% and 141% for non-slum, respectively, more precision as compare to SRS. RSS also shows a saving of 37%(59%) under perfect and a minimum of 32%(58%) under imperfect in slum (non-slum) as compare to SRS.

Conclusion
The present study revealed that RSS based estimates in both of perfect and imperfect situations, performs better than SRS based estimates. It should also be emphasized in context of estimation of proportion of measles immunization coverage in slum and non-slum region of Assam, RSS based estimates for different choices of accuracy (ρ) are not only more accurate but are more precise and efficient than the SRS procedure, and also suggest that the procedure of RSS is better than the classical SRS. Therefore, based on the obtained results one can recommend to adopt RSS procedure in epidemiological application and in other health related studies so that it will help in planning to build a healthy and disease free environment.