Machine Learning Algorithm-Based Contraceptive Practice among Ever-Married Women in Bangladesh: A Hierarchical Machine Learning Classification Approach

Iqramul Haq; Md. Ismail Hossain; Md. Moshiur Rahman; Md. Injamul Haq Methun; Ashis Talukder; Md. Jakaria Habib; Md. Sanwar Hossain

doi:10.5772/intechopen.103187

Abstract

Contraception enables women to exercise their human right to choose the number and spacing of their children. The present study identified the best model selection procedure and predicted contraceptive practice among women aged 15–49 years in the context of Bangladesh. The required information was collected through a well-known nationally representative secondary dataset, the Bangladesh Demographic and Health Survey (BDHS), 2014. To identify the best model, we applied a hierarchical logistic regression classifier in the machine learning process. Seven well-known ML algorithms, such as logistic regression (LR), random forest (RF), naïve Bayes (NB), least absolute shrinkage and selection operation (LASSO), classification trees (CT), AdaBoost, and neural network (NN) were applied to predict contraceptive practice. The validity computation findings showed that the highest accuracy of 79.34% was achieved by the NN method. According to the values obtained from the ROC, NN (AUC = 86.90%) is considered the best method for this study. Moreover, NN (Cohen’s kappa statistic = 0.5626) shows the most extreme discriminative ability. From our research, we suggest using the artificial neural network technique to predict contraceptive use among Bangladeshi women. Our results can help researchers when trying to predict contraceptive practice.

Keywords

contraceptive
machine learning algorithms
LASSO
NN
hierarchical

Author Information

Show +

Iqramul Haq*
- Department of Agricultural Statistics, Sher-e-Bangla Agricultural University, Bangladesh
Md. Ismail Hossain
- Department of Statistics, Jagannath University, Bangladesh
Md. Moshiur Rahman
- Department of Pharmacology and Toxicology, Sher-e-Bangla Agricultural University, Bangladesh
Md. Injamul Haq Methun
- Department of Statistics, Tejgaon College, Bangladesh
Ashis Talukder
- Statistics Discipline, Khulna University, Bangladesh
Md. Jakaria Habib
- Department of Statistics, Jagannath University, Bangladesh
Md. Sanwar Hossain
- Department of Statistics, Jagannath University, Bangladesh

*Address all correspondence to: iqramul.haq@sau.edu.bd

1. Introduction

Family planning is indispensable in facilitating the prosperity and autonomy of women, their families, and their communities. Contraceptive choices, maternal and newborn health care, sexually transmitted infections, and sexual health are the main concepts of reproductive health [1]. The states agreed in 2001 that among the Millennium Development Goals (MDGs), target 5b was called for by 2015 for universal access to reproductive health. Global contraceptive prevalence is 64% (41% in low-income countries) and the global unmet need for family planning is 12% (22% in low-income countries) as reported at the end of the MDGs period. Sustainable Development Goals (SDGs) targets 3.7 and 5.6 call for universal access to sexual and reproductive health care services and sexual and sexual and reproductive health and reproductive rights, respectively [2, 3].

It has been calculated that maternal mortality has been reduced globally by 30% by the increase in contraceptive use [4]. Unintended pregnancies, pregnancy spacing, and reducing high-risk pregnancies are the consequences of contraceptive use [5, 6, 7]. Current studies show that every year, contraceptive use could reduce nearly 230 million births by stopping unwanted pregnancies [8]. As a result, the use of contraception improves the health of women and their children [6, 9]. However, the prevalence of contraceptive practice varied between 11.3% and 72.1% in different countries, namely Mozambique, 11.3%, Ghana, 21.5%, Bangladesh (modern method), 54.0%, and Sweden, 72.1% [9, 10, 11, 12].

Previous research has shown that various variables are significantly associated with contraceptive use, such as maternal age, maternal and husband’s educational level, wealth status, maternal age at first marriage, and so on [11, 13]. Through the promotion of family planning, appropriate diagnostics, and interventions, the prevalence of contraceptive use is increasing. Popular statistical methods (binary logistic regression) have been applied to determine important indicators of contraceptive use among women. But the main goal is to predict contraceptive practice among women aged between 15 and 49 in Bangladesh. Machine learning is a scientific method that can build models for prediction purposes. According to the research, traditional statistical procedures were shown to be ineffective in this form of modeling. Machine learning approaches have long been shown to be more successful and promising in handling a variety of complicated and nonlinear issues [14, 15, 16].

However, not many studies have explored machine learning methods to develop predictive models for studying contraceptive methods. Therefore, various well-known machine learning algorithms were applied to predict contraceptive practices among 15–49-year-old women in Bangladesh in this study. Before prediction, we applied a Hierarchical Logistic Regression classifier in machine learning approaches that were used to select potential risk factors associated with the contraceptive practice of women. To our best knowledge, the originality of the study is that it is almost new in the field of machine learning classifier approach in the contraceptive practice of Bangladesh context, for the first time using such methods, which will assist future data scientists.

2. Methods

2.1 Data source

In this study, the necessary information has been extracted from a representative secondary national data set, the Bangladesh Demographic and Health Survey (BDHS), 2014. This survey was carried out through a joint effort of the National Institute of Population Research and Training (Bangladesh), Mitra Associates (Bangladesh), and ICF International (USA).

The entire list of enumeration areas (EAs) that encompasses the entire country, provided by the Bangladesh Bureau of Statistics (BBS) for the 2011 population and housing census of the People’s Republic of Bangladesh, served as the sampling frame for the 2014 BDHS. An EA was a geographical zone with an average of 120 households. The survey uses a two-stage stratified sampling process that includes information on the EA region, residence (urban or rural), and the number of residential households counted. Viable interviews were conducted in 98% of the selected households (out of 17,989 total). For this study, 17,863 ever-married women aged 15–49 years were included in the final analysis. Note that to learn more about the detailed sampling procedure of the 2014 BDHS, see the final published report of the survey [17].

2.2 Dependent variable

Since the main purpose of this study was to predict contraception practice among women aged 15–49 years, the response variable was “current contraception use”, which was classified as “Yes or No”. If the respondent currently utilizes a contraceptive method, she falls into the “Yes” group, otherwise, she falls into the “No” group.

2.3 Independent variables

Besides the response variable, a set of 21 demographic and socioeconomic risk factors were included in the analysis, which was associated with contraceptive practice and considered predictor variables. Several studies found that demographic and socioeconomic characteristics such as current age, division, religion, residence, respondent’s working status, FP media exposure, age at first marriage, currently breastfeeding, wealth status, women’s education, husband’s education, child ever born, number of living children, ideal number of children, fertility preference, marital status, and decision making for using contraception are potential risk factors that determine contraception practice among women [10, 11, 18, 19, 20, 21, 22, 23, 24]. The list of independent variables and their measures are presented in Table 1.

No.	Variables	Measures
1	Women current age (years)	15–19, 20–24, 25–29, 30–34, 35–39, 40–44, 45–49
2	Division	Barisal, Chittagong, Dhaka, Khulna, Rajshahi, Rangpur, Sylhet
3	Religion	Islam, other
4	Sex of household head	Male, female
5	Residence	Urban, rural
6	Respondent working status	No, yes
7	Family planning (FP) media exposure	No, yes
8	Age at first marriage	<18, 18+
9	Currently breastfeeding	No, yes
10	Currently amenorrhoeic	No, yes
11	Currently abstaining	No, yes
12	Wealth status	Poor, middle, rich
13	Women education	No education, primary education, secondary+
14	Husband education	No education, primary education, secondary+
15	Sexually transmitted infection (STI)	No, yes
16	Children ever born	0–1, 2–3, 4+
17	Number of living children	None, 1–2, 3+
18	Ideal number of children	0–1, 2–3, 4+
19	Fertility preference	No more, have another, undecided, declared infecund, sterilized
20	Marital Status	Married, others
21	Decision making for using contraception	Respondent, others

Table 1.

Description of independent variables.

2.4 Statistical analysis

The frequency distribution was used to describe the background characteristics of the respondents. In this study, we developed a Hierarchical Logistic Regression classifier in machine learning approaches that were used to select potential risk factors related to the contraceptive practice of women in Bangladesh by using the largest value of AUC (p < 0.05). One of the procedures for enhancing the performance of machine learning is hierarchical learning, which is inspired by human learning [25]. The DeLong test is an extensively used test to compare the difference between two AUCs [26]. That model was significant, with the largest AUC value, and was considered the final model in this analysis. The steps are depicted in Figure 1.

Figure 1.
Flow diagram for hierarchical logistic regression classifier in the machine learning process.

To meet the objective of the study, we fitted numerous numbers of model where the full model is denoted by Mi (wherei=21) using Hierarchical Logistic Regression classifier in the Machine Learning Process. The steps are described below:

Step 1: Consider jth model defined as Mjj=123…i which is consist of j predictors. Thus, the initial model was named Model1 and defined as M1 where j=1, then fit the model M1 by using machine learning logistic classifier (MLLC).

Step 2: Adding a variable in the previous model and defined as Mj+1 and again also fit model Mj+1 by using MLLC approach.

Step 3: Identify the best model by using Delong’s Test, which is considered the largest area under the curve at a 5% level of significance.

Step 4: If Mj+1>Mj based on AUC at 5% level of significance, then Model Mj+1 has a significantly different AUC from Model Mj with p < 0.05. In this case, the best model was considered as Mj+1, otherwise the model was Mj.

Step 5: The process is repeated successively until the desired number of risk factors/features are identified.

After selecting the final model, we applied the 7 most popular machine learning classifiers to predict contraceptive practice among ever-married women aged 15–49 in Bangladesh. In this study, we used seven different popular ML algorithms (Logistic Regression (LR), Random Forest (RF), Naïve Bayes (NB), Least Absolute Shrinkage and Selection Operation (LASSO), Classification Trees (CT), AdaBoost, and Neural Network (NN)). A detailed description of the algorithms used is available in the literature [27, 28, 29, 30, 31, 32].

The Statistical Package for Social Science (SPSS) version 25 and R version 4.0.0 software were used for data management and analysis.

2.5 Proposed approach

Data from ever-married women aged 15–49 was used in this study. Only ever-married women aged 15–49 was considered for the final analysis based on this criterion. Then, apply data preparation methods; for example, first find out missing data from the overall dataset. It is well known that the main drawbacks of missing information in a dataset are the reduced statistical power (because it reduces the number of samples n, the estimates will have larger standard errors). The main disadvantages of missing data in a dataset are statistical power reductions, which are well-known (because it reduces the number of samples n, the estimates will have larger standard errors). There are numerous imputation methods for imputing missing values nowadays, including direct deletion, mode imputation, hot-deck imputation, and so on [33]. A lower threshold of 5% missingness has been suggested in the literature [34]. We utilized the direct deletion method because this study had a low rate of missing values, which means we removed all missing values from the data set and conducted the analysis using the entire data set. The next step after missing value processing is to normalize/standardize the variables, which is useful when the data distribution is unknown. As a result, normalization is not required for any machine learning approach, especially in categorical data. Finally, all machine learning classifiers included in this study were performed on 70% of the respondents in each group (training data set, n = 12,504) and acquired by the remaining 30% (test data set, n = 5358). All models were trained to support 10-fold cross-validation. On the training set, we performed 10-fold cross-validation, and on the testing set, we estimated performance. The results of the development of the seven machine learning classifiers are depicted in Figure 2.

Figure 2.
Flow chart of the development of the seven machine learning classifiers.

2.6 Model evaluation

We used the following criteria to evaluate the ML algorithms’ performance: confusion matrix, receiver operating characteristic (ROC), and the area under that curve (AUC). Generally, a confusion matrix has four possible prediction outcomes, such as TR = true positives, TN = true negatives, FP = false positives, and FN = false negatives. Several performance measures, including accuracy, precision, recall, and the F1 score, are usually calculated using these four potential outcomes to assess the classifier. The ROC curves have been calculated by utilizing the predicted outcomes as well as the true outcomes. To examine the ML algorithms’ discriminating powers, the AUC of the ROC has been averaged for the test data sets [35]. Theoretically, the AUC should be between 0 and 1, with 1 being the most extreme value for an ideal classifier. Since the usual lower bound for random classification is 0.5, an AUC greater than 0.5 has at least some capacity to separate between cases and non-cases [36]. In addition to these measures, we also used Cohen’s kappa statistic, which is a better measure to examine the agreement between two raters. It is calculated by utilizing the predicted and the actual classifications in a data set. The value of Cohen’s kappa statistic is 1.

3. Results

3.1 Sociodemographic characteristics of women

Table 2 shows the percentage distribution of women according to the selected socio-demographic characteristics of Bangladesh. The majority of women (19%) are between the ages of 25 and 29. The majority of them (35%) are from the Dhaka division, Muslims (90%), living in male-headed households (89%), and in the rural areas (72%). In terms of working status, slightly more than two-thirds (67%) of women are not currently involved in any kind of income-generating activities, and 80% of them do not have any media exposure. The majority of women (77% of them) married before their 18th birthday, and 79% of them were not breastfeeding their children at the time of the survey. The findings also show that around 96 to 97% of women are not amenorrheic (96%) or abstaining (97%). In terms of wealth status, 42% of the women were from rich families. Approximately half of the women (46%) had secondary or higher education. The majority of the husbands (44%) had a secondary or higher level of education. The number of women who knew about sexually transmitted infections (STIs) was found to be 67%. The majority of women (46%) have had 2–3 children, while 53% have 1–2 living children. The ideal number of children was 2–3 (86%) and more than half (57%) of the women were not interested in having another child. The vast majority of women are currently married (94%), and only 9% can make the decision to use a contraception method on their own. Regarding contraception use, according to the 2014 BDHS, 58.9% of women used it.

Characteristics	Sample women
Characteristics	No.	%
Women current age (years)
15–19	2029	11.4
20–24	3224	18.0
25–29	3390	19.0
30–34	3047	17.1
35–39	2315	13.0
40–44	2092	11.7
45–49	1766	9.9
Division
Barisal	1111	6.2
Chittagong	3301	18.5
Dhaka	6223	34.8
Khulna	1838	10.3
Rajshahi	2103	11.8
Rangpur	2056	11.5
Sylhet	1232	6.9
Religion
Islam	16,096	90.1
Other	1767	9.9
Sex of household head
Male	15,854	88.8
Female	2009	11.2
Residence
Urban	5047	28.3
Rural	12,816	71.7
Respondent working status
No	11,947	66.9
Yes	5912	33.1
Family planning (FP) media exposure
No	14,316	80.1
Yes	3547	19.9
Age at first marriage
<18	13,657	76.5
18+	4206	23.5
Currently breastfeeding
No	14,033	78.6
Yes	3830	21.4
Currently amenorrhoeic
No	17,054	95.5
Yes	809	4.5
Currently abstaining
No	17,341	97.1
Yes	522	2.9
Wealth status
Poor	6767	37.9
Middle	3560	19.9
Rich	7536	42.2
Women education
No education	4455	24.9
Primary	5209	29.2
Secondary +	8199	45.9
Husband education
No education	5189	29.0
Primary	4289	27.3
Secondary+	7795	43.6
Sexually transmitted infection (STI)
No	11,947	66.9
Yes	5912	33.1
Children ever born
0–1	5670	31.7
2–3	8139	45.6
4+	4054	22.7
Number of living children
None	1814	10.2
1–2	9478	53.1
3+	6571	36.8
Ideal number of children
0–1	1127	6.3
2–3	15,308	85.7
4+	1429	8.0
Fertility preference
No more	9555	56.7
Have another	5293	31.4
Undecided	462	2.7
Declared infecund	561	3.3
Sterilized	986	5.8
Marital Status
Married	16,858	94.4
Others	1005	5.6
Decision making for using contraception
Respondent	1515	8.5
Others	16,348	91.5.5
Contraception use status
Using	10,527	58.9
Not Using	7336	41.1

Table 2.

Percentage distribution of ever- married women age between 15 and 49 by selected socio-demographic characteristics.

3.2 Create model

In the initial step of the analysis, we applied hierarchal logistic regression to select the final model. Here, each variable was considered as one model. We added a potential risk factor (variable) to the previous model that was considered a new model in this analysis (Table 3). For example, in the initial model M1 we considered (arbitrary) respondent age, M1+Division was considered as M2. Similarly, we consider another model by adding a variable to the previous model until the desired number of models is reached in this analysis. The details are presented in Table 3.

Model	Model
M₁ = respondent age	M₁₂ = M₁₁ + wealth status
M₂ = M₁ + division	M₁₃ = M₁₂ + women education
M₃ = M₂ + religion	M₁₄ = M₁₃ + husband education
M₄ = M₃ + Sex of household head	M₁₅ = M₁₄ + sexually transmitted infection (STI)
M₅ = M₄ + residence	M₁₆ = M₁₅ + children ever born
M₆ = M₅ + respondent working status	M₁₇ = M₁₆ + number of living children
M₇ = M₆ + FP media exposure	M₁₈ = M₁₇ + ideal number of children
M₈ = M₇ + age at first marriage	M₁₉ = M₁₈ + fertility preference
M₉ = M₈ + currently breastfeeding	M₂₀ = M₁₉ + marital status
M₁₀ = M₉ + currently amenorrhoeic	M₂₁ = M₂₀ + decision making for using contraception
M₁₁ = M₁₀ + currently abstaining

Table 3.

Create a model-based hierarchical approach.

3.3 Best model selection

All models were statistically significant (p < 0.001) except models M7 and M12. Based on the Delong test, we excluded two variables (FP media exposure and wealth status) from our final analysis. The remaining significant variables were considered risk factors for predicting contraceptive practice among women aged 15–49 years in Bangladesh. From Table 4, Model M21 was the final model for analysis, and selected risk factors were also used for the final analysis. The details of the best model selection procedure are given in Table 4.

Model	AUC	DeLong’s test for AUC (p-value)	Decision	Model selection
M₁	0.629	−9.26 (0.000)	M₂ has a significantly different AUC from M₁	M₂ is selected
M₂	0.660	−9.26 (0.000)	M₂ has a significantly different AUC from M₁	M₂ is selected
M₃	0.662	−2.16 (0.031)	M₃ significantly different AUC from M₂	M₃ is selected
M₄	0.713	−16.21 (0.000)	M₄ significantly different AUC from M₃	M₄ is selected
M₅	0.714	−2.03 (0.041)	M₅ had significantly different AUC from M₄	M₅ is selected
M₆	0.715	−2.24 (0.025)	M₆ had significantly different AUC from M₅	M₆ is selected
M₇	0.716	−0.61 (0.545)	M₇ had not significantly different AUC from M₆	M₇ is not selected
M₈	0.716	−2.63 (0.008)	M₈ had a significantly different AUC from M₆	M₈ is selected
M₉	0.723	−4.34 (0.000)	M₉ had a significantly different AUC from M₈	M₉ is selected
M₁₀	0.762	−14.38 (0.000)	M₁₀ had a significantly different AUC from M₉	M₁₀ is selected
M₁₁	0.773	−8.05 (0.000)	M₁₁ had a significantly different AUC from M₁₀	M₁₁ is selected
M₁₂	0.773	−0.72 (0.472)	M₁₂ had not a significantly different AUC from M₁₁	M₁₂ is not selected
M₁₃	0.774	−2.22 (0.029)	M₁₃ had a significantly different AUC from M₁₁	M₁₃ is selected
M₁₄	0.775	−2.17 (0.030)	M₁₄ had a significantly different AUC from M₁₃	M₁₄ is selected
M₁₅	0.776	−2.13 (0.033)	M₁₅ had a significantly different AUC from M₁₄	M₁₅ is selected
M₁₆	0.799	−11.81 (0.000)	M₁₆ had a significantly different AUC from M₁₅	M₁₆ is selected
M₁₇	0.813	−9.26 (0.000)	M₁₇ had a significantly different AUC from M₁₆	M₁₇ is selected
M₁₈	0.816	−4.74 (0.000)	M₁₈ had a significantly different AUC from M₁₇	M₁₈ is selected
M₁₉	0.828	−11.45 (0.000)	M₁₉ had a significantly different AUC from M₁₈	M₁₉ is selected
M₂₀	0.847	−14.69 (0.000)	M₂₀ had a significantly different AUC from M₁₉	M₂₀ is selected
M₂₁	0.866	−21.75 (0.000)	M₂₁ had a significantly different AUC from M₂₀	M₂₁ is selected

Table 4.

Best model selection based on Delong’s test.

3.4 Performance parameter of machine learning algorithms

This study used seven different machine algorithms to classify contraceptive practices among married women both training and an experimental/test dataset. Performance parameters (such as accuracy, precision, recall, F1, specificity, and AUC value) were used to compare the predictive performance of these algorithms. In addition, Cohen Kappa’s statistical information was used to determine the discriminant accuracy of the algorithm. The prediction results with performance parameters for each algorithm are shown in Table 5 and Figure 3.

Model name	Accuracy (95% CI)	Cohen’s kappa	Precession	Recall	F1	AUC	Specificity
LR	78.52 (77.39, 79.61)	0.5559	81.23	82.39	81.81	86.57	73.03
RF	77.57 (76.43, 78.68)	0.5288	78.32	85.35	81.69	84.07	66.53
NB	76.56 (75.40, 77.69)	0.4995	75.73	88.32	81.54	84.17	59.90
LASSO	79.08 (77.96, 80.16)	0.5601	79.39	86.85	82.96	86.59	68.06
CT	78.57 (77.45, 79.67)	0.5464	78.16	88.06	82.81	85.59	65.13
AdaBoost	78.50 (77.37, 79.59)	0.5523	80.20	84.08	82.10	86.15	70.59
NN	79.34 (78.23, 80.42)	0.5626	78.71	88.76	83.44	86.90	65.99

Table 5.

Performance evaluation for seven ML algorithms (test data set).

Figure 3.
Area under curve of all seven machine learning classifiers.

Table 5 shows that the logistic regression classifier has an accuracy of 78.52%. The precision and recall of the fitted model were 81.23% and 82.39%, respectively, while the F1 score was 81.81%. The area under the curve (AUC) was calculated to be 86.57%. The prediction performance result of a random forest was displayed with an accuracy of 77.57%. Here, the precision, recall, and F1 score of the random forest classifier were 73.82%, 85.35%, and 81.99%, respectively. The AUC, in this case, was 84.07%. The final accuracy of the naïve Bayes classifier was 76.56%, with a precision of 75.73% and a recall of 88.32%. The F1 score and the AUC value, in this case, were 81.54% and 84.17%, respectively. Using Least Absolute Shrinkage and Selection Operator (LASSO) analysis, the accuracy in the test data set was seen as 79.08% with precession and recall of 79.39% and 86.85% respectively, and the F1 score was 82.96%. According to the test observation results, the classification tree method showed 78.57% accuracy in predicting contraceptive practice among married women, with a precession of 78.16%, a recall of 88.06%, an F1 score of 82.81%, and an AUC value of 85.59%. For AdaBoost, these values are 78.05% (accuracy), 80.20% (precession), 84.88% (recall), 82.10% (F1 score) and 86.15% (AUC). Finally, we used an artificial neural network and obtained an accuracy of 79.34%. Here, other parameters such as precession, recall, F1 score, and AUC are 78.71%, 88.76%, 83.44%, and 86.90% respectively. Among the seven classifiers, we obtained the best performance from NN in terms of both accuracy and AUC. Cohen’s kappa value is 0.5626.

This violin plot shows the relationship of seven classifiers to accuracy. The shaded areas detail the distribution of the data in each classifier. Figure 4 shows that NN provided the highest mean accuracy, followed by LASSO and AdaBoost. Unlike the boxplot, the entire distribution of the 10-fold accuracy can be visualized in this violin plot (Figure 4).

Figure 4.
Violin plots of the 10-fold cross-validation.

4. Discussion

This is the very first study that uses a hierarchical logistic classifier in a machine learning approach. Then the predictive performance of the hierarchical logistic classifier was compared with the other six machine learning algorithms’ predictive power. In this study, the use of contraception among ever-married women in Bangladesh has been predicted using sociodemographic factors. This study can provide policymakers and academics with a starting point to examine key outlines in a larger framework and raise noteworthy interventions.

The study found that the prevalence of contraception was almost 59% in Bangladesh. The prevalence rate of contraceptives in India is 54%, while the rates were 47%, 34%, and 65%, respectively, for Nepal, Pakistan, and Sri Lanka [37, 38]. As the government of Bangladesh is committed to the London Summit on Family Planning to improve contraceptive access and use among impoverished people in both urban and rural areas [39], the findings of this study will provide grounding direction for the increase in the prevalence of contraception.

In this study, we used hierarchical LR, RF, NB, LASSO, CT, AdaBoost, and NN machine learning techniques to predict contraceptive practice among ever-married women in Bangladesh. The current analysis was to evaluate which performed better based on the accurate prediction rate of contraceptive use for 2014, BDHS data sets. Moreover, there was no evidence of scientific study that used a hierarchical logistic classifier and several supervised learning. In this study, 70% of the respondents were used for model tuning purposes, and the remaining 30% were used to check model performance, for the model tuning was performed using 10-fold cross-validation on the training dataset. The researcher observed that cross-validation is most commonly used to evaluate model performance [40]. The prediction of contraceptive use was measured by performance parameters (such as accuracy, precision, recall, F1, and AUC value) compared to the performance of seven different machine learning classifiers in this analysis. Cohen’s kappa, the proportion of predicted to actual classification in the dataset, is used to assimilate model perfection. Among the used models, the Neural Network outperformed other models with an accuracy of 79.34%. Additionally, in terms of Cohen kappa, the result of this analysis also highlighted that the Neural Network provides the best predictive performance (Cohen’s κ = 0.5626). This indicates Neural Networks have achieved better performance than other LR, RF, Lasso Regression, NB, CT, and AdaBoost. Hailemariam et al. proposed a J48 decision tree that performed better than Naïve Bayes to predict contraceptive practice in Ethiopian women [41]. However, Hailemariam et al. have not used the neural network in their study [41]. In a data mining study in India, the CART model produces pretty satisfactory results for finding the predictors of contraception use among married women [42]. However, Vaz and his team member also found that the Random Forest model was the most accurate model for predicting women’s fertile periods [43]. Machine learning algorithms can be quite helpful in predicting infertility in women, according to a study conducted in Nigeria [44].

5. Conclusions

In this paper, we investigate the hierarchical logistic regression classifier in machine learning approaches to identify potential risk factors related to contraceptive practices of women in Bangladesh. In summary, we conclude that all of the selected covariates were significant determinants for contraceptive practice except FP Mass media exposure and wealth status according to the hierarchical logistic regression classifier in machine learning approaches based on the Delong test. Here, we compared seven supervised machine learning algorithms to predict contraceptive practice among ever-married women aged between 15 and 49 years in Bangladesh. The NN model has exhibited the best results based on the performance parameters, having demonstrated an accuracy of 79.34%, a precision of 78.71%, a recall of 88.76%, an F1 score of 83.44%, and an AUC value of 86.90. Among the seven algorithms, the NN model performs the best in terms of accuracy, Cohen’s kappa statistic, and area under the curve (AUC). This study recommends the use of the NN model and policymakers should pay attention to continuing this study in the future.

Acknowledgments

A special thank goes to the Demographic Health Surveys for enabling us to use Bangladesh Demographic Health Survey data for our study from https://dhsprogram.com/data/.

Conflicts of interest

The authors declare that they are not competing of interest.

Funding

This study did not receive funding.

Data availability

This study was analyzed using secondary data, which were available at “https://dhsprogram.com/data/”.

References

1. United Nations Population Fund. Sexual and Reproductive Health for all: Reducing Poverty, Advancing Development and Protecting Human Rights. New York, New York, United States: United Nations Population Fund; 2010
2. United Nations. Transforming our World: The 2030 Agenda for Sustainable Development United Nations. 2015. Available from: https://sustainabledevelopment.un.org/content/documents/21252030%20Agenda%20for%20Sustainable%20Development%20web.pdf
3. World Health Organization. Health-Related Millennium Development Goals. 2015 . Available from: https://www.who.int/gho/publications/world_health_statistics/EN_WHS2015_Part1.pdf?ua=1
4. Cleland J, Conde-Agudelo A, Peterson H, Ross J, Tsui A. Contraception and health. The Lancet. 2012;380(9837):149-156. DOI: 10.1016/s0140-6736(12)60609-6
5. Ahmed S, Li Q, Liu L, Tsui AO. Maternal deaths averted by contraceptive use: An analysis of 172 countries. The Lancet. 2012;80(9837):111-125. DOI: 10.1016/S0140-6736(12)60478-4
6. Brunner Huber LR, Smith K, Sha W, Vick T. Interbirth interval and pregnancy complications and outcomes: Findings from the pregnancy risk assessment monitoring system. Journal of Midwifery & Women’s Health. 2018;63(4):436-445. DOI: 10.1111/jmwh.12745
7. Darroch J. Singh S. Estimating Unintended Pregnancies Averted from Couple-Years of Protection (CYP). 2011. Available from: https://www.guttmacher.org/sites/default/files/page_files/guttmacher-cyp-memo.pdf
8. Liu L, Becker S, Tsui A, Ahmed S. Three methods of estimating births averted nationally by contraception. Population Studies. 2008;62(2):191-210. DOI: 10.1080/00324720801897796
9. Yazdkhasti M, Pourreza A, Pirak A, Abdi F. Unintended pregnancy and its adverse social and economic consequences on health system: A narrative review article. Iranian Journal of Public Health. 2015;44(1):12-21
10. Aviisah PA, Dery S, Atsu BK, Yawson A, Alotaibi RM, Rezk HR, et al. Modern contraceptive use among women of reproductive age in Ghana: Analysis of the 2003–2014 Ghana demographic and health surveys. BMC Women’s Health. 2018;18(1):1-10. DOI: 10.1186/s12905-018-0634-9
11. Haq I, Sakib S, Talukder A. Sociodemographic factors on contraceptive use among ever-married women of reproductive age: Evidence from three demographic and health surveys in Bangladesh. Medical Science. 2017;5(4):31. DOI: 10.3390/medsci5040031
12. Kopp Kallner H, Thunell L, Brynhildsen J, Lindeberg M, Gemzell Danielsson K. Use of contraception and attitudes towards contraceptive use in Swedish women—A Nationwide survey. PLoS One. 2015;10(5):e0125990. DOI: 10.1371/journal.pone.0125990
13. Mandiwa C, Namondwe B, Makwinja A, Zamawe C. Factors associated with contraceptive use among young women in Malawi: Analysis of the 2015–16 Malawi demographic and health survey data. Contraception and Reproductive Medicine. 2018;3(1):12-19. DOI: 10.1186/s40834-018-0065-x
14. Moazenzadeh R, Mohammadi B, Shamshirband S, Chau K. Coupling a firefly algorithm with support vector regression to predict evaporation in northern Iran. Engineering Applications of Computational Fluid Mechanics. 2018;12(1):584-597. DOI: 10.1080/19942060.2018.1482476
15. Mousa SR, Bakhit PR, Osman OA, Ishak S. A comparative analysis of tree-based ensemble methods for detecting imminent lane change maneuvers in connected vehicle environments. Transportation Research Record: Journal of the Transportation Research Board. 2018;2672(42):268-279. DOI: 10.1177/0361198118780204
16. Zhang Y, Haghani A. A gradient boosting method to improve travel time prediction. Transportation Research Part C: Emerging Technologies. 2015;58:308-324. DOI: 10.1016/j.trc.2015.02.019
17. NIPORT, Mitra and Associates, & ICF International. Bangladesh Demographic and Health Survey 2014. Bangladesh: NIPORT, Mitra and Associates, and ICF International; 2016
18. Johnson EO. Determinants of modern contraceptive uptake among Nigerian women: Evidence from the National Demographic and health survey. African Journal of Reproductive Health. 2017;21(3):89-95. DOI: 10.29063/ajrh2017/v21i3.8
19. Gebre MN, Edossa ZK. Modern contraceptive utilization and associated factors among reproductive-age women in Ethiopia: Evidence from 2016 Ethiopia demographic and health survey. BMC Women’s Health. 2020;20(1):1-14. DOI: 10.1186/s12905-020-00923-9
20. Islam AZ, Mondal MNI, Khatun ML, Rahman MM, Islam MR, Mostofa MG, et al. Prevalence and determinants of contraceptive use among employed and unemployed women in Bangladesh. International Journal of MCH and AIDS. 2016;5(2):92-102. DOI: 10.21106/ijma.83
21. Kidayi PL, Msuya S, Todd J, Mtuya CC, Mtuy T, Mahande MJ. Determinants of modern contraceptive use among women of reproductive age in Tanzania: Evidence from Tanzania demographic and health survey data. Advances in Sexual Medicine. 2015;05(03):43-52. DOI: 10.4236/asm.2015.53006
22. Solanke BL. Factors influencing contraceptive use and non-use among women of advanced reproductive age in Nigeria. Journal of Health, Population and Nutrition. 2017;36(1):1-14. DOI: 10.1186/s41043-016-0077-6
23. Sridhar A, Salcedo J. Optimizing maternal and neonatal outcomes with postpartum contraception: Impact on breastfeeding and birth spacing. Maternal Health, Neonatology and Perinatology. 2017;3(1):1-10. DOI: 10.1186/s40748-016-0040-y
24. Vu LTH, Oh J, Bui QT-T, Le AT-K. Use of modern contraceptives among married women in Vietnam: A multilevel analysis using the multiple indicator cluster survey (2011) and the Vietnam population and housing census (2009). Global Health Action. 2016;9(1):29574. DOI: 10.3402/gha.v9.29574
25. Zhang L, Zhang B. Hierarchical machine learning–a learning methodology inspired by human intelligence. In: International Conference on Rough Sets and Knowledge Technology. Berlin, Heidelberg: Springer; 2006. pp. 28-30. DOI: 10.1007/11795131_3
26. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics. 1988;44(3):837. DOI: 10.2307/2531595
27. Anuse A, Vyas V. A novel training algorithm for convolutional neural network. Complex & Intelligent Systems. 2016;2(3):221-234. DOI: 10.1007/s40747-016-0024-6
28. Buntine W. Learning classification trees. Statistics and Computing. 1992;2(2):63-73. DOI: 10.1007/bf01889584
29. Jang W, Lee JK, Lee J, Han SH. Naïve Bayesian classifier for selecting good/bad projects during the early stage of international construction bidding decisions. Mathematical Problems in Engineering. 2015;2015:1-12. DOI: 10.1155/2015/830781
30. Talukder A, Ahammed B. Machine learning algorithms for predicting malnutrition among under-five children in Bangladesh. Nutrition. 2020;78:110861. DOI: 10.1016/j.nut.2020.110861
31. Vasquez MM, Hu C, Roe DJ, Chen Z, Halonen M, Guerra S. Least absolute shrinkage and selection operator type methods for the identification of serum biomarkers of overweight and obesity: Simulation and application. BMC Medical Research Methodology. 2016;16(1):154-172. DOI: 10.1186/s12874-016-0254-8
32. Wu P, Zhao H. Some analysis and research of the AdaBoost algorithm. In: International Conference on Intelligent Computing and Information Science. Berlin, Heidelberg: Springer; 2011. pp. 1-5
33. Xu X, Xia L, Zhang Q, Wu S, Wu M, Liu H. The ability of different imputation methods for missing values in mental measurement questionnaires. BMC Medical Research Methodology. 2020;20(1):1-9. DOI: 10.1186/s12874-020-00932-0
34. Madley-Dowd P, Hughes R, Tilling K, Heron J. The proportion of missing data should not be used to guide decisions on multiple imputation. Journal of Clinical Epidemiology. 2019;110:63-73. DOI: 10.1016/j.jclinepi.2019.02.016
35. Liu B, Fang L, Liu F, Wang X, Chou K-C. iMiRNA-PseDPC: microRNA precursor identification with a pseudo distance-pair composition approach. Journal of Biomolecular Structure & Dynamics. 2016;34(1):223-235. DOI: 10.1080/07391102.2015.1014422
36. Liaw A, Wiener M. Classification and regression by random Forest. R News. 2002;2(3):18-22. Available from: https://cogns.northwestern.edu/cbmg/LiawAndWiener2002.pdf
37. Family Planning. India: Commitment Maker since 2012. 2018. Available from: https://www.familyplanning2020.org/india
38. The World Bank. Contraceptive Prevalence, Any Methods (% of Women Ages 15–49) Data. 2019. Available from: https://data.worldbank.org/indicator/SP.DYN.CONU.ZS%20(2019)
39. Huda FA, Robertson Y, Chowdhuri S, Sarker BK, Reichenbach L, Somrongthong R. Contraceptive practices among married women of reproductive age in Bangladesh: A review of the evidence. Reproductive Health. 2017;14(1):69-77. DOI: 10.1186/s12978-017-0333-2
40. Cawley GC, Talbot NLC. Gene selection in cancer classification using sparse logistic regression with Bayesian regularization. Bioinformatics. 2006;22(19):2348-2355. DOI: 10.1093/bioinformatics/btl386
41. Hailemariam T, Gebregiorgis A, Meshesha M, Mekonnen W. Application of data mining to predict the likelihood of contraceptive method use among women aged 15-49 case of 2005 demographic health survey data collected by central statistics agency, Addis Ababa, Ethiopia. Journal of Health & Medical Informatics. 2017;8(3):274-279. DOI: 10.4172/2157-7420.1000274
42. Chaurasia AR. Contraceptive use in India: A data mining approach. International Journal of Population Research. 2014;2014:1-11. DOI: 10.1155/2014/821436
43. Vaz F, Silva RR, Bernardino J. Using data mining in a mobile application for the calculation of the female fertile period. In: Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management. Setúbal, Portugal: SciTePress; 2018. DOI: 10.5220/0007228603590366
44. Balogun JA, Egejuru N, Idowu P. Comparative analysis of predictive models for the likelihood of infertility in women using supervised machine learning techniques. Computer Reviews Journal. 2018;2(1):313-330

[1] 1. United Nations Population Fund. Sexual and Reproductive Health for all: Reducing Poverty, Advancing Development and Protecting Human Rights. New York, New York, United States: United Nations Population Fund; 2010

[2] 2. United Nations. Transforming our World: The 2030 Agenda for Sustainable Development United Nations. 2015. Available from: https://sustainabledevelopment.un.org/content/documents/21252030%20Agenda%20for%20Sustainable%20Development%20web.pdf

[3] 3. World Health Organization. Health-Related Millennium Development Goals. 2015 . Available from: https://www.who.int/gho/publications/world_health_statistics/EN_WHS2015_Part1.pdf?ua=1

[4] 4. Cleland J, Conde-Agudelo A, Peterson H, Ross J, Tsui A. Contraception and health. The Lancet. 2012;380(9837):149-156. DOI: 10.1016/s0140-6736(12)60609-6

[5] 5. Ahmed S, Li Q, Liu L, Tsui AO. Maternal deaths averted by contraceptive use: An analysis of 172 countries. The Lancet. 2012;80(9837):111-125. DOI: 10.1016/S0140-6736(12)60478-4

[6] 6. Brunner Huber LR, Smith K, Sha W, Vick T. Interbirth interval and pregnancy complications and outcomes: Findings from the pregnancy risk assessment monitoring system. Journal of Midwifery & Women’s Health. 2018;63(4):436-445. DOI: 10.1111/jmwh.12745

[7] 7. Darroch J. Singh S. Estimating Unintended Pregnancies Averted from Couple-Years of Protection (CYP). 2011. Available from: https://www.guttmacher.org/sites/default/files/page_files/guttmacher-cyp-memo.pdf

[8] 8. Liu L, Becker S, Tsui A, Ahmed S. Three methods of estimating births averted nationally by contraception. Population Studies. 2008;62(2):191-210. DOI: 10.1080/00324720801897796

[9] 9. Yazdkhasti M, Pourreza A, Pirak A, Abdi F. Unintended pregnancy and its adverse social and economic consequences on health system: A narrative review article. Iranian Journal of Public Health. 2015;44(1):12-21

[10] 10. Aviisah PA, Dery S, Atsu BK, Yawson A, Alotaibi RM, Rezk HR, et al. Modern contraceptive use among women of reproductive age in Ghana: Analysis of the 2003–2014 Ghana demographic and health surveys. BMC Women’s Health. 2018;18(1):1-10. DOI: 10.1186/s12905-018-0634-9

[11] 11. Haq I, Sakib S, Talukder A. Sociodemographic factors on contraceptive use among ever-married women of reproductive age: Evidence from three demographic and health surveys in Bangladesh. Medical Science. 2017;5(4):31. DOI: 10.3390/medsci5040031

[12] 12. Kopp Kallner H, Thunell L, Brynhildsen J, Lindeberg M, Gemzell Danielsson K. Use of contraception and attitudes towards contraceptive use in Swedish women—A Nationwide survey. PLoS One. 2015;10(5):e0125990. DOI: 10.1371/journal.pone.0125990

[13] 13. Mandiwa C, Namondwe B, Makwinja A, Zamawe C. Factors associated with contraceptive use among young women in Malawi: Analysis of the 2015–16 Malawi demographic and health survey data. Contraception and Reproductive Medicine. 2018;3(1):12-19. DOI: 10.1186/s40834-018-0065-x

[14] 14. Moazenzadeh R, Mohammadi B, Shamshirband S, Chau K. Coupling a firefly algorithm with support vector regression to predict evaporation in northern Iran. Engineering Applications of Computational Fluid Mechanics. 2018;12(1):584-597. DOI: 10.1080/19942060.2018.1482476

[15] 15. Mousa SR, Bakhit PR, Osman OA, Ishak S. A comparative analysis of tree-based ensemble methods for detecting imminent lane change maneuvers in connected vehicle environments. Transportation Research Record: Journal of the Transportation Research Board. 2018;2672(42):268-279. DOI: 10.1177/0361198118780204

[16] 16. Zhang Y, Haghani A. A gradient boosting method to improve travel time prediction. Transportation Research Part C: Emerging Technologies. 2015;58:308-324. DOI: 10.1016/j.trc.2015.02.019

[17] 17. NIPORT, Mitra and Associates, & ICF International. Bangladesh Demographic and Health Survey 2014. Bangladesh: NIPORT, Mitra and Associates, and ICF International; 2016

[18] 18. Johnson EO. Determinants of modern contraceptive uptake among Nigerian women: Evidence from the National Demographic and health survey. African Journal of Reproductive Health. 2017;21(3):89-95. DOI: 10.29063/ajrh2017/v21i3.8

[19] 19. Gebre MN, Edossa ZK. Modern contraceptive utilization and associated factors among reproductive-age women in Ethiopia: Evidence from 2016 Ethiopia demographic and health survey. BMC Women’s Health. 2020;20(1):1-14. DOI: 10.1186/s12905-020-00923-9

[20] 20. Islam AZ, Mondal MNI, Khatun ML, Rahman MM, Islam MR, Mostofa MG, et al. Prevalence and determinants of contraceptive use among employed and unemployed women in Bangladesh. International Journal of MCH and AIDS. 2016;5(2):92-102. DOI: 10.21106/ijma.83

[21] 21. Kidayi PL, Msuya S, Todd J, Mtuya CC, Mtuy T, Mahande MJ. Determinants of modern contraceptive use among women of reproductive age in Tanzania: Evidence from Tanzania demographic and health survey data. Advances in Sexual Medicine. 2015;05(03):43-52. DOI: 10.4236/asm.2015.53006

[22] 22. Solanke BL. Factors influencing contraceptive use and non-use among women of advanced reproductive age in Nigeria. Journal of Health, Population and Nutrition. 2017;36(1):1-14. DOI: 10.1186/s41043-016-0077-6

[23] 23. Sridhar A, Salcedo J. Optimizing maternal and neonatal outcomes with postpartum contraception: Impact on breastfeeding and birth spacing. Maternal Health, Neonatology and Perinatology. 2017;3(1):1-10. DOI: 10.1186/s40748-016-0040-y

[24] 24. Vu LTH, Oh J, Bui QT-T, Le AT-K. Use of modern contraceptives among married women in Vietnam: A multilevel analysis using the multiple indicator cluster survey (2011) and the Vietnam population and housing census (2009). Global Health Action. 2016;9(1):29574. DOI: 10.3402/gha.v9.29574

[25] 25. Zhang L, Zhang B. Hierarchical machine learning–a learning methodology inspired by human intelligence. In: International Conference on Rough Sets and Knowledge Technology. Berlin, Heidelberg: Springer; 2006. pp. 28-30. DOI: 10.1007/11795131_3

[26] 26. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics. 1988;44(3):837. DOI: 10.2307/2531595

[27] 27. Anuse A, Vyas V. A novel training algorithm for convolutional neural network. Complex & Intelligent Systems. 2016;2(3):221-234. DOI: 10.1007/s40747-016-0024-6

[28] 28. Buntine W. Learning classification trees. Statistics and Computing. 1992;2(2):63-73. DOI: 10.1007/bf01889584

[29] 29. Jang W, Lee JK, Lee J, Han SH. Naïve Bayesian classifier for selecting good/bad projects during the early stage of international construction bidding decisions. Mathematical Problems in Engineering. 2015;2015:1-12. DOI: 10.1155/2015/830781

[30] 30. Talukder A, Ahammed B. Machine learning algorithms for predicting malnutrition among under-five children in Bangladesh. Nutrition. 2020;78:110861. DOI: 10.1016/j.nut.2020.110861

[31] 31. Vasquez MM, Hu C, Roe DJ, Chen Z, Halonen M, Guerra S. Least absolute shrinkage and selection operator type methods for the identification of serum biomarkers of overweight and obesity: Simulation and application. BMC Medical Research Methodology. 2016;16(1):154-172. DOI: 10.1186/s12874-016-0254-8

[32] 32. Wu P, Zhao H. Some analysis and research of the AdaBoost algorithm. In: International Conference on Intelligent Computing and Information Science. Berlin, Heidelberg: Springer; 2011. pp. 1-5

[33] 33. Xu X, Xia L, Zhang Q, Wu S, Wu M, Liu H. The ability of different imputation methods for missing values in mental measurement questionnaires. BMC Medical Research Methodology. 2020;20(1):1-9. DOI: 10.1186/s12874-020-00932-0

[34] 34. Madley-Dowd P, Hughes R, Tilling K, Heron J. The proportion of missing data should not be used to guide decisions on multiple imputation. Journal of Clinical Epidemiology. 2019;110:63-73. DOI: 10.1016/j.jclinepi.2019.02.016

[35] 35. Liu B, Fang L, Liu F, Wang X, Chou K-C. iMiRNA-PseDPC: microRNA precursor identification with a pseudo distance-pair composition approach. Journal of Biomolecular Structure & Dynamics. 2016;34(1):223-235. DOI: 10.1080/07391102.2015.1014422

[36] 36. Liaw A, Wiener M. Classification and regression by random Forest. R News. 2002;2(3):18-22. Available from: https://cogns.northwestern.edu/cbmg/LiawAndWiener2002.pdf

[37] 37. Family Planning. India: Commitment Maker since 2012. 2018. Available from: https://www.familyplanning2020.org/india

[38] 38. The World Bank. Contraceptive Prevalence, Any Methods (% of Women Ages 15–49) Data. 2019. Available from: https://data.worldbank.org/indicator/SP.DYN.CONU.ZS%20(2019)

[39] 39. Huda FA, Robertson Y, Chowdhuri S, Sarker BK, Reichenbach L, Somrongthong R. Contraceptive practices among married women of reproductive age in Bangladesh: A review of the evidence. Reproductive Health. 2017;14(1):69-77. DOI: 10.1186/s12978-017-0333-2

[40] 40. Cawley GC, Talbot NLC. Gene selection in cancer classification using sparse logistic regression with Bayesian regularization. Bioinformatics. 2006;22(19):2348-2355. DOI: 10.1093/bioinformatics/btl386

[41] 41. Hailemariam T, Gebregiorgis A, Meshesha M, Mekonnen W. Application of data mining to predict the likelihood of contraceptive method use among women aged 15-49 case of 2005 demographic health survey data collected by central statistics agency, Addis Ababa, Ethiopia. Journal of Health & Medical Informatics. 2017;8(3):274-279. DOI: 10.4172/2157-7420.1000274

[42] 42. Chaurasia AR. Contraceptive use in India: A data mining approach. International Journal of Population Research. 2014;2014:1-11. DOI: 10.1155/2014/821436

[43] 43. Vaz F, Silva RR, Bernardino J. Using data mining in a mobile application for the calculation of the female fertile period. In: Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management. Setúbal, Portugal: SciTePress; 2018. DOI: 10.5220/0007228603590366

[44] 44. Balogun JA, Egejuru N, Idowu P. Comparative analysis of predictive models for the likelihood of infertility in women using supervised machine learning techniques. Computer Reviews Journal. 2018;2(1):313-330

Machine Learning Algorithm-Based Contraceptive Practice among Ever-Married Women in Bangladesh: A Hierarchical Machine Learning Classification Approach

Artificial Intelligence Annual Volume 2022