Characteristics of included studies.
Abstract
Data mining algorithms have been performed to reveal the factors that can be used to enhance live body weight and egg weight during chicken breeding. This work was conducted to systematically review the published articles on the use of data mining algorithms in chicken breeding. ScienceDirect, Web of Science, PubMed, Google Scholar and were used for searching articles. Using the combination of chicken or chicken breeding, data mining algorithm or decision tree, body weight and egg weight as keywords. The results indicated that 8 articles were included from 120 articles were found from searching. The 8 included articles were published from 2016 to 2021 and most of them were originated from South Africa (n = 3) followed by Turkey (n = 2) with. CHAID as the most used data mining algorithm (n = 5) followed by CART (n = 4). Out of 8 included articles, 6 of them used coefficient of determination (R2) as the selection criteria and CART was found as the best model followed by the CHAID model. It is concluded that CART followed by CHAID data mining algorithms are the recommended models that might be used for improving egg production and growth performance of chickens.
Keywords
- body weight
- coefficient of determination
- chicken breeding
- data mining algorithm
- egg weight
1. Introduction
Chicken breeding focuses on improving different animal productions including the growth performance, carcass characteristics and egg production. Different studies had been conducted trying to improve growth performance [1, 2, 3] and egg production [1, 4, 5, 6, 7] using different data mining algorithms. Data mining algorithms are nonparametric methods superior and simpler in statistically calculating complex data sets [3]. Moreover, Gevrekçi and Takma [5] reported that they are computer-based procedures to detect evidence from data removing multicollinearity and can run large data. The common data mining algorithms that are performed for estimation of chicken live body weight are classification and regression tree (CART) and artificial neural network (ANN) in Sasso breed [1], and exhaustive chi-square automatic interaction detector (exhaustive CHAID) and chi-square automatic interaction detector (CHAID) in Hy-line Silver Brown and Potchefstroom Koekoek chicken breed [3] and multivariate adaptive regression splines (MARS) in Hy-line Silver Brown chicken [2]. chi-square automatic interaction detector (CHAID) in Hy-line Silver Brown and Boschveld layers [8] and in White layer hybrids chicken [4], chi-square automatic interaction detector (CHAID) and classification and regression tree (CART) in Indigenous chicken of Zambia [7], k-nearest neighbor (KNN), linear discriminant analysis (LDA) and Support vector machine (SVM) in Beijing You Chicken and Dwarf Beijing You Chicken [6], and chi-square automatic interaction detector (CHAID) and ridge regression (RR) in White layer hybrids [4]. Chi-square automatic interaction detector (CHAID, and Classification and regression tree (CART) are commonly performed algorithm methods to improve egg production [5].
Based on authors knowledge, there is no systematic review on the use of data mining algorithms in chicken breeding. To close the identified knowledge gap, the objective of this work was to perform the systematic approach to review the information on the use of data mining algorithms in chicken breeding. This book chapter will help the chicken breeders and researchers to identify the potential data mining algorithms that might be used for estimation of live body weight and egg weight.
2. Methods and materials
2.1 Eligibility criteria
The Population, Exposure and Outcomes (PEO) as components were identified as outlined by Saltikov [9]. The “Chicken” was defined as population of the study, while the “Data mining algorithm or decision tree” as intervention, “Eggs weight” and “Body weight” as outcome. A preliminary search of the PEO component on Google Scholar, Web of Science, PubMed and ScienceDirect was performed before deciding to conduct the study.
2.2 Search strategy
A scientific publication search was performed independently by two investigators (Kwena Mokoena and Thobela Louis Tyasi) in databases up to 10th November 2023, using Google Scholar, Web of Science, PubMed and ScienceDirect. The search was performed using the combination of keywords as follows: ‘Chicken” or “Chicken breeding”, “Data mining algorithm” or “Decision tree”,” Body weight”, and “Egg weight”.
2.3 Inclusion criteria
Searched articles were selected for eligibility according to several standard and considered for inclusion if they met the following criteria:
Chicken
Data mining algorithm or decision tree
Egg weight
Live body weight
2.4 Exclusion criteria
The criteria of excluding searched articles contained the following:
Records irrelevant to data mining algorithm, egg weight, carcass weight and body weight
Studies published as abstract without full text
Records duplicated
Studies not on chickens
Articles with no available original data in the publication and failure to contact the authors
2.5 Data extraction
The data for the current study was extracted independently by Kwena Mokoena and Thobela Louis Tyasi, and an agreement was made involving all sections. The information obtained from each article consisted of the following:
First author
Year of publication
Number of eggs weight
Chicken breed
Data mining algorithm or decision tree
Dependent variables (egg weight and live body weight).
2.6 Ethical considerations
When performing this work all authors considered plagiarism, fabrication, and data falsification.
3. Results
3.1 Searched results
One hundred and twenty (n = 120) articles were retrieved from a publication search, were twenty-five (n = 25) of which were duplicated were removed. As a result, ninety-five (n = 95) articles were considered for title and abstract screening, which resulted in seventy-two (n = 72) articles eliminated after title and abstract review. Twenty-three (n = 23) articles were considered for full text-review, a total of fifteen (n = 15) articles were eliminated after a full text- review, the reasons are stated in Figure 1. A total of eight (n = 8) articles qualified for the inclusion in the study.
3.2 Characterization of included articles
Table 1 shows eight articles that met the inclusion procedure. The results indicated that [2, 3, 4, 5, 8] used commercial chicken breeds and their eggs. The study that used large sample size of chickens was [8], while the study that used large sample size of chicken eggs was of [4]. The results showed that the most dominant chicken breed was Hy-line silver, Brown layer [2, 3, 5, 8].
Authors | Years | Country | Breeds | Data mining algorithm |
---|---|---|---|---|
Gevrekçi and Takma | 2018 | Turkey | — | Classification and regression tree (CART), and chi-square automatic interaction detector (CHAID) |
Dong et al. | 2021 | China | Beijing You Chicken and Dwarf Beijing You Chicken | k-nearest neighbor (KNN), linear discriminant analysis (LDA) and Support vector machine (SVM) |
Liswaniso et al | 2020 | Zambia | Indigenous chicken of Zambia | Classification and regression tree (CART), and chi-square automatic interaction detector (CHAID) |
Okoro et al. | 2017 | South Africa | Hy-line Silver Brown and Boschveld layers | chi-square automatic interaction detector (CHAID) |
Orhan et al | 2016 | Turkey | White layer hybrids | chi-square automatic interaction detector (CHAID) and ridge regression (RR). |
Tyasi et al. | 2020 | South Africa | Hy-line Silver Brown | Multivariate adaptive regression splines (MARS). |
Tyasi et al | 2021 | South Africa | Hy-line Silver Brown and Potchefstroom Koekoek | Classification and regression tree (CART), Chi-square automatic interaction detector (CHAID) and exhaustive chi-square automatic interaction detector (exhaustive CHAID). |
Yakubu and Madaki | 2017 | Nigeria | Sasso breed | Artificial neural network (ANN), Classification and regression tree (CART) |
3.3 Publication by year
Figure 2 indicates the year of publication of included articles. The findings indicated that year 2017 [1, 3, 7, 8] had the highest numbers of articles published (n = 2). The year 2016 [4] and 2018 [5] showed the least number of articles.
3.4 Publication by county
The origin of the included articles is presented in Figure 3. The results indicated that from the eight articles included in the review, South Africa [2, 3, 8] had the maximum number of articles (n = 3) and followed by Turkey [5, 8] with two articles. The results also indicated that Nigeria [1], Zambia [7] and China [6].
3.5 Publications by data mining algorithms
Figure 4 displays the number of published articles by data mining algorithms. The findings showed that CHAID was the more commonly used data mining algorithm (n = 5), followed by CART (n = 4). The results also indicated that KNN, LDA, SVM, RR, MARS, Exhaustive CHAID and ANN were the data mining algorithms to be used (n = 1).
3.6 Predictive performance of data mining algorithms
Table 2 displays the predictive performance of different data mining algorithms used in the included articles for this study. From eight included articles only seven articles reported goodness of fit. Out of seven studies, most of them (six) used coefficient of determination (R2) as the selection criteria. However, only one study used CV, RAE, MAD and RMSE [5]. From the six studies that used R2 as the selection criteria, it was found that CART was the best model, then followed by the CHAID model. The study of Gevrekçi and Takma [5] indicated that CHAID was the best data mining algorithm model.
Author and year | Dependent variable | Goodness of fit criteria | Models | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
CART | CHAID | Exhaustive CHAID | RR | MARS | ANN | KNN | LDA | SVM | |||
Gevrekçi and Takma, 2018 | Egg production | CV% | 10.57 | 9.32 | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
RAE | 0.0024 | 0.0021 | N/A | N/A | N/A | N/A | N/A | N/A | N/A | ||
MAD | 8.85 | 7.56 | N/A | N/A | N/A | N/A | N/A | N/A | N/A | ||
RMSE | 11.25 | 9.93 | N/A | N/A | N/A | N/A | N/A | N/A | N/A | ||
Dong et al. 2021 | Egg discrimination (fatty acid) | R2 | N/A | N/A | N/A | N/A | N/A | N/A | 91.7% | 83.3% | 91.7% |
Dong et al. 2021 | egg discrimination (flavor characteristics) | R2 | N/A | N/A | N/A | N/A | N/A | N/A | 50% | N/A | 16.7% |
Liswaniso et al. 2020 | Egg weight | R2 | 59.3% | 82.3% | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Orhan et al. 2016 | Egg weight | R2 | N/A | 99.98% | N/A | 93.15% | N/A | N/A | N/A | N/A | N/A |
Tyasi et al. 2020 | Body weight | R2 | N/A | N/A | N/A | N/A | 100% | N/A | N/A | N/A | N/A |
Yakubu and Madaki, 2017 | Body weight (deep litter) | R2 | 93.4% | N/A | N/A | N/A | N/A | 87% | N/A | N/A | N/A |
Yakubu and Madaki, 2017 | Body weight (battery cage) | R2 | 93.4% | N/A | N/A | N/A | N/A | 99% | N/A | N/A | N/A |
Tyasi et al. 2021 | Body weight | R2 | 83.2% | 65.9% | 64.1% | N/A | N/A | N/A | N/A | N/A | N/A |
Okoro et al. 2017 | Egg size and performance | R2 | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
4. Discussion
This review was conducted to discover the suitable data mining algorithm model that might be used in chicken breeding from 8 included articles. The findings showed that CHAID was the most used data mining algorithm (5/8) out of the eight articles included in the review, followed by CART (4/8). However, the predictive performance results indicated six articles from included studies used the coefficient of determination (R2) as the selection criteria. This shows that R2 is the reliable goodness of fit criteria for selecting the best model. However, the CV, RAE, MAD and RMSE were used by only one article [5]. From the six articles that used R2 as the selection criteria, it was found that CART was the best model, then followed by the CHAID model. CART algorithm is a kind of machine learning technique performed to assemble a decision tree [2]. The study of Gevrekçi and Takma [5] indicated that CHAID was the best data mining algorithm model. Liswaniso et al. [7] used the different data mining algorithm models to determine egg weight as dependent variable from egg characteristics and found that CHAID is the best model. Similarly, Tyasi et al. [3] found that CHAID model is the best in predicting the body weights of Hy-line Silver Brown commercial layers and Potchefstroom Koekoek indigenous chickens raised in South Africa. To the best of authors’ knowledge, this is the first review in a systematic approach reporting the use of data mining algorithms in chicken breeding. Hence, there is no comparison of our findings in this systematic review. The implication of this work is that the CART method might be used for prediction of live body weight and egg weight for growth performance and egg production improvement in different countries. The strength of the review is that there is no similar study had been done. The contribution of this systematic review is that out of all the commonly used statistical technique for prediction of egg weight and live body weight in chickens, CART is the best model that can be used during chicken breeding. However, more studies need to be done to confirm and add to the results of the study.
5. Conclusion
The current systematic review was conducted to discover the best data mining algorithm that might be used by chicken breeders to identify the factors for improving live body weight and egg weight. Included articles identified factors that can be used during breeding as selection criteria to improve egg production and growth performance. This systematic review showed that included articles used different data mining algorithms including classification and regression tree (CART), artificial neural network (ANN), chi-square automatic interaction detector (CHAID), exhaustive chi-square automatic interaction detector (exhaustive CHAID), multivariate adaptive regression splines (MARS), k-nearest neighbor (KNN), linear discriminant analysis (LDA), support vector machine (SVM) and ridge regression (RR) for chicken breeding. Included articles used goodness of fit criteria such as coefficient of determination and root mean square error to select the best data mining algorithm. This systematic review concludes that CART was the best data mining algorithm model to be used in chicken breeding, followed by CHAID. Furthermore, the researchers should involve the CART and CHAID methods in chicken breeding for prediction of egg weight and live body weight.
References
- 1.
Yakubu A, Madaki J. Modelling growth of dual-purpose Sasso hens in the tropics using different algorithms. Journal Genetic Biology. 2017; 1 (1):1-9 - 2.
Tyasi TL, Makgowo KM, Mokoena K, Rashijane LT, Mathapo MC, Danguru LW, et al. Multivariate adaptive regression splines data mining algorithm for Presiction of body weight of Hy-line silver Brown commercial layer chicken breed. Advances in Animal and Veterinary Sciences. 2020; 8 (8):794-799. DOI: 10.17582/journal.aavs/2020/8.8.794.799 - 3.
Tyasi TL, Eyduran E, Celik S. Comparison of tree-based regression tree methods for predicting live body weight from morphological traits in Hy-line silver brown commercial layer and indigenous Potchefstroom Koekoek breeds raised in South Africa. Tropical Animal Health and Production. 2021; 53 (7):1-8. DOI: 10.1007/s11250-020-02443-y - 4.
Orhan H, Eyduran E, Tatliye A, Saygici H. Prediction of egg weight from egg quality characteristics via ridge regression and regression tree methods. 2016; 45 (7):380-385 - 5.
Gevrekci Y, Takma C. A comparative study for egg production in layers by decision tree analysis. Pakistan Journal of Zoology. 2018; 50 (2):437-444. DOI: 10.17582/journal.pjz/2018.50.2.437.444 - 6.
Dong X, Gao L, Zhang H, Wang J, Qiu K, Qi G, et al. Discriminating eggs from two local breeds based on fatty acid profile and flavor characteristics combined with classification algorithms. Food Science of Animal Resources. 2021; 41 (6):936-949. DOI: 10.5851/kosfa.2021.e47 - 7.
Liswaniso S, Qin N, Tyasi TL, Chimbaka IM, Sun X, Xu R. Use of data mining algorithms Chaid and CART in predicting egg weight from egg quality traits of indigenous free-range chickens in Zambia. Advances in Animal and Veterinary Sciences. 2020; 9 (2):215-220. DOI: 10.17582/journal.aavs/2021/9.2.215.220 - 8.
Okoro VMO, Ravhuhali KE, Mapholi TH, Mbajiorgu EF, Mbajiorgu AC. Comparison of commercial and locally developed layers performance and egg size prediction using regression tree methods. The Journal of Applied Poultry Research. 2017; 26 :477-484 - 9.
Bettany-Saltikov J. Learning how to undertake a systematic review: Part 2. Nursing Standard. 2010; 24 :47-56