Open access peer-reviewed chapter

Genomic Selection: A Faster Strategy for Plant Breeding

Written By

Gizachew Haile Gidamo

Submitted: 14 April 2022 Reviewed: 12 May 2022 Published: 18 August 2022

DOI: 10.5772/intechopen.105398

From the Edited Volume

Case Studies of Breeding Strategies in Major Plant Species

Edited by Haiping Wang

Chapter metrics overview

531 Chapter Downloads

View Full Metrics

Abstract

Many agronomic traits, such as grain yield, are controlled by polygenes with minor effects and epistatic interaction. Genomic selection (GS) uses genome-wide markers to predict a genomic estimate of breeding value (GEBV) that is used to select favorable individuals. GS involves three essential steps: prediction model training, prediction of breeding value, and selection of favorable individual based on the predicted GEBV. Prediction accuracies were evaluated using either correlation between GEBV (predicted) and empirically estimated (observed) value or cross-validation technique. Factors such as marker diversity and density, size and composition of training population, number of QTL, and heritability affect GS accuracies. GS has got potential applications in hybrid breeding, germplasm enhancement, and yield-related breeding programs. Therefore, GS is promising strategy for rapid improvement of genetic gain per unit time for quantitative traits with low heritability in breeding programs.

Keywords

  • genomic selection
  • training population
  • breeding population
  • prediction accuracies
  • plant breeding

1. Introduction

Since the 1990s, when promising analysis results for tagging genes or mapping QTL led to the development of marker-assisted selection (MAS). MAS has become a popular strategy in plant breeding. In the identification of underlying key genes in gene pools and their transfer to desirable traits in many plant breeding programs, marker-assisted selection and molecular breeding, have been applied. The use of MAS has shown some flaws, such as extensive selection schemes and the inability to catch “minor” gene effects when looking for crucial marker-QTL relationships. As a result, improving traits with complex inheritance, such as grain yield and abiotic stress tolerance, using MAS is difficult.

Genomic selection, also known as genome-wide selection, is a strategy that employs genotypic data from throughout the entire genome to accurately predict any trait, allowing for the selection of a favorable individual [1]. The most suitable individual is chosen based on a genomic estimate of breeding values (GEBVs). Breeding values are a popular and widely used measure in the animal breeding industry. Breeding values are defined as the “sum” of the estimated genetic deviation and the weighted total of the estimated breed effect [2], which are predicted using phenotypic data from family pedigrees based on the additive infinitesimal model. The success of selection in animal breeding, particularly in cattle and pigs, was aided by the infinitesimal genetic model and quantitative genetics. In the estimation of breeding value in animal breeding, the best linear unbiased prediction (BLUP) and Bayesian framework are often utilized. Following the introduction of genome sequencing in several model animals, a novel method for selection dubbed GEBV was developed [1]. In this chapter, principles of genomic selection and their application as a faster strategy for plant breeding is presented.

1.1 Phases of genomic selection

For large effect alleles, molecular marker technology has aided QTL identification, marker-assisted introgression, and selection, but not for low heritability quantitative traits, which still require considerable field testing. In the case of low heritability quantitative traits, locus identification and effect estimation are difficult to predict. New statistical methods that account for such uncertainty in genomic selection were used to make the best predictions. When classical marker-assisted selection and genomic selection are compared, the core framework is similar, including both breeding and training phases [3, 4]. In genomic selection, there are three crucial processes [3]:

  1. Prediction model training and validation. Some lines in a population under selection are referred to as training sets. The training population is made up of germplasm that has both phenotypic and genome-wide marker data and is used to estimate marker effect and cross-validate results (Figure 1). These data were used to develop a statistical model that links variance in detected genotypes marker loci to variation in individual phenotypes. The training set’s statistical analysis evaluates allele effect in all loci at the same time. The second group of lines, known as selection candidates (breeding materials), are genotyped in order to calculate their GEBV.

  2. Prediction of breeding value. The genomic estimate of breeding value (GEBV) is calculated using a mixture of marker effects estimations and marker data from a single cross. For an individual, GEBV is the sum of all marker effects incorporated in the model. These GEBVs were used to make selection. The line is the unit of evaluation as long as phenotype is a selection factor. When allele effect prediction is used as a selection criterion, the allele becomes the evaluation unit. The maximum GS accuracies were achieved in the presence of a large training population. The training population consists of numerous generations of training and comprises of parents or very recent ancestors of the population under selection. The use of markers for effect measurement is one of the key differences between traditional MAS and genomic selection. Only important markers are used in traditional MAS for effect estimation during QTL identification and selection. The non-significant markers were ruled out of consideration. However, all of the markers were utilized in genomic selection to capture the whole additive genetic variance. As a result, a more precise and small-effect QTL can be identified for use in breeding programs.

  3. Selection based on predicted GEBV. The single cross is subjected to GEBV-based selection. GS made it possible to calculate breeding value directly from genotype rather than phenotype. When limited seed production prevents the application of selection and recombination in the F1 generation, a modified recurrent selection method using GS among F2 individuals is proposed in crop plants.

Figure 1.

Phases of genomic selection. Genome wide genotypic and phenotypic information from training population allows GS model optimization and breeding value estimation.

Advertisement

2. Statistical methods used in genomic selection

Finding a causal association between a genetic element and the characteristic of interest is a common selection framework used in all pedigree-based phenotypic selections, classic marker-assisted selection, and genomic selection. Typical QTL mapping and MASs either overstate or ignore the marker effect. The desire to use high-density genotyping technology for complicated trait prediction led to the creation of genomic selection. Avoiding marker selection minimizes bias during effect estimation and genetic value computations, according to Meuwissen et al. [1]. Because marker selection produces a bigger predictor effect (P) than the number of data (smaller n). When there is not enough degree of freedom, the ordinary least square estimator fails to estimate all of the predictors’ effects, resulting in an over-fitted model. As a result, the ordinary least square’s prediction performance suffers. To address this issue, various genomic selection models have been developed. Models like the shrinkage methods, variable selection models, kernel approaches, and dimension reduction methods can be mentioned.

2.1 The basic genetic model and variance decomposition

The basic genetic model that relates the phenotype (P) of an individual with summation of the genetic values (G) by assuming that only effects of the genetic factors were inherited to the next generations. The genetic values include genetic, dominance and epistatic effects and the residual environmental effect (E). It is mathematically denoted as:

P=G+EE1

In absence of G and E interaction, the covariance between G and E becomes zero. Therefore, the phenotypic variance V (P) can be expressed as [2]:

VP=VG+VE+2COVGEE2
VP=VG+VE+0=VG+VEE3

The GEBV is generally equal to G.

2.2 Heritability

The fraction of phenotypic variation (V(P)) owing to variation in genetic value (V(G)) is known as heritability. It assesses how well a population’s phenotypic characteristics are passed on to the following generation. There are two ways to explain heritability: broad and narrow sense approaches. The fraction of phenotypic variation owing to genetic value is captured by broad-sense heritability (H2). It concentrates on all genetic influences, including additive, dominance, and epistatic effects. Therefore, it can be mathematically represented as

H2=VGVPE4

While the narrow-sense heritability (h2) captures only the proportion of genetic variation that is due to additive genetic effect (V (A)) and the residual effect variance denoted as V(ɛ). It is represented by h2=VGVP. Therefore, for h2, the genetic model can be rewritten as:

VP=VA+VεE5

where V (ɛ) represents the residual effects that are not included in the additive genetic effect (A) such as the dominant and epistatic effects.

The narrow-sense heritability is the most important in plant selection because it accounts for nearly all of the genetic variance that affect response to selection (close to 100%). Meuwissen et al. [1] suggested that V(A) might be broken down into various DNA markers effect like V(A1), V(A2), V(A3), and so on. This made it easier to calculate the breeding value of a plant using markers that covered the complete genome.

2.3 Breeding value

In animal breeding, the word “breeding value” refers to how many beneficial genes one animal passes on to its progeny. The genotype value and the breeding value can be equivalent. However, owing to dominance or epistatic situations, this is not always the case. Alleles at the loci that affect phenotype are heritable. Knowing the effect of an allele in a population can assist in predicting the progeny’s phenotype. The deciding variables of a given trait in a population are allele frequencies and the effect of each genotype that includes the allele. It is also referred to as the allele’s average effect.

An individual’s breeding value is the total of the average effects of all the alleles the individual bears [3]. An AB heterozygote, for example, has a breeding value of 3 if an A allele is worth +5 and a B allele is worth −2. It is an individual’s genetic value added together. Breeding value (BV) can alternatively be described as the departure of offspring’s phenotypic mean value from the population phenotypic mean value using the narrow sense heritability concept (h2). This can be expressed numerically as:

BV=m¯0+h2yim0=m0+yim0VAVPE6

where yi is the phenotypic value of individual i (i = 1, 2, ... n) and mo denotes the population’s mean phenotypic value. Estimate of breeding value (EBV) is a term used to describe a breeding value that is estimated based on heredity. Genomic selection, on the other hand, employs genome-wide markers to evaluate genotype effect and breeding value, resulting in GEBVs (genomic estimate of breeding values) [2].

Advertisement

3. Models

3.1 The linear model

A linear model or its extension can be used to describe the causal link between phenotype and genotype. For the pair of observed phenotype and genotype of the marker of ith individual (yi, x1i), i.e. (y1, x11), (y2, x12), .., (yN, x1N) in the training population, which assumes N individuals and M biallelic markers. N individuals’ phenotypes are normally distributed, and based on their marker genotype, they get an additional normally distributed phenotypic value of β1, depending on their marker genotype. The phenotype (yi) can be modeled using genetic value gi = x1iβ1 as a parametric regression on marker covariate x1i as follows: yi = β0 + x1iβ1 + εi, where, β0 is the intercept (overall mean) and β1 is the marker effect (regression coefficient), x1i is the genotype value of marker 1 for individual i. The values of β0 and β1 are the parameters that need to be determined, and εi is an error term that is usually assumed to have a normal distribution with a mean of zero. To determine the unknown parameters, least-squares estimation, such that the summation of εi2, that is an error function E=iyiβox112, is minimized and the line is fitted to the phenotype. However, applying the model for P and N number of markers and individuals, respectively, result over fitting. To avoid overfitting, a penalty term is introduced in the error function, i.e.,

E=i=1Nyij=0Mxijβj2+λj=0Mβjq)E7

where, effect of the penalty term is controlled by λ.

To incorporate genome-wide markers in the model, the above formula can be extended into a multiple linear regression model, which gives the following formula [2]:

yi=β0+x1iβ1+x2iβ2xmiβmi+εiE8
yi=β0+j=0Mxjiβj+εiE9

where yi = the phenotypic value of the individual i and xji is the genotype value of the jth marker in ith individual. The coefficient βj is the effect of marker j on the phenotype or regression of yi on the jth marker covariate xij and εi is the random error assumed [2]. X0i = 1 is a dummy variable. Similarly, the coefficients were determined by minimizing the error function,

E=i=1Nyij=0Mxjiβj2E10

In genomic selection, the focus is given to calculations of the genome enhanced breeding value rather than the exact location of the QTL; therefore, using the link function of linear model assumption, which provides relationship between linear predictor and the mean of the distribution function and error variance of regression, it can be rewritten as [2]:

yi=j=0Mxjiβj+εi,E11

A number of models, including random regression best linear unbiased prediction (RR-BLUP), least absolute shrinkage and selection operator (LASSO), reproducing kernel Hilbert spaces (RKHS) and support vector machine regression, Bayesian methods, and collaborative filtering recommender system [5] have been developed using the above fundamental concepts. The majority of GS models aim to reduce the cost function [6].

3.2 Evaluating genomic prediction accuracy

Candidates for selection have no phenotypic information. As a result, their GEBV predictive performance may be evaluated using either a group of validation individuals with highly accurate EBVs and many progenies or cross validation. Both methods necessitate a reference population that contains both marker genotypes and phenotypic information.

3.2.1 Correlation studies between GEBV and observed EBV value

The r(GEBV: EBV) correlation between the GEBVs and empirically determined breeding values (observed) is used to assess the GEBVs’ prediction accuracy (predicted). The EBV can be produced in a number of ways, the most basic of which is as a phenotypic mean. This relationship establishes a direct link between GEBV prediction accuracy and selection response, as well as a rough estimate of selection accuracy. Other statistics are occasionally used, such as mean-square error (MSE). The correlation between GEBV and true breeding value (TBV), that is, r(GEBV:TBV) is used to quantify genomic selection accuracy. Due to the fact that we can only measure r(GEBV:EBV), we must transform this value to an estimate of r(GEBV:TBV). To do so,

rGEBV:EBV=rGEBV:TBVEBV:TBVE12

This assumption is accurate if the TBV is the only component that the GEBV and the EBV have in common. In other words,

GEBV=TBV+e1E13

and

EBV=TBV+e2E14

where e1 and e2 are uncorrelated error residuals, the assumption holds. If the training and validation data were obtained in the same setting, the assumption may be broken. In that instance, a common component of error in both GEBV and EBV would be generated by genotype by environment (GxE) interaction, biasing their correlation higher. To obtain accurate estimations of GEBV prediction accuracy, training and validation data should be collected in various environments. The r(EBV:TBV) correction accommodates for the fact that the EBV in the validation set is not error-free. Within the validation set, r(EBV:TBV) equals the square root of heritability (h) when the EBVs are phenotypes [7].

3.2.2 Evaluating GEBV accuracy through cross validation (CV)

Cross validation is used in GS research to evaluate GEBV accuracy on empirical data (CV). The reference population is divided into subsets in cross validation, such as a training set and a validation/testing set. Similar genetic backgrounds and relationships of validation and selection individuals to the reference population are required for cross validation, so that the accuracies achieved for selection candidates resemble those estimated using the reference population. The size of the subset determines accuracy; higher sizes usually result in lower sampling variance of anticipated and observed correlations [8].

The number of observations in each set varies, but a fivefold CV is frequently employed, in which the data set is divided into five sets at random, four of which are combined to form the training set, and the remaining set is designated as the validation set. Each subset of the data is used as a validation set once, and the model’s correctness should be evaluated before it is applied to the breeding population. To do so, the majority of the training population is utilized to build a prediction model, which is then used to estimate the genomic estimated breeding values of the remaining individuals in the training population based solely on genotypic data. This allows researchers to “test” and develop the prediction model to ensure that it has high enough prediction accuracy that future predictions can be trusted. Once validated, the model is frequently used to calculate GEBVs of lines for which genotypical but not phenotypical information is available [8, 9].

3.3 Factors affecting genomic selection accuracy

The response of genomic selection is the result of numerous elements that contribute to the accuracy of GEBV estimation. These components are intricately linked in a comprehensive and complex way. The extent and distribution of linkage disequilibrium between individuals, as well as model performance, sample size and relatedness, marker density, gene effect, heritability, and genetic design are all factors to consider.

  1. Marker density

    Marker density and TP sizes required for satisfactory accuracy are heavily influenced by factors such as effective population size and QTL number. Minimum number of markers that cover the complete genome were used based on LD decay, with at least one marker in LD with each gene area. When there were a lot of LD and dense markers, the prediction was better [10]. However, unless the marker density is extremely low, marker density has minimal effect on prediction accuracy within families. Furthermore, some GS models, such as Bayes B, do not require a particularly dense marker for good breeding value prediction. The required marker density is also determined by the type of marker. For example, bi-allelic markers like SNP required two to three times the density of multi-allelic markers like SSR [3, 4, 11].

  2. Size and composition of training population

    GS accuracy is affected by the size of the training population. Up to the highest size possible, Vanraden et al. [12] found that the connection between accuracy and training population (TP) size was nearly linear. In other words, when the training population size was big, the maximum GS accuracies were achieved. Furthermore, population structure, training population age, and numerous generations of training all have an impact on accuracy. A near ancestor or parents, older lines, related lines, and multiple generations of training have good accuracy [4, 10]. Additionally, using a pooled training set of heterotic groups could improve accuracy [13]. As a result, the under-selection population’s parents or recent ancestors can be used as the training population in a repeated generation of training to achieve high accuracy.

    To maintain accuracy when using landrace or exotic germplasm in GS, very high marker density and a large training population size are required [1]. In addition, the training population’s unrelatedness and single crosses cause marker effects become inconsistent. Due to the presence of various alleles, allelic frequencies, genetic background, or epistatic interaction, erroneous assessment of marker effect and GEBVs may occurs [14].

  3. Number of QTL

    The number of QTL and trait heritability determines the appropriate marker density and training population size. Even traits with low heritability can be accurately predicted in the context of a large training population [15]. For this prediction, a model like BLUP can be employed, which captures a lot of modest effect QTL that may not be in LD with the marker.

  4. Heritability

    Lower GEBV accuracies are associated with low heritability of a trait. High accuracy can only be maintained in the case of low heritability traits (particularly for h2 0.4) by utilizing a large training population with many phenotypic data [10, 15]. Consider a population with an effective size of 1000 individuals and an accuracy of 0.70. If the heritability, h2, is 0.2, it is expected that the training population (TP) size will need to be 9000, however, if h2 is 0.50, a TP size of fewer than 3000 will be required. Responses to genomic selection were 18–43 percent higher than MARS across varied population sizes, QTL numbers, and heritability [16].

  5. Linkage disequilibrium (LD)

    LD refers to the nonrandom linkage of alleles at different loci. Marker density and GS accuracy can be estimated using the rate of LD decay across the genome. It has been found that for high heritability traits, an average nearby marker LD value (r2) of 0.15 is sufficient, but increasing the r2 value to 0.2 enhances GEBV prediction accuracy for low heritability traits.

  6. Model used

    Several published GS studies compared the accuracy of various statistical models. The disparity in prediction accuracies is negligible. However, some studies have found that the prediction accuracies of various models vary greatly, as seen in rice hybrid breeding [17, 18, 19, 20].

3.4 Approaches to improve genetic gain and GS accuracies

3.4.1 Using biparental populations

With no group structure, biparental populations have a high level of LD between markers and trait alleles. Three shorter GS cycles can be completed in one year by using full-sib families. Genetic gain per unit time was improved in biparental populations from rapid selection cycles (C), according to studies made in maize [21]. Under drought conditions, maize hybrids generated from C3 cycles yielded 7.3 percent more than C0, according to Beyene et al. [21]. In winter wheat biparental populations, Lozada et al. [4] found a 10% increase in responsiveness to selection using genomic section relative to phenotypic selection. Similar studies have been conducted in oat by Asoro et al. [22] and wheat by Rutkoski et al. [23].

3.4.2 Using multi-parental populations and multi-environment models

GS rapid cycles of multi-parental crosses were performed in diallelic fashion to form cycle 0 in the CIMMYT maize breeding program. With two selection cycles each year and two location experiments, it suggested an improvement in genetic gain in which the study predicted a 0.1 t/h per year yield gain over a period of 4.5 years [24]. However, when comparing the C4 cycle to the C0 cycle, a decrease in genetic diversity was seen [24].

3.4.3 Combining GS with high-throughput phenotyping

In addition to genotyping data, accurate phenotypic data is required for genomic prediction model training to achieve the desired accuracy. For large-scale field-based accurate phenotypic data collection, a number of high-throughput phenotyping technologies have been built. These platforms are based on image and distant or proximal sensor technologies. Infrared thermometry and thermal imaging; visible/near-infrared spectroradiometry; and red, green, and blue light color digital photography are the three types of technologies in use for high throughput phenotyping. Their deployment is determined by the trait of interest and experimental design in the field. These technologies’ data can be used as the primary input for model training. It is feasible to quantify high-density phenotypes over time and space using distant or proximal sensing by applying high throughput phenotyping such as canopy hyperspectral reflectance in a large number of breeding lines. This can improve the precision and intensity of selection, as well as the selection response, while lowering the phenotyping expenses. Lozada et al. [4] have proved in wheat that combining GS with high throughput phenotyping results in the highest accuracy for grain yield. The advantage of this imaging method is that large numbers of phenotypes can be screened for complex phenotypic expression and secondary traits that are genetically connected with grain yield at a low cost during early-generation testing. Juliana et al. [25] claim that by utilizing high throughput phenotyping, they were able to achieve a 60 percent increase in genetic gain for wheat yield and secondary characteristics.

3.4.4 Using historical data

Predicting the performance of new lines can be done by using phenotypic data from relatives and ancestors for model training that accounts for GxE interaction in multi-location research [26]. Historical data from breeding programs can be used effectively to increase genomic selection accuracy, particularly when the training set is adjusted to include only the most informative individuals from the target testing set [27].

3.4.5 Genotype imputation

For genotyping, genomic selection uses a high sample size and a dense marker set. In such data sets, missing data is a problem. Missing data were dealt with in one of four ways: (1) repeating genotyping in missing regions, (2) adapting analysis methods to accommodate missing data, (3) eliminating SNPs and/or samples with missing data, or (4) inferring the missing data (imputation).

Imputation of genotype is useful in a variety of situations. First, genotyping by sequencing, which is regarded as a low-cost genotyping method, typically yields a large number of markers at a low cost, but with a high proportion of missing data due to the poor genome sequencing depth. As a result of the imputation, the data set is full and ready for further study [28]. Second, utilizing low density genotyping and a closely similar reference panel genotyped at high density, imputation can enable GEBV prediction without a significant loss of accuracy. Using this in silico genotyping technique, low density genotyping in GS can be done without sacrificing accuracy. Imputation, on the other hand, may pose the danger of biases and inaccuracies [29].

Haplotype tagging is the simplest technique for genotype imputation [30]. In this strategy, a tag from the reference panel was chosen so that the majority of known (common SNPs) have a r2 of less than 0.8 with the tag SNP. To identify shared haplotypes, the sequences of the reference panel haplotypes were compared to the genotyped markers. The missing genotypes were then filled in by copying alleles found in a matching reference haplotype (called FILLIN method) [29].

For imputation, a number of statistical approaches have been developed. These include the expectation maximization, Bayesian, LinkImpute, LD k-nearest neighbor imputation (LD KNNi) and entropy methods. These methods integrate models of recombination by partitioning markers into haplotype blocks. The tree-based imputation infers on the basis of perfect phylogeny and pairwise haplotype dissimilarity rather than haplotype structure [31].

3.5 Applications of genomic selection

3.5.1 GS for breeding of quality traits and yield

Grain yield is a crucial economic feature that has been studied in most GS studies of crops such as wheat. Grain yield is a complicated quantitative trait that is impacted by interactions between genes and surroundings and is regulated by a number of genes with little effects. GS has been shown to be important in grain yield studies in cereals. Prediction accuracies have been improved by include GxE effects in models [32]. Furthermore, GS aided in the cost reduction of phenotyping for malt quality in barley breeding [33].

3.5.2 GS and breeding for disease resistance

In terms of boosting intricate quantitative disease resistance, the GS method has a lot of potential for crop breeders. Pathogens find it difficult to overcome quantitative disease resistance because it is governed by a large number of genes with minor effects. Wheat rust, fusarium head blight, and rice blast resistance are three of the most well-studied diseases using the GS approach [18, 34].

3.5.3 GS for germplasm enhancement

Alleles for cultivar development can be found in abundance in gene bank accessions. Identification of these alleles is costly and time-consuming, and it necessitates extensive pre-breeding operations. Germplasm augmentation initiatives can begin with landraces by crossing them with elite testers. High genome-enabled prediction accuracy may be attained with GS, which may aid breedings in introducing valuable genetic variants. This supports the use of GS to introduce landrace accessions into elite germplasm and create gene pools and populations suited for pre-breeding and germplasm improvement [35].

3.5.4 GS for hybrid breeding

In hybrid breeding, parental selection is a critical issue. The performance of prospective crosses of a given parent set with genotyped parents and a small number of crosses examined in the field can be improved by employing whole genome markers in GS. This lowers the expense of hybridization and field testing of all possible hybrids. GS can also be used to predict hybrid performance and assist in hybrid selection. The predicted hybrids can be tested in the field and released as superior hybrids if they pass the test. There are just a few papers indicating the use of genomic selection for hybrid breeding in maize and rice [17, 20].

Advertisement

4. Conclusion

Before creating phenotypes, traditional cultivar development in plant breeding necessitates understanding of biological function. GS enables breeding without mapping and characterization of genes/QTLs at a low cost in order to gain functional data. GS can result in high genetic gain per unit time in crop breeding programs by enhancing GEBV accuracies through employing dense markers, increasing training population size, trait heritability, adopting a good GEBV prediction model, and using imputation techniques.

Advertisement

Acknowledgments

I acknowledge the support from the Department of Biotechnology, Addis Ababa Science and Technology University, Addis Ababa, Ethiopia.

Advertisement

Nomenclature

MASMarker-assisted selection
GSGenomic selection
BVBreeding value
GEBVGenomic estimated breeding value
TBVTrue breeding value
CVCross-validation
TPTraining population
BPBreeding population
LDLinkage disequilibrium
CSelection cycle

References

  1. 1. Meuwissen THE, Hayes BJ, Goddard ME. Prediction of total genetic values using genome-wide dense marker maps. Genetics. 2001;157:1819-1829. DOI: 10.1093/genetics/157.4.1819
  2. 2. Nakaya A, Isobe S. Will genomic selection be a practical method for plant breeding? Annals of Botany. 2012;110:1303-1316. DOI: 10.1093/aob/mcs109
  3. 3. Jannink J, Lorenz AJ, Iwata H. Genomic selection in plant breeding: From theory to practice. Briefings in Functional Genomics. 2010;9:166-177. DOI: 10.1093/bfgp/elq001
  4. 4. Lozada DN, Mason RE, Sarinelli JM, Brown-Guedira G. Accuracy of genomic selection for grain yield and agronomic traits in soft red winter wheat. BMC Genetics. 2019;20:82. DOI: 10.1186/s12863-019-0785-1
  5. 5. Lozada DN, Carter AH. Genomic selection in winter wheat breeding using a recommender approach. Genes. 2020;11:779. DOI: 10.3390/genes11070779
  6. 6. Heslot N, Yang H-S, Sorells ME, Jannick JL. Genomic selection in plant breeding: A comparison of models. Crop Science. 2012;52:146-160. DOI: 10.2135/cropsci2011.06.0297
  7. 7. Storlie E, Charmet G. Genomic selection accuracy using historical data generated in a wheat breeding program. The Plant Genome. 2013;6:1-9. DOI: 10.3835/plantgenome 2013. 01.0001
  8. 8. Zhao Y, Gowda M, Liu W, Würschum T, Maurer HP, Longin FH, et al. Accuracy of genomic selection in European maize elite breeding populations. Theoretical and Applied Genetics. 2012;124:769-776. DOI: 10.1007/s00122-011-1745-y
  9. 9. Heffner EL, Jannink JL, Sorrells ME. Genomic selection accuracy using multifamily prediction models in a wheat breeding program. The Plant Genome. 2011;4:65-74. DOI: 10.3835/plantgenome2010.12.0029
  10. 10. Liu X, Wanga H, Wanga H, Guoa Z, Xua X, Liua J, et al. Factors affecting genomic selection revealed by empirical evidence in maize. The Crop Journal. 2018;6:341-352. DOI: 10.1016/j.cj.2018.03.005
  11. 11. Solberg TR, Sonesson AK, Woolliams JA, Meuwissen TH. Genomic selection using different marker types and densities. Journal of Animal Science. 2008;86:2447-2454. DOI: 10.2527/jas.2007-0010
  12. 12. VanRaden PM, Van Tassell CP, Wiggans GR, Sonstegard TS, Schnabel RD, Taylor JF, et al. Reliability of genomic predictions for north American Holstein bulls (invited review). Journal of Dairy Science. 2009;92:16-24. DOI: 10.3168/jds.2008-1514
  13. 13. Technow F, Bürger A, Melchinger AE. Genomic prediction of northern corn leaf blight resistance in maize with combined or separated training sets for heterotic groups. Genes Genomes Genetics. 2013;3:197-203. DOI: 10.1534/g3.112.004630
  14. 14. Bernardo R. Molecular markers and selection for complex traits in plants: Learning from the last 20 years. Crop Science. 2008;48:1649-1664. DOI: 10.2135/cropsci2008.03.0131
  15. 15. Hayes B, Bowman P, Chamberlain A, Goddard M. Invited review: Genomic selection in dairy cattle: Progress and challenges. Journal of Dairy Science. 2009;92:433-443. DOI: 10.3168/jds.2008-1646
  16. 16. Bernardo R, Yu JM. Prospects for genomewide selection for quantitative traits in maize. Crop Science. 2007;47:1082-1090. DOI: 10.2135/cropsci2006.11.0690
  17. 17. Cui Y, Li R, Li G, ZhanG F, Zhu T, Zhang Q, et al. Hybrid breeding of rice via genomic selection. Plant Biotechnology Journal. 2020;18:57-67. DOI: 10.1111/pbi. 13170
  18. 18. Huang M, Balimponya EG, Mgonja EM, McHale LK, Luzi-Kihupi A, Wang GL, et al. Use of genomic selection in breeding rice (Oryza sativa L.) for resistance to rice blast (Magnaporthe oryzae). Molecular Breeding. 2019;39:114. DOI: 10.1007/s11032-019-1023-2
  19. 19. Islam MS, Fang DD, Jenkins JN, Guo J, McCarty JC, Jones DC. Evaluation of genomic selection methods for predicting fiber quality traits in upland cotton. Molecular Genetics and Genomics. 2020;295:67-79. DOI: 10.1007/s00438-019-01599-z
  20. 20. Xu Y, Wang X, Ding X, Zheng X, Yang Z, Xu C, et al. Genomic selection of agronomic traits in hybrid rice using an NCII population. Rice. 2018;11:32. DOI: 10.1186/s12284-018-0223-4
  21. 21. Beyene Y, Semagn K, Mugo S, Tarekegne A, Babu R, Meisel B, et al. Genetic gains in grain yield through genomic selection in eight bi-parental maize populations under drought stress. Crop Science. 2015;55:154c163. DOI: 10.2135/cropsci2014.07.0460
  22. 22. Asoro FG, Newell MA, Beavis WD, Scott MP, Tinker NA, Jannink JL. Genomic marker-assisted and pedigree-BLUP selection methods for beta-glucan concentration in elite oat. Crop Science. 2013;53:1894-1906. DOI: 10.2135/cropsci2012.09.0526
  23. 23. Rutkoski J, Singh RP, Huerta-Espino J, Bhavani S, Poland J, Jannink JL, et al. Genetic gain from phenotypic and genomic selection for quantitative resistance to stem rust of wheat. Plant Genome. 2015;8:1-10. DOI: 10.3835/plantgenome2014.10.0074
  24. 24. Zhang X, Pérez-Rodríguez P, Burgueño J, Olsen M, Buckler E, Atlin G, et al. Rapid cycling genomic selection in a multi-parental tropical maize population. Genes Genomes Genetics. 2017;7:1-12. DOI: 10.1534/g3.117.043141
  25. 25. Juliana P, Montesinos-López OA, Crossa J, Mondal S, González Pérez L, Poland J, et al. Integrating genomic-enabled prediction and high-throughput phenotyping in breeding for climate-resilient bread wheat. Theoretical and Applied Genetics. 2019;132:177-194. DOI: 10.1007/s00122-018-3206-3
  26. 26. Dawson JC, Endelmana JB, Heslot N, Crossad J, Poland J, Dreisigacker S, et al. The use of unbalanced historical data for genomic selection in aninternational wheat breeding program. Field Crops Research. 2013;154:12-22. DOI: 10.1016/j.fcr.2013.07.020
  27. 27. Atanda SA, Olsen M, Burgueño J, Crossa J, Dzidzienyo D, Beyene Y, et al. Maximizing efficiency of genomic selection in CIMMYT’s tropical maize breeding program. Theoretical and Applied Genetics. 2021;134:279-294. DOI: 10.1007/s00122-020-03696-9
  28. 28. Rutkoski JE, Poland J, Jannink JL, Sorrells ME. Imputation of unordered markers and the impact on genomic selection accuracy. Genes Genomes Genetics. 2013;3:427-439. DOI: 10.1534/g3.112.005363
  29. 29. Li Y, Willer C, Sanna S, Abecasis G. Genotype imputation. Annual Review of Genomics and Human Genetics. 2009;10:387-406. DOI: 10.1146/annurev.genom.9.081307.164242
  30. 30. Johnson GCL, Esposito L, Barratt BJ, Smith AN, Heward J, Genova GD, et al. Haplotype tagging for the identification of common disease genes. Nature Genetics. 2001;29:233-237. DOI: 10.1038/ng1001-233
  31. 31. Roberts A, McMillan L, Wang W, Parker J, Rusyn I, Threadgill D. Inferring missing genotypes in large SNP panels using fast nearest-neighbour searches over sliding windows. Bioinformatics. 2007;23:401-407. DOI: 10.1093/bioinformatics/btm220
  32. 32. Juliana P, Singh RP, Braun HJ, Huerta-Espino J, Crespo-Herrera L, Govindan V, et al. Genomic selection for grain yield in the CIMMYT wheat breeding program status and perspectives. Frontiers in Plant Science. 2020;11:e 564183. DOI: 10.6084/m9.figshare.12350000.v1
  33. 33. Schmidt M, Kollers S, Maasberg-Prelle A, Großer J, Schinkel B, Tomerius A, et al. Prediction of malting quality traits in barley based on genome-wide marker data to assess the potential of genomic selection. Theoretical and Applied Genetics. 2015. DOI: 10.1007/s00122-015-2639-1
  34. 34. Ornella L, Singh S, Perez P, Burgueño J, Singh R, Tapia E, et al. Genomic prediction of genetic values for resistance to wheat rusts. The Plant Genome. 2012;5:136e148. DOI: 10.3835/plantgenome2012.07.0017
  35. 35. Gorjanc G, Jenko J, Hearne SJ, Hickey JM. Initiating maize pre-breeding programs using genomic selection to harness polygenic variation from landrace populations. BMC Genomics. 2016;17:30. DOI: 10.1186/s12864-015-2345-z

Written By

Gizachew Haile Gidamo

Submitted: 14 April 2022 Reviewed: 12 May 2022 Published: 18 August 2022