Application of Microsatellites in Genetic Diversity Analysis and Heterotic Grouping of Sorghum and Maize

Sorghum and maize are major cereal crops worldwide and key food security crops in Sub-Saharan Africa. The difference in the mating systems, maize as predominantly a cross-fertilizer and sorghum as a self-fertilizer is reflected in differences in visible phenotypic and genotypic variations. The reproductive differences dictate the level of genetic variation present in the two crops. Conventionally, a heterotic group assignment is made based on phenotypic values estimated through combining ability and heterosis analyses. However, phenotypic evaluation methods have their limitation due to the influence of the environment and may not reflect the heterotic pattern of the lines accurately. Therefore, more effective and complementary methods have been proposed for heterotic grouping of candidate lines. Estimation of molecular-based genetic distance has proven to be a useful tool to describe existing heterotic groups, to identify new heterotic groups, and to assign inbreds into heterotic groups. Among the molecular markers, microsatellites markers have proved to be a powerful tool for analyzing genetic diversity and for classifying inbred lines into heterotic groups. Therefore, the aim of this chapter was to elucidate the use of microsatellite markers in genetic diversity analysis and heterotic grouping of sorghum and maize.


Introduction
Maize and sorghum have been more widely evaluated in genetic and cytogenetic studies than other cereal crops. Maize is one of the domesticated crop species with the highest level of molecular polymorphism. Nucleotide diversity of more than 5% has been reported at some loci of the maize genome [1], and this has been confirmed by high genetic variability in maize. The molecular diversity of maize is approximately 3-to 10-fold higher than any other domesticated grass species [2]. Several factors have been suggested as reasons for the diversity in maize including: (1) differences in the growing environments, cultivation geared toward various production systems and varied consumption preferences [3] that influenced breeding of maize varieties to severe diverse human needs worldwide; (2) high level of cross-fertilization and independent assortment of genes that led to considerable gene transfer between populations, including wild relatives; (3) presence of duplications and recombination of genes leading to creation of mutations and ultimate phenotypic variability [4]; and (4) existence of transposons and retro-transposable genetic elements leading to marked genetic variation among maize populations [5].
Similarly, sorghum is one of the most genetically diverse self-fertilizing crops. Early domestication and selection of sorghum in response to environmental factors and human needs resulted in the wide variability. The environmental factors included day length, altitude, temperature, rainfall, and soil characteristics. Humans usually required a large panicle, a nonshattering habit, large grain, tall plant height, and early maturity. The greater genetic diversity is, therefore, partly due to the diverse physical environments and partly due to the interaction of man with the environment [6]. As a result, the new and stable sorghum biotypes that have emerged can be attributed to selection, adaptation, intercrossing, and the movement of plant material from place to place. Introduction of new genotypes that have evolved in other places may result in intercrossing with the native genetic resources leading to the development of new biotypes. This movement and evolution of germplasm gave rise to five major sorghum races: bicolor, caudatum, guinea, kafir, and durra [7].
Morpho-agronomic characters of crop plants have traditionally been used for assessment of genetic variability. These characters reflect genetic variations that are manifested as visible morphological traits [8]. However, assessments based on these characters are not efficient or reliable because they are strongly affected by environmental factors. Other genetic variations are compositional or chemicals that require various tests for evaluation [9]. Isozymes [10] and seed storage proteins [11] were the most widely used biochemical markers. Since the late 1980s, analyses using various electrophoretic [12] and reversed-phase high-performance liquid chromatography (RP-HPLC) [13] of seed storage proteins have been developed and are considered effective methods for cultivar identification. Often, the importance of these types of markers is inherently impeded by low polymorphism.
The application of DNA molecular markers as compared to morphological and biochemical markers overcomes the problem of low polymorphism. DNA markers are highly informative and have facilitated the identification of agronomic traits in wild, traditional, and improved germplasm through the dissection of quantitative traits [14]. DNA-based molecular markers are independent to environmental factors. DNA markers are fast, efficient, and robust providing clear genetic differences than phenotypic markers [15] . Several DNA marker technologies are available for determining genetic variations. Nevertheless, selection of the best marker system depends on the target species, the aim of the marker analysis, and the resource capacity [14]. PCR-based markers are widely preferred for genotype characterization in diverse crop species, including sorghum and maize, as they are relatively simple to use, nondistractive, and require small quantity of DNA, thus permitting many reactions from a single sample [16]. In addition, genetic distance (GD) estimates using molecular markers are reportedly helpful to identify the best parent combinations for new pedigree starts and to assign lines into heterotic groups [17,18].
Molecular markers, such as restricted fragment length polymorphisms (RFLPs), random amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLPs), and simple sequence repeats (SSRs) or microsatellites, have been proposed as tools not only to evaluate breeding lines and hybrids and cultivars [19] but also to facilitate the monitoring of introgression, mapping of quantitative trait loci (QTLs), and the assessment of genetic diversity [20,21] in various crops, including sorghum and maize. SSR markers have been widely applied for the assessment of genetic diversity and characterization of germplasm [22][23][24], identification and fingerprinting of genotypes [14], and estimation of genetic distances between and within populations [22] and to assign inbred lines into heterotic groups [25,26]. SSR data from a number of loci have the potential to provide unique allelic profiles or DNA fingerprints for precisely establishing genotypic identity. They also have greater discriminatory power than restricted fragment length polymorphisms markers and can exhibit genetic relations that are reflective of the pedigree of the inbred lines [27]. Genotyping of inbred lines using SSRs is a reliable way of germplasm characterization which, together with morphological descriptions, leads to unambiguous differentiation of genotypes that can be utilized for a hybrid breeding program [28]. Therefore, SSR markers are the efficient marker of choice due to their ability to provide informative multiallelic loci, highly reproducible test with great powers of genotypic differentiation, which are relatively simple to use [29].
Classification of the available complementary inbreds into distinct heterotic groups is crucial to the development of superior hybrids and in developing genetic pools and breeding populations for designed breeding and genetic analyses. Exploitation of heterosis, utilization of heterotic groups, and their patterns is well established and developed in maize [30]. However, efforts to determine heterotic groups in sorghum have not been successful in clearly delineating any patterns [31]. The phenomenon of heterosis between genetically distant or unrelated genotypes has been widely reported. A heterotic group is a group of related or unrelated genotypes displaying similar combining ability effects and providing a heterotic response when crossed to other genetically distinct and complementary group [32]. Classification of inbred lines into heterotic groups based on phenotypic values could be inaccurate due to the influence of the environment and may not truly reflect the heterotic pattern of the lines. Therefore, more effective methods have been proposed for genetic grouping of candidate lines, including the use of molecular markers, line by tester analysis, and diallel crosses, among others. Recently, the use of genetic distance as indices of genetic relatedness and as a tool for defining potential heterotic groups has been used in numerous crop plants. Simple sequence repeats (SSRs) have proved to be a powerful tool for analyzing genetic diversity and for classifying inbred lines. Therefore, the aim of this chapter was to elucidate the use of SSR markers in the heterotic grouping of the two model crops using experimental data.

Determination of genetic diversity using SSRs in sorghum
Assessment of genetic variability in crops has a strong impact on crop improvement programs and conservation of genetic resources [33]. SSR markers appear to be particularly useful for measuring diversity, for assigning genotypes to heterotic groups, and for genetic fingerprinting [34]. The study reported by our group [26], involving 36 sorghum lines, provided clear genetic differentiation using the 30 SSR markers ( The 32 lowland sorghum lines from Ethiopia were crossed with the four cytoplasmic malesterile (CMS) lines using a line x tester mating design. The 128 single-cross hybrids, along with the parental genotypes plus four checks, were evaluated under rainfed and irrigated conditions. A 12 × 14 incomplete block design (alpha lattice), with three replications, was used for the evaluation of the hybrids and check varieties. To determine the magnitude of heterosis and combining ability effects, the maintainer lines were used in place of their male sterile counterparts. An interrow spacing was 0.75 m and intrarow spacing of 0.30 m. Each genotype was planted in three rows of 3 m long. A 1 m pathway was used to separate between plots. Two sorghum seeds were planted per hill, and two weeks after emergence the seedlings were thinned keeping one healthy and vigorous plant.
Performance data on 128 F1 hybrids generated from these parents were used for this study.
Grain yield data were recorded on both rainfed and irrigated plots. Best linear unbiased estimates (BLUEs) were made from the grain yield performance of 128 hybrids. The BLUEs of hybrid performance were calculated using trial data from two environments (rainfed and irrigated) using Genstat for Windows 17th Edition [35]. The BLUEs were then used to calculate general combining ability (GCA), specific combining ability (SCA) effects, and the level of heterosis. Heterotic group's specific and general combining ability (HSGCA) was computed as the sum of GCA and SCA. A phylogenetic tree was constructed from the genetic distance matrix and HSGCA value using the neighbor-joining method implemented in DARwin software ver 5.0 [36].
This study detected a total of 203 putative alleles and the number of alleles per locus detected was highly variable ranging from 2 (mSbCIR223, Xcup61, and Xtxp040) to 15 (Xtxp145), with a mean of 6.8 per locus ( Table 2). Our results were slightly higher than Folkertsma et al. [37] and Ganapathy et al. [25] but lower than Wang et al. [38] and Mutegi et al. [34]. The higher level of allelic diversity of the SSR loci examined in this study was probably associated with the wide range of genetic diversity represented in sorghum R lines sampled. The results of a χ 2 test showed significant differences in major allele frequencies with a mean major allele frequency of 0.50. This result is in congruence with the results of Wang et al. [38]. A total of 60 rare alleles, those occurring at a frequency of ≤5%, were detected by the 30 SSR markers. The detection of a significant number of rare alleles could be attributed to the high genetic diversity within the sorghum lines. Polymorphism information content (PIC) values ranged from 0.15 (mSbCIR223) to 0.90 (Xtxp145) with a mean of 0.63 ( Table 2). High PIC values have been reported by others [22,39]. Among the tested SSRs, 26 markers (87%) revealed PIC values of greater than 0.5, indicating their usefulness in discriminating between the genotypes. Observed heterozygosity (Ho) ranged from 0.0 to 0.03, with a mean of 0.01, indicating that the test lines used in the present study were genetically pure lines, which were maintained by continued self-fertilization. The mean expected heterozygosity (He) was observed to be 0.64 with maximum and minimum He values recorded by SSR markers, Xtxp145 (0.91) and mSbCIR223 (0.15), respectively. Expected heterozygosity was higher for test materials, suggesting that 64% of individuals are expected to be heterozygous at a given locus under random mating conditions. This can be explained by the higher outcrossing rate (5%-50%) observed in sorghum [40]. The genetic distance between the lines ranged from 0.40 to 0.80, with overall mean of 0.63 [26].

Determination of genetic diversity using SSRs in maize
Comparing two marker systems (SSRs and RAPDs), researchers [23] reported that the RAPDs produced several polymorphic bands, although the resolution power of the agarose gel electrophoresis was not good enough to allow the bands of both marker systems to be seen clearly. In the study by Demissew et al. [23], the 25 RAPD markers yielded a total of 31 alleles, with an average of 1.24 alleles per locus. Only 7.5% of the RAPD primers exhibited polymorphic bands, while the majority of the markers were monomorphic. The results were consistent with the findings of Asif et al. [41]. The application of a given marker in characterizing genotypes can be determined by the level of polymorphism it can detect and its discriminatory potential to distinguish individuals. Higher PIC value was observed for the SSR markers as compared to RAPD, reflecting the better discriminating power of SSR markers over RAPDs that makes them ideal for use in fingerprinting of maize lines as was reported by Liu et al. [42]. Garcia et al. [43] also found that the RFLP and SSR polymorphism information content means were higher than the RAPD and AFLP means.  In another study, a total of 98 alleles, with a mean of 3.9 alleles per marker, were detected across 30 quality protein maize (QPM) and 6 non-QPM maize inbred lines using 25 SSR markers [24] ( Table 3). The number of alleles detected in this study was in agreement with other studies [44]. Beyene et al. [45] genotyped 62 traditional Ethiopian highland maize accessions with 20 SSRs and reported a total of 98 alleles and a mean of 4.9 alleles per marker. Legesse et al. [20] reported an average of 3.9 alleles per marker by genotyping 56 highland and mid-altitude non-QPM inbred lines using 27 SSRs. Krishna et al. [15] reported a mean of 4.1 alleles using 48 SSR loci and 63 QPM inbred lines. The mean number of alleles in these studies were, however, lower than the 5.4 and 6.4 alleles previously reported by Wu et al. [46] and Yao et al. [47], respectively, but higher than the 3.3 alleles reported by Kassahun and Prasanna [48] and the 2.4-3.4 alleles reported by Babu et al. [49,50]. The differences in mean numbers of alleles among different studies could be attributed to the type of germplasm, sample size, and repeat length of the SSRs used [24].
According to Botstein et al. [51] PIC guideline, 14 markers from Demissew et al. [24] were reasonably informative (0.30 < PIC < 0.50) and the remaining 11 markers were highly informative (PIC > 0.50). The values were comparable with previous reports by Dhliwayo et al. [52] and Mahar et al. [53] but lower than those of reported by Krishna et al. [15] . Smaller PIC values may have been due to the presence of relatively few dinucleotide repeat SSR markers [24] as opposed to a greater number of dinucleotides used in other studies [49,50] or the presence of little genetic variability among the genotypes used in that particular study [52].

Population structure and heterotic grouping in sorghum
In sorghum, a predominantly self-pollinated crop, the exploitation of heterosis began in the USA in the 1950s. There have been few studies on the mechanism of heterosis, heterotic grouping, and the use of molecular markers as selection criteria for parents in sorghum when compared to other crops such as maize [54]. Heterosis in sorghum has been reported in the form of increased grain, hastened flowering and maturity, increased height, and larger stems and panicles [54]. Enhanced grain yield was reported by Kambal and Webster [55] to be a product of an increased number of seeds per panicle and increased seed weight. Hybrid sorghum cultivars have been demonstrated to be more productive than pure line varieties [56]. Significant heterosis for grain yield and other agronomic traits has been reported in sorghum [57]. It has also been reported that F 1 hybrids have superior buffering capacity across variable environments than pure lines in sorghum [58]. Consequently, breeding for hybrid cultivars is a better option than pure line varieties while improving sorghum grain yield.
SSR marker data have frequently been used as a tool to examine the dynamics of differentiation and population structures within germplasm collections [34,38]. Cluster analysis using neighbor-joining tree analysis and structure analysis can estimate the number of subpopulations and the genetic relatedness among assessed genotypes. The study by Amelework et al. [22] investigated the extent of genetic differentiation, population structure, and patterns of relationship among 200 sorghum landraces collected from lowland agro-ecology. The results obtained from both model-based population structure analysis and neighbor-joining tree analysis revealed that two group patterns existed. The two distinct subgroups resulted from farmers' selection for adaptation for the two main seasons. The results obtained from these two separate analyses support each other, with small discrepancy between groupings. Out of the 200 landraces, 32 genotypes were selected based on prior study on the basis of their relatively better yield performance and better adaptability in a moisture stress environment. They were kept homogenous through continued selfing and selection. Estimation of molecular-based genetic distance have been proven to be a useful way to describe existing heterotic groups, to identify new heterotic groups, and to assign inbreds of unknown genetic origin to established heterotic groups [25]. The cluster analysis carried out on the 32 lines and 4 A/B female lines, based on SSR markers, revealed three distinct groups among the 36 parental genotypes [26]. Cluster I consisted of a large number of landraces (15 genotypes).
This cluster consisted of R lines such as 242039B, 244733, 242036, 244735A, 73059, and 214855 that showed highest significant HSGCA in cross-combination with ICSA 749 and ICSA 756. This group was dominated by late flowering and high biomass lines. The high biomass was, in turn, expressed as large numbers of leaf and larger leaf width per plant. The second cluster was composed of 11 landraces and 1 CMS line. All the R lines clustered in this group except 244727, revealing positive HSGCA in cross-combination with ICSA 756. Cluster II was dominated by early flowering with small panicle and high 100 seed weight genotypes. It was reported that heterosis in sorghum is expressed as a high plant or crop growth rate as compared with the parents [59]. The third cluster (III) composed of six landraces and three CMS. This cluster consisted of three R lines (75454, 239208, and 242049A) with high and positive HSGCA in cross-combination with ICSA 743 and ICSA 756.
Heterotic groups comprise sets of genotypes that perform well when crossed with genotypes from a different heterotic group [30]. Heterotic groups in sorghum have been defined by the milo-kafir cytoplasmic genetic male-sterility system where lines are grouped either as A/Blines or R-lines [25]. The independent cluster analysis carried out based on HSGCA value for grain yield under irrigation and rainfed condition revealed three heterotic patterns based on the distribution of the 32 lines across environments (Figure 1A and B). In this study, R and B lines did not show distinct heterotic grouping. The groupings that appeared were mainly based on the female parents. For example, in Figure 1A, the genotypes assigned to the first cluster (blue) had high and positive HSGCA value in a cross-combination with ICSA 743. The second group (purple) composed of 15 genotypes that showed positive HSGCA values in a crosscombination with ICSA 749. The third group (blue) was mainly represented by genotypes that revealed positive HSGCA values in a cross-combination with ICSA 756.
The extent of genetic diversity between the two parents has been proposed as a possible measure of the prediction of heterosis [60]. Although it has been suggested that the genetic distance between parents is positively correlated with heterosis of F 1 hybrids, strong association has rarely been observed between heterosis and genetic distance between parents [61]. However, studies in different crops have shown moderate to strong correlation between combining ability and per se performance [62]. Even though this method is extensively used for prediction of heterosis, it is hypothetical and relies heavily on field evaluation. In the study of Amelework et al. [26], it was found that there were significant variations for grain yield, SCA, HSGCA, mid-, and better-parent heterosis among the 128 F 1 hybrids and 36 parental lines for grain yield. However, the results of the correlation analysis revealed that SSR-based genetic distance had no significant association with any of the grouping methods across environments ( Table 4). Better-parent heterosis (BPH) under irrigated conditions had no significant correlation with SCA and HSGCA under both irrigated and rainfed conditions. On the contrary, both mid-and better-heterosis under rainfed conditions showed significant association with SCA, HSGCA, mid-parent heterosis (MPH) under irrigation, and across the two environments. The lack of significant association between genetic distance and other hybrid performance indicator in this study is also supported by other studies. In studies on rice [63], wheat [64], and grain sorghum [65], there were also nonsignificant relationships between whole genome-based genetic distance and hybrid performance. However, Boppenmaier et al. [66] and Mosar and Lee [67] reported significant genetic relationships between genetic distance and hybrid performance of maize and oats, respectively. The prediction power of genetic distance has been inconsistent in many studies using different species and different germplasm [68]. This may be because of the peculiarities of many agronomic traits and lack of common phenotypic assaying methods across environments.

Use of SSR markers in delineation of maize population structures and heterotic groups
Genetic distance estimates are indicators of the presence or absence of relationships among genotypes. The estimates can be made using various types of molecular markers. Heterotic group assignment is often made through combining ability experiments. Also, several authors suggested the use of molecular markers in heterotic grouping [17,18]. A comparison of SSRs and SNPs markers were carried out by Hamblin et al. [69] to characterize maize inbred lines, to elucidate the population structure, and the genetic relationships among individuals. The authors reported that the SSRs were markers of choice than SNPs by clustering the test germplasm into populations and providing more resolution in measuring genetic distance.
A study by Demissew et al. [24] indicated the extent of genetic differentiation, population structure, and patterns of relationship among 36 maize inbred lines developed from CIMMYT source germplasm (Table 1). This study used 25 SSRs and applied a model-based population structure, neighbor-joining cluster, and principal coordinate analyses. All these different multivariate methods revealed the presence of two to three primary cluster groups, which was in general agreement with prior pedigree information and partly with the putative heterotic groups. The model-based population structure analysis in the same study assigned about half of the inbred lines into their putative heterotic group previously defined by breeders. There were 17, 14, and 5 inbred lines in cluster groups I, II, and III, respectively (Figure 2). Cluster Group I was dominated by six lines from the Ecuador heterotic group, four from the Kitale group, two from the Pool 9A group, three from previously uncategorized lines, and two CMLs (CML144 and CML491). Out of the 17 lines in Group 1, 8 of them were converted to QPM using CML176 as donor, whereas only 3 lines out of 17 were converted to QPM using CML144 as donor. Three lines in Group I were non-QPM counterparts. A mid-altitude line (F7215Q), which was converted into QPM using CML159 as donor parent, was also found in this group. Similarly, the cluster in Group II was dominated by five lines extracted from the Kitale heterotic group, four from Ecuador, four Pool9A, and one previously uncategorized line. Six lines in Group II were converted to QPM using CML144. Five lines were converted using CML176 and the remaining three lines were again non-QPM counterparts. The other mid-altitude line (142-1eQ) which was converted into QPM using CML176 as donor parent was also found in this group. As regards cluster Group III, it included two previously uncategorized lines with CML144 being used for their conversion to QPM, one from Kitale with CML144 again used as the QPM donor, one from Pool9A where CML176 was the QPM donor, and CML176 itself. However, in the report of Bantte and Prasanna [70], it was noted that CML176 and CML144 were categorized together into one cluster group. Such incongruities with the results of other investigators in assigning inbred lines into heterotic groups may occur due to error in seed handling or pollination [71]. It may also be caused by differential selection of the different lines in different environments or genetic drift and mutation [27].
The inconsistent results in identifying heterotic pools following phenotypic evaluations during the initial phase of development of the inbred lines might have contributed to the failure of the SSR markers to categorize the remaining 50% of the inbred lines into the known heterotic groups [24]. Partial or unclear heterotic patterns were previously reported by Semagn et al. [72] in tropical and subtropical CIMMYT maize inbred lines. It was also noted from the present study that prior conversions of conventional maize inbred lines into QPM counterparts were not done systematically leading to disruption of the original heterotic system. The inbred lines from the three known heterotic groups (Kitale, Ecuador, and Pool 9A) were spread throughout the three genetic clusters (Figure 2).
The conversions had been done using phenotypic selections without monitoring the genetic backgrounds using molecular markers. Consequently, recombinants were selected and only a small portion of the genome of the recurrent parent was recovered. This suggested the need to use marker-assisted backcrossing (MAB) or marker-assisted selection (MAS) in the development of QPM lines through backcross procedures. Marker-assisted breeding and/or MAS can be used to facilitate background selection and to avoid disruption of newly established heterotic groups. Furthermore, earlier phenotypic selection methods used by CIMMYT could have contributed for the lack of genetic information and the partial success of the SSR markers to recognize all the available heterotic groups. In the early 1990s, broad-based genetic pools and populations were utilized by CIMMYT breeders to develop inbred lines and open pollinated varieties (OPV). Consequently, the classification of CIMMYT populations and inbred lines into heterotic groups through various mating designs has been intensified to exploit hybrid technologies using different representative testers. However, it is not easy to cluster inbred lines into their respective heterotic groups if they are extracted from similar genetic pool or source population without considering origin or heterotic pattern of inbred lines [73]. Therefore, many generations of reciprocal recurrent selection may be necessary before the lines from each heterotic group begin to significantly diverge [74].

Genetic purity analysis of maize lines using SSRs and implications for heterotic grouping
In a previous study Demissew et al. [23], the genetic variability of quality protein maize (QPM) inbred lines were investigated using SSR and RAPD markers. A single SSR amplification product (allele) per locus was expected from all the inbred lines given the high level of expected homozygosity. However, "double bands" were detected using SSR markers, which could have been masked should RAPD markers were only used in that study. The "double bands" or SSR heterozygosity indicated that some of the QPM inbred lines were not homozygous at the specific locus. This genetic background is not expected for inbred lines given that these individuals are a product of continuous and controlled selfing yielding high levels of homozygosity. The SSR markers used in the present study facilitated differentiation of homozygotic and heterozygotic alleles in the tested inbred lines sourced from the same genetic pool. The SSR profile observed in this study concurs with the reports of Bantte and Prasanna [70]. A study by Shehata et al. [75] used SSRs and analyzed the molecular diversity and heterozygosity. The authors reported that different seed sources of the same inbreeds were important source of genetic variations. Also, there is a limited genetic variability that can be expected within inbred lines sourced from the same genetic sources suggesting the danger of ignoring this during sampling of inbred lines yet evolved through continued selfing. This is not uncommon in cross-fertilizing crops such as maize where a wide range of genetic variability is expected due to random crosses or mutational events over time [76].
In a related study conducted by Demissew et al. [24], the genetic purity and classification of maize inbred lines were tested using SSR markers. The authors reported 4.0-16.7% heterozygosity present among the tested inbred lines showing higher than the expected value after four generations of continuous selfing. In another study (B. Tadesse, unpublished), a total of 88 maize inbred lines were genotyped using a subset of 191 SNPs, identified for a routine quality control analysis [77]. This result showed that nearly 78% of the inbred lines showed high levels of heterozygosity. Factors such as seed admixture, pollen contamination, mislabeling of seed sources, and mixing of different seed stocks for planting are reported to be the source of heterozygous-inbred lines (K. Semagn, unpublished). The study by Warburton et al. [73] reported that bulking during maintenance breeding, seed regeneration, and contamination with seeds or pollen of other samples could possibly cause small changes in allelic frequencies. However, high levels of heterozygosity can significantly change phenotypic uniformity, heterotic patterns, and hence performance of hybrids. These may result in the distribution of mixed hybrids lacking proper genetic identity. Consequently, additional generations of selfing for all lines with high levels of heterozygosity are essential. The levels of homozygosity should be monitored frequently, especially in QPM materials, because opaque2 is a recessive gene that is liable to contamination. For new pedigree starts, such problems could be minimized by implementing a routine quality control genotyping using a subset of informative markers at different stages in a breeding program [77].

Conclusions
SSRs have been proved to be a valuable tool for diversity analysis and to assign inbred lines into heterotic groups in both sorghum and maize [22][23][24]26]. SSR have greater discriminatory power than RAPDs markers, and can identify genetic relations that are reflective of the pedigree of the inbred lines. SSR markers were also found to be useful in studying the genetic purity and the level of heterozygosity in inbred lines. Genotyping of inbred lines using SSRs is a reliable way of germplasm characterization which, together with morphological descriptions, leads to unambiguous differentiation of genotypes that can be utilized for hybrid breeding programs.
Heterotic groups in sorghum have been defined either as A/B-lines or R-lines. However, recent molecular marker-based diversity studies that utilize more detailed analyses have indicated the existence of a more complex system of genetic relationships among elite parental lines. In this study, although nonsignificant association between genetic distance and hybrid performance was observed, some patterns were detected in the distribution of sorghum genotypes. The challenges of using SSR markers as a tool for heterotic grouping in sorghum is that the genetic distance estimates can be affected by several factors such as the distribution of markers in the genome, the number of markers used, and the nature of the evolutionary mechanism underlying the variation measured. Additionally, the basic assumption for molecular diversity to predict hybrid performance is the existence of high levels of gametic phase linkage disequilibrium between yield quantitative trait loci and marker alleles. QTLs influencing heterosis in grain yield are located in certain chromosomal regions, which are unevenly distributed over the genome. Therefore, future research should focus on combined use of field-based progeny tests for yield and yield components, and molecular-based distance measurements to improve breeding efficiency. To improve prediction efficiency of molecular markers, dissecting the diversity of individual linkage groups will be exploited.