Genetic Characterization of Global Rice Germplasm for Sustainable Agriculture

This book is devoted to food production and the problems associated with the satisfaction of food needs in different parts of the world. The emerging food crisis calls for development of sustainable food production, and the quality and safety of the food produced should be guaranteed. The book contains thirteen chapters and is divided into two sections. The first section is related to social issues rising from food insufficiency in the third world countries, and is titled "Sustainable food production: Case studies". The case studies of semi-arid Africa, Caribbean and Jamaica, Burkina Faso, Nigeria, Pacific Islands, Mexico and Brazil are discussed. The second section, titled "Scientific Methods for Improving Food Quality and Safety", covers the methods for control and avoidance of food contaminants. Substitution of chemical treatment with physical, rapid analytical methods for control of contaminants, problems in animal husbandry related to diary production and hormones in food producing animals, approaches and tasks in maize and rice production are in the covered by 6 chapters in this section.

enhancement (Bretting, 2007). The NPGS is a cooperative effort by public and private organizations to preserve the genetic diversity of plants. Crop Germplasm Committees (CGC), representing the federal, state, and private sectors in various scientific disciplines, determine the set of descriptors to be managed by GRIN for most crops. Rice CGC has requested 42 descriptors plus panicle and kernel images to characterize the collection (Rice Descriptors, 2011). The USDA-ARS Dale Bumpers National Rice Research Center (DBNRRC) coordinates germplasm activities of rice including evaluation of the collection for the 42 descriptors and constantly updating the GRIN database. Furthermore, the DBNRRC manages the Genetic Stocks -Oryza collection including more than 30,000 accessions of genetic materials donated from national and international research programs (GSOR, 2011).
Comprehensive evaluation of the collection for such a large number of descriptors has been hindered by the sheer number of accessions, particularly those involving grain quality and resistances to biotic and abiotic stresses which require sophisticated instruments and significant resources. It also is difficult to characterize such a large collection using molecular means. For practical evaluation and effective management of large collections in crops, the core collection concept was proposed in the 1980s (Brown, 1989).

USDA rice core collection
A core collection is a subset of a large germplasm collection that contains chosen accessions capturing most of the genetic variability within the entire gene bank (Brown, 1989). With the strategy of comprehensive evaluation and accurate analysis of the core collection, the genetic diversity of the collection can be assessed, genetic distances among the accessions can be estimated for identification of special divergent subpopulations, genetic gaps of the existing collection can be identified for planning acquisition strategies and joint analysis of phenotype and genotype can be conducted for molecular understanding of the collection (Steiner et al., 2001). These analyses can help users effectively find the traits in which they are interested along with molecular information. The information is useful for determining strategies for transferring desirable traits found in the collection into new commercial cultivars.

Establishment of the core collection
The USDA rice core subset (RCS) or collection was assembled by sampling from over 18,000 accessions in the working collection of the NSGC in 1998 and 2002, respectively (Yan et al., 2007). A method of stratification by country and then random sampling was adapted by: 1) recording the number of accessions from each country or region of origin; 2) calculating the logarithm (log) of the number of accessions from each country or region of origin; 3) randomly choosing the accessions within each country or region based on the relative log numbers, with a minimum of one accession per country or region; and 4) removing obvious duplications by plant introduction (PI) number and cultivar name. In addition to the stratified sampling, additional emphasis was placed on some newly introduced Chinese germplasm  and newly released accessions from quarantine programs . The resultant RCS consists of 1,794 entries from 112 countries and represents approximately 10% of the rice whole collection (RWC).

Evaluation of the core collection
The RCS was evaluated at Stuttgart, Arkansas in 2002. Seeds of each accession were visually purified by seed shape and hull color as described in the GRIN before planting in a plot consisting of two rows, 0.3 m apart and 1.4 m long using a Hege 500 planter. Plots were separated by 0.9 m to avoid biological and mechanical contamination. A permanent flood was established after 67 kg ha -1 of nitrogen as urea was applied at about 5-leaf stage.
Agronomic descriptors were recorded in the field using standard criteria described in the GRIN. Rough or paddy rice is the mature rice grain as harvested, and becomes brown rice when the hulls are removed. Rough and brown rice samples were analyzed on an automated grain image analyzer (GrainCheck 2312; Foss Tecator AB, Hoganas, Sweden) to determine rice kernel dimensions (length, width and length/width ratio), hull and seed pericarp (bran) colorations, and 1000 grain weight. Samples were milled for determination of apparent amylose content (Pérez and Juliano, 1978;Webb, 1972) and alkali spreading value (ASV) (Little et al., 1958). Fourteen important traits were selected for comparison with the whole collection.

Comparative study of the RCS with RWC
Statistical analysis was conducted using the univariate and correlation procedures of SAS statistical software, Version 9.1.3 (SAS Institute, 2004). Frequency distributions for each of 14 traits were determined using Microsoft Office Excel software. Frequency refers to how often data values occur within a range of values in an Excel bins-array that is an array of data intervals into which the data values are grouped. For example, days to flower had a binsarray of 40,50,60,70,80,90,100,110,120,130,140,150,160,170,180 and 190 (Fig . 1), e.g., all accessions ranging from 36 to 45 days were grouped in bin 40. Frequencies (%) of the respective bins were 0.02, 0. 05,1.15,2.91,7.54,16.01,20.33,21.16,14.91,6.65,4.07,2.29,1.83,0.48,0.52 and 0.10 among 15,097 accessions in RWC,and 0,0.24,1.26,4.56,10.43,23.38,27.40,13.73,9.53,3.54,2.82,1.50,0.96,0.48,0.18 and 0 among 1,668 RCS entries that headed in the field (others failed to head). Paired frequencies of the RWC and the RCS on each bin were used for correlation analysis, which measures the correspondence between the two collections. The RCS data of 1,794 accessions were from above field evaluation the RWC data of ~15,000 accessions were extracted from the GRIN. Those with no unit are categorical traits, and their category classifications are explained in the GRIN, i.e. Awn type: 0-absent; 1-short and partly awned; 5-short and fully awned; 7-long and partly awned and 9-long and fully awned (Rice Descriptors, 2011).

Frequency analysis of 14 traits proves that the RCS well represents the RWC
As displayed in Fig. 1 (Yan et al., 2007). Taken together, the 14 traits had a high correlation of distribution frequency (r=0.94, P<0.0001) between the RCS and RWC, resulting a determination coefficient (r 2 ) of 0.88. The high correlation of the RCS with the RWC demonstrates that a stratified set of 10% of the accessions can be effectively used to assess the variability in the whole rice collection with 88% certainty. The correlation analysis validates the RCS to be well representative of the RWC for genetic assessment of global rice germplasm.

The RCS improves genetic characterization of germplasm collection
In an effort to better characterize genetic diversity of the rice collection, the RCS with 10% of over 18,000 accessions in the whole collection is a reasonable size for replicated evaluations.

Genotyping and statistical analysis
Total genomic DNA was extracted using a rapid alkali extraction procedure (Xin et al., 2003) from a bulk of five plants derived from a single plant selected to represent each accession in the core collection. Seventy-two (71 SSR and an indel) molecular markers, covering the entire rice genome, approximately with an average of one marker per 30 cM, were used to genotype the 1,794 accessions. PCR amplification of the markers followed the procedure that was described by Agrama et al. (2009). DNA samples were separated on an ABI Prism 3730 DNA analyzer according to the manufacturer's instructions (Applied Biosystems, Foster City, CA, USA). Fragments were sized and binned into alleles using GeneMapper v. 3.7 software.
The 112 countries or districts from which the 1,794 accessions originated were classified into 14 geographic regions according to groupings of the United Nations Statistic Division (UNSD, 2009). Each accession was plotted on the global map using its latitude and longitude coordinates according to the GRIN passport database. The map was built using the 'prcomp' procedure in the statistics module (version 2.8.1) of the R statistical package including 'spatial', 'maps' and 'fields' Ripley, 1998, Venables et al., 2008).
PowerMarker software (Liu and Muse, 2005) was used to calculate allele frequencies and polymorphism information content (PIC) values (Botstein et al., 1980) for each marker, region and country. Analysis of molecular variance (AMOVA; Excoffier et al., 1992) was conducted for variance components within and among regions and countries of origin, respectively, using ARLEQUIN 3.0 software (Schneider et al., 2000). Significance of variance components was tested using a non-parametric procedure based on 1,000 random permutations of individuals using the software ARLEQUIN 3.0 (Schneider et al., 2000). Genetic diversity was estimated using Nei diversity index for each accession according to Lynch and Milligan (1994). Geographical distribution of diversity index represented by Kriging methods was globally mapped using the R-script (François et al., 2008).
Genetic relationships among accessions represented by regions and countries were determined by the unweighted pair-group method with an arithmetic mean (UPGMA) analysis based on Nei (Nei, 1973) genetic similarity estimated using the 72 markers. The UPGMA trees were constructed from 1,000 bootstrap replicates using the software PowerMarker (Liu and Muse, 2005) and drawn with MEGA v. 3.1 (Kumar et al., 2004). The number of alleles, which are private to a population and do not exist in other populations, is especially informative when populations are studied with highly variable multi-allelic markers, such as SSRs (Szpiech et al., 2008). The average number of private alleles per locus for core accessions originating in each of 14 geographic regions was estimated using ADZE (Allelic Diversity AnalyZEr) software (Szpiech et al., 2008) with the 72 molecular markers. www.intechopen.com
The 1,794 core accessions were introduced from 112 countries and distributed to 14 worldwide geographic regions with the most countries in Africa and the least in North America ( AMOVA showed that the majority (89%) of total genetic variance attributed to differences within regions and the rest (11%) was due to variance among regions (Table 2). Likewise, when countries were taken into account, 82 % of the total variation was due to the differences within countries, and the remaining portion of the variance was equally shared by both among regions and among countries. Genetic variations were significantly differentiated among regions (Φ st =0.10, P<0.001) and among countries (Φ st =0.12, P<0.001), and very highly and significantly differentiated within countries (Φ st =0.85, P<0.001).

Genetic diversity and genetic relationships among geographic regions
Rice accessions collected from Southern Asia had the most number of alleles per locus, followed by Africa, Southeast Asia, China, South America, South Pacific and Central America, while those in Western and Eastern Europe, North America and Central Asia had the least ( Table 1). As demonstrated by the PIC value, the accessions derived from Southeast Asia had the greatest diversity, followed by Southern Asia, South Pacific, Africa, Middle East, South America and Oceania, while those in Western and Eastern Europe and North America had the lowest diversity. Visualized by Nei Genetic Diversity index on the world map using the Kriging method, germplasm accessions collected from Southern Asia, Southeast Asia, Central America and Africa were mostly diversified, while those from North Pacific, Oceania, Western and Eastern Europe and North America had the lowest diversity (Fig. 2).
Germplasm accessions that were introduced from Southern Asia had the most private alleles per locus, followed by Africa, Central America, Southeast Asia, South Pacific, China, Oceania and Middle East, while those in Eastern Europe, Central Asia, North and South America and Western Europe had the least private alleles per locus (Fig. 3).
Three main clusters were resulted from the UPGMA analysis based on Nei (Nei, 1973) genetic similarity (Fig. 4). In cluster 1, germplasm accessions from South America were mostly related to Central America, and then to Africa, Oceania and North America. Two sub-groups of the originating region among rice accessions obviously existed in cluster 2, while Eastern Europe and Western Europe were in sub-group 1 and Central Asia, Middle East and North Pacific in sub-group 2. In cluster 3, germplasm accessions originating in Southeast Asia were closest to those in the South Pacific, and then to China and the Southern Asia. Cluster 1 was closer to cluster 2 than to cluster 3.

Genetic diversity and genetic relationships among countries
Among the 78 countries from which 5 or more accessions were introduced, Myanmar had the most diversification indicated by the highest PIC (0.65 Cluster analysis of 78 countries from which 5 or more accessions were present in the core collection formed five distinctive groups (Fig. 5) Two countries each with five accessions were independent of these clusters. Haiti in Central America was between Cluster 4 and 5, while Guinea-Bissau in Africa was between Cluster 1 and 5. The vast diversity found in the USDA global rice collection is an important genetic resource that can effectively support breeding programs in the U.S. and worldwide.

Statistical analysis
Genotypic data of 71 SSR plus an indel markers for the core collection plus 23 reference cultivars were used to decide putative number of structures at first. Genetic structure was inferred using the admixture analysis model-based clustering algorithms implemented in TESS v. 2.1 . TESS implements a Bayesian clustering algorithm for spatial population genetics. Multi-locus genotypes were analyzed with TESS using the Markov Chain Monte Carlo (MCMC) method, with the F-model and a value of 0.6 which assumes 0.0 as non-informative spatial prior. To estimate the K number of ancestral-genetic populations and the ancestry membership proportions of each individual in the cluster analysis, the algorithm was run 100 times, each run with a total of 70.000 sweeps and 50.000 burn-in sweeps for each K value from 2 to 15. For each run we computed the Deviance Information Criterion (DIC) (Spiegelhalter et al., 2002), a model-complexity penalized measure www.intechopen.com to show how well the model fits the data. The putative number of clusters was obtained when the DIC values were the smallest and estimates of data likelihood were the highest in 10% of the runs. Similarity coefficients between runs and the average matrix of ancestry membership were calculated using CLUMPP v. 1.1 (Jakobsson and Rosenberg, 2007).
Each accession in the core collection was grouped to a specific cluster or population by its K value resulted from cluster analysis using TESS. The sub-species ancestry of each K was inferred by the reference cultivars for indica, AUS, aromatic, temperate japonica, and tropical japonica rices. Analysis of molecular variance (AMOVA; Excoffier et al., 1992) was used to calculate variance components within and among the populations obtained from TESS in the collection. Estimation of variance components was performed using the software ARLEQUIN 3.0 (Schneider et al., 2000). The AMOVA-derived Φ ST (Weir and Cockerham, 1984) is analogous to Wright's F statistics differing only in their assumption of heterozygosity (Paun et al., 2006). Φ ST provides an effective estimate of the amount of genetic divergence or structuring among populations (Excoffier et al., 1992). Significance of variance components was tested using a non-parametric procedure based on 1,000 random permutations of individuals. The computer package ARLEQUIN was used to estimate pairwise F ST (Goudet, 1995) for the populations obtained from TESS.
Multivariate analysis such as principle component analysis (PCA) provides techniques for classifying the inter-relationship of measured variables. Multivariate geo-statistical methods combine the advantages of geo-statistical techniques and multivariate analysis while incorporating spatial or temporal correlations and multivariate relationships to detect and map different sources of spatial variation on different scales (Goovaerts, 1992;Wackernagel, 1994). Geographical spatial interpolation of principal coordinates of latitude and longitude and admixture ancestry matrix coefficients (Ks) calculated in TESS for each accession were represented by kriging method (François et al., 2008) as implemented in the R statistical packages 'spatial', 'maps' and 'fields' (Venables and Ripley, 1998;Venables et al., 2008) for visualizing distribution in the world map.
Principal components analysis (PCA) was conducted using GenAlex 6.1 (Peakall and Smouse, 2006) software to structure the core collection genotyped by 72 molecular markers, and generate a PC-matrix. Geo-statistical and geographic analysis was based on CNT coordinates of latitude and longitude where a core accession originated using the R statistical packages. Polymorphism information content (PIC) and number of alleles per locus in each sub-species population were estimated using PowerMarker software (Liu and Muse, 2005). Number of distinct alleles in each population and number of alleles private to each population, that is not found in other populations, were calculated using ADZE program (Allelic Diversity AnalyZEr, Szpiech et al., 2008). ADZE uses the rarefaction method to trim unequal accessions to the same standardized sample size, a number equal to the smallest accessions across the populations.

Number of populations and ancestry determination
Structural analysis resulted in the lowest Deviance Information Criterion (DIC) or highest log likelihood scores when the putative number (K) of populations was set at five, and the ancestry coefficient of each accession in each K was estimated accordingly (Fig. 6) . Similarly, principle coordinate (PC) analysis of Nei's genetic distance (Nei, 1973;1978) classified the core accessions into five clusters by PC1 and PC2 including 71% of total variances (Fig. 7). Both structure and PC analyses indicated that five populations sufficiently explained the genetic diversity in the core collection. Analysis of molecular variance (AMOVA) showed that 38% of the variance was due to genetic differentiation among the populations (Table 3). The remaining 62% of the variance was due to the differences within the populations. The variances among and within the populations were highly significant (P<0.001).

www.intechopen.com
Among 40 reference cultivars, 20 that are known tropical japonica (TRJ) were classified in K1, four known temperate japonica (TEJ) in K2, eight known indica (IND) in K3, three known AUS (AUS) in K4 and five known aromatic (ARO) in K5, indicating the correspondent ancestry of each population. Based on the references, each accession was clearly assigned to a single population when its inferred ancestry estimate was 0.6 or larger and admixture between populations when its estimate was less than 0.6. Admixture was based on proportion of the estimate, i.e. GSOR 310002 was assigned TEJ-TRJ because of its estimate 0.5227 in K2 and 0.4770

Genetic relationship and global distribution of ancestry populations
All pair-wise estimates of F ST using AMOVA for the populations were highly significant ranging from 0.240 to 0.517 (Table 4). IND was equally distant from ARO and AUS, but more distant from TEJ and TRJ. AUS and IND were mostly differentiated from TEJ. However, TEJ, TRJ and ARO were close to each other in comparison with others. These relationships were consistent with structure analysis revealed by the PCA (Fig. 7). Among 421 accessions of TRJ rice in the core collection, the majority is collected from Africa (23%) and South America (21%), followed by Central America (15%), North America (13%), South Pacific (6%), Southeast Asia and Oceania (5% each) (Fig. 8A). North America had 75 accessions in total and 55 were grouped in TRJ, which was the highest percentage (73%) among 14 regions, followed by Central America (56%), Africa (49%) and South America (41%). Among 112 countries, the U.S. in North America had the highest percentage (92%) of accessions, followed by Cote d'lvoire and Zaire (91%) in Africa and Puerto Rico (72%) in Central America.

Genetic diversity of the populations
Average alleles per locus were the highest in IND, followed by AUS, ARO, TRJ and TEJ (Fig.  9). IND had 45% more alleles per locus than TEJ. ARO had the highest polymorphic information content (PIC), followed by AUS, IND, TRJ and TEJ. The PIC value of TEJ was 72% less than that of ARO. AUS had the most alleles per locus corrected for difference in sample size distinctly (Fig. 10A) and privately (Fig. 10B) from others. Although IND and ARO had same distinct alleles per locus, which was next to AUS, there were much more private alleles per locus in IND than in ARO. TEJ had either the lowest distinct alleles or private alleles per locus among the populations.
Genetic characterization of the USDA rice world collection for genetic structure, diversity, and differentiation will help design cross strategy to avoid sterility for gene transfer and exchange in breeding program and genetic studies, thus better serve the global rice community for improvement of cultivars and hybrids because this collection is internationally available, free of charge and without restrictions for research purposes. Seed may be requested from GRIN (GRIN, 2011) for the whole collection, and from GSOR (GSOR, 2011) for the core collection.

USDA rice mini-core collection
Development of core collections is an effective tool to extensively characterize large germplasm collections, and the utilization of a mini-core sub-sampling strategy further increases the effectiveness of genetic diversity analysis at detailed phenotype and molecular levels (Agrama et al., 2009). Using the advanced M strategy, Kim et al. (2007) presented PowerCore software that possesses the power to represent all the alleles identified by molecular markers and classes of the phenotypic observations in the development of core collections.  2  7  12  17  22  27  32  37  42  47  52  57  62  67  72  77  82  87  92  97  102  107  112  117  122  127  132  137  142  147  152  157  162  167  172  177  182  187  192  197  202  207  212  217  222  227

Phenotypic and genotypic data used to develop the USDA rice mini-core collection
Data of 26 phenotypic traits, 69 SSRs and one indel marker generated from 1,794 accessions in the USDA rice core collection at Stuttgart, Arkansas, USA were used to develop the minicore. The phenotypic traits included 13 for morphology, two for cooking quality, 10 for rice blast disease resistance ratings from individual races of Magnaporthe oryzae Cav., and one for physiological disease, straighthead. Field evaluations of blast were conducted at the University of Arkansas Experiment Station, Pine Tree, AR following inoculation using a mixture of the most prevalent races (IB-1, IB-49, IC-17, IE-1, IE-1K, IG-1 and IH-1) found in the southern US rice production region using the method described by Lee et al. (2003). In greenhouse, seven blast races, IB-1, IB-33, IB-49, IC-17, IE-1K, IG-1, and IH-1 were individually inoculated and rated in a scale from 0 (no lesions) to 9 (dead).

Sampling strategy and representation analysis
Sampling the core collection was performed by the PowerCore software with an effort to maximize both the number of observed alleles at SSR loci and the number of phenotypic trait classes using the advanced M (maximization) strategy implemented through a modified heuristic algorithm (Agrama et al., 2009). The phenotypic traits were automatically classified into different categories or classes by the PowerCore program based on Sturges' rule = 1 + Log 2 (n), where n is the number of observed accessions (Kim et al., 2007).
The resulting mini-core was compared with the original core collection to assess its homogeneity. Nei genetic diversity index (Nei, 1973) was estimated for each molecular marker in both the core and mini-core collections. Chi-squared (χ 2 ) tests were used to test the similarity for number of marker alleles and frequency distribution of accessions. Homogeneity was further evaluated for the 26 phenotypic traits using the Newman-Keuls test for means, the Levene test (Levene, 1960) for variances, and the mean difference (MD%), variance difference (VD%), coincidence rate of range (CR%) and variable rate of coefficient of variance (VR%) according to Hu et al. (2000). Coverage of all the phenotypic traits in the original core collection was estimated in the mini-core as proposed by Kim et al. (2007): Where Dc is the number of classes occupied in the mini-core and De is the number of classes occupied in the original core collection for each trait and m is the number of traits which is 26 in this case.

Distribution frequency of accessions in the core and mini-core collections
The heuristic search based on the 26 phenotypic traits and the 70 markers sampled 217 accessions (12.1%) out of 1,794 accessions in the core collection. The 217 mini-core entries originated from 76 countries covering all the 15 geographic regions (Table 5). Five regions, Subcontinent, South Pacific, Southeast Asia, Africa and China accounted for the majority, 63.6% of the mini-core entries, while the fewest entries came from three regions, Australia, Mideast and North America, accounting for 5.5%. Two accessions in the mini-core are of unknown origin.
The similarity of distribution frequencies between the core and mini-core collections for each of the 15 regions was tested using 2 with one degree of freedom (Table 5). All 15 regions had non-significant 2 values ranging from 0.095 to 0.996 with probability (P) from 0.303 to 0.758, which proved a homogeneous distribution between the two collections. * 2 values with one degree of freedom and the corresponding probability (P). Table 5. Distribution frequency comparison of origin of accessions between the USDA rice core and mini-core collections among 15 geographical regions.

Phenotypic diversity in the core and mini-core collections
Comparative analysis of the ranges, means and variances for 26 phenotypic traits demonstrated that the mini-core covered full range of variation for each trait. The Newman-Keuls test results indicate the presence of homogeneity of means between the core collection and mini-core for 22 traits (85%). Sixteen (62%) of the traits had homogeneous variances revealed by the Levene's test. Among the 10 traits having heterogeneous variances, five morphological traits and amylose content had greater variances in the mini-core than in the core collection. However, hull cover and color, and two disease traits had smaller variances.
The mean difference percentage (MD%), the variance difference percentage (VD%), the coincidence rate (CR%) and the variable rate (VR%) are designed to comparably evaluate www.intechopen.com the property of core collection with its initial collection. Over the entire 26 phenotypic traits, the MD% was 6.3%, far less than the significance level of 20%. The VD% was 16.5%, less than the significance level of 20%, and six traits had much greater variances in the mini-core than in the core collection (Table 6). The VR% compares the coefficient of variation values and determines how well the variance is being represented in the mini-core. More than 100% of VR is required for a core collection to be representative of its original collection (Hu et al., 2000). The mini-core had 102.7% VR over its originating core, indicating good representation.  Levene's test (Lev) for homogeneity between the USDA rice core collection and mini-core, * and ** significant at 0.05 and 0.01 probability, respectively.
2 Categorical data as described in the GRIN (GRIN, 2011). Table 6. Comparison of range, mean and variance between the USDA rice core collection and the mini-core for 26 phenotypic traits.
The coincidence rate (CR%) indicates whether the distribution ranges of each trait in the mini-core are well represented when compared to the core collection. The resulting CR over the 26 traits was 97.5%, indicating homogeneous distribution ranges of the phenotypic traits because it was larger than the recommended 80% (Kim et al., 2007). The calculated Coverage value for the resulting mini-core was 100%, suggesting there is full coverage of all the diversity present in each class of phenotypic traits in the USDA rice core collection.

Molecular diversity in the core and mini-core collections
Both the USDA rice core collection and mini-core contained the same total number of polymorphic alleles (= 962 alleles) produced by the 70 markers, with an average of 14 alleles per locus, ranging from two for RM338 to 37 for RM11229 (Fig. 7A). Total alleles per locus ranged from 2 to 9 for 24 markers, from 10 to 19 for 32 markers and from 20 to 37 for 14 markers. The Nei genetic diversity index values reveal the allelic richness and evenness in the population. Distributions of the Nei indices among the 70 markers were very similar between the core and mini-core collections (Fig. 7B). The core collection had an average Nei diversity index of 0.72 with a minimum of 0.24 for AP5625-1 and maximum of 0.94 for RM11229 and RM302, while the average was 0.76 with a minimum of 0.37 for RM338 and AP5625-1 and maximum of 0.95 for RM11229 and RM302 in the mini-core. The minor difference of the molecular diversity was not statistically significant. Similarly, none of the 70 markers had significantly different Nei diversity index between the core and mini-core collections, indicated by the 2 test with values ranging from 0.000 to 0.022 and probabilities ranging from 0.882 to 0.999. More than 60% of the markers have a diversity index higher than 0.60 indicating high diversity across the markers (Fig. 7). Fig. 7. Distribution of number of alleles per locus and Nei diversity index among the 70 DNA markers in the USDA rice core collection (Core) and mini-core (Mini-core). The markers were placed according to their potion within the rice genome.

Use the USDA rice mini-core collection for mining valuable genes
Demonstrated both phenotypically and genotypically, the USDA rice mini-core collection of 217 entries is a good representative of the core of 1,794 entries as well as the entire rice global genebank of more than 18,000 accessions in the US (Yan et al., 2007;Agrama et al., 2009). The vast genetic diversity means the richness of valuable genes that could be extracted for cultivar improvement (Li et al., 2010). The reasonable number of entries in the mini-core allows extensively phenotyping and genotyping for mining valuable genes. The phenotyping could be performed in replicated tests and in multi-locations for the traits that are largely affected by environments such as yield  and that require large amount of resources such as biotic and abiotic stresses. The genotyping could be done A B www.intechopen.com genome-wide with high density of molecular markers such as simple sequence repeat (SSR) or single nucleotide polymorphism (SNP), or with sequencing the entire genome. The reliably phenotyping and densely genotyping genome-wide will improve the efficiency and accuracy of mining valuable genes for a globally sustainable agriculture. The core and minicore collections are managed by the Genetic Stock Oryza Collection (GSOR, 2011) at the USDA-ARS Dale Bumpers National Rice Research Center and are available to the global research community.