SSR marker loci significantly associated with lint yield traits and their explained proportion of phenotypic variation in 3 different environments. LY: lint yield (g/plant); SY: seed cotton yield (g/plant); BN: bolls per plant; BW: boll weight (g); LP: lint percentage (%); LI: lint index (g/100 seeds); SI: seed index (g/100 seeds); E1: Jiangpu in 2009; E2: Dafeng in 2010; E3: Zhengzhou in 2010
Cotton (Gossypium spp.) is the most important natural textile fiber source globally. The worldwide economic impact of the cotton industry is estimated at approximately $500 billion per year with an annual utilization of about 27 million metric tons of cotton fiber .The tetraploid species Gossypium hirsutum L. (n=26, AD genome), commonly referred to as Upland cotton, accounts for 95% of the world’s cotton production . Most of the objectives in cotton breeding, such as yield, fiber quality, biotic-and abiotic-stress tolerance, are all complex traits, which are controlled by a large number of quantitative trait loci (QTLs). It is becoming progressively more difficult to improve these traits using conventional breeding methods due to their complex architecture and inheritance . Fortunately, the development in applied genomics research has provided alternative tools to improve efficiency in plant breeding programs. Molecular markers tightly linked to the target genes or QTLs can be used for marker-assisted selection (MAS) and/or genomic selection (GS) [4-5]. In the past two decades, the availability of abundant molecular markers has made tagging QTL harboring functional genes through family-based linkage mapping a routine process, and a large number of QTLs for fiber quality properties [6-40], yield and its components [6, 8, 10-11, 14, 18-19, 27, 30-32, 36, 40-45], nematode resistance [46-49], Verticillium wilt resistance [50-53] as well as Fusarium wilt resistance [54-60] have been identified in cotton.
However, approximately 80% of the previously reported QTLs could not be confirmed in subsequent studies, and few have actually been applied in breeding programs [61-62]. This may be because most QTLs were population-specific, and the genetic variations detected in a unique bi-parental population were not shared with other genetic populations, or shared but fixed in the parental lines. In addition, limited genetic recombination events in most populations used for linkage mapping make it difficult to map QTL with a high resolution, which severely limits their application in breeding programs. With the potential to exploit all recombination events that occurred in the evolutionary history of natural populations, linkage disequilibrium (LD) based association mapping (AM) has become a powerful approach for the dissection of complex traits and identification of causal variation with modest effects for target traits in many plant species [3, 63] including cotton [64-73]. Furthermore, although AM has been successfully used to detect the QTLs underlying quantitative traits in some crops, from a breeding standpoint, detecting associated loci is just the first step; analyzing the genetic effects of alleles and identifying favorable alleles will be more beneficial for target trait improvement. For example, Breseghello and Sorrells identified several potentially beneficial alleles for kernel size and milling quality by comparing the average phenotypic value with specific alleles and null alleles in a soft winter wheat population ; Jia et al. identified some putative resistant alleles for Sheath Blight resistance in a rice panel composed of 217 accessions from the USDA core collection, and found that the number of putative resistant alleles presented in an entry was highly and significantly correlated with the decrease of ShB rating .
China is the world’s largest cotton-growing nation, but not a cotton domestication region. Most Upland cotton cultivars developed in China were derived from a few germplasm resources such as Deltapine (DPL), Stoneville (STV), Foster, King and Uganda, all of which were introduced from abroad . More than 8800 accessions have been collected, developed, and maintained in Chinese cotton germplasm collection . Among these collections, G. hirsutum varieties and breeding lines comprise about 83% of the total. Current and obsolete cultivars have been and continue to be the main resources for cotton breeding programs in China. Dissecting the genetic basis of main breeding objective traits will be of great benefits to germplasm evaluation and future molecular breeding. In this chapter, we reviewed QTLs underlying yield and its components, fiber quality properties and Fusarium wilt resistance detected by association mapping and their favorable alleles identified in 2 Chinese Upland cotton AM panels [71-73]. We hope that these results should provide useful information for further understanding the genetic basis of these traits, and would facilitate future breeding process by MAS in Upland cotton.
2. Favorable QTL alleles for yield and its components
In recent years, demand for cotton fiber in the world market has dramatically increased. While cotton acreage has declined worldwide in the past few years, mainly it was due to strong competition from other crops as well as production costs . To improve lint yield of Upland cotton cultivars will remain critical for meeting worldwide demand and maintaining profitability for cotton growers. A total of 356 representative Upland cotton cultivars and breeding lines were selected from the cotton germplasm collection in Cotton Research Institute, Chinese Academy of Agricultural Sciences (CRI-CAAS), and from the collection in our laboratory, and assembled to construct an AM panel. The population consisted of 348 cultivars developed in China, seven introduced from the U.S. including the genetic standard line TM-1, and one introduced from Uganda. According to their release year, the 348 Chinese cultivars could be divided into the following six groups: I (1930–1960, 26 lines); II (1961–1970, 26 lines); III (1971–1980, 39 lines), IV (1981–1990, 83 lines); V (1991–2000, 125 lines); and VI (2000–2005, 49 lines). The cultivars introduced from abroad, DPL 15, DPL 16, STV 2B, King, Foster 6 and Uganda 3, were used as a check group for genetic diversity and allele transmission evaluation, because they had been used as the main founder parents in China’s Upland cotton breeding programs . Yield and its components of the accessions were evaluated in three diverse environments, including lint yield per plant (LP), seed cotton yield per plant (SY), bolls per plant (BN), boll weight (BW), lint percentage (LP), lint index (LI) and seed index (SI). Three hundred and eight-one pairs of SSR primers that amplify loci evenly covering the tetraploid cotton genome  were selected to genotype the 356 accessions.
The Bayesian model-based program STRUCTURE 2.3 was used to infer the population structure using 66 unlinked or weakly linked SSR markers as described in reference . The length of the burn-in period and the number of Markov Chain Monte Carlo replications after burn-in were all assigned at 100,000 with an admixture and allele frequencies correlated model. Five independent run iterations were performed with the hypothetical number of subpopulations (k) ranging from 1 to 10. However, the result showed that the LnP(D) value corresponding to each hypothetical k kept increasing with k value and did not show any peak (Figure 1). Thus an ad hoc statistic Δk  was joined to correctly estimate the population structure. The Δk value showed a much higher likelihood at k=2 than at k=3-10 (Figure 1), suggesting that the total panel could be divided into 2 major subpopulations. Based on the correct k, all accessions were assigned to each of the two subpopulations (P1 and P2), for which the membership value (Q value) was >0.5, and the population structure matrix (Q) was generated for further association mapping. The P1 group contained 115 accessions including 63 cultivars from Yellow River cotton growing region, 46 lines from North and Northwest China regions, and 6 cultivars from Yangtze River region. The P2 group consisted of 241 accessions including 116 lines from Yellow River region, 107 lines from Yangtze River region, 10 lines from the North and Northwest China regions, and 8 lines introduced from abroad (see reference  for details). The software SPAGeDi was used to calculate the pair-wise relatedness . For the kinship coefficient values, 86.85% was less than 0.05, 8.56% had a range of 0.05–0.10, and the remaining 4.59% showed various degrees of genetic relatedness. Based on the results of the relatedness analysis, a K matrix was constructed for further association mapping.
Marker-trait association analysis were performed with the MLM model, considering both kinship (K) and population structure (Q), implemented in TASSEL software . At the α=0.01 (-logP=2) level, a total of 195 significant associations were detected between 82 SSR markers and 7 lint yield traits. Among these, most of the associations (125 of 195) were detected in only one environment, and the proportion of phenotypic variation explained by markers ranged from 1.52% to 9.40%, with an average of 3.70% (see reference  for details). After Bonferroni correction , 55 associations were found to be significant (P≤0.05/145,-logP≥3.46) between 26 SSR markers and 7 lint yield traits (Table 1). Most of the associations could be detected in more than one environment, and the proportion of phenotypic variation explained by markers ranged from 1.63% to 9.40%, with an average of 4.51%. The number of SSR markers associated with LY, SY, BN, BW, LP, LI and SI were 9, 4, 6, 4, 14, 17 and 1, respectively. Seventeen loci were co-associated with two or more different traits (Table 1), for example, NAU3269 (Chr. 5) and NAU3100 (Chr. 23) were simultaneously associated with FY, SY, BN, LP, and LI, and most of the lint yield-associated loci were associated with at least one of its components. These associations coincided with phenotypic correlations among these traits . This might result from pleiotropy of a single causal gene or tight linkage of multiple causal genes. We found that 10 of 14 markers associated with LP were detected in all three environments, which was consistent with the results in phenotypic evaluation that LP possessed the highest broad-sense heritability (hB2=75.77%) . The phenotype of complex traits often result from the combined actions of multiple genes and environmental factors ; only those traits with high heritability can be stably detected. The resulting stably associated markers should be useful for cotton breeding with broad adaptability to different environments.
To identify favorable alleles, the phenotypic allele effect was estimated through comparison between the average phenotypic value over accessions with the specific allele and that of all accessions:
where ai is the phenotypic effect of the ith allele; xij is the phenotypic value over the jth accession with the ith allele; ni is the number of accessions with the ith allele; Nk is the phenotypic value over all accessions; nk is the number of accessions. If the value of ai > 0, the allele is considered to have a positive effect, if it is < 0, it corresponds to a negative allele. The favorable alleles were then identified according to the breeding objective of each target trait [69, 71]. Phenotypic effects of each QTL allele for the 41 associated loci detected in more than one environment were measured, and 5, 2, 3, 4, 12, 14 and 1 favorable alleles for FY, SY, BN, BW, LP, LI and SI were identified, respectively. Phenotypic effects and representative accessions for each favorable allele are shown in Table 2. Among the favorable alleles, NAU3100-2 had the most positive phenotypic effect for FY and SY, and increased FY and SY by 3.61 g and 7.27 g, respectively; NAU6584-2, NAU3398-2, NAU5166-2 and NAU3917-2 increased BN, BW, LP and LI by 0.89, 0.42 g, 4.93% and 0.94 g, respectively; while NAU493-1 deceased SI by 0.17 g. Lint yield of cotton is the result of series components and their interactions, such as boll number, boll weight, lint percentage, lint index, and seed index. Developing potentially high-yielding cultivars thus relies to some extent on selecting the appropriate yield components. As some of the QTLs were associated with more than one yield component, favorable alleles must be treated with caution. Positively co-associated genetic loci could simultaneously improve multiple target traits, while negative linkages must be broken.
|Traits||Favorable allele||ai||Accessions||Representative accessions|
|LY||NAU3269-2||0.27||133||Simian3, Zhongmiansuo9, Huakangmian1|
|JESPR204-1||0.70||314||Simian3, Zhongmiansuo9, P164-2|
|TMK19-2||1.02||240||Simian3, Zhongmiansuo9, P164-2|
|NAU3100-2||3.61||87||Simian3, Zhongmiansuo9, Lumianyan16|
|NAU2776-1||0.85||151||Zhongmiansuo9, P164-2, Lumianyan16|
|SY||NAU3269-2||0.42||133||Zhongmiansuo9, Zhongmiansuo19, Simian3|
|NAU3100-2||7.27||87||Zhongmiansuo9, Han4849, Lumianyan16|
|BN||NAU6584-2||0.89||217||Lumianyan16, Zhongmiansuo44, Zhongmiansuo9|
|NAU3269-2||0.08||133||Zhongmiansuo9, Wanmian73-10, Simian3|
|TMK19-2||0.45||235||Zhongmiansuo44, Zhongmiansuo9, Wanmian17|
|BW||BNL1414-2||0.18||93||Zhongmiansuo18, Zhongmiansuo5, I40005|
|NAU4047-2||0.03||221||Zhongmiansuo18, Zhongmiansuo5, I40005|
|NAU3398-2||0.42||22||Zhongmiansuo5, I40005, Hua101|
|JESPR208-2||0.20||86||Zhongmiansuo18, Zhongmiansuo5, I40005|
|LP||NAU3269-2||0.23||133||Simian3, Ekangmian9, Huakangmian1|
|NAU5166-2||4.93||8||Simian3, Huakangmian1, Sumian4|
|NAU2508-2||0.36||113||Nannongzao, 86-1, Yu668|
|NAU980-3||2.79||9||Ekangmian6, Emian16, Ekangmian10|
|JESPR135-1||0.13||343||XiangSC-24, Simian3, Ekangmian9|
|JESPR204-1||0.36||309||XiangSC-24, Simian3, Ekangmian9|
|BNL3590-1||0.26||327||XiangSC-24, Simian3, Ekangmian9|
|TMK19-2||0.58||235||XiangSC-24, Simian3, Huakangmian1|
|NAU3100-2||1.55||86||Simian3, Ekangmian9, Nannongzao|
|BNL1404-1||0.13||343||XiangSC-24, Simian3, Ekangmian9|
|Gh508-1||0.07||347||XiangSC-24, Simian3, Ekangmian9|
|NAU2361-3||0.75||73||Ekangmian9, Yu668, Yumian21|
|LI||NAU3269-2||0.01||133||Huakangmian1, I40005, Ekangmian9|
|NAU980-3||0.84||9||I40005, Zhongmiansuo5, Hua101|
|JESPR135-1||0.02||343||Huakangmian1, Emian23, I40005|
|Gh369-3||0.11||8||Emian16, Ekangmian8, Yumian20|
|NAU3398-2||0.80||22||Huakangmian1, I40005, Zhongmiansuo5|
|CIR246-3||0.63||16||Hua101, Zhongmiansuo41, Yumian9|
|BNL3590-1||0.06||327||Huakangmian1, Emian23, I40005|
|NAU2233-1||0.01||212||Huakangmian1, Emian23, I40005|
|TMK19-2||0.10||235||Huakangmian1, Emian23, I40005|
|NAU3100-2||0.38||86||I40005, Zhongmiansuo5, Hua101|
|NAU3917-2||0.94||6||Huakangmian1, Simian4, Sumian9|
|BNL1404-1||0.03||343||Huakangmian1, Emian23, I40005|
|Gh508-1||0.01||347||Huakangmian1, Emian23, I40005|
|NAU2361-3||0.28||73||Emian23, I40005, Zhongmiansuo5|
|SI||NAU493-1||-0.17||230||Chaoyangmian1, Xuzhou1818, XiangSC-24|
Allele frequencies of the 23 favorable alleles in the CK group and the six Chinese historically released cultivar groups are summarized in Table 3. Based on allele frequencies across the different groups, these favorable alleles could be categorized into three classes. The alleles in the first class, such as JESPR135-1, BNL1404-1 and Gh508-1, presented in the founder cultivars and with high frequency in all populations, might have been passed down stably from the original parents and were almost fixed in modern cultivars by selection. Alleles in the second class, such as BNL3269-2, BNL1414-2, NAU3100-2 and JESPR208-2, presented in the founder cultivars and with moderate to low frequency in most populations, should have been underutilized in modern breeding programs. Those in the third class, such as NAU5166-2, NAU980-3, Gh369-3 and CIR246-3, not presented in the founder cultivars and presented at low frequency in modern cultivars, might be from other original parents or could have been generated by mutations and/or recombinations. Favorable alleles, especially of the latter two classes, should have a great potential in future Upland cotton genetic improvement. We suggest that a multi-parent population should be constructed using cultivars that possess most of the favorable alleles, and in the meantime, a ranking system for MAS or genomic selection should be developed based on the results of AM. Favorable alleles that were passed down from the founder parents and have been almost fixed in modern cultivars formed the basis of lint yield of Chinese Upland cotton, and should be treated as fundamental elements in order to reject deleterious alleles at the corresponding loci. Alleles either absent in the founder cultivars or present at moderate to low frequencies in most cultivar groups have been underutilized in modern breeding programs, and should be regarded as essential elements for increasing lint yield potential.
3. Favorable QTL alleles for fiber quality traits
With the acceleration of spinning speed, the demand for cotton fiber quality is increasing rapidly [9, 17]. It is important to elucidate the molecular genetics of Upland cotton fiber qualities, and such information would enable the subsequent improvement of cotton cultivars by pyramiding favorable alleles of fiber quality traits. We performed a marker-trait association mapping for fiber quality traits using MLM model implemented in TASSEL  with the panel mentioned above . At the α=0.01 (-logP=2) level, a total of 59 significant associations were detected between 41 SSR markers and 5 fiber quality traits (Table 4). Among these, almost all of the associations were detected in only one environment, and the proportion of phenotypic variation explained by markers ranged from 1.84% to 7.12%, with an average of 3.53% (Mei et al. unpublished data). If a more stringent threshold with Bonferroni correction  was adopted, only 9 associations were found to be significant (Table 4). The result might be caused by low diversity of fiber quality properties existed in the 356-accession panel. Although the yield potential of Chinese cultivars are equal to or a bit higher than those developed in the United States or Australia, the fiber quality traits of Chinese cultivars are not as good as those of American or Australian cultivars [73, 84]. Narrow variation in fiber qualities of Chinese cultivars had severely limited the marker-trait association detection power in association mapping.
|Traits||Marker loci||Chr.||Position||-LogP||R2 (%)|
To embrace the genetic diversity of fiber qualities as large as possible, a previous collection comprising 99 Upland cotton cultivars and breeding lines was used for further association mapping. The collection included 63 cultivars and 36 breeding lines with elite fiber quality traits (13 of which were PD lines introduced from the United States), and the fiber properties including fiber length, strength, and fineness were evaluated in 2004 and 2007. A total of 260 SSR markers, including the markers linked to fiber properties reported in previous studies [17-19], were used to screen the 99 accessions, and 97 of which showed polymorphisms . Marker-trait association analysis was performed with the MLM model implemented in TASSEL . At the α=0.01 (-logP=2) level, a total of 51 significant associations were detected between 33 SSR markers and 3 lint yield traits, and 7 of the these associations could be detected in both 2 years. The proportion of phenotypic variation explained by markers ranged from 7.76% to 23.99%, with an average of 14.07%. After Bonferroni correction , 17 associations were still significant (P≤0.05/97,-logP≥3.29). Compared with the results from our previous 356-accession panel, both the number of significant associations and the proportion of phenotypic variation explained by markers were improved in the current panel.
Phenotypic effects of each QTL allele for the 33 significantly associated loci were measured, and 21, 24 and 7 favorable alleles for fiber length, strength and fineness were identified, respectively. Phenotypic effects and representative materials for each favorable allele are shown in Table 6. A wide variation of the favorable alleles processed in the 99 accessions was observed, with an average of 14.84 and a range of 7 to 21 (see reference  for details). Most of the representative materials were fiber-elite lines collected from China or introduced from the United States, and some of them, such as Yumian 1 and 7235, were introgression lines and contained more than half of the favorable alleles. For example, 7235 was developed from several hybridizations with multiple parents, such as a G. anomalum introgression line, Acala 3080 and PD4381 [85, 86]; Yumian 1 was an introgression line with G. barbadense, G. arboreum, and G. raimondii as putative donors, which is characterized by high lint yield and high fiber strength . These fiber-elite materials had been used for tagging QTLs underlying fiber quality traits in previous family-based linkage mapping studies [13, 17-19, 24-25; 39]. With the genetic information provided by this and previous studies, MAS should be complemented in the future breeding programs.
|Traits||Favorable allele||ai||Accessions||Representative materials|
|Fiber length||NAU934-4||1.0498||17||I-62434, 7235, I-62478|
|NAU2354-2||0.3662||10||I-62478, I-62479, PD94045|
|BNL1317-3||0.8522||21||I-62434, 7235, I-62431|
|NAU445-3||0.7926||31||I-62434, 7235, I-62431|
|JESPR295-2||0.8233||31||I-62434, 7235, I-62478|
|JESPR153-3||0.8457||16||I-62434, I-62431, Yumian1|
|NAU1200-1||0.5373||49||I-62434, I-62478, I-62431|
|NAU1102-3||0.4564||22||I-62431, I-62479, I-62429|
|TMO06-3||1.3109||19||I-62434, 7235, Yumian1|
|NAU2443-2||0.8435||17||I-62434, 7235, I-62431|
|NAU1004-2||0.9346||14||I-62434, 7235, TM-1|
|Fiber strength||NAU934-4||2.6759||17||I-62429, I-62478, PD93019|
|NAU474-2||1.0231||31||PD94042, I-62478, PD93019|
|BNL1317-3||2.3136||21||I-62429, 7235, I-62433|
|NAU2508-1||3.7061||8||7235, PD6992, HS427|
|NAU445-3||0.8072||31||I-62429, 7235, I-62433|
|JESPR295-2||2.1342||31||I-62429, I-62478, 7235|
|JESPR153-3||2.8625||16||I-62429, I-62433, I-62434|
|NAU1200-1||0.3169||49||I-62429, PD94042, I-62478|
|NAU1043-2||0.9147||29||I-62478, PD93019, Yuwu19|
|BNL1395-1||2.2677||12||I-62478, 7235, Yumian1|
|BNL1122-1||2.2677||12||I-62478, 7235, Yumian1|
|NAU1369-2||1.4296||15||I-62429, I-62433, Yumian1|
|NAU1322-2||1.7557||17||I-62429, 7235, I-62433|
|TMO06-3||3.1629||19||I-62429, 7235, I-62433|
|NAU2443-2||3.0749||17||I-62429, 7235, Yumian1|
|BNL2634-3||0.8204||26||I-62429, I-62433, Yumian1|
|NAU780-1||0.4073||71||I-62429, I-62478, Yumian1|
|NAU816-3||1.6559||29||I-62429, PD94042, I-62478|
|Fiber fineness||NAU934-4||-0.3064||17||Emian11, Emian6, CRI18|
|BNL1317-3||-0.3617||21||Jinmian2, Jinmian9, Jimian11|
|NAU1162-2||-0.2338||13||Sumian5, CRI16, Yu668|
|JESPR295-3||-0.2148||9||Qiannong465, Cangzhou7315-38, Jin185|
|BNL3436-3||-0.1132||60||Xiangmian11, Jinmian2, Qiannong465|
|TMO06-3||-0.3587||19||Shiduan5, Tai8033-2, Sumian5|
|NAU1004-2||-0.1530||14||Qiannong465, Wanmian73-10, Qinli514|
4. Favorable QTL alleles for FOV race 7 resistance
Fusarium wilt (FW), caused by Fusarium oxysporum f. sp. vasinfectum (FOV), is a widespread disease that causes huge losses in cotton (Gossypium ssp.) production worldwide [87-90]. Once established in soil, FOV can survive in the field for several years as Chlamydia spores, even in the absence of a host, and is nearly impossible to eliminate. Although both chemical controls and cultural practices have been employed in protecting plants from damage, the most effective and efficient control should be provided through host resistance [87, 89]. In the past several years, eight races (race 1–8) of FOV had been indentified worldwide that use both cotton and non-cotton differential hosts . Three FOV races (race 3, 7 and 8) had been found in China; race 7 possesses the highest virulence and is the most widely distributed race . Recently, DNA-based techniques were employed in conjunction with pathogenicity tests to validate these races and to test new isolates, and highly virulent isolates of FOV were identified in Australia  and the United States [93-94]. Host resistance to FOV races has been widely evaluated in cotton germplasm under both field nursery and greenhouse conditions, and many highly resistant cotton cultivars and breeding lines have been developed through conventional breeding [89, 95-96]. However, little is known about the mechanism and genetic basis of FOV resistance. Some early classical genetic studies have suggested that the inheritance of FOV resistance in cotton is determined by a single gene [97-99], while other studies have suggested that FOV resistance is controlled by multiple genes [100-102]. Large-scale resistance evaluations in breeding programs are time-consuming and labor-intensive, and it is not easy to obtain the ideal genotype simply through phenotypic selection. The cultivar development process has been slow in the face of the emergence of new, highly virulent FOV isolates [89-90].
To accumulate useful information for understanding the genetic basis of FOV 7 resistance and identify favorable alleles for facilitating future molecular resistance breeding, we performed a marker-trait association mapping to detect QTLs underlying FOV 7 resistance in Upland cotton . The 356-accession panel mentioned above  was used for association analysis and favorable allele mining. Meanwhile, a composite cross population (CP) with three parents was developed for linkage mapping. Three Upland cotton cultivars, Xuzhou 142, Yumian 21 and Shang 9901, were chosen as parents according to many years of FOV 7 resistance evaluation. Xuzhou 142, an obsolete cultivar selected from STV 2B in the 1970s with large boll size and high lint percentage that is severely infested by FOV 7, was selected as the susceptible parent. Yumian 21 was released in 1999 and is currently used as a resistant control in national cotton regional trials due to its extremely high resistance to FOV 7. The cultivar Shang 9901 is an anonymous breeding line with high yield potential and moderately high resistance to FOV 7. Thus serial QTLs for FOV 7 resistance were detected by joint population-based association mapping and family-based linkage mapping. Marker–trait association mapping was performed with the MLM model implemented in TASSEL software . A total of 27 markers were significantly associated with FOV 7 resistance at the at the α=0.01 level (-logP=2.0), and were localized to 16 chromosomes (Table 7). Of these loci, 23 were detected under field nursery conditions, 10 were detected in the greenhouse and 6 were detected under both conditions. The proportion of phenotypic variation explained by the markers ranged from 1.48% to 12.99%, with an average of 4.21%. If a more stringent threshold by the Bonferroni correction (P≤0.05/145,-logP≥3.46) is adopted , only 10 associations were significant (Table 7). Five (qFW-A3-1, qFW-A12-1, qFW-D3-1, qFW-D5-1 and qFW-D8-1) of the 7 QTLs identified by linkage mapping could be detected by association mapping at the α=0.05 level, while only 3 QTLs (qFW-D3-1, qFW-D5-1 and qFW-D8-1) could be detected at the α=0.01 level (see reference  for details).
The phenotypic effects of each QTL allele of the 27 associated loci were estimated in both the greenhouse and field nursery evaluations according to the method mentioned above , and therefore, favorable alleles for FOV 7 resistance were identified (Table 8). The phenotypic effects of the 27 loci in the greenhouse and field nursery averaged-1.31 and-3.45, with ranges of-3.81to-0.01 and-14.88 to-0.20, respectively. Among the favorable alleles, NAU934-2 had the most negative phenotypic effect in the greenhouse assay and was able to decrease FW DI by 3.8, while NAU6966-3 had the most negative phenotypic effect in the field nursery and was able to decrease FW DI by 14.88. A wide variation of the favorable alleles processed in the 356 accessions was observed, with an average of 14.84 and a range of 7 to 21 (see  for details). Pearson correlation analysis between the number of favorable alleles and FW DI was carried out, and highly negative significance was found in both the greenhouse (r=-0.344, P<0.001) and the field nursery (r=-0.488, P<0.001) evaluations. The top-3 resistant accessions for each favorable allele were listed as representative materials in Table 8.
|Favorable alleles||ag||af||Accessions||Representative materials|
Wang and Roberts identified a major resistance gene (Fov1) against FOV 1 in G. barbadense cv. Pima-S7 and indicated that one or more minor genes in Acala NemX could delay wilt symptoms . Ulloa et al. performed QTL mapping in an F2 (Pima-S7 x Acala NemX) and a recombinant inbred line (RIL; G. hirsutum TM-1 x G. barbadense Pima 3-79) population. The authors detected 6 QTLs (Fov1-C06, Fov1-C08, Fov1-C111, Fov1-C112, Fov1-C16 and Fov1-C19) conferring FOV 1 resistance in different genetic backgrounds . Lopez-Lavalle et al. used an intraspecific cross between the Upland cotton MCU-5 and Siokra 1-4 to detected QTLs conferring resistance to Australian FOV races and found that MCU-5 resistance is complex, with 3 QTLs identified in F3 and 8 ones in F4. The QTLs were localized to chromosomes A6, D4 and D6 . Very recently, Ulloa et al. investigated 3 intraspecific (G. hirsutum x G. hirsutum L. and G. barbadense x G. barbadense L.), 5 interspecific (G. hirsutum x G. barbadense) and one RIL population in 4 greenhouse and 2 field experiments and identified a set of 11 SSR markers across 6 linkage groups/chromosomes (3, 6, 8, 14, 17 and 25) associated with FOV 4 resistance . Integrating information mentioned above with the QTL mapping results for FOV 7 in our study, it suggests that the inheritance of FOV resistance in cotton should be far more complex than has been elucidated to date. Interestingly, the map positions of some QTLs conferring resistance against different FOV races coincided, suggesting that these genes may play the same or similar roles in the process of different host-pathogen and/or environment interactions. While great differences in the resistance of cotton genotypes against FOV races have been demonstrated in many evaluations worldwide, a genotype with a high level of resistance against all FOV races has not been found. Given that FW is the result of interactions among the host, pathogen and environment, many genes must be involved in this process. The accumulation of resistance alleles, even with relatively minor individual effects, will result in higher levels of resistance. The Upland cotton cultivars from China and Africa have shown more resistance to Australian FOV races . Therefore, the favorable alleles and their typical germplasm resources identified in this study should have great potential for developing highly resistant Upland cotton cultivars in future molecular breeding programs.
5. Discussion and future prospects
Compared to family-based linkage mapping, population-based association mapping offers several advantages such as: (1) the saving of time and money by using existing populations instead of creating cross-controlled populations, (2) analysis of more than two alleles per locus on average, and (3) high expected resolution owing to a short extent of linkage disequilibrium . All these make association mapping an increasingly important tool for complex trait dissection [63, 104].
A suitable association mapping panel should embrace as much phenotypic and genotypic diversity as possible. For example, because the 99-accession panel  comprised higher diversity in fiber properties than the 356-accession panel  and 81-accession panel , both the number of significant associations detected and the proportion of phenotypic variation explained by markers were improved. However, the presence of high diversity from different genetic origins may induce LD between unlinked loci, and consequently result in spurious marker-trait associations . Although several statistical strategies have been developed to account for issues related to population structure and relatedness in order to decrease false positive [106-107], whether these approaches will induce new false negative remain unclear. Synthetic association populations, such as the multi-parent advanced generation inter-cross (MAGIC) population  and the nested association mapping (NAM) population in maize , had been used for genome-wide association studies (GWAS); with which both high resolution and better population structure control can be achieved [110-111]. It is certain that such populations had been or are being developed in Upland cotton. Two artificially controlled multiple parent random-mating populations with each comprising more than 800 lines had been developed in our library, which should provide another theoretically ideal panel for association mapping.
The other inherent constraint limiting the successful use of association mapping is rare alleles exiting in natural populations. Given that the number of individuals with a specific genotype is quite small, the effect of rare alleles on mapping can go far beyond the effect of small population sizes . Large population size had been considered as an important factor to improve the QTL detection power in association mapping studies [3, 112]. Many more marker-trait associations for fiber yield were detected in our 356-accession panel  than in the 81-accession panel  between the same markers and target traits at the same significance level. While family–based linkage mapping can make use of alleles that occur at low frequencies in natural populations by designing crosses to create artificial populations with inflated frequencies of those alleles. So specifically designed mapping populations such as recombinant inbred lines (RIL) and near isogenic lines (NIL) will remain important. Furthermore, joint linkage and association mapping was recommended as an alternative approach to overcome some of the inherent limitations of both linkage and association mapping [105, 112], and this approach has proven to be a powerful tool to detecting architecture of complex traits [72, 113-118].
Recently, two preliminary maps of the whole-genome scaffolds of G. raimondii (the putative diploid donor for tetraploid species) were separately released by two different groups [119-120], which will facilitate the tetraploid genome sequencing and assembly. Real genome-wide association mapping will be realized in the near future through resequencing or other high-throughput genotyping technologies , which will dramatically accelerate genetic diversity exploitation and favorable allele mining in Upland cotton germplasm resources.
We thank all faculties and graduate students in Cotton Research Institute, Nanjing Agricultural University and the other members of the cotton genomics and breeding communities for their contributions, and apologize for not citing many enlightening papers owing to space limitations. This study was supported by grants from 973 (2011CB109300), 863, Jiangsu Province Key Project (BE2012329), and the Priority Academic Program Development of Jiangsu Higher Education Institutions.