Success stories of association mapping studies in cotton.
The genus Gossypium provides natural fiber for textile industry worldwide. Genetic improvement in cotton for traits of interest is not up to mark due to scarcity of adequate information about fiber production and quality. Use of DNA markers for overcoming the issues of selection associated with complex traits is the ultimate choice which may lead to initiate breeding by design. Numerous marker-trait associations have been identified for economical traits using linkage analysis in cotton. Currently there is need for developing high-density genetic maps using next-generation sequencing approaches together with genome-wide association studies (GWAS). Efforts have been started in this direction and several QTLs including fiber quality, yield traits, plant architecture, stomatal conductance and verticillium wilt resistance were identified. This chapter narrates genetic diversity, QTL mapping, association mapping and QTLs related to fiber quality traits. The incorporation of various genomic approaches and previously described marker strategies will pave the way for increase in fiber production.
- association mapping
Cotton (Gossypium spp.) belongs to the genus Gossypium, family Malvaceae and order Malvales, and is known as an ultimate source to produce natural fiber. All over the world, cotton seed is one of the important sources of edible oil. Cotton provides raw material for millions of consumers as well as for industrial products throughout the world. Total impact of cotton in the textile industry continues to excel its importance (presently exceeding 500 billion US$) . Geographically cotton is distributed at 36° South latitude and 46° North latitude in tropical and subtropical regions of the world. The total share of northern hemisphere in global cotton production is 90%. Planting time in the northern hemisphere is the time of harvesting in the southern hemisphere .
Cotton is a warm climate crop (cultivated in nearly 100 countries), and is largely grown in Asia, America and Africa. Major emphasis of cotton breeding programs is to improve its lint yield and its quality. It has been thoroughly studied that yield, yield components and fiber quality characters are governed by a number of genes and these are inversely related to each other. Fiber quality and other economic characters have not been refined with conventional breeding strategies as these are adversely influenced by the ecological conditions.
Molecular markers produce variability at genotypic basis and speed up breeding process. Genetic maps are constructed from DNA-based markers information and quantitative trait loci (QTLs) related to trait of interest have been identified. The availability of reference genome of upland cotton (G. hirsutum L.). Egyptian cotton (G. barbadense L.) and draft genome of G. arboreum L., G. herbaceum L.  and G. raimondii  has revolutionized the ‘omics’ studies. The advent of next-generation sequencing with high-throughput sequencing has allowed genotyping at single nucleotide level which are contributing a lot. High uniformity, strength, extensibility, and other fiber quality traits are need of the day worldwide [5, 6]. Fiber development in cotton is a complicated process-comprised of fiber initiation, elongation (primary wall synthesis), wall thickening (secondary wall synthesis) and desiccation (maturation) [7, 8]. Lint and fuzz cover the seed coat of cotton lint serves as a natural textile fiber while fuzz remains on seed coat after ginning.
2. Evolution of genome size in cotton
Polyploidy is a vital evolutionary process in angiosperms; one of the vital factors in creating new plant species [9–11]. Around 70% of the existing angiosperms are polyploids, which include many world-leading crops such as cotton, wheat, potatoes, canola, sugarcane, oats, peanut, tobacco, rose, alfalfa, coffee and banana [11, 12]. Nonetheless, genomic studies in polyploids are lagged behind than diploid species due to their polyploidy nature. It is highly tiresome to create a reference genome in tetraploid cotton owing to involvement of different species. However, advancements in genomic studies like quantitative trait locus (QTL) mapping, association mapping, nested association mapping, cloning, genome sequencing, functional and comparative genomics have laid down the foundation to study such complex organisms for the evolution of highly saturated genetic maps to ascertain the genomic evolution.
In polyploids, after the occurrence of whole genome duplication (WGD), intra and inter-chromosomal rearrangement processes have reallocated both large and small segments across the genome over the evolutionary span. Genome decomposition has given rise to a set of duplicated DNA segments which are dispersed among the chromosomes, with all the duplicate pairs exhibiting a similar degree of sequence discrepancy [11, 12].
The genus Gossypium has a long taxonomic and evolutionary history . Gossypium is comprised of 52 species, including 46 diploids (2n = 2x = 26) and 5 allotetraploids including one purporated tetraplod species (2n = 4x = 52) . Out of these, only four species are domesticated. In total, two species are old world diploids (G. arboreum L. and G. herbaceum L.) and two species are new world allopolyploids (G. hirsutum L. and G. barbadense L.) which are consisted of a (~1700 Mb) At and (~900 Mb) Dt genome. In total four domesticated species contribute toward the production of natural fiber worldwide . G. hirsutum L. also known as upland cotton dominates the world’s cotton production i.e., > 95%. G. barbadense L. known as extra-long staple or sea-island cotton is grown on 2–3% area in the world, but has lower yield/hectare compared to the G. hirsutum L. Cultivation of the diploid cotton like G. arboreum L. and G. herbaceum L. is restricted to a few countries, such as Pakistan and India. Diploid species (2n = 26) are grouped into eight genomic groups (A–G, and K), based on similarities of chromosome pairing . Eight genomes are divided into three different clades as shown in Figure 1 as A, B, E, and F; D; C, G, K genomes , are found naturally in Africa and Asia. D genome clade is indigenous to the Americas and is found in Australia (Figure 1).
Tetraploid species evolved ~1–2 million years ago (MYA) as a result of hybridization between “A” and “D” genome species , diverged each other from a common ancestor about 4–11 (MYA) [6, 17]. Both “A” & “D” genomes have maintained some level of sequence similarity, resulting in a high transferability of markers among the Gossypium species [18, 19]. F1s of mostly cultivated cotton species (G. hirsutum L. and G. barbadense L.) can further be used in making the crosses with wild tetraploid species (G. darwinii G. Watt, G. mustelinum Miers ex G. Watt, G. tomentosum Nutt. ex Seem.) which produce normal hybrids and some productive off springs .
During the last couple of years, major emphasis of genomic research is on comparative analyses of closely related, homoeologous stretch of genomic sequence in plants i.e., maize, rice, and sorghum [21–23]. Cultivated upland cotton has a history of genetic bottlenecks in evolution that have significantly reduced the extent of genetic diversity of the cultivated cotton species, which compelled geneticists to use populations developed by hybridizing two different species for identifying high number of polymorphisms.
2.1. Genetic diversity in cotton
Cotton has a narrow genetic base which is the main hindrance in sustaining cotton productivity worldwide. Limited genetic diversity and low efficiency of traditional selection methods were the major factors to slow down the process of cultivar improvement from the last three decades [24–26]. One of the major reasons for limited variability of cotton cultivars is the use of adapted cotton germplasm in breeding program. The cotton breeders avoid using the wild genetic resources because of the problem of linkage drags of unwanted characters. The other reason is the lack of innovative tools to mobilize the useful genetic variations from diverse exotic cotton species of Gossypium genus into the breeding cultivars. All these factors together led to the genetic bottleneck in evolution . Understanding about the extent of genetic diversity and relationships among breeding materials could pave the way for precise parental selection and germplasm organization for cotton improvement breeding programs [28–34]. Percent Disagreement Values (PDVs) distance matrix, tree clustering diagram and neighbor-joining stars are different statistical techniques to determine the extent of genetic diversity. Polymorphism Information Content (PIC) is another statistical technique that can be deployed to evaluate the polymorphism acquired through different techniques, primers or markers [35–38].
2.2. Genomic studies in cotton
The discoveries made through exploring the genome would set a firm foundation for initiating breeding by design for improvement programs in cotton. Over the last two decades, multiple genomic tools have been utilized for exploring the cotton genome. Different types of DNA markers such as restriction fragment length polymorphism (RFLP) [39, 40], randomly amplified polymorphic DNA (RAPD) [41–48], amplified fragment length polymorphism (AFLP) [27, 49], simple sequence repeat (SSR) or microsatellites [37, 50, 51], single nucleotide polymorphism (SNPs) [52–55], physical maps, genetic maps, mapped genes and QTLs, microarrays, gene expression profiling, BAC and BIBAC libraries, QTL fine mapping, resistance gene analogs (RGA), genome sequencing, non-fiber and non-ovule EST development, gene expression profiling, and association studies for various traits have been extensively used for understanding the cotton genome. Finally, the genome sequence information of G. hirsutum L. and its progenitor species will considerably expedite the cotton genomic research toward identifying new genes conferring various traits of interest, and would also help in identifying DNA markers linked with traits which can be used in MAS.
2.3. Mapping population
The group of individuals used for the determining variation on genetic basis, phylogenetic analysis, development of genetic map, assigning of loci to the trait of interest is known as mapping population; which are of vital importance for mapping. Mapping populations are obtained using two contrasting parents for the desired trait. In self-pollinated crops; usually mapping populations include F2 [56, 57]; F2:3 [58, 59], recombinant inbred lines (RILs) [60, 61], backcross (BC) [62, 63], Backcross inbred lines (BILs) , near isogenic lines [65, 66], double-haploids [67, 68], chromosome substitution lines (CSILs) [69, 70]. F2, BC, RILs and double-haploids have been highly used for linkage mapping studies in cotton.
The easiest developed populations include F2 and BC as less duration is required. F2 population has more cons for detecting QTLs with additive effects and can also be utilized for assessing the dominance pattern. A number of QTLs for cotton has been found using this population [56, 71–73]. Backcross population produce false results if dominant factors are allowed as additive and dominance are overlapped. Nonetheless, these two populations have demerits including (i) owing to less meiosis some markers which are present at a distance from QTL are also counted; (ii) non-allelic interaction cannot be analyzed; (iii) F2 and BC have got more heterozygosity and also temporary as cannot be repeated at different locations. In-contrast RILs have gone through number of selfings and are highly homozygous . Moreover, RILs are populations which produce finely saturated genetic maps as recombination frequency is high. RILs have been used in cotton for identifying traits related to agronomic and fiber [5, 53, 74]. Doubled-haploids are the best populations for improving any trait as these are the ones having 100% purity. These can be obtained in less duration compared to RILs and BILs but needs to be developed in a fully sterile environment with high skill . By using these, it’s convenient to reduce the variety development duration and analyze genetic behavior. BILs, RILs and double-haploids are the permanent populations and the inferences can be analyzed in detail of QTLs after phenotypical screening, genotyping and genetic map construction.
2.4. Association mapping
During the last decade, interest of plant geneticists is increasing to use the nonrandom associations of loci in haplotypes, a powerful high-resolution mapping tool for studying the complex quantitative traits as compared to the conventional linkage mapping. Association between chromosomal fragments and phenotype can be determined through exploiting genotypic data. Genotypic and phenotypic data are collected from a population with unknown relatedness followed by the estimation of marker-trait association in the experimental population. Association mapping is an open system model that helps in developing high-resolution maps while in linkage mapping, fine mapping is required to reach near the loci , but understanding about the time and place of recombination in the genome is very tricky. Single-marker analysis, interval mapping, multiple interval mapping, and Bayesian interval mapping, have been widely used in conventional linkage mapping studies. Association mapping is an influential approach to map genes for QTLs using genomic tools together with robust statistical methods. Association mapping is an imperative way to investigate the genetic structure of QTLs which can lay down a foundation to study the different traits like insect resistance, disease resistance, earliness, fiber quality etc. . Zhu et al.  reviewed status and prospects of association mapping in comparison to linkage analysis. However, recent advances in association of DNA markers with the fiber quality traits paves the way to understand the mechanism of cotton fiber development.
Linkage disequilibrium mapping (LD) commonly named as association mapping is a method to detect and locate QTLs based on marker-trait association study and it anticipates a relatively new method to dissect the complex traits. Methods for linkage disequilibrium were initially developed for undertaking human genetic studies [79, 80]. These methods have been successfully translated on crop plants for exploring the linkage disequilibrium (LD). Association mapping offers a uniquely high-resolution mapping strategy based upon historical recombination events at population scale which can empower mapping at gene level in less studied organisms where conventional QTL mapping would not be practical .
There are several ways for the determination of LD  but the most popular statistic parameter for the calculation of marker-trait association is “r2”. Theoretically Pearson’s correlation coefficient narrates the polymorphism of allele at one locus to other allele at another while “r2” is known as “coefficient of determination” being the squared value of Pearson’s coefficient. As a whole “r2” elaborates the magnitude of individual variance independent variable with the dependent variable when linear regression is accomplished.
LD is described by another common statistic parameter termed as “Lewontin’s D”. If two loci are segregated randomly then “D” measures the disequilibrium as the distinction among coupling and repulsion gametes frequencies . D is used for calculating D/ for determination of association among loci using the formula:
PAB is the observed extent of a set of closely linked alleles of two loci inherited to offspring with allele A in the 1st locus and B in the 2nd while PA is allele A frequency in 1st site and PB is allele B frequency on 2nd. Owing to base on allelic magnitude the D calculated value is not a precise approach for determining power and distinction of nonrandom association. D/ was developed by Lewontin  for determining the LD which is less related to allelic magnitude. Varshney and Tuberosa  revealed that LD variance values were often high but D/ had minimum variance nonetheless the individuals were evaluated from populations under equilibrium. He also pointed out that population size had significant impact upon association as D/ can produce problematic outcomes for the studies.
Ersoz et al.  devised other approaches based on kinship which deals with the determination of probability of independence among two loci through individual spreading instead of using LD statistics summary. These statistical tools are also known as model-based LD methods which allow determination of population recombination measure from sequence information in an unbiased equilibrium model [85–87]. Besides these models there are other ways which are model-based using diverse population structures for the calculation of LD for differentiation among different individuals origin .
The applications of association mapping are receiving major attention for genomic studies of quantitative traits in all major crops. However, association mapping achieved for crop improvement is not comparable to that in human genomics . Over the decades, many QTLs have been identified using bi-parental populations for yield, yield components and other traits of interest [90, 91]. However, only few were successfully used in plant improvement programs. The recent advancements in genomic science has provided the opportunity of identifying more QTLs through various approaches including GWAS. Similarly, low cost genotyping methods also complemented the aforementioned in identifying more QTLs which can be used in breeding programs. A genome-wide association study (GWAS) was conducted for yield components and fiber quality traits on a diversity panel of 103 cotton accessions. They identified 17 SNP associations for fiber length and 50 for micronaire value . In another report, GWAS was conducted on 318 genotypes. They found that 54.8% of the GWAS detected alleles were transferred from three founder parents; Deltapine15, Stoneville 2B and Uganda Mian .
DNA markers linked to QTLs contributing toward traits of agronomic importance are invaluable resources for cotton (Gossypium sp.) improvement. In spite of the existence of potential diversity in the Gossypium genus, it is mainly underutilized due to barriers of photoperiodism and stringency of advanced technologies to deal with these challenges. Linkage disequilibrium (LD) mapping is a powerful tool for dissecting genetic diversity. Abdurakhmonov et al.  used association mapping in 208 exotic G. hirsutum L. accessions, containing 208 landrace accessions and 77 photoperiodic accessions. A significant genetic diversity within exotic germplasm stock was found. About 11–12% of SSR loci showed significant LD. Estimates of LD declined at significant threshold (r2 = 0.1) found in the range of 10 cM genetic distance in landraces and 30 cM in varieties. LD calculated at r2 = 0.2 was estimated on an average 6–8 cM in cotton varieties and ~1–2 cM in land races, providing evidence for potential associations for important traits. A significant relatedness and population structure was found in the germplasm. Mixed linear model (MLM) detected between 13 and 6% of SSRs associated with major fiber quality parameters in cotton. The study demonstrated the potential application of association mapping in cotton to exploit new sources of genetic variation.
Utility of the diploid Asiatic cotton species in breeding programs depend upon the understanding of the ancestry and genetic relatedness. A collection of 56 G. arboreum L. accessions collected from nine different zones of Asia, Africa and Europe were assessed for eight fiber quality parameters (strength, lint color, lint percentage, micronaire, elongation, maturity, 50% span length and 2.5% span length) and genotyped with 98 microsatellites. Majority of the SSRs were found polymorphic. The analysis of population structure identified six major clusters for accessions representing distinct geographic regions. Marker-trait association estimates were assessed by general linear model method. This study illustrated the potential of association mapping in diploid cotton, because a modest number of SSRs, phenotypic data and strong pioneering statistical interpretation, identified interesting associations .
The use of marker-assisted breeding (MAB) in cotton improvement is limited, as compared to the other commercial crops due to its narrow genetic base and limited polymorphisms. This scenario urges a need for tagging, characterization and utilization of naturally existing polymorphisms in Gossypium germplasm collections. Estimates of genetic diversity, population structure, LD magnitude and association mapping were explored for cotton fiber quality trait in a set of 335 G. hirsutum L. germplasm cultivated under two distinct environments by surveying 202 SSRs. Genome-wide LD at r2 ≥ 0.1, extended up to 25 cM in tested cotton accessions. However, at a threshold of r2 ≥ 0.2, genome-wide was reduced to ~566 cM, highlighting the potential application of association mapping studies in cotton. Preliminary findings suggested inbreeding, linkage, selection, genetic drift and population stratification as the key LD-generating players in cotton. Using a kinship and unified MLM on an average ~20 SSR markers were observed to be associated with major fiber quality traits in two environments. These significant associations were further confirmed for permutation based multiple testing and population structure by applying linear model and structured association test. The identified association provided a strong evidence for the use of association mapping studies in cotton germplasm resources .
In another report, association studies were undertaken to identify SSR markers linked with fiber traits in the exotic germplasm population derived from multiple crosses among tetraploid species of Gossypium. After 12 generations of continuous selfing, a total of 260 lines were selected for evaluation of fiber-related traits in three environments from species polycross (SP) population. A total of 314 polymorphic fragments were amplified by surveying with 86 SSRs. The SSRs showing 6% allele frequency were evaluated for associations. A total of 59 markers have substantial (P < 0.05, 0.01, or 0.001) association with six fiber traits. Structure analysis grouped the population in six groups with allelic frequency ranged from 0.11 to 0.27. The correction for population structure and kinship analysis identified 39 out of 59, significant marker-trait associations. Population sub-structure was highly significant for boll weight. The results clearly indicated that marker-trait associations have a promising potential in determining the genetics underlying interrelationships among fiber traits .
The discovery of valuable alleles for fiber quality traits and also the novel germplasm exhibiting high fiber quality features are important for accelerating the breeding progress for improving the lint quality. An association mapping study was conducted for fiber quality traits using 99 G. hirsutum L. accessions with diverse origins. A total of 97 polymorphic microsatellite marker were used which detected 107 significant marker-trait associations for three fiber quality traits under three diverse environments. A total of 70 marker-trait significant associations were detected in two to three environments while 37 identified in only one environment. Out of the 70 marker-trait associations, 52% were found similar with earlier reports, indicating the stability of these loci for the target traits. Further, a large number of elite alleles conferring two or three traits were also detected. These results pointed out the potential of using germplasm for mining elite alleles and their use in breeding for improving the lint quality .
Knowledge about population structure and linkage disequilibrium in association mapping studies can help in minimizing the appearance of false positive associations. Association mapping of verticillium wilt resistance in cotton was reported in the panel of 158 cotton genotypes. The studied germplasm was genotyped with 212 markers covering the whole genome and phenotyped with disease nursery and screening method in green house. In total 480 alleles were identified, ranged 2–4 alleles/locus. A total of two major groups and seven subgroups were identified through model-based analysis. The LD level of the linked markers was considerably higher than the unlinked markers, indicating that physical linkage heavily affected LD in this panel and LD level increased when the studied germplasm was divided into groups and subgroups. In total, 42 marker loci were associated with verticillium wilt resistance, which were mapped on 15 chromosomes. In total, 10 out of 42 marker loci were found to be constant with already known QTLs while 32 were new marker loci. This study paved the way for marker-assisted selection of verticillium wilt resistance in cotton .
Baytar et al.  used SSRs in a germplasm collection consisting of 108 genotypes for association analysis and analyzing genetic diversity, population stratification and linkage disequilibrium in upland cotton. 967 alleles were used for population construction and differentiated into 4-subgroups. Linkage disequilibrium showed the decay in 20–30 cM (r2 ≤ 0.5) and association was observed via general linear model and mixed linear model for verticillium disease resistance. As a whole 26 markers were observed associated with this disease on 14-chromosomes at P ≤ 0.05 while it was found that 8 of total markers were highly significant P ≤ 0.01. Phenotypically variation fluctuated in each marker from 3.2 to 8.2%. They assumed that the identified markers being in accordance to earlier studies may be a good source for devising any breeding strategy.
The genetic pattern of sympodial branch number, length, node of first fruiting branch and some other characters of branches was observed in an association panel of 39 genotypes and 178 F1s under separate ecological conditions for developing ideogram, photosynthesis and yield . 20 QTLs were found for these traits with MLM in association analysis which revealed that these traits had additive, dominance, epistatic and environment effects and phenotypical variation showed these traits are highly influenced by genetic factors.
2.5. Fiber quality traits
World cotton consumption increased 2.9% from 2012 to 2013 (106.4 million bales) to 2013–2014 (109.5 million bales) . Cotton being the prime fiber crop of the world , and cash crop of Pakistan which have a significant contribution as foreign exchange in the economy of country .
Globally, the demand for cotton products is projected to rise 102% from 2000 to 2030. This is likely to occur in a global environment where arable land is squeezing, water supplies are decreasing, and the impact of worldwide climate change on cotton production is uncertain. Current rate of genetic gain for lint yield under normal plant densities range from 7.1 to 8.7 kg ha−1 year . Most of the genetic gain has been achieved through deploying conventional breeding tools and recently the biotechnological tools . Conventional breeding alone cannot achieve the genetic gain without supplementing it with modern genomic tools. Fortunately, cotton genomic research has gained momentum after the introduction of GM cotton . The other approaches including marker-assisted selection (MAS) can accelerate the breeding progress. Genetic variations for fiber quality traits among G. hirsutum L. cotton substantially limit its quality improvement [45, 105]. Breeding for better quality lint is a primary objective of most cotton breeding programs worldwide. Conventional breeding has played a key role for improving yield and fiber quality of upland cotton. The invention and advancement of molecular markers surely make it accessible for plant breeders to even more rapidly and precisely improve crop economic and agronomic traits . Cotton fiber originates from the seed protodermal cells, being the renewable of textile materials and major alternative to man-made fibers.
Cotton genotypes significantly differ for fiber quality traits  and lint percentage . Different types of the model systems can be used to discover the new genes controlling the cotton fiber. Cotton seed hair development has a strong resemblance with Arabidopsis leaf trichome development . The data generated from different studies using cotton fiber-related genes supported this study, thus confirmed the close relationship between cotton seed fibers and Arabidopsis trichomes [110–112].
Researchers working on cotton fiber development demonstrated that, there is a significant impact of high-density genetic map of cotton anchored with fiber-related genes which may expedite the MAS to improve fiber quality traits. Keeping this object in mind, a genetic map was constructed by deploying simple and complex sequence repeat markers on 183 recombinant inbred lines (RILs) derived from the interspecific cross TM1 (G. hirsutum L.)/Pima 3–79 (G. barbadense L.). The newly developed genetic map was comprised of 193 loci including 121 new fiber loci not previously reported. These new reported fiber loci were mapped on chromosome no. 19 and 11 LG extending 1277 cM, contributing approximately 27% of the total genome coverage. Preliminary QTL analysis studies suggested that genes for fiber-related traits were present on chromosome no. 2, 3, 15 and 18. These newly synthesized PCR-based SSRs derived from cotton fiber ESTs will open new doors for the development of a high-resolution integrated genetic map of cotton for structural and functional study of the genes that augment fiber quality .
Jamshed et al.  screened 28,861 SSRs to identify polymorphism among parents 0–153 and sGK9708 and used 851 polymorphic SSRs in a RIL population containing 186 individuals for determining genomic regions connected to fiber quality. The genetic map spanned to 4110 cM with 5.2 cM distance between makers and as whole constituted about 93.2% G. hirsutum L. genome. As a whole they found 165 QTLs related to fiber and 90 of them were declared as common QTLs which will be a good source for cotton.
In another study, researchers reported that the mapping of genes involved in cotton fiber development will expedite the cloning and manipulation of these genes. In this study, already known seven fiber mutants were mapped, four dominant (Li1, Li2, N1 and Fbl) and three recessives (n2, sma-4(ha), and sma-4(fz)) in six F2 populations spanning 124 or more plants each. Map position of the mutants were harmonious with previously assigned chromosomes except n2, which was mapped on the homoeolog of the chromosome already reported. Three mutations (N1, Fbl, n2) having primary effects on fuzz fibers were mapped near QTLs that affected fiber lint production in the same populations that may be due to pleiotropic effects on both fiber types. However, only one mutant Li1 mapped within the likelihood interval for 191 already reported lint fiber QTLs discovered in non-mutant crosses, suggesting that these mutations may occur in genes that played early roles in the evolution of cotton fiber and for which new allelic variants are quickly eliminated from improved germplasm. Studying the genome comparison of cotton and Arabidopsis opens new avenues to accelerate the genetic dissection of cotton fiber development .
Genetics of the fiber traits was determined in a cross of 5 × 5 complete diallel system. This study reported additive gene action, demonstrating that fiber quality of a certain cotton genotype is a sequence of different fiber quality traits. However, the most important traits are fiber length, strength, fineness and uniformity index .
Bhatti  observed association for fiber quality traits in a global germplasm collection of upland cotton using SNPs. 32 QTLs found connected to different fiber traits such as fiber length, fiber strength, uniformity index, micronaire, maturity, fiber strength and fiber elongation.
2.5.1. Ginning out turn percentage
Lint is the lifeline of the textile industry, serves as a backbone in earning foreign exchange and thus adds up in country exchange reserves. Ginning out turn (GOT) percentage has a key role for more lint production. GOT percentage is a useful index for the performance of a genotype and it can be defined as the percentage of lint obtained from a sample of seed cotton. Genotypes-cultivars having high GOT are preferred, because of high lint potential. Approximately, 1% increase in GOT would bring about 3% increase in seed cotton yield. In order to meet the demand of textile industry, the breeders ought to breed for high lint producing genotypes-cultivars.
Genetic make-up of a cotton plant contributes more toward lint than that of the macronutrients, phosphorus and nitrogen level [116, 117]. Motes percentage and GOT are highly affected by genotypes-cultivars and location . Effect of sowing time on GOT has also been reported in multiple studies [119, 120]. Negative correlation between staple length and GOT has been observed  while GOT is directly associated with seed cotton yield . Non-additive gene action has been reported for GOT, irrespective of its high heritability estimates . Applications of farm yard manure can improve the fiber yield by improving GOT .
A total of 25 QTLs conferring lint production were reported over the entire genome but none of the chromosome contained more than three QTLs . Qin et al.  reported 17 associated markers with lint percentage.
12 QTLs connected to lint percentage were observed at whole genome level in upland cotton . They also assumed that QTL Gh_A02G1268 was also found in fiber development and these QTLs can be used for fiber improvement at whole genome level.
Zhang et al.  developed chromosome introgressed lines using TM-1 and TX-256 and TX-1046 and observed 5 QTLs related to ginning out turn and concluded 1 stable QTL which was observed in multiple environments.
Iqbal and Rahman  screened germplasm collection of 185 genotypes with 95 polymorphic SSRs for three years and at different locations for ascertaining lint percentage. They found IR-NIBGE-3701 with the maximum GOT percentage of 43.63%. The population pattern was observed using STRUCTURE, unweighted pair group method with arithmetic mean and principal component analysis and four clusters were developed. Totally 47 genotypes found to have common ancestry and distance among subgroups ranged from 0.058 to 0.130. As a whole 75 marker-trait associations were detected among fiber quality traits; out of which 18 were related to GOT percentage and MGHES-51 found in all traits. They concluded that such QTLs can be utilized in molecular breeding as a tool to observe all quality traits.
2.5.2. Micronaire value
Fiber fineness and maturity are measured in terms of micronaire value because this value is the combination of fiber maturity and fineness [129, 130]. Micronaire value is the measure of air resistance through plugs of cotton, wool, rayon, and glass wool fibers [11, 131, 132]. Optimal range for micronaire value is 3.8–4.5. Lint with high micronaire > 4.5 is considered as of coarse quality that results in less fiber in the yarn cross-section. Ultimately, the course fiber makes relatively weaker yarn. It is one of the reasons that high micronaire cotton is less preferred by the spinners due to reduced fiber bundle strength . Less micronaire value < 3.5 μg/inch of lint is also undesirable as it reflects the immature fiber which may prone to dye uptake problems, breakage and neps formation.
Biotic and abiotic stresses have a major impact on micronaire value. For example, temperature, plant defoliation [133–135], radiation [136, 137] and water stress  significantly impact the micronaire value. Thus, understanding the extent that these factors affect the micronaire value is important for undertaking cultural practices to produce cotton fiber with desirable micronaire value.
Various instruments have been developed for the accurate measurement of micronaire value like areal meter , shirley fineness maturity tester (FMT), originally developed by Shirely institute, since 1998 sold as the WIRA Electronic Cotton Fineness and Maturity Meter [140, 141] and the Uster Technologies Advanced Fibre Information System (AFIS) providing module for direct measurement of individual fiber diameter and giving the degree of fineness [142, 143]. Now a days, latest instruments Cottonscan and Siromat for the measurement of micronaire are commercially available in one instrument called Cottonscope .
Fiber quality assessed in a population developed from contrasting parents and SSRs used for determining associations between economic traits . They found 131 QTLs for fiber quality, verification done in another RIL population and deduced that 77 QTLs were in accordance to earlier findings while 54 are unique and will fasten MAS in cotton.
Said et al.  identified 234 QTLs for micronaire value, which were spread over the entire genome of cotton. Most of these micronaire QTLs were on chromosome no. 5, 24 and 25.
Zhang et al.  observed QTLs related to fiber length, micronaire and strength using SSR and SNPs in a RIL population consisted of 196 individuals developed from 0 to 153 and sGK9708. They identified 25 QTLs on chromosome 25 and 17 among them were common in minimum two locations. They also detected a specific genomic region for micronaire COT002-CRI-CRI-SNP68652 which will contribute a lot for fiber quality improvement.
2.5.3. Staple length
Staple length is the average length of the longer one half of the fibers. Its improvement through adopting various breeding procedures including the modern genetic tools is the only effective way. It is manifested by high heritability-ranged from 0.52 to 0.90 for fiber bundle strength and 0.46 to 0.79 for staple length [146–149]. Previous studies have shown that staple length is highly under the influence of the genotype . In the present cultivated varieties, significantly low variations for the staple length were observed-thus hampering the future breeding progress because of low genetic diversity available for the trait [25, 150, 151].
A total of 151 staple length QTLs were reported over the entire genome except chromosomes C2 and C22 . In total 12 marker-trait association for staple length were reported in earlier study . Cuming  reported four QTLS for staple length in the genetic mapping of F2 population in the green colored cotton.
There is a dire need to broaden the genetic base of cultivated upland cotton for continuous genetic advancement of seed cotton yield and fiber-related traits through introgressing the alleles from G. barbadense L.
Tan et al.  used RIL population and screened SSR for developing QTLs related to fiber quality. As a whole 59 QTLs were found related to fiber quality; 15, 10, 9, 10, 15 for fiber length, uniformity, strength, elongation and micronaire respectively. They revealed that these QTLs can be used for developing cultivars in upland cotton.
Association analysis conducted for fiber quality using CottonSNP63K in a germplasm collection of 503 genotypes at genome-wide level . The populations were differentiated into three subgroups on the basis of 11975 SNPs and found that genetic structure is not based on geographic based. They observed 160 QTLs associated with yield and yield components with 324 SNPs.
2.5.4. Fiber bundle strength
Fiber tensile properties include fiber bundle strength and elongation. HVI based tensile properties are user friendly and provide average estimates for thousands of fibers. Single fiber tensile testing is a tedious job and thus not routinely practiced  but it has been observed that single fiber testing provides better intrinsic fiber tensile properties . Fiber bundle strength has a major impact in the modern spinning technology rather than staple length and micronaire value . Negative correlation between the fiber bundle strength and cotton lint is a major bottle neck in upland cotton breeding programs [158–160]. It means that increase in fiber bundle strength would not be possible without sacrificing the yield. Good quality fiber that contributes to the production of stronger yarn is highly desirable and has a major impact on highly efficient fabric production [161, 162]. Fibers having optimal micronaire value, long staple length and high fiber bundle strength have much more potential to synchronize with textile processing methods while fibers of short staple length have lesser yarn strength which reduces the efficiency of spinning and ultimately decreases the yarn utility. The textile industry requires yarn of high average strength so that it can help to counter harsh spinning activities [163, 164]. Said et al.  reported 132 fiber bundle strength QTLs, which were spread over the entire genome with the exception of chromosome no. 17 which contained none. A total of 12 associations between SSR makers and fiber bundle strength were reported in the association mapping studies of G. hirsutum L. collections . The exploration of novel genes in the wild germplasm and their introgression into adaptive cultivars would pave the way for the genetic improvement of seed cotton yield and fiber bundle strength in G. hirsutum L.
2.5.5. Uniformity index
The fiber uniformity index (uniformity ratio) is the ratio between upper half mean length (UHML) and mean length expressed as a percentage of the longest length. Uniformity index has a major role in cotton spinning industry. It is highly vulnerable to environmental changes with some special reference to micronaire value [165, 166]. That’s why it is highly desirable to improve the fiber-related traits including uniformity index to fulfill the needs of the textile industry because low length uniformity and high short fiber contents are correlated with more manufacturing waste and less spinning efficiency during yarn process.
In multiple studies, the inter-relationship between QTLs associated with three fiber length parameters, average staple length, length uniformity and short fiber contents was reported . Many researchers reported the QTLs for length uniformity, which are useful to produce the uniform cotton . In total 91 QTLs were reported over the entire genome except for the chromosome no. 11, which contained none of the fiber uniformity index QTL . Cuming , identified two fiber uniformity index QTLs in the F2 intraspecific population of G. hirsutum L. by deploying single-marker analysis. Moreover, success made in cotton using genetic mapping has been described in Table 1.
|Species||Pop. type||Pop. size||Markers number||Markers type||Mapped traits||Ref.|
|G. hrisutum||Diverse accessions||285||95||SSRs||Fiber quality|||
|Diverse accessions||335||202||SSRs||Fiber quality|||
|Exotic germplasm||260||86||SSRs||Fiber traits|||
|G. arboreum||Accessions (9-regions)||56||98||SSRs||Fiber quality|||
|G. hrisutum||Accessions (global)||1000||100||SSRs||Yield and fiber|||
|Wild races and variety accessions||37||23||SSRs||Fiber traits|||
|Cultivated sp. and wild||Accessions||8193||197||SSRs||Verticlium and fiber|||
|Germplasm||323||106||SSR||Drought and salt|||
|Cultivars||356||381||SSR||Yield and yield components|||
|G. hrisutum||Diverse accessions||99||97||SSRs||Fiber quality|||
|G. hrisutum||Variety germplasm||109||98||SSRs||Salinity|||
|G. hrisutum||Elite cotton cultivars||180||58||SSRs||Oil and protein|||
|G. hrisutum||Elite germplasm||158||212||SSRs||Verticilium resistance|||
|G. hrisutum||cultivars||134||74||SSR||Salt tolerance|||
|Upland cotton||Accessions||355||81,675||SNPs||Fiber traits|||
|Accessions||355 and 185||81,675||SNPs||Early maturity|||
|Inbred lines (native and exotic)||503||179||SSRs||Fiber quality|||
|Cotton genotypes||90||95||SSRs||Drought tolerance|||
|Germplasm||200||3786||SNPs||Yield and fiber quality|||
|Accessions||395||103 and 26,324||SSRs and SNPs||Seed protein, oil and traits|||
|Landraces and cultivars||318||Lint yield and fiber quality|||
|Diverse accessions||719||10,511||SNPs||Fiber quality|||
As cotton is the most common source of natural fiber all over the globe, there is an urgent need to improve its lint yield and quality through the utilization of diverse germplasm resources and employing high-throughput technologies. The development and efficiency of phenotypic traits can be maximized using DNA-based markers, and mapping of cotton can lay the foundation for future breeding strategies. Family-based genetic mapping has been used for ascertaining desirable traits in cotton for a while, however, it may not be as reliable as once though since some regions connected to the trait of interest may be highly influenced by climatic conditions as well. Therefore, breeders have become more inclined to use variations hidden in the permanent populations such as accessions and landraces in gene pool. The associations between various economical characters discovered using such resources provide genetic mappers with valuable information since there would be no recombination between the character and the marker. Advances in emerging technologies in sequencing would require automation and in order to accomplish an efficient automation, the use of highly potent markers would become crucial. SNPs are the markers of choice for such genomic studies because they can be developed by employing different methods. GBS is one of such methods and it is unique in its perspective, as it can detect reproducibility and genotype of large populations simultaneously. When the recent developments in the field are considered altogether, it appears that the incorporation of various genomic approaches and genotyping will pave the way to increase fiber production and they will be the source of food security at global level.