Cotton, a major source of natural fiber and vegetable oil , belongs to the genus Gossypium. The word cotton derived from “quotn”, an Arabic word , and Gossypium derived from “goz”, an Arabic word . The genus Gossypium is composed of ~50 species; out of these five are tetraploids (2n = 4x = 52) which evolved about 1–2 million years ago [4, 5, 6] through hybridization of two diploid species containing genomes much similar like “A” (A2 (G. arboreum) and A1 (G. herbaceum)) and “D” (G. raimondii). Among these, G. hirsutum (AD1) and G. barbadense (AD2) are cultivated, while three species including G. tomentosum (AD3), G. mustelinum (AD4), and G. darwinii (AD5) are wild which are endemic to Hawaii, Brazil, and Galapagos Islands. In total, 45 are diploids containing one genome each of the total eight different genomes (A–G and K) .
Significant progress toward increasing lint yield and improving lint quality has been made during the last seven decades through using conventional and nonconventional breeding approaches. Transgenic cotton conferring resistance to chewing insect pests as well as to glyphosate not only reduces the cost of production but also increases the yield per unit area in developing countries. Cultivation of transgenic cotton helped IPM programs as well as reduced the toxic impact of pesticide in the environment. Also, the cultivation of Bt cotton was found safe for the nontargets. At the moment, cotton is cultivated on >30 million hectares in 80 different countries of the world [7, 8]. World cotton average yield is fluctuating over the last 3 years, and significant reduction (~9% in 2015–2016 than that of 2014–2015) in yield was observed . There are a number of factors including changing climate, resistance development in target insect pests and weeds, increased heat and drought stress, excessive rains and water logging, evolution of new strains of diseases, etc. contributed toward the decline in yield [10, 11]. Another menace is the high demand of inputs for harvesting acceptable yield from Bt cotton. Also, the synthetic fiber is competing out the natural cotton fiber owing to its reduced cost in the international market [3, 10, 11]. Thus, extraordinary efforts are needed to sustain cotton production by 2050.
2. Possibilities toward achieving sustainable cotton production
Bringing sustainability in cotton production is a major challenge for the resource-poor farming communities—dominantly living in developing countries. Other than releasing cotton varieties expressing high yield potential by growing in high input environments—progressive farmers can afford, there is a need to open R&D fronts for developing cotton varieties which can withstand the impact of changing climate and produce sustainable yields in low input or in optimum input environments . Deployment of high-tech genomic tools in breeding is one of the approaches to initiate breeding by design aiming at the development of resilient cotton cultivars [12, 13].
Efforts for achieving maximum yield potential under optimum input environments are not successful largely because of lack of genetic diversity among the genetic resource use to develop new cotton varieties. Narrow genetic base does not only limit the future breeding progress but also makes the crop vulnerable to insect pests and diseases and also to the negative impact of changing climate. For enhancing the genetic diversity of newly developed cultivars, underutilized genetic resources, land races, obsolete varieties, old accessions, etc. are the golden assets for cotton breeders to cultivate genes conferring novel traits . Usually, cotton breeders avoid using the genetic resource from undomesticated germplasm as it hinders the breeding progress because of the linkage drag of some undesirable traits present in the wild germplasm. For having success using conventional breeding approach, selection of diverse parent genotypes, appropriate population size, and accurate testing of newly developed lines in a given ecosystem are the prerequisites to develop cotton varieties with improved genetics .
After sequencing of Arabidopsis genome, a number of crop species including cultivated cotton species have been sequenced. Also, the progenitor species, i.e., A-genome and D-genome, of the tetraploid-cultivated cotton species have been sequenced. The next question is to understand the extent of genetic diversity residing in genomes, and associating the diversity in gene spellings with phenotypic diversity or important traits (quality, productivity, resistance to biotic and abiotic stresses, etc.) would remain a major challenge . In this regard, massive re-sequencing of thousands of cotton genotypes/accessions of the same species as well as different species would help in elucidating gene function of important traits  as well as the footprint of selection . In the present scenario, the function of identical genes can also be deduced by comparing with the similar genes present in Arabidopsis thaliana—shared common ancestry with cotton ~83–86 million years ago . Thus, many genes which are analogous to genes present in Arabidopsis can be characterized with varying degree of success. However, genes conferring traits typical to cotton will have to characterize by deploying various molecular assays including finding DNA markers associated with the traits using nested-association mapping (NAM) populations, genome-wide association study (GWAS) analysis, etc. or by using some other forward and reverse genetic approaches—VIGS and CRISPR-Cas system may have a role in the future for assigning functions to different genes [12, 17].
Another important genetic resource in the form of genetic maps construed using several mapping populations developed through interspecific (G. hirsutum/G. barbadense) and intraspecific (G. hirsutum/G. hirsutum) crosses is available. Initially, this information may help in developing cotton cultivars by design. Unlike rice, deployment of DNA markers in routine cotton breeding is limited to simple traits; however, selection for complex traits is yet to be realized even in advanced countries. For example, ~1000 QTLs for ~40 different traits have been reported. In total, over 49 genetic maps and about 25,000 genetic markers have been reported in different research articles. This information can be used to breed “designer cotton varieties” .
In developing countries, cotton breeders are not using the diagnostic markers linked with simple traits because of lack of resources and trained manpower. It is extremely important to identify DNA markers close enough to the complex traits in spite of the fact that genetic base of the parent genotypes is narrow. In this regard, collaborative efforts among few labs for searching new DNA markers have been initiated by deploying new assays like nested-association mapping and association mapping approaches. Such coordinated efforts can be successful if the goal is common, for example, breeding for resistance to lepidopteron insect pests, drought, and salinity. However, there are few research issues typical to a specific region. For example, cotton leaf curl disease infects cotton in Pakistan and is a potential threat to all other cotton-growing countries where whitefly is prevalent. For addressing this issue, collaborative and coordinated research programs including sharing of cotton germplasm for screening in the hotspot regions would help in tackling such issues. USDA screened >5000 accessions of cotton in hotspot regions of Pakistan, and more than a dozen of asymptomatic cotton genotypes were identified [11, 13, 19]. This information and genetic material are useful for Pakistan and also for the whole cotton-growing community as the threat is spreading to other countries (reported in India and China). Thus, cotton production would be sustained. There are 53,000–63,946 cotton germplasm accessions that are preserved in gene banks of different cotton-growing countries. Sharing these resources is extremely important to characterize the extent of phenotypic and genotypic diversity present in the genus Gossypium. A high-throughput phenotyping platform is required to study traits precisely in a large number of genotypes/accessions in the shortest possible time . The use of such automated technologies will expedite the progress toward initiating marker-assisted selection as well as elucidating various genetic circuits of simple and complex traits. Duplicated accessions can be discarded to avoid redundancy. Selected genotypes/accessions reflecting the maximum phenotypic diversity can be explored using genomic tools including SSR-based characterization, genotyping by sequencing, and re-sequencing, together with the application of association mapping analysis including nested-association mapping (NAM) and genome-wide association study (GWAS) methods .
Another important genetic resource is the cultivated G. arboreum and G. herbaceum species. These have been evolved in drought-prone regions and are still under plow in marginal lands of Indo-Pak regions [21, 22]. These two species have relatively greater root biomass than that of the cultivated G. hirsutum species. Thus, these species harbor important genes for high root biomass including deep root system and scavenge water from deep layers of soils. These two species are also resistant to cotton leaf curl disease and other biotic stresses. In spite of the fact that the genomes of G. arboreum and G. herbaceum have been sequenced, a limited number of genetic maps using diploid interspecific populations have been reported. Identifying QTLs associated with traits typical to these diploid species would be another priority research area for initiating DNA-based selection procedures—previously not possible. The negative linkage drags can be avoided in desirable plants in segregating and/or backcross populations by monitoring the introgression of desirable alleles/genes using the associated DNA markers. Most times, backcross breeding scheme is deployed to recover the genome of recurrent genotype while retaining the desirable alleles of the donor genotype. In the whole scheme, extensive backcrossing followed by identifying plants containing the desirable alleles using DNA markers would be an important breeding procedure for developing improved cotton germplasm.
The use of mutagens for breaking the negatively linked traits has been used to develop new cotton varieties. Earlier radiations (gamma rays) have been extensively used during the early 1960s—induced changes in structure of chromosomes randomly. Usually, soaked seed and/or pollen grains are exposed to physical mutagens. These mutagens can be used to expose F1/F2 seed developed through crossing two different species, and the best mutant plants can be identified by surveying with DNA markers originating from the adapted species. This technique was not largely found worth working because of induction of several deleterious mutations and unwanted linkages in the newly developed mutated genotypes.
The use of chemical mutagens is another approach for inducing mutations by exposing the cottonseed with known mutagens; that is, EMS would not only be helpful in assigning functions to different genes but would also be instrumental in adding new alleles through adding or deleting or replacing nucleotides in the genes which would enhance the genetic diversity—a potential buffer to the epidemic of insect pests and diseases.
Hybrid vigor, increased growth over the parent genotypes, has been explored in corn that resulted in multifold increase in production worldwide. Such efforts were also translated in vegetable and arable crops including cotton but could not gain the popularity like corn. For example, in few parts of the world, cultivation of hybrid cotton has shown increase in lint production. The hybrid cotton can be exploited if the hybrid seed surpasses the yield by 30% over the open-pollinated variety (OPV); resultantly, farmers may get rewards of their investments made to procure costly hybrid seed. In this regard, extensive studies are needed to identify the best combiners. Hybrid cotton breeding is also handicapped due to the nonavailability of reliable genetic as well as mechanical means for getting rid of anthers/pollens. Farmers may produce hybrid seed sufficient to meet their own demands (farmers in few provinces of China do this practice) for sowing cotton hybrids instead of OPVs. Thus, farmers can save money meant for buying the hybrid seed. Trainings for the production of hybrid cottonseed may be given to farmers by the public sector organizations [11, 22].
In most developing countries, for example, in Pakistan, the cost of cottonseed is paid based on total weight of the cottonseed irrespective of the lint potential which encourages farmers to grow varieties producing more cottonseed yield per hectare rather than more lint yield per hectare. A significant fluctuation in lint percentage (35–50%) in the cotton germplasm has been found. Thus, emphasis should be made to breed for >45% lint recovery which would add a couple of million bales toward the total cotton production.
Another breeding objective is to improve the lint quality—always remained a major challenge for breeders as the trait is controlled by multiple thousands of genes. Through conventional breeding, limited success has been achieved; however, further improvement is handicapped due to lack of compatible genetic resources. The success of resolving the complex traits has been demonstrated in model plant species using modern genomic tools. Currently, efforts are underway to clone QTLs involved in conferring these complex traits in cotton. Genome sequence information of the cultivated species also unraveled the genetics of fiber initiation—relatively simpler than that of the fiber elongation traits. Once the key genes involved in defining lint characters are identified and these can be used to engineer the pathways of the diploid cotton species like G. arboreum and others, thus fiber production can be sustained in low input environments. Another immediate thought is to initiate projects for “re-sequencing” the representative genotypes of the closely related cotton species using next-generation sequencing approaches which would be a way forward for associating genetic variations with the traits using bioinformatics tools—would add synergy for sustaining cotton production. These studies would also help in identifying new tissue-specific promoters, unlike the constitutive promoters, which may save plant energy.
Protection to insect pests and diseases has been engineered in different crop species including cotton using different novel genes (Cry genes) excised from a soil bacterium. Success of Bt cotton in protecting cotton crop from bollworms has been demonstrated since its release in 1996. Now, resistance conferred by the Cry1Ac gene has been weakened. Also, pink bollworm infestation on GM-cotton in India has been reported (scientific evidences are lacking). Potential of minor pests for emerging as major pests is another threat to cotton sustainability. For example, before the cultivation of Bt cotton in Pakistan, mealy bug and dusky bug have never been problematic to cotton as indirectly controlled by the application of insecticides applied to kill lepidopteron insect pests, but on Bt cotton these two insects infested in the recent past. This scenario may also arise in other countries. The situation can be mitigated through educating cotton farmers for monitoring and controlling the new emerging pests by taking measures including the applications of insecticides. Till now, GM-cotton containing few genes (largely of Cry series and genes conferring resistance to glyphosate) has been commercialized [23, 24]. Thus, new genes and/or their transcription factors conferring tolerance to biotic and abiotic stresses from other plant sources including wild species can be characterized followed by the introduction in cotton. These genes would have high chances of acceptance by the end user. Improvement in expression of transgene can be made by designing effective gene cassette with efficient promoters, followed by identification of the best event with high gene expression .
Identification of new marker genes (e.g., genes conferring fluorescent proteins) instead of using conventional marker genes (e.g., antibiotic-resistant gene) would be useful for making the use of GM foods more safe. These genes should be tested for an extended time period on the number of different model organisms for making the conclusions acceptable to the public in effective manner. All these new genes from alien background may be supplemented, or resistance can be delayed by adopting some other strategies including development of short-duration varieties and characters offering defense umbrella (small leaves, pubescence, light green leaves, etc.) for avoiding any significant damage by the insect pests and diseases. Secondly, research on other aspects such as chemical ecology would add synergy in managing eco-friendly insect pests and diseases and thus can enhance yield.
Genome editing through CRISPR-Cas is an emerging tool which can be used to edit the genes to improve or silent their expression. For example, gossypol-free cottonseed can be produced by silencing the genes conferring gossypols in seed. Major advantage of this assay is that the function of gene can be characterized and new cultivars can be evolved without introgressing foreign gene; hence, the technology will be acceptable to countries having skeptical views about the GM technology [11, 12, 15, 17]. Thus, it is summarized that the adoption of high-tech management practices, utilization of untapped genetic resources in breeding, cultivation of cotton varieties with excellent genetics, monitoring of risk and efficacy of transgene in ecosystem, and continued search for new genetic resources would help in sustaining cotton production [11, 12].
Cotton breeding, largely based on recombination genetics, has paved the way to develop cotton varieties which were heat tolerant, early maturing, and high yielding with improved fiber traits. Historically, the breeding subject of cotton revolves around the evolution of new cultivars, maintenance of varieties, and their seed production. Presently, phenotypic-based selection procedures have been changed to DNA-based selection systems (marker-assisted selection)—made possible for evolving cotton varieties with brilliant genetics in the shortest possible time. Finding the enormous number of variations has been greatly facilitated by the advent of next-generation sequencing tools, and these variations were assigned functions using advanced bioinformatic tools. All these discoveries and effectiveness of the technologies have made it possible to initiate breeding by design using DNA markers as well as by precise genome editing. Unlike conventional breeding practices, genes (e.g., Cry1Ac, etc.) have been transferred from alien sources for improving resilience to chewing insect pests and herbicides. In the present book, research efforts representing a wide range of research endeavors, being undertaken in different parts of the world, were comprehensively discussed. Both the editors (Drs. Mehboob-ur-Rahman and Yusuf Zafar) ensured the compilation of high-quality research, opinions, and progress made toward enhancing and understanding the cotton genome and its application for developing resilient cotton varieties—a way to sustain cotton production beyond 2050. In the end, editors acknowledge efforts and hard work of the authors in compiling their respective chapters.