Open access peer-reviewed chapter

Bioinformatics Tools and Genomic Resources Available in Understanding the Structure and Function of Gossypium

Written By

Gugulothu Baloji, Lali Lingfa and Shivaji Banoth

Submitted: 18 December 2021 Reviewed: 22 December 2021 Published: 05 October 2022

DOI: 10.5772/intechopen.102355

From the Edited Volume

Cotton

Edited by Ibrokhim Y. Abdurakhmonov

Chapter metrics overview

85 Chapter Downloads

View Full Metrics

Abstract

Gossypium spp. (Cotton) is the world’s most valuable natural fiber crop. Gossypium species’ variety makes them a good model for studying polyploid evolution and domestication. The past decade has seen a dramatic shift in the field of functional genomics from a theoretical idea to a well-established scientific discipline. Cotton functional genomics has the potential to expand our understanding of fundamental plant biology, allowing us to more effectively use genetic resources to enhance cotton fiber quality and yield, among with using genetic data to enhance germplasm. This chapter provides complete review of the latest techniques and resources for developing elite cotton genotypes and determining structure that have become accessible for developments in cotton functional genomics. Bioinformatics resources, including databases, software solutions and analytical tools, must be functionally understood in order to do this. Aside from GenBank and cotton specific databases like CottonGen, a wide range of tools for accessing and analyzing genetic and genomic information are also addressed. This chapter has addressed many forms of genetic and genomic data now accessible to the cotton community; fundamental bioinformatics sources related to cotton species; and with these techniques cotton researchers and scientists may use information to better understand cotton’s functions and structures.

Keywords

  • Gossypium
  • genomics
  • bioinformatics
  • structure
  • gene sequencing

1. Introduction

Providing fiber for one of the biggest and most significant sectors (textiles), cotton is the world’s major natural fiber crop. It has a global economic effect of around $500 billion every year. Cotton genetic resources, which comprise germplasm from more than 50 distinct cotton species, have allowed researchers to investigate the transformation of fibers of lint and the impact of polyploidy in enhancing lint output. It allow researchers to gain a better understanding of how domesticated cotton evolved from its wild counterpart. Bio-based alternatives to petroleum-based chemicals, such as assessing the degree of genetic diversity and exploiting it to increase cotton yield and lint quality, are also being carefully researched [1]. A staple of the world economy, cotton (Gossypium hirsutum) is appreciated for its valuable renewable fiber resource and is a cornerstone of the global economy. Various biological investigations, including polyploidization [2], single-celled biological processes and genome evolution, may be carried out using this plant as a model [3]. Polyploidy and genomic size disparities with the Gossypium genus, as well as cotton’s evolution, may be better understood through decoding cotton’s genome [4].

A summary of the technological advancements achieved during the previous two decades have presented in this article. For example, progress has been made in understanding differences in a variety of physiological, biochemical, morphological and genetically relevant features, all of which have been explored in this volume. Many cutting-edge genomic methods were used to study the cotton genome of the genus Gossypium, as was done with other significant crop species. This work significantly contributed to laying the groundwork for maintaining lint production around the globe [5]. There are still a lot of things to look into [6]. Cotton’s genetic foundation has become more limited as a consequence of extensive domestication [7]. In previous generations of cotton production, conventional genetic resources were used; nevertheless, cotton productivity has been declining over the last several years [8]. As it is imperative for producing fresh sources in order to confront the difficulties of the new millennium.

Genomic methods have been used extensively for enhancing fiber characteristics, producing cotton cultivars that are resistant to insect pests and diseases, and developing cotton varieties that are resistant to abiotic pressures. In the minds of many, genetic alteration has led to an improvement in genomes [9]. The scope of this investigation of cotton’s genetic resources, both traditional and modern, as well as its potential for future use, is broad. Improved usage of existing genetic resources has the potential to alleviate the concerns to cotton production that are now being faced.

Advertisement

2. Genomics and genetic diversity of Gossypium spp

Members of the Malvaceae family, Gossypium spp., belong to the genus Gossypium. 50 species (45 diploids 2n =2 × =26; and 5 tetraploids 2n = 4 × =52) are found in the Gossypium genus [10]. The tetraploid species were formed after interbreeding of the A and “D” genome species around 4–11 million years ago (MYA), and they diverged about 1–2 lakh years ago from their common ancestor. As two sub-genomes contains a single copy of practically all genes, the tetraploid cotton contains more than two copies of each genes [11]. Furthermore, these genes were organized in the same manner as the diploid progenitors [12]. There are two tetraploid species of cotton that are among the most frequently farmed in the world: Gossypium barbadense and G. hirsutum.

A through G and K in the alphabet represent the eight genome groups, which are based on chromosomal pairing affinities [13]. (AD)1 through (AD)5 are the five tetraploid species, based on their genomic constitutions. On the basis of phylogenetic analyses, Gossypium species were categorized into two lineages: the 13 D-genome species lineage and the 30 ∼ 32 A-, B-, E-, F-, C-, G- and K-genome species lineage. Based on phylogenetic analyses, the polyploid species have been divided into one lineage, with the 5 AD-genome lineage being the most notable (Figure 1). Allotetraploids (G. hirsutum and G. barbadense) and diploids (Gossypium herbaceum and G. arboreum) are two of the four Gossypium species grown in agriculture. It is estimated that G. hirsutum, or “Upland cotton,” is responsible for 90% of the world’s cotton production. Cotton varieties like G. herbaceum (Levant Cotton), and G. arboretum (Tree Cotton), account for about 2% of the world’s production of cotton, while G. barbadense, (Egyptian Cotton), American Pima Cotton, Sea Island Cotton or Extra Long Staple Cotton provides 8% of the world’s production of cotton [13].

Figure 1.

Phylogeny and evolution of Gossypium spp. [14].

Advertisement

3. Genome sequencing of Gossypium spp

The advances made possible by genome sequencing show that functional genomics research is working to become more efficient. Insect and herbicide-resistant cotton cultivars have advanced at breakneck pace during the previous two decades [15]. When it comes to genetic modification of cotton for plant morphology and blooming as well as for fiber quality as well as yield and resistance to biological and environmental stresses, however, process was slow. The advancement of a cotton genome research collaboration depending on Arabidopsis and rice has been made possible thanks to the efficient deployment considering the availability of well-known whole-genome sequences. While setting up an approach for cotton genome sequencing, the Consortium of cotton genome [6], decided to focus on simpler diploid genomes which is applied to tetraploid cotton. Among the D-genome species like G. raimondii has been prioritized for full sequencing in order to meet the personal milestone of cotton genome sequence completion. Both Paterson [16] and Wang [17] authored the review genome sequence of G. raimondii in 2012, which was an obvious first step in categorizing the bigger “A” diploid as well as “AD” tetraploid cotton genomes. They were not only ones to write the review genome sequence in 2012. Cotton genome sequencing in 2012 began with this as an initial financial funding source.

It wasn’t long until the same study team published their results on the 1694-MB genome for G. arboreum, which is assumed to be a cotton’s donor species. Group of A chromosome in tetraploids [18]. There are two known cotton progenitors, G. raimondii and G. arboreum both of which have their genomes sequenced, however it is still uncertain which species was responsible for the growth of the tetraploid cotton species around lakhs of years ago [19]. Furthermore, when compared to diploid cotton species, G. hirsutum showed significant alterations in economic characteristics and structures of plants. This indicates that throughout development, both natural and artificial selection took place. As a result, the allotetraploid cotton species must be sequenced in order to learn more about the plant’s evolutionary history and fiber biology. Li [20] and Zhang [21] sequenced the allotetraploid G. hirsutums genomes using genes from the A and D progenitor species as a preparatory step. Also of note is Sea Island cotton, which is renowned for its great durability and excellent fiber. Its use in textile manufacturing sounds perfect for the production of high-quality goods. Both Liu et al. [22] and Yuan et al. [23] have sequenced the genome of the genus G. barbadense and found that it spans 2470 Mb of the genome [23].

As a result, experts feel that because of the way they were constructed, a number of the recently disclosed genome sequences reference for tetraploid and diploid species of cotton are flawed. G. raimondii [16, 17] as well as G. hirsutum [19, 20] sequenced and assembled review genomes differed in chromosomal lengths as well as the number of annotated genes between the 2 categories. At least on a large scale, it is plausible that such differences are the consequence of assembly errors. As a result of this, there has been a huge amount of genetic study done on diverse cotton species. For the time being, we must put in more effort to gather these genome assemblies for a more skeptical eye in order to do thorough comparisons, assessments, and repairs of their misassembles, among many other things.

When a reference genome is available, it is possible to investigate the link between sequence alterations and other properties by re-sequencing the genome. In order to find genomic regions that are indicators of choice in cotton, recent comprehensive genome studies on 34 [24] as well as 318 [25], along with 147 [26], and 352 [27] cotton accessions constitute considerable collection. Cotton molecular breeding has benefited tremendously from the discoveries made in these experiments, which have yielded valuable new genetic resources. It is possible to transfer favorable genes associated with high yield, wide adaptability, with high fiber grade across many gene pools under the guidance of sequencing information, in order to considerably enhance cotton output.

Advertisement

4. Genome database/bioinformatic tools available for Gossypium spp

Identifying the genetic features that are important for the biological behavior of cotton is just the first step in the process of genome sequencing and resequencing. Cotton genomics have disclosed various DNA’s physiologically active states in the same manner as studies of epigenetic alterations, fine map platforms, SNP array platforms, high density genetic, and transcript abundance across different species and tissues have done so for other model crop plants. All other model crop plants have done the same thing. Cotton plantation industry genetic research and breeding was hindered by limited ultra-precision genetic mapping before the publishing of the complete genome sequences for four Gossypium species in 2013. The cotton plantation industry’s access to fairly large cotton-genome linkage maps may enable gene mapping, high-throughput markers, cotton cloning and gene isolation [28, 29]. In the previous 10 years, approx. 1075 QTLs in 58 G. hirsutum studies and 1059 QTLs in interspecific G. hirsutum and 9 G. barbadense populations were submitted as yield, fiber quality, seed quality, and biotic and abiotic challenge tolerance. In the case of marker-aided selection, the newly identified QTLs provide only coarse resolution due to their location in vast genomic domains that may comprise several genes. When selecting a marker, it is crucial to have a large number to choose from, so that the genes presence in the target locus may be cloned more effectively. The glandless gene [30], leaf shape [31], and quality of fiber related QTLs [32, 33], have all been mapped in cotton along with Many genes and quantitative trait loci (QTLs).

Single nucleotide polymorphism variations (SNP) have been discovered at the 2.5-Gb of whole genome level for allotetraploid cotton genome in recent years thanks to better in silico techniques and next-generation sequencing (NGS). SNP63K is formed in cotton, that includes tests for 45,104 and 17,954 possible intraspecific as well as interspecific SNP markers [34]. SNP63K was created by Ashrafi et al. [34]. The SNP63K cotton array is a foundational high-throughput genotyping technique as well as a platform for genetic research for commercially and agronomically relevant methods. CNVs, which stand for a larger proportion of the genome than SNPs, may be beneficial in discovering phenotypic changes that are not recorded by SNPs, since they stand for more of the genome. Many studies have shown that plant genomes are full with copy number variants (CNVs), which may affect gene regulation, dosage and gene structure [35]. The vast majority of genes impacted by CNVs are linked to important traits. In a recent study, researchers found that cotton contains 989 CNV-infected genes that influence plant type, cell wall structure, and translational control [26].

A decade ago, transcriptome analysis identified as the most essential tool for determining how sequencing data might be used to get insights into the activities of individual genes. Whole-genome transcriptome profiling may be achieved with RNA-Seq, it allows high-throughput sequencing tools to sequence transcripts directly. The freshly published transcriptome assembly for the G. hirsutum TM-1 inbred line, as well as assembly of all publically accessible to expressed sequence tags (ESTs), were utilized as a reference for SNP detection in cotton [34]. The utilization of diploid and tetraploid genome sequences, as well as next-generation sequencing (NGS) technologies, was also described in RNA-Seq analyses for large-scale gene expression in the cotton plant. Many activities in plants have been studied using transcriptome analysis, along with the study of leaf sense [36], fiber growth [37], biotic stress [38], along abiotic stress [39]. However, there are certain obstacles with the RNA-Seq approach, such as library creation and the development of efficient techniques for storing and processing vast volumes of information [40]. As soon as these limitations to widespread use of RNASeq are removed, it has envisaged that this approach has taken over as the primary tool for evaluation of transcriptome [41].

Many characteristics of living creatures are influenced by processes known as epigenetic modifications in addition to genetic variations. Gene expression is influenced by these alterations, which alter when, how many, and how much they are expressed. Among the several epigenetic signaling approaches available, DNA methylation [42] has been shown a crucial role in agricultural plant growth and morphological variety [43]. DNA methylation changes the cotton which is connected with seasonal fluctuations in fiber production [44] and different tissue [45]. CHH methylation mediated with RNA-directed DNA methylation (RdDM) has been associated with ovules gene activation, while CHH methylation mediated by chromomethylase2 (CMT2) has been connected to gene repression in fiber development [46].

It has been shown that between wild and domesticated cotton varieties, 519 cotton genes have been epigenetically changed, some of which have been linked to domesticated and agronomic properties [47], and others of which have not. As a result of this research, we have a better understanding of how epigenetic regulation affects many aspects of cotton’s development and its polyploid evolution. In terms of bringing this technique to reality, we need to know how the methylome has evolved and been domesticated.

4.1 Functional genomics databases for cotton

To do an examination of genome, data in the map location, protein expression and mRNA, allelic variation genome sequence, and metabolism must all be accessible at the same time. With the increase of omics sets of data, it is important than ever to get a database of functional genomics that helps users to easily access and display genetic data. “CottonGen (https://www.cottongen.org) [48], Cotton Genome Resource Database (CGRD; http://cgrd.hzau.edu.cn/index.php) [49], Database for Co-expression Networks with Function Modules (ccNET; http://structuralbiology.cau.edu.cn/) [50], Join Genome Institute (JGI; http://jgi.doe.gov) [51], Cotton Genome Database (CottonDB; http://www.cottondborg) [52], Evolution of Cotton (https://learn.genetics.utah.edu/content/cotton/evolution/) [53], Platform of Functional Genomics Analysis in Gossypium raimondii (GraP; http://structuralbiology.cau.edu.cn/GraP/about.html) [54], Cotton Functional Genomic Database (CottonFGD; https://cottonfgd.org) [55], Cotton Genome Project (CGP; http://cgp.genomics.org.cn/page/species/index.jsp) [56], https://www.cottongen.org/data/markers [57], https://bacpacresources.org/ [58], https://scienceweb.clemson.edu/cugbf/clemson-genomics-and-bioinformatics-courses/ [59].” As a result, Cotton FGD provides accessibility to most of the sequenced genomes of Gossypium, and also other plant genomes, and also from transcriptome data along with re-sequencing data. The ccNET database contains 1155 and 1884 functional modules from the diploid G. arboreum as well as G. hirsutum, respectively, in respect of cotton species’ founder patterns and structural modules.

Advertisement

5. Advances in cotton genomics research

It has been shown that genome research may be used to maintain and improve agricultural plant genetics as a consequence, attempts in cotton genetic studies, notably the creation of genetic tools, as well as the establishment of breeding stock for genetic and genomics research, have been made. Genomic markers like simple sequence repeats or microsatellites, random amplification of polymorphic DNA, restriction fragment length polymorphism, amplified fragment length polymorphism, resistance gene analogues, sequence-related amplified polymorphism are some of the tools available. Cotton genome sequencing is taking place at the same time as genetic mapping as well as genome-wide Bacterial artificial chromosome (BAC) libraries, plant-transformation-competent binary bacterial artificial chromosome (BIBAC)-based integrated physical map is being created. Study on cotton’s genome lags behind that of soybean, rice and maize mostly due to the lack of funding provided for the species in contrast to these other important crops. The following section provides an overview of recent significant advancements in research of cotton genomics [14].

5.1 DNA markers and molecular linkage maps

RFLPs were first DNA markers to be utilized in cotton genomic research, and they were also found in the most of plant species at the moment of their discovery, indicating that they were widely distributed. The development of the first Gossypium species genetic linkage map [60], which was formed from the F2 population of interspecific G. barbadense, G. hirsutum and founded in RFLPs, should come as no surprise to those familiar with the genus. On the map, 705 locations were included, which was organized with 41 linkage groups along with a total area of 4675 square kilometers. Rong et al. [61] designed it to be more comprehensive than the previous Gossypium genus map, which contained 2584 loci spaced at 1.74-cM intervals which includes all 13 homeologous chromosomes of cotton, making it most comprehensive genetic map for the species to date. Crosses among the D-genome diploid species G. trilobum x G. raimondii [61] and a diploid species G. arboreum x G. herbaceum [62] revealed a large quantity of DNA probes from the map, In addition, there are hybrids between G. arboreum and G. herbaceum, which are a A-genome diploid species. In-depth research on the relationship among the tetraploid AD subgenomes and the diploid A and D genome maps, as well as the cross-species transfer of these insights, produced important results.

RFLPs are time-consuming and need large quantity of DNA, labor-intensive blot hybridization, autoradiography processes, all of which are now being superseded by DNA marker systems based on polymerase chain reaction (PCR). The development of a broad range of markers for diverse applications has resulted from the utilization of PCR-based DNA markers in genetic investigations of cotton. Multiple techniques, including as AFLP, RAPD, SRAP and RGA, provide an ideal chance for scanning a large number of DNA loci in a short period of time, focusing on DNA elements that are quickly developing and hence more likely to include loci that vary between genotypes [63]. Using a population obtained from an interspecific cross between Texas Marker-1 (TM-1) and 3–79, Kohel [63], collected 355 DNA markers into 50 linkage groups, which covered a total of 4766 cM, to construct a genetic map of the species. This map was initially published by Brown and Brubaker [64], which was based on an interspecific Gymnodinium nelsonii x G. australe population that was geneticall linked by AFLP. For the Gossypium G-genome, this was the first AFLP genetic linkage map. In a G. australe hexaploid bridging family, it was observed that AFLPs could be used to detect chromosomal-specific molecular markers that were unique to the G-genome, and that the frequency of chromosome transmission of G. australe could be monitored using AFLPs.

A novel class of cottons genetic markers has been developed that is more easy to use and greatly polymorphic. as a result, the introduction of SSR or microsatellite markers in the cotton industry. Upland cotton has a low level of intraspecific polymorphism, which is especially helpful to the crop’s cultivation because of the crop’s minimum intraspecific polymorphism. As a result of the presence of flanking primer sequences, SSRs are easily transferred between laboratories and are highly transferable from one population to another, SSRs are the PCR-based markers that are commonly co-dominant, extensively distributed all along the genome, and readily transferable across populations [65]. SSRs have been created in cotton, according to http://www.cottonmarker.org [60], for a total of about 5484 SSRs [66].

5.2 Gene and QTL mapping

Even while maps of molecular linkage made significant advances in the knowledge of the development and organization of cotton genomes, a main motive of molecular linkage map building was to locate the genes that impact qualitative as well as quantitative features. If DNA markers are linked to genes that impart critical agronomic characteristics that are costly or time-consuming to analyze, it is less expensive and more reliable to select for acceptable progenies in breeding programs.

5.3 Mapping qualitative traits

Whether it’s a qualitative or plain evaluation, Mendelian hereditary features are qualities that are passed down from one generation to the next that differ in type rather than degree. All of these characteristics are generally managed with a single gene, further the phenotypic diversity in the offspring of the segregating parent may be divided into several groups. It has been discovered that the qualitative characteristics of G. arboreum and G. herbaceum are present in both the diploid (G. arboreum) and tetraploid species (mostly G. hirsutum and G. barbadense) species [1]. Pollen color, leaf shape, lint color, leaf color, pubescence, bract morphology, and other traits are examples of such characteristics. Many qualitative characteristics in crop production, for example, are the result of morphological mutants that have arisen as a result of natural variation among species with interspecific hybrids, or morphological mutants that have arisen as a result of irradiation, spontaneous mutation. As a result, only a few attempts have been undertaken to map qualitative features onto the molecular genetic map as a result of this predicament. A recent publication [67] presented an overview of the qualitative qualities which are mapped with molecular markers. As a consequence, many of these features were included in the map as a kit for linking the different linkage groups to the chromosomes allocated with the classical map, which was the main goal. Genes for leaf shape and development, genes for fiber production (including fiber strength), genes for disease and insect pest resistance (including insect pest resistance), and genes for fertility restoration (including fertility restoration genes) are among those associated with cotton quality and productivity [67].

5.4 Mapping quantitative traits

Qualities with quantitative approach are characteristics of persons which fluctuate in degree rather than kind, as opposed to other traits. They’re usually assumed to be the result of interactions between several loci, and they show continuous variation in a segregating population, as well as being quickly altered by environmental change. In recent years, there has been an explosion of activity in the discovery and detection of quantitative trait loci (QTLs), Since the previous decade, there has been a growth in the number of DNA markers that can be used in cotton genetic mapping. Among the quantitative trait loci (QTLs) that have been found in cotton are those that affect plant architecture, disease resistance, insect resistance, and blooming date, to name a few [14].

5.5 BAC and BIBAC resources

Significant-insert BAC and BIBAC libraries are necessary and sought for advanced genetics and genomics research, according to a large number of publications [68, 69, 70]. Due to the simplicity of increased purification of DNA cloned insert, low levels of chimerism, and high levels of stability in the host cell, bacteria and bacterial-infected cells (BACs) have swiftly established themselves as a significant component of genome research [71, 72]. Gene and QTL mapping [73], whole-genome or chromosome physical mapping [74, 75], large genome sequencing [7677], isolation and characterization of structural and regulatory genes [78, 79] and cytologically based gene discovery are only some of the applications that this technique has been used for in genomics. For a range of taxa, including plants, animals, insects, and bacteria, artificial chloroplast (BAC) libraries have been produced. The public may access these libraries via following websites: (i) https://bacpacresources.org/ [58], and (ii) https://scienceweb.clemson.edu/cugbf/clemson-genomics-and-bioinformatics-courses/ [59]. G. hirsutum, an upland cotton variety, has had BAC and BIBAC libraries produced to help in the study of the cotton genome. A number of G. hirsutum genotypes have been screened, and libraries of BAC and BIBAC are developed for making cotton genome research more efficient. Further on May 1, 2007, the construction of minimum six binary data libraries, as well as their availability to the general public, was accomplished. Using five different genotypes of upland cotton, containing Auburn 623, Tamcot HQ95, 0-613-2R and TM-1, Maxxa, each of these libraries was constructed. The construction was carried out in 4 different BAC vectors and 1 Agrobacterium-mediated along with plant-transformation competent BIBAC vector, each of which contained three restriction enzymes, and each of which was carried out in a BAC vector containing three restriction enzymes. When all libraries are merged, the average insert size in each library varies from 93 to 175 kb, with genome coverage ranging from 2.3 to 8.3x genome equivalents, resulting in a total of >21x haploid cotton genomes in the polyploid cotton species. Other Gossypium species that have been studied include G. raimindii, G. barbadense (Pima S6), G. longicalyx, and G. arboreum (AKA8401), and among others. All the libraries of BAC and BIBAC are necessary additions to the field, providing crucial resources for advanced genetics and genomics research on cotton.

5.6 Microarray

Gene identification, mutational tests, gene expression profiling, gene expression mapping (eQTL mapping), high-throughput genetic mapping, and comparative genome analysis, among other applications, have all benefited from the widespread use of microarrays in genomics research in recent years. For the process of array printing to take place, long (70-mer) gene-specific oligonucleotides are printed as array elements on chemically-coated glass slides, followed by the hybridization of the slide with one or more fluorescently-labeled cDNA or mRNA targets obtained by extracting specific tissues, organs, or cells from the mRNA source. As a consequence, researchers may save time and money by observing the expression and activity of all the genes represented on the microarray in a single hybridization experiment. To further the progress of research in cotton genomics, microarrays created from cotton ESTs have been built in various labs across the world to help in the discovery of new cotton ESTs. In order to produce the first batch of cotton microarrays [80], 70-mers oligos were used to generate the first batch of unigene ESTs of G. arboretum. NR fiber ESTs are represented by 12,227 elements in each microarray, each of which corresponds to 12,227 NR fiber ESTs. Each element is replicated twice in each microarray. Arpat et colleagues [80] found a statistically significant difference in gene expression between 10-dpa fibers during the manufacturing stage or elongation of primary cell wall and 24-dpa fibers during the stage of secondary cell wall disposal using microarrays (Figure 2). According to the findings, fiber gene expression changes from primary cell wall biogenesis or elongation to secondary cell wall biogenesis, with 2553 fiber genes possibly down-regulated and 81 greatly up-regulated in this phase.

Figure 2.

Cotton fiber development and corresponding morphogenesis stages. The initiation stage is characterized by the enlargement and protrusion of epidermal cells from the ovular surface; during the elongation stage the cells expend in polar directions with a rate of 2 mm/day; during the secondary cell wall deposition stage celluloses are synthesized rapidly until the fibers contain 90% of cellulose; and at the maturation stages minerals accumulate in the fibers and the fibers dehydrate [14].

According to the findings of this research, the expression of fiber genes seems to be stage-specific or cell-expansion-dependent rather than continuous. As a result of our research, we discovered that most of the genes that were upregulated in secondary cell wall synthesis when compared with primary cell wall biogenesis belonged to three main functional categories: energy and metabolism; cellular organization and biogenesis; and cytoskeleton (cytoskeleton was the most frequently observed). The fact that such a large amount of cellulose synthesis and cell wall biogenesis is taking place at this moment makes it feasible to suppose that it is taking place in large quantities. Recent additions to the fiber gene microarrays include almost 10,000 gene elements acquired from ESTs of the tetraploid farmed cotton, G. hirsutum, as well as ovary ESTs of the tetraploid farmed cotton. It was necessary to employ G. hirsutum fiber and ovary ESTs in order to generate the fiber gene microarrays, which were later upgraded to include over 10,000 gene elements that were derived from G. hirsutum fiber and ovarian ESTs [14].

Advertisement

6. Application of bioinformatics-genomic tools

Undoubtedly, one of the most important the employment of genomic technology is one of the aims of genome research. That have been developed so as to promote or assist in the continued growth of agricultural genetics in the future. It is now possible to answer a myriad of vital scientific questions in the area of cotton because of advancements in genetic resources and technology. It is possible to employ genomic resources and techniques to encourage or support cotton genetic improvement in a number of ways, depending on the situation. According to current and future projections, marker-assisted selection (MAS) has been one of the most significant and beneficial applications in the field of computer science in the present and near future. The MAS technology has the potential to bring various benefits to a breeding program in a variety of circumstances. For example, using DNA linked to a gene of interest in the first generation of a mating cycle may be utilized to boost the efficiency of selection in the subsequent generations.

When screening for phenotypes in situations where selection is costly or difficult to perform, such as when dealing with a large number of recessive genes, seasonal or geographical issues, or late expression of the characteristic, the adoption of this approach offers substantial benefits [81]. Because the majority of research in cotton genome over the last decade has been devoted for the growth of resources and genomics techniques, for the improvement of cotton genetics as the ultimate end goal of the research, cotton breeding programs have only recently begun to use MAS.

6.1 Fiber quality

Glossoloma anomalum introgression line 7235 was used by Zhang [82] with excellent fiber quality attributes to uncover molecular markers related with fiber strength QTLs. The results showed that molecular markers associated with fiber strength QTLs were found in the introgression line 7235. QTLFS1, a big quantitative trait locus (QTL), was identified in the Hainan and Nanjing field sites in China, as well as in the College Station field site, Texas. QTLFS1 was discovered in the Nanjing and Hainan field sites in China, as well as in the College Station field site, Texas, USA. This QTL is shown to be joined with eight markers and to be responsible for more than 30% of the phenotypic variation in the study population. QTLFS1 is originally thinked to be located on chromosome 10, further study revealed it was actually positioned on LGD03 [81]. As established by Guo et al. [83], an unique SCAR4311920 marker was employed to undertake large-scale screening for the absence or presence of this important fiber strength QTL in breeding populations using a genetic marker [83, 84, 85]. It is possible that this QTL, as well as the DNA markers that are closely associated to it, has been crucial in the commercial cultivars with superior fiber length attributes.

The researchers detected stable fiber length QTL, qFLD2-1, in the population of Xiangzamian 2 by evaluating it in four distinct settings at the same time, as reported in Wang et al. [86]. Because of its high degree of stability, it is conceivable that this QTL has been important for use in MAS algorithms due to its high degree of stability. By applying an in-depth RFLP map to 15 parameters that reflect fiber length in 3662 BC3F2 plants from 24 independently derived BC3 families using Gossypium barbadense as the donor parent, Chee and coauthor [87] dissected the molecular basis of genetic variation in G. barbadense-derived BC3 families that governs 15 parameters that reflect fiber length. The finding of many QTLs that are identical to each characteristic shows that, to obtained the largest genetic gain, breeding works that target each trait are necessary to target each trait individually. Lacape et al., [88] done a quantitative trait locus investigation of 11 fiber characteristics in BC1, BC2, and BC2S1 backcross generations created from a cross between G. hirsutum “Guazuncho 2” and G. barbadense “VH8,” which resulted in the BC1 and BC2S1 backcross generations. They founded 15, 12, 21, and 16 quantitative trait loci for strength, length, color and fineness, in atleast one populations, with the number of QTLs varied from population to population.

The data indicated that the vast majority of QTLs had advantageous alleles coming from the G. barbadense parent, and that QTLs colocalization for diverse traits was much prevalent to isolated placement of QTLs for unique features. By considering these QTL-rich chromosomal sites, scientists were able to identify 19 spots on 15 different chromosomes that may be used as prospective target regions in the marker-assisted with introgression approach. G. barbadense quantitative trait loci linked to genetic markers may allow breeders to more effectively transmit and keep favorable characteristics gained from foreign sources throughout cultivar development as a result of the sources of DNA markers related to QTLs.

6.2 Cytoplasmic male sterility

The D8 restorer (D8R), which is formed for use with the D2 cytoplasmic male sterile alloplasm, and the D2 restorer (D2R), which is formed for use with the D2 cytoplasmic male sterile alloplasm, both work to restore cytoplasmic male sterility by the D8 alloplasm (CMS-D8) to fertility in cotton (CMS-D2). Following these findings, Zhang and Stewart [89] examined that the two restorer loci are not only nonallelic, as well as they are also genetically closely connected, with an approx. Genetic distance between them of 0.93 cM on average. Restoration of the D2 restorer gene has been renamed Rf1, and restoration of the Rf2 restorer gene has been assigned to the restoration of the D8 restorer gene It is possible that a molecular marker that is closely related to the restorer genes of cytoplasmic male sterility are identified and utilized to help hybrid cotton parental lines creation.

According to the findings of Guo et al. [90], one of the DNA markers utilized in the investigation, dubbed OPV-15(300), was shown to be significantly related to the fertility-restoring gene Rf1. They uncovered three RAPD markers which are linked to the restorer gene and, more crucially, they turned the three RAPD markers into markers of genome specific sequence tagged site (STS). It was identified by Liu et al. [22] on the chromosome 4 long arms, which was previously unknown, that the Rf1 locus was located. It was observed that the Rf1 gene is significantly related with two RAPD and 3 SSR markers, for a total of 4 markers. Because they are specific to restorers, MAS should find these markers to be beneficial in the creation of restorer parental lines. Later, Yin [91] developed a genetic map of Rf1 in high resolution that had 13 markers that were separated by a genetic distance of 0.9 cM. This map was utilized to determine the location of Rf1 mutations. Using the Rf1 locus physical map, the researchers determined that the gene’s likely location was atleast of two Bacterial Artificial Chromosome clones with an interval of generally 100 kb among them, which were identified as 081-05 K and 052-01 N, respectively, with an interval of generally 100 kb. The method of extracting the Rf1 gene from cotton is now in the process of being completed.

6.3 Resistance to diseases and insect pests

An important consideration in breeding programs of cotton is resistance of diseases. For this purpose, the researchers identified and described the family of NBS-LRR expressing genes in the Auburn 634 Upland cotton cv. in order to allow investigation, modification and cloning of genes imparting resistance to diverse diseases including fungus, viruses, bacteria and nematodes. It was discovered that only a less percentage of AD-genome chromosomes of cotton include members of the RGA gene family, and that members of one subfamily tend to cluster together on the genetic map of cotton, with many RGAs found in subgenome. Than in subgenome D. Wright et al. [92] discovered two RGAs that comapped with previously identified QTLs for cotton bacterial blight resistance. Cotton RGAs from the NBS-LRR gene family have been crucial in the manipulation, characterization and cloning of resistant genes to a variety of pests and pathogens, accounting for approx. 80% of the genes (>40 genes) that have been cloned to date and confirmed resistance to fungus, viruses, bacteria and nematodes.

Meloidogyne incognita, an RKN, has the potential to significantly reduce cotton yields. CIR316, a SSR marker on linkage group A03, was found by Wang et al. [93] using the G. hirsutum “AaclaNemX.” resistant cultivar. This marker was closely attached to a critical gene resistant RKN (rkn1). A bulked segregant analysis in combination with AFLP is also used in a parallel study to find additional rkn1-associated molecular markers [94]. When an AFLP marker called GHACC1 that was previously linked to rkn1 was converted to a CAPS marker, it resulted in the creation of the CAPS markers. MAS patients might benefit from the use of these two markers. Researchers from Shen et al. [95] found that RFLP markers which are on chromosomes 7 as well as 11 are related to RKN resistance in the source of Auburn 634,which is another source of resistant germplasm than the AcalaNemX source [96].

On chromosomes 7 and 11, an SSR marker-based search for a minor and major dominant quantitative trait locus further verified this relationship. It was shown that when two SSR markers were combined, they accounted for 31% of the galling index. Short arm chromosome 14 mapping is handled by BNL 3661, while long arm chromosome 11 mapping is handled by BNL 1231. It is fair to believe that minimum two genes are included in RKN resistance, given the link between RKN resistance and two different chromosomes.

Blight produced with the bacteria Xanthomonascampestris is other commercially essential disease of cotton (Xcm). There have been two studies that looked at the genetic genes location that provide bacterial resistance that cause blight disease, Wright et al. and Rungi et al., respectively [92, 97]. RFLP markers linked to specific locations on the chromosome were used in both experiments to look for genes that give resistance to the virus. Maps show an association between the B12 resistance gene marker on chromosome 14 and the resistance locus that was initially discovered in African cotton varieties. As an additional step, AFLP and SSR markers were used to discover novel markers that may be used to introduce the Xcm resistance gene into G. barbadense through MAS.

Advertisement

7. Conclusion and future prospectives

A vast amount of genetic information about the cotton plant and its products has been made available despite the fact that cotton genomics research has lagged behind that of rice, maize, wheat, and soybean. Numerous genes and quantitative trait loci (QTLs) joined with quality of cotton fiber, production of fiber, biotic and abiotic stresses are seen and mapped using these resources and methodologies. At Texas Tech University, the laboratory of T. A. Wilkins includes cotton fiber microarrays that may be used for research and development purposes. In the four arrays one is printed on a single slide have seen in the picture above. Biology of cotton, as well as plant biology in general, is explored. These tools and methodologies, however, need a lot more effort to be properly utilized in improvement of cotton genetic and biology study, as well as made more accessible for usage in applications. Cotton genomes research should be emphasized, including but not limited to the following: Based on whole-genome BAC/BIBAC sequencing, we are developing physical maps for cottons. Till date it has not been an accurate and trustworthy based on BAC/BIBAC whole-genome, which is for cotton’s physical/genetic map. The maps should contain minimum two species of Gossypium. Both Upland and Gossypium Raimondi cottons have 90% of world cotton output. Genome of Gossypium’s is the smallest among its species, which means that it has the largest density of genes among Gossypium’s. Mostly current genetics and research for genomics initiatives may benefit from the usage of whole-genome integrated physical or genetic maps, which are shown to be strong platforms and freeways in model and other species, such as the fruit fly, the human genome and the mouse genome [74, 75]. Additional advantages of developing integrated physical maps include a more speedy and effective integration of all current mapped genes, genetic maps, and QTLs, along with genetic resources, which resulted in enhanced research efficiency and cheaper costs.

QTLs are being finely mapped. Even though many genes and quantitative trait loci (QTLs) related to cotton fiber output and fiber quality, as well as stressors from both the natural and man-made environments have been genetically mapped, a couple of issues must be addressed: first, virtually all QTLs are discovered using F2, BC1, and early generations in only one setting, if not a few. Because quantitative elements are very subject to environmental change, findings obtained by using early generations in just one or a few conditions that differ it from one research to the next. Furthermore, DNA markers and most QTLs genetic distances are just too great for MAS applications to be successful. This is the second challenge. For mapping QTLs, huge and advanced population, such as RILs or DHs, in varied settings, and nearly connected DNA markers, comprehensive physical maps are required. Accurate mapping of QTLs and formation of DNA markers which are well-equiped for MAS (i.e., tightly connected and user-friendly) are necessary for the ultimate isolation of QTL genes for map-based cloning. Genes which are isolated as best candidates for generating MAS markers as gene and markers have no recombination between them which is making them perfect choices. More than one key for genomes of cotton are being sequenced. The most effective method for identifying and decoding all cotton genes is whole genome sequencing, despite its high cost with current sequencing technology. It also generates the most sought-after and highly detailed map of the cotton genome, both physically and genetically integrated. Although the genome sizes of Gossypium species vary widely used studies of comparative genomics which show the gene content and order of genes for these species are very consistent [60, 61]. Gossypium raimondii has small genome for all Gossypium species, despite the fact that it is not cultivated in culture. This makes it an excellent candidate for genome sequencing. The sequence data from G. raimondii is transfers to the most important farmed cotton, If a physical map for this larger genome is available, end sequences of BAC for the integrated physical map may be used as anchors for G. hirsutum.

Cells at the stage of secondary cell wall development, including those produced from nonfiber and nonovary tissues and fibers. To be sure, Cotton ESTs are now more plentiful than ever before, but the distribution of these ESTs across different tissue types is still rather uneven, as seen above. After 20 dpa, when secondary cell wall deposition has occurred, there are relatively few ESTs from nonfiber/nonovary tissues as well as fibers. This is especially true during the 15–45-dpa stage. It is clear that even while the expressed genes first set do not contribute directly to fiber output with quality, the second set of expressed genes has a major influence on fiber yield and quality. A large influence on fiber output and quality may be found in the first set of expressed genes, despite the fact that they do not directly contribute to fiber strength. Researchers are working to profile and identify genes related with certain biological processes with an emphasis on genes involved in fiber production. There have been several advances in molecular biology that have been made possible by the creation and widespread availability of microarrays based on cDNA or unigene EST. Cotton research has not made much progress in any of these areas, unfortunately. The capacity of cotton breeders to improve cotton genetics have considerably enhanced by incorporating and defining genes used in the process of fiber creation, development as well as growth of plant, and responses of cotton plants to biotic and abiotic challenges.

Cotton breeders benefit greatly from the capacity to translate changes in gene activity or expression in different tissues and developmental stages into changes in fiber quality and yield. However, it is not clear what the upregulation or downregulation of fiber gene activity or active expression in developmental stages and organs means to cotton’s final fiber yield or in order to discover genes included in fiber introduction [98, 99], expansion [80, 98] along with secondary cell wall deposition [80], cotton genotypes are employed. Are longer fibers inferred by the presence of a gene that is actively expressed during the elongation stage of the fiber? There needs to be more study done on using the data of gene expression for cotton germplasm analyses and development programs.

Advertisement

Acknowledgments

The authors are grateful to Department of Genetics & Biotechnology, Osmania University and Averinbiotech for its support.

Advertisement

Conflict of interest

The authors declare no conflict of interest.

Advertisement

Acronyms and abbreviations

Cotton FGD

Cotton functional genomic database

Cotton DB

Cotton genome database

CGP

Cotton genome project

CGRD

Cotton genome resource database

JGI

Joint genome institute

CMT2

chromomethylase2

CNVs

Copy number variants

CMS-D8

Cytoplasmic male sterility by the D8 alloplasm

D2R

D2 restorer

D8R

D8 restorer

ESTs

Expressed sequence tags

MAS

marker-assisted selection

NGS

Next-generation sequencing

PCR

Polymerase chain reaction

QTL

Quantitative trait loci

RdDM

RNA-directed DNA methylation

SNP

Single nucleotide polymorphism

TM-1

Texas Marker-1

References

  1. 1. Shaheen T, Tabbasam N, Iqbal MA, Ashraf A, Zafar Y, Peterson AH. Cotton genetic resources. A review. Agronomy for Sustainable Development. 2012;32:419-432. DOI: 10.1007/s13593-011-0051-z
  2. 2. Qin YM, Zhu YX. How cotton fibers elongate: A tale of linear cell growth mode. Current Opinion in Plant Biology. 2011;14:106-111. DOI: 10.1016/j.pbi.2010.09.010
  3. 3. Shan CM, Shangguan XX, Zhao B, Zhang XF, Chao LM, Yang CQ, et al. Control of cotton fibre elongation by a homeodomain transcription factor GhHOX3. Nature Communications. 2014;5:5519. DOI: 10.1038/ncomms6519
  4. 4. Chen ZJ, Scheffler BE, Dennis E, Triplett BA, Zhang T, Guo W, et al. Toward sequencing cotton (Gossypium) genomes. Plant Physiology. 2007;145:1303-1310. DOI: 10.1104/pp.107.107672
  5. 5. Rahman M, Zafar Y, Paterson AH. Gossypium DNA markers types, number and uses. In: Paterson AH, editor. Genomics of Cotton. Berlin: Springer; 2009. pp. 101-139. DOI: 10.1007/978-0-387-70810-2_5
  6. 6. Chen H, Qian N, Guo WZ, Song QP, Li BC, Deng FJ, et al. Using three overlapped RILs to dissect genetically clustered QTL for fiber strength on chro.24 in Upland cotton. Theoretical and Applied Genetics. 2009;119:605-612. DOI: 10.1007/s00122-009-1070-x
  7. 7. Rahman M, Yasmin T, Tabassum N, Ullah I, Asif M, Zafar Y. Studying the extent of genetic diversity among Gossypium arboreum L. genotypes/cultivars using DNA fingerprinting. Genet. Resour. Crop Evoluation. 2008;55:331-339. DOI: 10.1007/s10722-007-9238-1
  8. 8. Helms AB. Yield study report. In: Duggar P, Richder D, editors. Proceedings of Beltwide Cotton Production Conference. San Antonio TX: National Cotton Council; 2000. pp. 4-9
  9. 9. Abelson PH. A third technological revolution. Science. 1998;279(5359):2019. DOI: 10.1126/science.279.5359.2019a
  10. 10. Fryxell PA, Craven LA, Stewart JM. A revision of Gossypium sect. grandicalyx (malvaceae), including the description of six new species. Systematic Botany. 1992;17:91-114. DOI: 10.2307/2419068
  11. 11. Rong J, Abbey C, Bowers JE, Brubaker CL, Chang C, Chee PW, et al. A 3347-locus genetic recombination map of sequence-tagged sites reveals features of genome organization, transmission and evolution of cotton (Gossypium). Genetics. 2004;166:389-417. DOI: 10.1534/genetics.166.1.389
  12. 12. Brubaker CL, Paterson AH, Wendel JF. Comparative genetic mapping of allotetraploid cotton and its diploid progenitors. Genome. 1999;42:184-203. DOI: 10.1139/g98-118
  13. 13. Endrizzi JE, Turcotte EL, Kohel RJ. Qualitative genetics, cytology, and cytogenetics. In: Kohel RJ, Lewis CF, editors. Cotton. Madison, Wisconsin: American Society of Agronomy; 1984. pp. 81-129. DOI: 10.2134/agronmonogr24.c4
  14. 14. Zhang HB, Li Y, Wang B, Chee PW. Recent advances in cotton genomics. International Journal of Plant Genomics. 2008;742304:14. DOI: 10.1155/2008/742304
  15. 15. Yu LH, Wu SJ, Peng YS, Liu RN, Chen X, Zhao P, et al. Arabidopsis EDT1/HDG11 improves drought and salt tolerance in cotton and poplar and increases cotton yield in the field. Plant Biotechnology Journal. 2016;14:72-84. DOI: 10.1111/pbi.12358
  16. 16. Paterson AH, Wendel JF, Gundlach H, Guo H, Jenkins J, Jin D, et al. Repeated polyploidization of Gossypium genomes and the evolution of spinnable cotton fibres. Nature. 2012;492:423-427. DOI: 10.1038/nature11798
  17. 17. Wang K, Wang Z, Li F, Ye W, Wang J, Song G, et al. The draft genome of a diploid cotton Gossypium raimondii. Nature Genetics. 2012;44:1098-1103. DOI: 10.1038/ng.2371
  18. 18. Li F, Fan G, Wang K, Sun F, Yuan Y, Song G, et al. Genome sequence of the cultivated cotton Gossypium arboreum. Nature Genetics. 2014;46:567-572. DOI: 10.1038/ng.2987
  19. 19. Wendel JF. New world tetraploid cottons contain old world cytoplasm. Proceedings of the National Academy of Sciences. 1989;86:4132-4136. DOI: 10.1073/pnas.86.11.4132
  20. 20. Li F, Fan G, Lu C, Xiao G, Zou C, Kohel RJ, et al. Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nature Biotechnology. 2015;33:524-530. DOI: 10.1038/nbt.3208
  21. 21. Zhang T, Hu Y, Jiang W, Fang L, Guan X, Chen J, et al. Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nature Biotechnology. 2015;33:531-537. DOI: 10.1038/nbt.3207
  22. 22. Liu L, Guo W, Zhu X, Zhang T. Inheritance and fine mapping of fertility restoration for cytoplasmic male sterility in Gossypium hirsutum L. Theoretical and Applied Genetics. 2003;106(3):461-469. DOI: 10.1007/s00122-002-1084-0
  23. 23. Yuan D, Tang Z, Wang M, Gao W, Tu L, Jin X, et al. The genome sequence of Sea-Island cotton (Gossypium barbadense) provides insights into the allopolyploidization and development of superior spinnable fibres. Scientific Reports. 2015;5(1):1-16. DOI: 10.1038/srep17662
  24. 24. Page JT, Liechty ZS, Alexander RH, Clemons K, Hulse-Kemp AM, Ashrafi H, et al. DNA sequence evolution and rare homoeologous conversion in tetraploid cotton. PLoS Genetics. 2016;12:e1006012. DOI: 10.1371/journal.pgen.1006012
  25. 25. Fang L, Wang Q, Hu Y, Jia Y, Chen J, Liu B, et al. Genomic analyses in cotton identify signatures of selection and loci associated with fiber quality and yield traits. Nature Genetics. 2017;49:1089-1098. DOI: 10.1038/ng.3887
  26. 26. Fang L, Gong H, Hu Y, Liu C, Zhou B, Huang T, et al. Genomic insights into divergence and dual domestication of cultivated allotetraploid cottons. Genome Biology. 2017;18:33. DOI: 10.1186/s13059-017-1167-5
  27. 27. Wang M, Tu L, Lin M, Lin Z, Wang P, Yang Q, et al. Asymmetric sub genome selection and cis-regulatory divergence during cotton domestication. Nature Genetics. 2017;49:579-587. DOI: 10.1038/ng.3807
  28. 28. Li X, Jin X, Wang H, Zhang X, Lin Z. Structure, evolution, and comparative genomics of tetraploid cotton based on a high-density genetic linkage map. DNA Research. 2016;23:283-293. DOI: 10.1093/dnares/dsw016
  29. 29. Wang S, Chen J, Zhang W, Hu Y, Chang L, Fang L, et al. Sequence-based ultra-dense genetic and physical maps reveal structural variations of allopolyploid cotton genomes. Genome Biology. 2015;16:108. DOI: 10.1186/s13059-015-0678-1
  30. 30. Cheng H, Lu C, Yu JZ, Zou C, Zhang Y, Wang Q, et al. Fine mapping and candidate gene analysis of the dominant glandless gene Gl2 e in cotton (Gossypium spp.). Theoretical and Applied Genetics. 2016;129:1347-1355. DOI: 10.1007/s00122-016-2707-1
  31. 31. Andres RJ, Bowman DT, Kaur B, Kuraparthy V. Mapping and genomic targeting of the major leaf shape gene (L) in Upland cotton (Gossypium hirsutum L.). Theoretical and Applied Genetics. 2014;127:167-177. DOI: 10.1007/s00122-013-2208-4
  32. 32. Fang X, Liu X, Wang X, Wang W, Liu D, Zhang J, et al. Fine-mapping qFS07. 1 controlling fiber strength in upland cotton (Gossypium hirsutum L.). Theoretical and Applied Genetics. 2017;130:795-806. DOI: 10.1007/s00122-017-2852-1
  33. 33. Xu P, Gao J, Cao Z, Chee PW, Guo Q, Xu Z, et al. Fine mapping and candidate gene analysis of qFL-chr1, a fiber length QTL in cotton. Theoretical and Applied Genetics. 2017;130:1309-1319. DOI: 10.1007/s00122-017-2890-8
  34. 34. Ashrafi H, Hulse-Kemp AM, Wang F, Yang SS, Guan X, Jones DC, et al. A long-read transcriptome assembly of cotton (L.) and intraspecific single nucleotide polymorphism discovery. The Plant Genome. 2015;8:1-14. DOI: 10.3835/plantgenome2014.10.0068
  35. 35. Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, Alkan C, et al. 1000 genomes project. Mapping copy number variation by population scale genome sequencing. Nature. 2011;470:59-65. DOI: 10.1038/nature09708
  36. 36. Lin M, Pang C, Fan S, Song M, Wei H, Yu S. Global analysis of the Gossypium hirsutum L. Transcriptome during leaf senescence by RNASeq. BMC Plant Biology. 2015;15:43. DOI: 10.1186/s12870-015-0433-5
  37. 37. Islam MS, Thyssen GN, Jenkins JN, Zeng L, Delhom CD, McCarty JC, et al. A MAGIC population-based genome-wide association study reveals functional association of GhRBB1_A07 gene with superior fiber quality in cotton. BMC Genomics. 2016;17:903. DOI: 10.1186/s12864-016-3249-2
  38. 38. Artico S, Ribeiro-Alves M, Oliveira-Neto OB, de Macedo LL, Silveira S, Grossi-de-Sa MF, et al. Transcriptome analysis of Gossypiumhirsutum flower buds infested by cotton boll weevil (Anthonomusgrandis) larvae. BMC Genomics. 2014;15:854. DOI: 10.1186/1471-2164-15-854
  39. 39. Bowman MJ, Park W, Bauer PJ, Udall JA, Page JT, Raney J, et al. RNA-Seqtranscriptome profiling of upland cotton (Gossypiumhirsutum L.) root tissue under water-deficit stress. PLoS ONE. 2013;8(12):e82634. DOI: 10.1371/journal.pone.0082634
  40. 40. Wang Z, Gerstein M, Snyder M. RNA-Seq: A revolutionary tool for transcriptomics. Nature Reviews. Genetics. 2009;10:57-63. DOI: 10.1038/nrg2484
  41. 41. Zhao S, Fung-Leung WP, Bittner A, Ngo K, Liu X. Comparison of RNA-Seq and microarray in transcriptome profiling of activated T cells. PLoS One. 2014;9:e78644. DOI: 10.1038/nrg2484
  42. 42. Phillips T. The role of methylation in gene expression. Nature Education. 2008;1:116
  43. 43. Cubas P, Vincent C, Coen E. An epigenetic mutation responsible for natural variation in floral symmetry. Nature. 1999;401:157-161. DOI: 10.1038/43657
  44. 44. Jin X, Pang Y, Jia F, Xiao G, Li Q, Zhu Y. A potential role for CHH DNA methylation in cotton fiber growth patterns. PLoS One. 2013;8:e60547. DOI: 10.1371/journal.pone.0060547
  45. 45. Osabe K, Clement JD, Bedon F, Pettolino FA, Ziolkowski L, Llewellyn DJ, et al. Genetic and DNA methylation changes in cotton (Gossypium) genotypes and tissues. PLoS One. 2014;9:e86049. DOI: 10.1371/journal.pone.0086049
  46. 46. Song Q, Guan X, Chen ZJ. Dynamic roles for small RNAs and DNA methylation during ovule and fiber development in allotetraploid cotton. PLoS Genetics. 2015;11:e1005724. DOI: 10.1371/journal.pgen.1005724
  47. 47. Song Q, Zhang T, Stelly DM, Chen ZJ. Epigenomic and functional analyses reveal roles of epialleles in the loss of photoperiod sensitivity during domestication of allotetraploid cottons. Genome Biology. 2017;18(1):99. DOI: 10.1186/s13059-017-1229-8
  48. 48. CottonGen. Available from: https://www.cottongen.org [Accessed: December 4, 2021]
  49. 49. Cotton Genome Resource Database. CGRD. Available from: http://cgrd.hzau.edu.cn/index.php [Accessed: December 4, 2021]
  50. 50. Database for Co-expression Networks with Function Modules. Available from: http://structuralbiology.cau.edu.cn/Gossypium/ [Accessed: December 4, 2021]
  51. 51. Join Genome Institute. Available from: http://jgi.doe.gov [Accessed: December 4, 2021]
  52. 52. Cotton Genome Database. Available from: http://www.cottondborg [Accessed: December 4, 2021]
  53. 53. Evolution of Cotton. Available from: https://learn.genetics.utah.edu/content/cotton/evolution/ [Accessed: December 4, 2021]
  54. 54. Platform of Functional Genomics Analysis in Gossypium raimondii. Available from: http://structuralbiology.cau.edu.cn/GraP/about.html [Accessed: December 4, 2021]
  55. 55. Cotton Functional Genomic Database. Available from: https://cottonfgd.org [Accessed: December 4, 2021]
  56. 56. Cotton Genome Project. Available from: http://cgp.genomics.org.cn/page/species/index.jsp [Accessed: December 4, 2021]
  57. 57. Abd El-Moghny AM, Santosh HB, Raghavendra KP, Sheeba JA, Singh SB, Kranthi KR. Microsatellite marker based genetic diversity analysis among cotton (Gossypium hirsutum) accessions differing for their response to drought stress. Journal of Plant Biochemistry and Biotechnology. 2017;26(3):366-370
  58. 58. Zeng C, Kouprina N, Zhu B, Cairo A, Hoek M, Cross G, et al. Large-insert BAC/YAC libraries for selective re-isolation of genomic regions by homologous recombination in yeast. Genomics. 2001;77(1-2):27-34. DOI: 10.1006/geno.2001.6616
  59. 59. Decker CJ, Steiner HR, Hoon-Hanks LL, Morrison JH, Haist KC, Stabell AC, et al. dsRNA-Seq: Identification of viral infection by purifying and sequencing dsRNA. Viruses. 2019;11(10):943
  60. 60. Reinisch AJ, Dong JM, Brubaker CL, Stelly DM, Wendel JF, Paterson AH. A detailed RFLPmap of cotton, Gossypium hirsutum x Gossypium barbadense: Chromosome organization and evolution in a disomic polyploid genome. Genetics. 1994;138(3):829-847. DOI: 10.1093/genetics/138.3.829
  61. 61. Rong J, Abbey C, Bowers JE, Brubaker CL, Chang C, Chee PW, et al. A 3347-locus genetic recombination map of sequence-tagged sites reveals types of genome organization, transmission and evolution of cotton (Gossypium). Genetics. 2004;166:389-417. DOI: 10.1534/genetics.166.1.389
  62. 62. Desai A, Chee PW, Rong J, May OL, Paterson AH. Chromosome structural changes in diploid and tetraploida genomes of Gossypium. Genome. 2006;49(4):336-345. DOI: 10.1139/g05-116
  63. 63. Kohel RJ, Yu J, Park YH, Lazo GR. Molecular mapping and characterization of traits controlling fiber quality in cotton. Euphytica. 2001;121(2):163-172. DOI: 10.1023/A:1012263413418
  64. 64. Brubaker CL, Brown AHD. The use of multiple alien chromosome addition aneuploids facilitates genetic linkage mapping of the Gossypium G genome. Genome. 2003;46:774-791. DOI: 10.1139/g03-063
  65. 65. SaghaiMaroof MA, Biyashev RM, Yang GP, Zhang Q, Allard RW. Extraordinarily polymorphic microsatellite DNA in barley: Species diversity, chromosomal locations, and population dynamics. Proceedings of the National Academy of Sciences of the United States of America. 1994;91(12):5466-5470. DOI: 10.1073/pnas.91.12.5466
  66. 66. Blenda A, Scheffler J, Scheffler B, Palmer M, Lacape JM, Yu JZ, et al. CMD: A cotton microsatellite database resource for gossypium genomics. BMC Genomics. 2006;7:132. DOI: 10.1186/1471-2164-7-132
  67. 67. Ulloa M, Brubaker C, Chee P. Cotton. In: Kole C, editor. Technical Crops. Genome Mapping and Molecular Breeding in Plants. Vol. 6. Berlin, Heidelberg: Springer; 2000. DOI: 10.1007/978-3-540-34538-1_1
  68. 68. Lichtenzveig J, Scheuring C, Dodge J, Abbo S, Zhang H-B. Construction of BAC and BIBAC libraries and their applications for generation of SSR markers for genome analysis of chickpea, Cicer arietinum L. Theoretical and Applied Genetics. 2005;110(3):492-510
  69. 69. He L, Du C, Li Y, Scheuring C, Zhang HB. Large insert bacterial clone libraries and their applications. In: Liu Z, editor. Aquaculture Genome Technologies. Ames, Iowa, USA: Blackwell; 2007. pp. 215-244
  70. 70. Ren C, Xu ZY, Sun S, Lee MK, Wu C, Scheuring C, Zhang HB. Genomic DNA libraries and physical mapping. In: Meksem K, Kahl G, editors. The Handbook of Plant Genome Mapping: Genetic and Physical Mapping. Wiley-VCH Verlag GmbH; Weinheim: Germany; 2005. p.173-213. DOI: 10.1002/3527603514.ch8
  71. 71. Ioannou PA, Amemiya CT, Garnes J, Kroisel PM, Shizuya H, Chen C, et al. A new bacteriophage P1-derived vector for the propagation of large human DNA fragments. Nature Genetics. 1994;6(1):84-89. DOI: 10.1038/ng0194-84
  72. 72. Shizuya H, Birren B, Kim UJ, Mancino V, Slepak T, Tachiiri Y, et al. Cloning and stable maintenance of 300-kilobase-pair fragments of human DNA in Escherichia coli using an F-factor-based vector. Proceedings of the National Academy of Sciences of the United States of America. 1992;89(18):8794-8797. DOI: 10.1073/pnas.89.18.8794
  73. 73. Zhang H-B. Map-based cloning of genes and QTLs. In: Kole C, Abbott A, editors. Plant Molecular Mapping and Breeding. New York, NY, USA: Springer; 2007
  74. 74. Wu C, Sun S, Lee MK, Xu ZY, Ren C, Zhang HB. Whole genome physical mapping: An overview on methods for DNA fingerprinting. In: Meksem K, Kahl G, editors. The Handbook of Plant Genome Mapping: Genetic and Physical Mapping. Wiley-VCH Verlag GmbH. Weinheim, Germany; 2007. p. 257-283. DOI:10.1002/3527603514.ch11
  75. 75. Zhang HB, Wing RA. Physical mapping of the rice genome with BACs. Plant Molecular Biology. 1997;35(1-2):115-127
  76. 76. International Rice Genome Sequencing Project. The map-based sequence of the rice genome. Nature. 2005;436(7052):793-800. DOI: 10.1038/nature03895
  77. 77. Tyler BM, Tripathy S, Zhang X, Dehal P, Jiang RH, Aerts A, et al. Phytophthora genome sequences uncover evolutionary origins and mechanisms of pathogenesis. Science. 2006;313(5791):1261-1266. DOI: 10.1126/science.1128796
  78. 78. Chen M, SanMiguel P, de Oliveira AC, Woo SS, Zhang H, Wing RA, et al. Microcolinearity in sh2-homologous regions of the maize, rice, and sorghum genomes. Proceedings of the National Academy of Sciences of the United States of America. 1997;94(7):3431-3435. DOI: 10.1073/pnas.94.7.3431
  79. 79. Patocchi A, Vinatzer BA, Gianfranceschi L, Tartarini S, Zhang HB, Sansavini S, et al. Construction of a 550 kb BAC contig spanning the genomic region containing the apple scab resistance gene Vf. Molecular & General Genetics. 1999;262(4-5):884-891. DOI: 10.1007/s004380051154
  80. 80. Arpat AB, Waugh M, Sullivan JP, Gonzales M, Frisch D, Main D, et al. Functional genomics of cell elongation in developing cotton fibers. Plant Molecular Biology. 2004;54(6):911-929. DOI: 10.1007/s11103-004-0392-y
  81. 81. Dreher K, Morris M, Khairallah M, Ribaut JM, Pandey S, Srinivasan G. Is marker-assisted selection cost-effective compared to conventional plant breeding methods? The case of quality protein maize. In: Evenson RE, Santaniello V, Zilberman D, editors. Economic and Social Issues in Agricultural Biotechnology. Wallingford, UK: CABI Publishing; 2002. pp. 203-236
  82. 82. Zhang T, Yuan Y, Yu J, Guo W, Kohel RJ. Molecular tagging of a major QTL for fiber strength in Upland cotton and its marker-assisted selection. Theoretical and Applied Genetics. 2003;106(2):262-268. DOI: 10.1007/s00122-002-1101-3
  83. 83. Guo W, Zhang T, Shen X, Yu JZ, Kohel RJ. Development of SCAR marker linked to a major QTL for high fiber strength and its usage in molecular-marker assisted selection in upland cotton. Crop Science. 2003;43(6):2252-2256. DOI: 10.2135/cropsci2003.2252
  84. 84. Shen X, Guo W, Zhu X, Yuan Y, Yu J, Kohel R, et al. Molecular mapping of QTLs for fiber qualities in three diverse lines in Upland cotton using SSR markers. Molecular Breeding: New Strategies in Plant Improvement. 2005;15(2):169-181
  85. 85. He DH, Lin ZX, Zhang XL, Nie YC, Guo XP, Feng CD, et al. Mapping QTLs of traits contributing to yield and analysis of genetic effects in tetraploid cotton. Euphytica. 2005;144:141-149. DOI: 10.1007/s10681-005-5297-6
  86. 86. Wang BH, Wu YT, Huang NT, Zhu XF, Guo WZ, Zhang TZ. QTL mapping for plant architecture traits in upland cotton using RILs and SSR markers. Yi ChuanXueBao. 2006;33(2):161-170. DOI: 10.1016/S0379-4172(06)60035-8
  87. 87. Chee PW, Draye X, Jiang CX, Decanini L, Delmonte TA, Bredhauer R, et al. Molecular dissection of phenotypic variation between Gossypium hirsutum and Gossypium barbadense (cotton) by a backcross-self approach: III. Fiber length. Theoretical and Applied Genetics. 2005;111(4):772-781. DOI: 10.1007/s00122-005-2062-0
  88. 88. Lacape JM, Llewellyn D, Jacobs J, Arioli T, Becker D, Calhoun S, et al. Meta-analysis of cotton fiber quality QTLs across diverse environments in a Gossypiumhirsutum x G. barbadense RIL population. BMC Plant Biology. 2010;10:132. DOI: 10.1186/1471-2229-10-132
  89. 89. Zhang JF, Stewart MDJ. Inheritance and genetic relationships of the D8 and D2-2 restorer genes for cotton cytoplasmicmale sterility. Crop Science. 2001;41(2):289-294. DOI: 10.2135/cropsci2001.412289x
  90. 90. Guo W, Zhang T, Pan J, Kohel RJ. Identification of RAPD marker linked with fertility-restoring gene of cytoplasmic male sterile lines in upland cotton. Chinese Science Bulletin. 1998;43(1):52-54
  91. 91. Yin J, Guo W, Yang L, Liu L, Zhang T. Physical mapping of the Rf1 fertility-restoring gene to a 100 kb region in cotton. Theoretical and Applied Genetics. 2006;112(7):1318-1325. DOI: 10.1007/s00122-006-0234-1
  92. 92. Wright RJ, Thaxton PM, El-Zik KM, Paterson AH. D-subgenome bias of Xcm resistance genes in tetraploid Gossypium (cotton) suggests that polyploid formation has created novel avenues for evolution. Genetics. 1998;149(4):1987-1996. DOI: 10.1093/genetics/149.4.1987
  93. 93. Wang C, Ulloa M, Roberts PA. Identification and mapping of microsatellite markers linked to a root-knot nematode resistance gene (rkn1) in AcalaNemX cotton (Gossypiumhirsutum L.). Theoretical and Applied Genetics. 2006;112(4):770-777. DOI: 10.1007/s00122-005-0183-0
  94. 94. Wang C, Roberts PA. Development of AFLP and derived CAPS markers for root-knot nematode resistance in cotton. Euphytica. 2006;152(2):185-196. DOI: 10.1007/s10681-006-9197-1
  95. 95. Shen X, Van Becelaere G, Kumar P, Davis RF, May OL, Chee P. QTL mapping for resistance to root-knot nematodes in the M-120 RNR Upland cotton line (Gossypium hirsutum L.) of the Auburn 623 RNR source. Theoretical and Applied Genetics. 2006;113(8):1539-1549. DOI: 10.1007/s00122-006-0401-4
  96. 96. Shen X, Zhang T, Guo W, Zhu X, Zhang X. Mapping fiber and yield QTLs with main, epistatic, and QTL x environment interaction effects in recombinant inbred lines of Upland cotton. Crop Science. 2006;46(1):61-66. DOI: 10.2135/cropsci2005.0056
  97. 97. Rungis D, Llewellyn ES. Dennis, Lyon BR. Investigation of the chromosomal location of the bacterial blight resistance gene present in an Australian cotton (Gossypiumhirsutum L.) cultivar. Crop & Pasture Science. 2002;53(5):551-560. DOI: 10.1071/AR01121
  98. 98. Shi YH, Zhu SW, Mao XZ, Feng JX, Qin YM, Zhang L, et al. Transcriptome profiling, molecular biological, and physiological studies reveal a major role for ethylene in cotton fiber cell elongation. The Plant Cell. 2006;18(3):651-664. DOI: 10.1105/tpc.105.040303
  99. 99. Wu Y, Machado AC, White RG, Llewellyn DJ, Dennis ES. Expression profiling identifies genes expressed early during lint fibre initiation in cotton. Plant & Cell Physiology. 2006;47(1):107-127. DOI: 10.1093/pcp/pci228

Written By

Gugulothu Baloji, Lali Lingfa and Shivaji Banoth

Submitted: 18 December 2021 Reviewed: 22 December 2021 Published: 05 October 2022