Cotton is one of the most important crops in the world. The Gossypium genus is represented by 50 species, divided into two levels of ploidy: diploid (2n = 26) and tetraploid (2n = 52). This diversity of Gossypium species provides an ideal model for studying the evolution and domestication of polyploids. In this regard, studies of the origin and evolution of polyploid cotton species are crucial for understanding the ways and mechanisms of gene and genome evolution. In addition, studies of polyploidization of the cotton genome will allow to more accurately determine the localization of QTLs that determine fiber quality. In addition, due to the fact that cotton fibers are single trichomes originating from epidermal cells, they are one of the most favorable model systems for studying the molecular mechanisms of regulation of cell and cell wall elongation, as well as cellulose biosynthesis.
- genome evolution
- cotton fiber
- cell elongation
Currently, the cotton (
Similar to most plants, the evolution of cotton was characterized by repeating cycles of whole genome duplication [1, 6, 7]. At the same time, a parallel level of cytogenetic and genomic diversity emerged during the global widespread of the cotton, that finally led to the appearance of eight groups of diploid (n = 13) species (groups A-G and K of genomes) [1, 6]. It should be noted that despite the existence of different types of polyploidy [1, 6], the most common type is allopolyploidy, when two differentiated genomes, usually of various species, are combined in one cell nucleus as a result of hybridization [1, 6].
Thus, allopolyploid duplication of the genome leads to numerous of molecular genetic interactions, interlocus concerted evolution, difference of genomic evolution rates, interlocus transfer of genetic material, and possibly to changes in gene expression [1, 6]. In addition, allopolyploidy may have stimulated the morphological, ecological and physiological adaptation of cotton through natural selection based on a higher level of variability such as a result of duplication of the gene set [1, 6].
For the same reasons, the genome duplication may have given new opportunity for cotton improvement by directional selection [7, 8]. Another important aspect of allopolyploidy is that not every allopolyploid has to strictly correspond to concept of the simple summation of the ancestral diploid genomes. In some cases, the fusion of two different genomes is accompanied by significant genomic reorganization and non-Mendelian genetic inheritance as result [7, 9].
Consider to the mentioned above, we would attempt to analyze the consequences of evolution of polyploids, including on genomic, epigenomic and phenotypic levels in this chapter.
2. Evolution of
According to molecular genetic data, the history of cotton evolution has amounted about 10–15 million years, after the
The evolution studies of the
After occurrence of the predecessor of allotetraploid species, at the initial stage of divergence led to the origination of two evolutionary lines of cotton with AD genomes: the first includes
One of an important evolutionary events for
Followed phylogenetic studies have shown the trait of prolonged elongation of trichomes has appeared first time in the A/F-genomes. Possibly, it was the reason to domestication of
Moreover, the domestication of cotton species led to a change not only in the length of the fiber, but also in the chemical composition of its: the fiber of wild species besides cellulose contains suberin, while in cultivated species it is cellulose only .
Summarizing the information mentioned above, it should be noted that the
3. Mechanisms of polyploidy
Polyploidization of eukaryotic genomes is an important evolutionary event that had a significant effect on the evolution of plants, including cotton [14, 15, 16]. Polyploids are divided into two large groups: autopolyploids and allopolyploids [17, 18, 19, 20]. The difference between these two groups basically lies in the hybridization type: intraspecific hybridization occurs in autopolyploids, while allopolyploids arise by the combination of processes such as interspecies hybridization and duplication of chromosomes [17, 20].
In turn, there are two types of allopolyploids: true and segmental allopolyploids. True allopolyploids emerged due to hybridization of distantly related species, but segmental allopolyploids through hybridization of closely related species with partially different genomes . In this case, segmental allopolyploids can be considered as an intermediate type between true allopolyploids and autopolyploids .
In autopolyploids, the presence of more than two homologous chromosomes in the genome may lead to formation of multivalents during meiosis. It contributes to the polysomic type of inheritance of traits. Whereas, in true allopolyploids bivalents are formed, that leads to disomic inheritance of traits. At the same process, in segmental allopolyploids monovalent, bivalent and/or multivalent chromosome pairing is observed during meiosis .
The second mechanism is the fusion of unreduced gametes – the basic factor of the natural emergence of polyploidy. In this case, the fusion of unreduced gametes may lead to unilateral- (fusion with a typically reduced gamete) or bilateral polypolydization (fusion with another unreduced gamete) .
The formation of unreduced gametes can occur due to errors during meiosis. In this case, errors during meiosis I (first division restitution – FDR) can be a consequence of a fail to chromosome pairing in prophase I (synaptene/pachytene) or separation of homologous chromosomes in anaphase I . At the same time, errors during meiosis II (second division restitution - SDR) occur in anaphase II due to the fail to separation and segregation of sister chromatids . Both of FDR and SDR lead to a chromosome set doubling in gametes, resulted in dyads or triads formation .
Depending on the meiotic restitution mechanism, a polyploidization consequences will differ. Thus, after FDR, the heterozygosity level of unreduced gametes will be similar to the original gametes, while SDR leads to a decrease in the level of heterozygosity of unreduced gametes . The heterozygosity level of a resulting polyploids will be of decisive importance both in the struggle for survival as well as by artificial selection.
Polyploidy had a significant effect on the evolution process and formation of species by increasing phenotypic variability, heterosis, and mutation resistance. On the other hand, in terms of evolution, allopolyploidization (interspecific hybridization) is more preferable due to the pronounced effect of heterosis, that manifest in increasing of biomass, growth and its rate, fertility and resistance of occured hybrids to stress . Thus, in tetraploid cultivated cotton species (
Resuming the above, polyploidization is rather widespread phenomenon in plant evolution (the number of polyploid species is approximately ¼ of the total number of vascular plant species) . At the same time, the polyploidy occurrence brings an evolutionary “benefit” to a species, increasing its chances in the struggle for survival.
4. Genomic consequences of polyploidization
The allopolyploidization process of cotton genome could not be considered as the simple sum of the A- and D-genomes. It has been shown that genome duplication leads to various molecular genetic interactions e.g.: interlocus consistent evolution, different rates of genomes evolution, interlocus transfer of genetic material and changes in gene expression [1, 6, 17].
Additionally, according to the latest molecular data tetraploid cotton species are at least paleo-octaploids, and diploid species are paleo-tetraploids. Due to this fact cotton may be a good model system for studying consequences of genome polyploidization [6, 9, 25].
In connection with the above, let us review the changes that occurred after polyploidization of the cotton genome.
4.1 Genome stability
Despite the fact that diploid Gossypium species have the same chromosome basic number (n = 13), the DNA length in different species widely varies from ~900 Mb in D-genomes to ~2500 Mb in K-genomes [1, 6, 17]. Moreover, the analysis of bivalents formation in the metaphase of meiosis also suggest that diploid cotton species are actually paleopolyploid organisms . A number of studies have also shown that the ancestor of
In this respect it should be noted that allopolyploidization of cotton has not only characterized by rearrangements at the chromosome level [1, 6]. This assumption was confirmed by both classical cytogenetic and molecular genetic data [1, 6]. Thus, cytogenetic data show that chromosomes of A- and D-genome less form bivalents after crossing of allotetraploids compared to diploid species hybrids [1, 6]. For example, hybrids of allotetraploids form less than one bivalent per cell in the meiotic metaphase, while hybrids of present diploids of A- as well as D-genome form, on average, 5.8 and 7.8 bivalents [1, 6].
Additionally, the analysis of the order and syntheny of genes in the A- and D-genomes as well as allopolyploid genomes (A versus At and D versus Dt) showed a low level of structural chromosome rearrangements with a retention of collinear linkage groups . Along with this, AFLP analysis of nine artificial allotetraploid and allohexaploid cotton species showed a significant additivity of genetic loci [1, 6].
Summarizing the facts, it can be assumed that the cotton genome stabilization after polyploidization led to such reorganization of the original genomes that they were no longer able to homeological pairing [1, 6].
Thus, it can be concluding that the cotton genome is quite stable and genome stabilization is not achieved through structural rearrangements unlike some other plant models with polyploid genome.
4.2 Mobile elements in genome
As mentioned above, the genome size of different cotton species differs significantly even the same basic number of chromosomes [1, 6, 29]. This may be conditioned with a number mobile genetic elements (MGE) in the
Moreover, the analysis of the genomes of
It has been also found that besides the genome resizing in various cotton species, MGEs have also affected on the expression of genes responsible for fiber development [30, 32]. Thus, in D-subgenome was observed the insertion of the
It has been also suggested that the silencing of CICR (Chinese Institute of Cotton Research) LTR elements had an appreciable effect on the formation of allotetraploid cotton species, because the occurrence frequency of these MGEs is significant in the A-subgenomes, and practically not occur in the D-subgenomes .
Summarize this, presence of mobile elements in a genome, their polymorphism and occurrence frequency, probably had the significant influence on the cotton evolution. In addition, MGE are involved in regulation of activity of genes responsible for fiber quality.
4.3 Asymmetric evolution of the genome
As mentioned above, the extended trichomes elongation trait was probably inherited by the allotetraploid AD-genomes from the A-genome . Further evolution of domesticated tetraploids (
In addition, scientists have found a greater extension of total rearrangements in At-subgenome (372.6 Mb) compared to Dt-subgenome (82.6 Mb) by comparative study of interchromosomal rearrangements and SNP frequency in
These data also show that allotetraploid genomes due to genetic redundancy are being under less pressure from stabilizing selection, and directed selection by fiber quality has a greater effect on the At-subgenome [31, 38].
The asymmetry of these subgenomes is also appeared by the mutation types occurring in allotetraploid genomes of
Differences in subgenomes are also manifested by different occurrence of frequency and activity of MGE. Two independent research groups have found that MGE number in At-subgenome exceeded the same parameter in Dt-subgenome [31, 40]. At the same time, the frequency of LTR-
The asymmetry is also manifested in the unequal expression of At- or Dt-homeologs, which regulate fiber development in cotton [31, 41, 42, 43]. The expression level of homeologs of some transcription factors (eg,
Moreover, the results obtained using the RNA-seq technology on
Thus, the analysis of the available data allows to speak about the asymmetric evolution of allopolyploid cotton subgenomes with a shift in dominance towards A-subgenome.
5. Effects of polyploidy on fiber development
The fiber is one of the key point for domestication of four
Cotton fiber is basically elongated single cell of seed epidermis (trichome) with a clear gradation of development stages: fiber initiation, elongation, secondary biosynthesis of the cell walls and maturation [33, 46, 47]. It first appeared among ancestral diploid cotton with A-genome after divergence with F-genome [1, 6, 48]. Allotetraploid species (AD genomes) have significantly higher fiber quality, that can be explained by the nucleotypic effect after allopolyploidization of A- and D-genome [48, 49].
Polyploidization has also led to increase of the number of nuclear genes associated with fiber development . E.g., a number of studies have shown the content of Malvaceae specific genes of
The fiber development in cotton is a complex process ensured by the coordinated action of many genes involvong to biosynthesis of polysaccharides, lipids and phytohormones, pro- and antioxidant system, calcium homeostasis, as well as transcription factor genes (
The difference of gene expression level between
It was also found that the fiber development in tetraploid is specified by gene expression in both At- and Dt-subgenome [1, 40, 48, 55]. Despite the fact that major genes for fiber quality were introduced into allopolyploids from A-genome, the genes in Dt-subgenome also take a significant effect on the fiber development in tetraploid cotton . For example, several researchers on the base of an integrated genetic and physical map of fiber development genes supposed that a transcription factors regulating the expression of fiber genes in At-subgenome are transcribed in Dt-subgenome [1, 56].
Along with this, another research group has identified 811 positively selected genes (PSG) in
All of these results were confirmed by studies of functional enrichment of proteins differentially expressed in cotton fiber . The results of the study of proteome in
The results obtained by genome sequencing of tetraploid
Thus, all of these data show that hybridization of A- and D-genome in allopolyploids had a significant effect on the fiber development in cotton due to both nucleotypic effect as well as changes and differentiation at the expression level of homeologuesof in At- and Dt-subgenome. Obviously, At-genes are associated with the fiber development, while Dt-genes regulate the activity of At-genes towards to fiber quality and determine the adaptive capabilities of allotetraploid cotton to adverse environment conditions [8, 42, 44].
6. Differential evolution of subgenomes
Following the fusion of two genomes into a single nucleus due to allopolyploidy, it is expected that some genes will acquire mutations and become pseudogenes, while others may diverge and acquire new functions [17, 18, 19]. However, it can be expected that these and other phenomena affecting the genes molecular evolution, will be equally distributed in the two allopoliploid genomes. This leads to a useful null hypothesis, that is, the evolutionary rates of nucleotide substitutions will be equivalent for duplicated homeologists [17, 18, 19]. This leads to the null hypothesis, according to which the evolutionary rates of nucleotide substitutions will be equivalent for duplicated homeologs [17, 18, 19]. Inference expectation is that both gene copies accumulate intraspecific diversity at equivalent rates. However, this is not always true, for example, when there is strong directional selection per gene copy [17, 18, 19]. However, in the presence of strong stabilizing selection per gene copy, this condition got broken [17, 18, 19].
Despite this, this model can be useful in study the mechanisms underlying differential evolutionary rates or different levels of diversity. Thus, if one of the homeolog becomes pseudogenized, while the others remain under the pressure of purifying selection, an increase in nucleotide diversity can be expected at a higher rate in the first locus than in the last one [15, 19]. Finding duplicated genes in the same nucleus simplifies the problem of isolating potentially important genomic forces from population-level factors that can influence diversity patterns, such as the selection system or effective population size [15, 19]. Since population factors are neutral in regards to the two homeologs, the observed differences in diversity are almost certainly associated with genetic or genomic processes [15, 17, 18, 19].
In addition, a direct test of the null hypothesis of the nucleotide substitution rates equivalence for homeologous genes is provided by measuring of the levels of nucleotide diversity [1, 17, 18, 19]. If evolutionary forces are equal for duplicated genes, mutations must accumulate randomly towards the homeolog. Therefore, the number of detected alleles should be approximately equal for two gene copies in the study of allelic polymorphism [1, 17, 18, 19, 58]. This approach was used by researchers in the study of the nucleotide sequences of the alcohol dehydrogenase gene (
Thus, these data allowed to suggest the existence of the increasing rate of Dt-subgenome evolution of the allopolyploid
7. Conclusion and future prospect
Summarizing the aforementioned, due to
This chapter presents the results of research on the evolution of
Despite the volume of the obtained data, there are many unsolved issues in cotton genomics. Thus, the study the subgenome asymmetry using LTR-elements will help to clarify the evolution of
In addition, the issues of sub- and neofunctionalization of duplicated genes remain unclear, as well as the mechanism and relationship of epigenetic regulation in asymmetric expression of homeologous genes.
Continuation of comparative transcriptome and proteomic studies will also make it possible to more accurately differentiate the of natural and artificial selection influence on cultivated cotton species. At the same time, these studies can be a good basis for a more complete characterization of the metabolic pathways underlying the fiber formation and development.
Such research as genotyping and more accurate assembly of reference genomes, pan-genomic approaches (sequencing of gene pool in a populations), big data analysis, genome editing, de-novo domestication and genomic selection, combined with the available data, will allow for more efficient development of new cotton varieties with the desired properties as well as developing of personalized farming technologies for this crop.