Molecular characterization studies of the inversion breakpoints in species of the Drosophila genus.
High rates of chromosomal rearrangements are remarkably abundant in Drosophila Fallén, 1832 (Insecta, Diptera) genus, highlighting the paracentric inversions. Since different species of this genus are paradigms for genetics, evolutionary, and population studies, polymorphism analyses for chromosomal inversions have provided basic knowledge for beautiful biological questions. Chromosomal inversions suppress meiotic recombination and thus, natural selection can act to preserve favorable gene complexes. Analyses of natural and laboratory populations show that these polymorphisms provide adaptive advantages to their carriers in relation to diverse factors, such as niche exploration and climatic factors. In addition, due to their monophyletic origin, they also serve as genetic markers for the construction of unrooted phylogenies. With the increasing domain of molecular techniques and genome sequencing, factors such as the reuse of breakpoints by different inversions and the mechanisms that give rise to these polymorphisms have been exploited with scientific refinement. These analyses show the presence of regions that are hot spots for breakpoints, fitting the fragile breakage chromosomal evolution model, as well as the involvement of transposition elements at the origin of chromosomal inversions.
- chromosomal evolution
- chromosomal inversion
- polytene chromosomes
- staggered breaks
- transposable elements
Structural chromosome rearrangements originate from chromosomal breaks at different sites, followed by reconstitution of these breaks in a distinct combination. They involve large quantities of genetic material at the cytological level and can be visualized under light microscopy.
The analysis of different rearrangements in the karyotype of the species of this genus was favored due to the presence of the polytene chromosomes. These polytene chromosomes are formed in interphase nuclei and are the final product of successive replication cycles without the consequent separation of the daughter chromatids, resulting in a huge structure that presents natural banding, formed by the precise synapses of parallel chromomeres of the sisters’ chromatids. It is estimated that the polytene chromosomes founded in the salivary glands undergo 210 replication events, generating up to 1024 filaments in each chromosomal pair of a diploid cell , originating a unique visualization magnitude. Tissues and organs containing cells with polytene chromosomes are, in general, involved in intense short time secretory functions, in a fast-growing context. Another peculiarity of the interphase polytene chromosomes is the non-segregation after replication; the parental chromosomes remain united and paired in the same conformation only seen in meiosis I of most other organisms .
The physical structure of the polytene chromosomes enables the accurate analysis of the different chromosomal rearrangements in Drosophila focusing on inversions—the most frequent rearrangement of the genus. This rearrangement consists in the simultaneous break of two sites in a chromosome and the reorganization of this area with a 180° inverted order.
Inversions are classified in two types, in diploid organisms: paracentric (do not involve the centromere in its formation, occurring in the same chromosome arm) and pericentric (involve the centromere and more than one chromosome arm). This rearrangement can be visualized as heterozygous during the pairing of the homologous chromosomes in meiosis I when only one of the parental chromosomes carries the inversion, forming an inversion loop for the correct pairing of the homologous chromosomes; or as homozygous when both parental chromosomes carry the inversion. These chromosomal conformations can be visualized on the Drosophila interphase polytene chromosomes ( Figure 1 ) .
Chromosomal inversions, compared to the other structural chromosomal rearrangements, use to be better tolerated by the organisms that carry them, since do not imply, theoretically, an increase or reduction of the genomic material. An inversion that occurs within a gene, however, can result in mutation, often lethal to the organism. The changing position of the genes, related to each other’s and their controlling sequences, which is called the position effect, is another consequence of the inversion, resulting in alterations of gene expression and, consequently, alterations at the phenotypic level.
The behavior of a heterozygous inversion and the consequences it may entail differs during meiosis and mitosis. In meiosis I, the occurrence of crossing over inside of a paracentric inversion loop induces the formation of a dicentric chromosome (with two centromeres) and an acentric fragment (without centromere), resulting in gametes with deletions. In contrast, the occurrence of a meiotic recombination at the pericentric inversion loop results in the normal segregation of the chromosomes during meiosis I, since the centromeres are contained in the inversion, but originates gametes with deletion and duplications at meiosis II ending. During the mitosis, a heterozygous inversion does not imply major difficulties for the course of the cycle, since each chromosome duplicates and the sister chromatids are directed to the resulting daughter cells . Illustrations of this are found in several genetics books, usually in Structural Chromosomal Alterations chapter.
Species of the Drosophila genus are model organisms for the study of chromosomal inversions, given the high resolution of the polytene chromosomes analysis, coupled with the fact that more than half of the studied species of Drosophila are naturally polymorphic for inversions . However, based on the knowledge of the genomic destabilization and effects on the production of gametes that the inversions can originate, the high occurrence of chromosomal polymorphism is not expected a priori in the different living beings. The species of the genus Drosophila present a high rate of paracentric inversions, without a major deleterious effect on their reproductive success duo the presence of defense mechanisms in males and females, preventing the production of gametes bearing unbalanced chromosomal rearrangements .
There is a mechanism in the meiosis of females of Drosophila melanogaster Meigen, 1830, carrying heterozygous inversion that selectively eliminates the recombinant gametes during the formation of the polar corpuscles. In this mechanism, the first polar corpuscle to be excluded is one of the balanced chromatids (standard order, or inverted order). The second polar corpuscle eliminated is the dicentric chromosome. The acentric fragment is not oriented in the meiotic spindle and is later degraded. The last polar corpuscle to be eliminated, which will be effectively fertilized, also presents the standard order, or inverted order .
The mechanism of protection against the production of inviable gametes in males of D. melanogaster seems to be the suppression of recombination in spermatogenesis . Mutations in genes that affect the segregation of chromosomes that did not undergo meiotic exchange in Drosophila females do not have the same effect in males, suggesting that the exchange is not necessary for the correct segregation of homologous chromosomes in meiosis I in males of this genus .
Aside from the inferred suppression of recombination in males, reports of its occurrence at the meiotic level are present in the literature, evidencing some peculiarities. Among these, the high occurrence in males showing the phenomenon of the hybrid dysgenesis of different species stands out. This phenomenon is also characterized by the presence of high frequencies of inviable offspring, mutations, structural chromosomal alterations, and distortion of the rate of transmission of alleles by one sex .
Another peculiarity is the spontaneous occurrence of recombination in males of species with a high degree of polymorphism for paracentric inversions, such as D. melanogaster , Drosophila ananassae Doleschall, 1858 , and D. willistoni .
Despite the exceptions, the presence of several cases of multiple heterozygosities occurring in many species of Drosophila support the great efficiency of these mechanisms and direct us to other biological aspects involving these chromosome rearrangements. The purpose of this chapter is to provide a basic overview of the knowledge of the evolutionary basis of its wide occurrence, and the adaptability conferred by the chromosomal polymorphism to the bearers of paracentric inversions found in this genus, converging in the present day in the analyses at the genomic level of the mechanisms that originate these inversions.
2. Population studies of chromosomal inversions in the genus Drosophila
The high polymorphism of chromosomal inversions has been used as a model for different adaptative processes, involved in the maintenance of the genetic variation. The concerns of Theodosius Dobzhansky and collaborators, more than 80 years ago, originated the early studies encompassing analyses of chromosomal inversions in natural populations of Drosophila persimilis Dobzhansky and Epling, 1944 and Drosophila pseudoobscura Frolova, 1929 . Their findings were the stimuli for many of the discoveries that constituted the basis of modern evolutionary synthesis, which intricately combines Charles Darwin theory of evolution of species with Mendelian heritage patterns and population genetics.
The work of Dobzhansky “Genetics and the Origin of Species”  was a great incentive to the development of experimentation in evolutionary and population genetics.
Several experiments with D. pseudoobscura performed by Dobzhansky and colleagues were the basis to the postulation of the co-adaptation model of the genes contained in inversions . Dobzhansky established that the reduced recombination in the inversions of this species is able to sustain positive combinations of genes in epistasis with other gene arrangement prevailing in the population. Therefore, gene complexes linked in an inversion in the different chromosome types are inherited as blocks and are rarely corrupted by meiotic recombination. Thus, the heterozygosity would be preferable to homozygosity, as predicted by the balancing selection [13, 14]. Thenceforth, the analysis and characterization of the chromosomal inversion polymorphism in natural populations of other species have become extensively explored. Also, indirect evidence of the association of chromosomal inversion with a better adaptation of the carrier individuals based on statistics was reported.
Drosophila pseudoobscura presents a broad geographic distribution in North America, being founded since west Canada, USA, and part of Central America, with the presence of a subspecies in Colombia (D. pseudoobscura bogotana), and individuals collected in New Zealand (Oceania) . The ST, AR, CH, PP, SC, OL, EP, and TL arrangements founded on the chromosome 3 of this species are extensively monitored and traditionally present altitudinal cline on their frequencies. Among these, the TL inversion presents a frequency increase on the Pacific coast since the 70 decade, which seems to be related to environmental changes .
Drosophila subobscura Collin, 1936 is a species with high chromosomal inversion polymorphism. Their rearrangements have been traditionally associated with adaptation to environmental variables. This Palearctic species invaded the American continent in the 70/80 decades . Studies encompassing the frequency of inversion in European, North-, and South American populations show an inversely proportional relation of the increase in the frequency of inversions occurring in low latitudes (hot climate areas) and a decrease of frequency of the inversions occurring in high latitudes (cold climate areas) . The chromosomal polymorphism in this species has also been related to environmental heavy metal contamination .
Drosophila buzzatii Patterson and Wheeler, 1942 belongs to the cactophilic species of the repleta group. It is originally from Southern Latin America, and its occurrence has been reported in the 1970s in the Mediterranean region, the Canary Islands, equatorial Africa and Australia, associated with cactus species of the genus Opuntia, which have been disseminated by human interference . In this species, latitudinal clines in the frequency of some inversions have been inferred for the populations of the original areas and the colonized areas. The polymorphism described for the second chromosome, for example, the 2j arrangement has been related to the longer development time, and larvae viability .
The Neotropical species Drosophila mediopunctata Dobzhansky and Pavan, 1943, belongs to the Drosophila subgenus, tripunctata group (the second largest group of Neotropical species). The acrocentric chromosome II of this species is highly polymorphic, with 17 inversions described, which are distributed in the distal (inversions DA, DI, DS, DP, DR, DL, and DJ) and proximal (inversions C0, PC1, PC2, PC3, PC4, PC5, PB0, PA0, and PA8) regions. Based on the 72 haplotypes already described for this chromosome it is possible to infer that the inversions at the distal and proximal regions practically do not overlap, and there is strong linkage disequilibrium between them. Thus, DA inversion is mostly found in association with PA0 inversion. In the same way, DI inversion is associated with PB0 inversion, DS with PC0, DP with PC0, and DS with PC0. Thus, it is difficult to find one of these distal inversions not associated with the corresponding proximal inversions. These five haplotypes are the most frequent (>90%) in the natural populations of D. mediopunctata from Southeastern Brazil. Since 1980, the inversions of chromosome II of this species have been analyzed as potential bioindicator of genetic responses to environmental changes, under the action of natural selection. Collections conducted from 1986 to 1988 and from 1991 to 2002 in different places of Southeast and Southern Brazil showed that DA, DP, and DS inversions present seasonal variation of their frequencies, and the inversion DA increased in dry and cold periods, and DP and DS inversions during rainy and hot periods. In addition, this panorama is related to altitudinal clines. Later collections (2007–2010) in one of the sampled sites (Itatiaia National Park, Rio de Janeiro, Brazil) allowed the comparison of the mean frequencies of inversions at the distal region, with the previous frequencies for this site. It was observed that the mean frequencies of DA and DI inversions increased, while DS, DP, and DV (associated with higher temperatures) decreased their frequencies; and the DA inversion no longer has a significant correlation with altitude. Considering the climatic changes that occurred during these two decades in the region of Itatiaia Park, this suggest that temperature change has little influence on the seasonal changes in the frequencies of inversions in this species. Climate changes may have affected other genetic or morphological features, which may be more directly related to the inversions in chromosome II of D. mediopunctata [23, 24, 25, 26].
Although several characteristics are indirectly associated with the inversions, little progress has been made in defining the genetic-evolutionary basis of these associations . Direct shreds of evidence associating chromosomal inversions and selective pressures have been presented with the advancement of molecular techniques and genome sequencing.
Increasing amounts of data tend to confirm the inhibition of the recombination within the inversion area and also in adjacent areas, which is fundamental to the maintenance of the adaptive role. The patterns of linkage disequilibrium (LD) located within these regions reflect the inversion history and the gene flow since its origin [27, 28, 29, 30, 31].
An example of this case comes from the study of genetic variation and the unbalance of cosmopolitan inversion In(3R)P in two D. melanogaster populations from Australia, one from a tropical region (subdivided between individuals with inversion and individuals with the standard arrangement) and another from a temperate region (whose individuals carried only the standard arrangement). Since their high frequencies are related to higher temperatures, this inversion is known to be associated with climatic adaptations and the success of an evolutionarily recent migratory event (100 years) of this species in Australia. The results of this analysis support the hypothesis that In(3R)P inversion is associated with capture of locally adapted alleles, which interact substantially with loci external to the inversion. However, it was not possible to clarify whether these alleles are either in an additive or epistatic mode. Interestingly, high rates of LD in the region within the inversion are also found in the corresponding genomic region of the individuals that carried the standard arrangement in the tropical population, evidencing selection of such loci. Another result showed a high differentiation of the genomic region that involves the In(3R)P inversion between the tropical and the temperate population .
Despite the confirmed association of chromosomal inversions with the maintenance of combinations of alleles that lie within this region, gene recombination in the inverted region of a chromosome is possible because viable recombinant gametes arise through double meiotic recombination within the inverted region and also in consequence of gene conversion .
The prediction of recombination rates analysis in chromosomes carrying a heterozygous inversion, based on two mathematical models (Poison and Couting), made by Navarro and collaborators  infer three main points about this: “(i) the lower the inversion, the greater the effect on the reduction of the double meiotic recombination rate; (ii) in short inversions and in regions around the breakpoint, inversion reduces the rate of recombination but does not have the same capacity to prevent gene conversion; (iii) reduction of the recombination rate is not uniform throughout the chromosome, generally reducing the gene flow between different arrangements to near zero close to the breakpoints, and higher recombination rates are found in the central regions of the inversion.” The inversion also influences the events of recombination of regions outside their limits. All these findings have implications for the analyses that use balancer chromosomes [32, 33, 34].
It should be noted that a fraction of these chromosomal polymorphisms occurring in the different species is adaptively neutral, and thus suffer less selective pressure (or none), and its fixation, or loss, depends on population size and migration. These inversions can also reach high frequencies through other mechanisms, such as the inversion In(1)Be of the X chromosome of D. melanogaster. This inversion, considered of recent origin, has its maintenance probably due to the distortion of the transmission ratio through males of the species .
Despite the high acceptance and diffusion of the co-adaptation model of the genes contained in the inversions in Drosophila , alternative hypotheses point to different scenarios for the propagation and distribution of chromosomal inversions in populations of living beings, as a result of the increasing acquisition of knowledge and domain of improved analysis techniques [33, 34, 36].
3. Inversions breakpoints in Drosophila: chromosomal distribution
Parallel to the evolutionary-population studies of chromosomal inversions in Drosophila, the concern about the cause and origin of these polymorphisms in populations was already present.
Krimbas and Powel  wrote the best definition of the traditional point of view for the genesis of inversions: “It is that they are the result of two independent breaks, occurring at the same time, followed by the reconnection of the broken parts of the chromosome in an inverted orientation with respect to neighboring regions. Thus, the multiple overlapping inversions found in many Drosophila species would have occurred sequentially, not due to the simultaneous occurrence of multiple breaks. Regarding in tandem inversions (side-by-side inversions), the coincidence of breakpoints is attributed to chance, in events that occurred at different times. The hypothesis of the unique origin of the inversion is reinforced by the rarity of a chromosomal inversion event. It is even rarer that two events originating the same inversion occur spontaneously at the same time in the same chromosome site [37, 38].”
The monophyletic origin of the inversions implies that different rearrangements in the same chromosome can clarify some aspect of the evolutionary history of the analyzed species (or distinct species, when inter-crossings are possible), establishing the inversions as genetic markers for the reconstruction of unrooted phylogenies [39, 40].
“The first genetic dataset used for phylogenetic construction were the inversions of the chromosome 3 of D. pseudoobscura .” For this, the karyotype of a given populations of this species was arbitrarily inferred as the standard arrangement, being named ST. The crossings of males collected in the wild (as well as male offspring of the collected females), with females of the ST lineage, showed the differences of the chromosomal arrangements between the populations due to the formation of inversion loops in the F1 offspring. This comparative methodology of chromosomal inversions allowed relating the different triads of overlapping heterozygous inversions found in an unrooted phylogenetic tree. Based on this, a hypothetical central arrangement in the phylogeny, which has never been found in nature in later works, has been inferred. However, the key point for this analysis was that all copies of a particular inversion would have a unique origin, the arrangement being seen in the individuals of a population as a replica of the single arrangement that arose in the past in a single common ancestor; in other words, its monophyletic origin. Later, molecular phylogenies corroborated the unique origin of the inversions in the chromosome 3 in D. pseudoobscura, and the topology of molecular phylogeny is in accordance with the topology obtained from the cytogenetic data .
The analysis of the phylogenetic relationships of overlapping inversions  considers the most parsimonious route (those with the small amount of inversion) for the evolutionary inference. Phylogenies were constructed for various species groups, such as melanogaster , cardini , Hawaiian Drosophila , virilis , fasciola subgroup , willistoni subgroup , among others.
Considering the traditional point of view of an inversion genesis, the distribution of the inversions along the chromosomes occurs randomly . Sometimes, this characteristic seems to be well suited to the chromosomal distribution of the arrangement of chromosome 3 of D. pseudoobscura in natural populations , sometimes does not seem . The inversion breakpoints induced by X-ray in Drosophila (and many other organisms, in general), seem to cluster preferentially in regions closer to the centromere [37, 49]. Add to this postulate, the evolutionary random breakage model, which gained notoriety with analyses of genomic comparisons, mainly between humans and mouse, later extending to other mammals in the 1980s. This model, in a simplistic way, assumed that the chromosome rearrangements, responsible for the breakdown of the synteny between these organisms, had their breakpoints distributed randomly along the chromosomes [50, 51, 52].
However, increasingly consistent studies evidencing the occurrence of repeated breaks in the same site for different inversions in a considerable amount of species have raised doubts regarding the randomness of the breakpoints distribution. These sites were denominated “hot spots,” and may involve particular structural instabilities of these regions .
The availability of the complete human genome and other mammals showed the effects of the limitations of the random breakage model, since it did not consider countless regions of the genomes of these organisms, because they were not available. The analysis of 281 syntenic blocks up to 1 Mb shared between humans and mouse showed the presence of 190 additional blocks with less than 1 Mb in size, which was very difficult to identify by alignment, and were totally unknown until then. The comparison of the chromosomal rearrangements occurred during the divergence between the two species showed a large number of breakpoints close to each other. This characteristic did not fit the random breakage model theory, so the fragile breakage model was proposed [53, 54].
This model was based on the inference that breakpoints of chromosome rearrangements occur mainly within fragile genome sites (hot spots), in other words, regions prone to breakage. These fragile sites may correspond to regions with lots of transposable elements (TEs), to segmental duplications, or to a palindromic sequence. “The reuse feature does not imply the use of the same genomic position (at the nucleotide level) repeatedly, but rather that, the breakpoint presents multiple genomic regions that originate chromosomal rearrangements [53, 54].”
Pioneering results at cytological level, on the reuse of breakpoints by different inversions, provided challenging data about the randomness of these breaks in Drosophila. Cáceres et al.  analyzed 86 paracentric (heterozygous and fixed) inversions described for species of the D. buzzatii complex and 18 inversions induced in D. buzzatii by introgression, through crossings with Drosophila koepferae Fontdevila and Wasserman, 1988. The authors found that inversions of intermediate size are the most successful for the fixation in this species. They also observed that the breakpoints distribution of chromosome 2 inversions of these species, taking into account the location of the band involved in the break, is not random. The authors founded up to eight breakpoints at the same band in certain chromosomal segments. Similar results were observed in D. subobscura , Hawaiian Drosophila , and D. willistoni .
Although the reuse intra or interspecific of the inversions breakpoints, at cytological level, is common and well documented in the Drosophila genus, the characterization at DNA sequence level is still limited [59, 60, 61]. In silico comparisons of total genomes of different species available , estimate between 1.5  and 2.27  times the reuse of breakpoints throughout the evolutionary history of the species of this genus.
4. Characterization of inversion breakpoints in Drosophila and origin mechanisms
Delimitation and characterization of the inversion breakpoints are fundamental to determinate the mechanisms that originate them. In Drosophila, two main mechanisms have been highlighted in the origin of chromosomal inversions.
The first mechanism is the non-allelic homologous recombination (NAHR, also called ectopic recombination) between repetitive sequences, especially, the TEs [65, 66, 67]. The molecular machinery used by this mechanism is the same as allelic recombination, which has direct involvement with the genetic recombination in meiosis I. When ectopic recombination occurs between two copies of a repetitive sequence (very similar or identical), which are located physically at different chromosomal sites and in opposite orientations, the resulting inverted chromosome segment is flanked by two copies of these sequences, which are chimeric due to the exchange between them [66, 67]. The minimum identity between two sequences required for recombination is called minimal effective processing segment (MEPS). This parameter is not yet satisfactorily elucidated, in vitro analyses with prokaryotic organisms and mammalian cells infer that efficient MEPS for NAHR is 50 bp and between 270 and 280 bp, respectively . However, the genomic approach of NAHR between copies of Ty retrotransposons in Saccharomyces cerevisiae Meyen ex E. C. Hansen, 1883 points out that more important than the identity between the copies of TEs is the genomic distance between them . Figure 2 illustrates a schematic for this mechanism. Based on this, it is important to note that when NAHR involves transposable elements, target site duplications (TSDs) of these can also be changed during recombination, a feature that has been very relevant for the recognition of this mechanism (see Section 4.1). For a long time, the TEs were considered junk DNA, and the involvement of these in the genesis of inversions of Drosophila genus provides solid knowledge to support the participation of these sequences in the molding of the genomes of living beings.
The second mechanism is via the erroneous repair of the free extremities, resulting from the chromosomal staggered breaks, by the non-homologous end joining (NHEJ). The physically close breaks in the chromosome cause failures in the correct pairing of the nitrogen bases, and the chromosomal regions separate. The inversion is due to the junction of the 5′ end with the 3′ end of the other breakpoint [60, 70]. Duplicated DNA segments and in opposite orientations (delimiting the inverted chromosome segment) are the result of the repair and the main recognition mark of this mechanism . In Figure 3 it is possible to notice that staggered breaks occurred on both sides, duplicating two sequences that were originally single copies. However, based on the same figure, it is possible to extrapolate the occurrence of staggered breaks in only one side, and a simple break in the other side. The result is the duplication of just one originally single copy segment flanking the inversion. These duplicate sequences may involve genes. Gene duplication has been implicated as one of the main sources for the evolution of the genomes. The duplicate copy often does not undergo selective pressure, thus mutating more rapidly than the other essential regions of the genome. This may result in new gene functions, which is considered one of the most important results of these duplication events . Thus, the repair of the free ends of staggered breaks by NHEJ gives rise to two different structural rearrangements: chromosomal inversion and duplication. Although in the case in question, duplications have small chromosomal magnitude compared to inversions, when they involve genes, they can also provide genomic variability in populations, and act on adaptive processes, speciation, and chromosome evolution.
The contribution of these two mechanisms is not completely clarified, and intriguing questions such as “whether these mechanisms are generalized among species of the genus and whether there are functional implications through the chromosomal evolution maintained by these inversions, remain open ”.
Table 1 presents a compilation of the different studies that characterized the inversion breakpoints at the molecular level in different species of the Drosophila genus. As can be seen, the origin of different inversions, besides being via NAHR between TEs and other repetitive sequences, and staggered breaks followed by NHEJ, is also via simple breaks and repair. The breakpoint analysis does not always allow us to infer the probable origin of the inversions, a point that may be related to the antiquity of the inversion genesis, implying a greater amount of modifications in these regions, and loss of the signals that point to their origin mechanisms.
|Species||Chromosomal inversion||Breakpoints description and mechanism of chromosomal inversion genesis|
|D. melanogaster||In(3R)P||Analysis by microdissection and sequencing of the inversion region in the chromosome. Absence of repetitive sequences at the breakpoints .|
|D. melanogaster x|
|Fixed inversion in the X chromosome of D. subobscura||Sequences of approximately 30–50 bp rich in thymines flanking the breakpoints .|
|D. melanogaster||In(2L)t||Analysis of the proximal breakpoint and presence of a TE – LINE .|
|D. buzzatii||2j||Presence of homologous copies of a TE denominated Galileo at the breakpoints, and origin of the inversion by NAHR between inverted copies of this TE .|
|D. buzzatii||2q7||Presence of homologous copies of a TE denominated Galileo at the breakpoints, and origin of the inversion by NAHR between inverted copies of this TE .|
|D. pseudoobscura||Arrowhead||Presence of 128 and 315 bp repetitive motifs in opposite orientation at the breakpoints of the inversion. Origin of the inversion by NAHR between the inverted copies of these repetitions .|
|D. melanogaster||In(3R)Payne||Small duplications in both breakpoints of the inversion .|
|D. melanogaster x D. simulans x|
|29 inversions||17 (59%) of the inversions presented inverted duplications at the breakpoints, including the In(3R)84F1;93F6–7 inversion, which traditionally differentiates the karyotype of D. melanogaster and D. simulans. Origin of these inverions by staggered breaks mechanism .|
|D. americana||In(4)a||Repetitive sequences in opposite orientation of a MITE element in both breakpoints of the inversion .|
|D. mojavensis x|
|Inversion in the X chromosome||Absence of repetitive sequences at the breakpoints of the inversion .|
|D. pseudoobscura x D. persimilis||Inversion in the X and II chromosomes||In tandem repetitions of a 319 bp motif at the breakpoints of the inversion in the XR arm of D. persimilis .|
|D. buzzatii||2z3||Presence of homologous copies of the TE GalileoN at the breakpoints and origin of the inversion by NAHR between the inverted copies of this TE .|
|D. buzzatii||5 g||Absence of significant repetitive sequences at the breakpoints .|
|D. mojavensis||Xe||Absence of significant repetitive sequences at the breakpoints. Probable origin by single breaks .|
|D. americana x D. virillis||Inversions (In) Xa and (In)5a||Presence of copies of the MITE DAIBAM at the breakpoints of the inversions in D. americana. Origin of the inversions by NAHR between the inverted copies of this TE .|
|D. buzzatii||Inversions 2m and 2n||2m inversion with 13 Kbp duplications in both breakpoints; origin of the inversion by staggered breaks |
|D. melanogaster||Inversions In(2L)t, In(2R)NS, In(3R)K,|
|Presence of inverted duplications at the breakpoints of the In(2R)NS, In(3R)K, In(3R)P, In(1)A, In(1)Be inversions .|
|D. mojavensis||Inversions 2c, 2f, 2g, 2h, 2q and 2r||Presence of copies of the TE But-5 in both breakpoints of the 2s inversion by NAHR between the inverted copies of this TE. Presence of inverted duplications at the breakpoints of the 2h and 2q inversions; origin of the inversions by staggered breaks .|
|D. subobscura||O3||300 bp sequence in both breakpoints; origin of the inversion by staggered breaks .|
|D. subobscura||Inversions E1 and E2||Probable origin of the E1 inversion by staggered breaks and duplication of a region with approximately 400 bp, named β motif; origin of the E2 inversion by NAHR between α motifs (~ 700 bp) .|
|D. subobscura||Inversions E9 and E3||Presence of duplicated region (~8 Kbp) at the breakpoints of the E9 inversion and of the duplicated region (~3.5 Kbp) at the breakpoints of E3 inversion. Origin of the inversions by staggered breaks .|
|D. subobscura||E12||Presence of the Ugt58Fa gene in both breakpoints. Origin of the inversions by staggered breaks .|
|D. subobscura||Inversions O4 and O8||Duplications in both breakpoints of the inversions; origin by staggered breaks .|
4.1. Involvement of the transposable elements at the origin of the inversions: non-allelic homologous recombination
Transposable elements are interesting and dominant components of the prokaryote and eukaryote genomes, meaning that the comprehension of their biology is a fundamental subject in genetics. Since their discovered by McClintock , much has been learned regarding the molecular properties of the TEs and their contribution to genome configuration of living beings.
These elements are classified according to their characteristics and transposition mode. Class I elements, also called retrotransposons, replicate through a “copy and paste” method and involve the production of an RNAm intermediary, processed by reverse transcription to DNA and re-inserted in the genome. The retrotransposons subdivide into elements with Long Terminal Repeats (LTRs), for example, copia and Gipsy elements in Drosophila, that are similar to retroviruses; and the retrotransposons without LTRs, as Long Interspersed Elements (LINEs) and Short Interspersed Nuclear Elements (SINEs), which do not encode their reverse transcriptase and are also called retroposons [89, 90].
Class II elements, or DNA transposons, replicate, generically by a “cut and paste” mechanism, where the elements are physically excised from the genome and inserted into another site. In this case, there is an increase in the number of copies during the repairing of the excision sites of the DNA transposon by the host during DNA synthesis, or by the insertion of the TE in a genome site which has not been replicated [90, 91]. Still, among Class II elements, there is a non-autonomous element group denominated MITEs (Miniature Inverted-repeat Transposable Elements). These elements are short sequences with several copies in the genome and without coding capacity, as suggested by Mar element, which seems to be restricted to the D. willistoni subgroup .
The TEs of both classes are also classified in Subclass, Order, Superfamily, Family, and Subfamily based on their sharing of certain structures and sequence similarities .
The studies associating TEs with chromosomal rearrangements breakpoints in Drosophila genus begin mostly with the analysis of lineages presenting hybrid dysgenesis syndrome. This syndrome is caused by crossing certain lineages of Drosophila and is characterized by high mutation rates in germinative cells, causing a high frequency of inviable offspring, recombination in males, mutation and structural chromosomal abnormalities . The cause of hybrid dysgenesis has been reported to the activation of several TEs families, including P, I, and hobo elements in D. melanogaster  and Penelope, Ulysses, Helena, and Telemac in Drosophila virilis Sturtevant, 1916 . Subsequently, studies involving programed crossings and also cytogenetic and molecular analyses of the offspring followed the movement of the involved TEs and the appearance of chromosomal rearrangements associated with this movement .
The association of TEs insertions at cytological level with inversions breakpoints in natural populations of Drosophila has also been reported. Among them, stands out the analysis of the transposon hobo in D. melanogaster , the P element in D. willistoni , and the retroelements Penelope and Ulysses in D. virilis species group [98, 99].
The first analysis that directly evidenced the involvement of a TE at the origin of an inversion in a natural Drosophila population was made by Cáceres et al. . This study analyzed the breakpoints of the polymorphic inversion 2j (of the second chromosome) of the species D. buzzatii (subgenus Drosophila, repleta group), which originated from the 2st (standard) chromosome arrangement. For the analysis of the breakpoints of the 2j inversion, these were delimited by chromosome walk, cloned, and sequenced in two lineages of D. buzzatti, which presented the 2st (lineage st-1) and 2j (lineage (j-1) arrangements in homozygosis. For organization purposes, the breakpoints were designated AB and AC (distal breakpoint), CD and BD (proximal breakpoint) in the 2st and 2j lines, respectively. Sequencing and alignment of these regions in both lineages showed large insertions at the two inversion breakpoints, which were not present in the 2st standard arrangement. The insertion between A and C had 392 bp with long inverted repeats terminals (ITRs) of 106 bp. The insertion between B and D had 4319 bp, with ITRs as those of the 106 and 47 bp AC inserts. The central 180 bp of the AC insert and the BD sequence had 95% homology but was in opposite orientations. Sequences of 7 bp separated and inverted flanked each insert and resembled TSDs, which are the result of the TE insertion event. These characteristics pointed out that inversion 2j was generated by intrachromosomal pairing and recombination between the two homologous sequences inserted at distant sites and opposite orientations. The original structure of these inserts was homologous at approximately 274 bp and sustained a NAHR in Drosophila. These same insertions of the inversion breakpoints 2j were characterized as copies of a Class II TE, which was named Galileo .
Subsequently, the Galileo element was classified as a member of the P Superfamily of Class II Transposons  and subdivided into three subfamilies: GalileoG (Galileo), GalileoN (Newton), and GalileoK (Kepler) . The involvement of this family was also pointed on the origin of two more polymorphic inversions of the chromosome 2 of D. buzzatii: 2q7  e 2z3 . These analyses showed, through cytological, molecular, and in silico analyses, that the origin of these inversions was due to the occurrence of NAHR between two copies of the TE Galileo, present at the breakpoints of these inversions.
Still, with respect to inversion 2j of D. buzzatii chromosome 2, its effect on the CG13617 gene was analyzed. This gene was chosen because it is very close to the proximal breakpoint of this inversion (12 bp), and the embryos of homozygous lineage for the 2j arrangement have the expression five times lower compared to the standard lineages, without the presence of the inversion. Based on the characterization of this region in the D. buzzatti genome and analysis of the mRNA levels, the authors pointed that the TE denominated Kepler is responsible for originating an antisense RNA, which forms a complex with the mRNA of the CG13617 gene, performing a post-transcriptional regulation, making it inactive. Kepler TE is inserted adjacent to the proximal breakpoint in the lineages that carry the inversion 2j and is not found in this same region of the breakpoint in the lineages without the inversion. The results of this study show a scenario of the interaction of antisense RNA with the CG13617 gene via position effect. “Thus, the silencing of the CG13617 gene is not due to the influence of the inversion 2j itself, but rather due to the performance of sequences associated with them .”
There are also analyses of fixed inversions 2m and 2n in D. buzzatti, which are distributed in tandem and share the central breakpoint at the cytological level . The delimitation and molecular characterization of the breakpoints were based on the genomic library of bacterial artificial chromosomes (BACs), and physical map of this species , and in the genome of the related species Drosophila mojavensis Patterson, 1940 , which did not exhibit such inversions. It was possible to establish which clones contained the regions of the three breakpoints in D. buzzatii (breakpoints denominated AC, BE, DF, whose direction from the left to the right is from the telomere to the centromere), by means of chromosomal walk by in situ hybridization, using BACs as probes. These positive BACs had their terminal portions sequenced, and these sequences served as a basis for delimiting the breakpoints (denominated AB, CD, EF, although not fully representative of the ancestral karyotype) in the genome of D. mojavensis. Subsequently, probes based on this genome were physically mapped on the polytene chromosomes of D. buzzatii, thus allowing the gene delimitation of the three breakpoints of 2m and 2n inversions. The comparison of these regions at the molecular level presented a very complex scenario. Small fragments of the BuT-5 TE were found at both breakpoints of the 2n inversion (breakpoints BE and DF), which may indicate their probable origin by ectopic recombination between these copies. However, due to the age of inversion, this assumption cannot be strongly based since these regions have already undergone many modifications and the TSDs have not been found. On the other hand, the 2m inversion (AC and BE breakpoints) is flanked by ~13 Kbp duplications, which contain the CG4673 gene. Thus, its most probable origin is via staggered breaks followed by NHEJ (See Section 4.2).
There is an extensive analysis of the mechanisms of origin of fixed inversions in Drosophila mojavensis, another representative of the repleta group. This species is the only representative of the mulleri complex that inhabits the Sonora desert, one of the aridest known environments, with fauna and flora quite peculiar . The analysis of the chromosome evolution of D. mojavensis shows 10 fixed inversions in relation to the primitive arrangement I of the repleta group, along the evolution: one on the chromosome X (Xe), seven on chromosome 2 (2c, 2f, 2g, 2h, 2q, 2r, and 2s) and two on chromosome 3 (3a and 3d) [83, 103]. The molecular characterization of the breakpoints of the seven inversions of chromosome 2 of this species occurred by means of end sequencing of clones of chromosome 2 of the genomic library of BACs of D. buzzatii . Subsequently, these sequences were mapped in the genome of D. mojavensis and compared with the genome of D. virilis (external species with the karyotype without inversions). The breakpoints of 2c, 2r, and 2s inversions showed copies of TEs flanking both sides of the inversion. However, the 2s inversion stood out, due to the presence of the BuT-5 transposon at its breakpoints. The distal copy had 981 bp delimited by 9 bp AAGGCAAGT and CTGTATAAT sequences. At the proximal breakpoint, the copy of BuT-5 TE was a 27 bp fragment, with 12 bp identical to one end and the remaining 15 bp were identical to the other end of this TE, delimited by sequences of 9 bp ACTTGCCTT and ATTATACAG. The sequences ACTTGCCTT and CTGTATAAT are the inverted complementary sequences of AAGGCAAGT and ATTATACAG, respectively, and constitute the TSDs derived from the insertion of the element. These characteristics indicate that the origin of the 2s inversion of D. mojavensis is due to ectopic recombination between the two copies of the BuT-5 TE. Functional inference of this inversion in the D. mojavensis genome indicates that the proximal copy of BuT-5 TE acts on the Dmoj\CG10375 gene promoter (which probably relates to the Hsp40 gene family). In silico analyses show that 2s inversion and the proximal copy of BuT-5 TE increase the expression of this gene and may have direct implication with the thermotolerance regulation in this species .
Another species that clearly presents the involvement of TEs in the genesis of their inversions is the Drosophila americana Spencer 1938 (subgenus Drosophila, virilis group). The neo-X chromosome of this species is derived from a centromeric fusion segregating between the X-chromosome (Muller element A) and chromosome 4 (Muller Element B) in this species. This chromosomal fusion is positively correlated with latitude and has a polymorphic In(4)a inversion . In addition, the arrangement of D. americana chromosome 4 is homosequential to the arrangement of the same chromosome in D. virilis, a related species that has its genome sequenced. Thus, its genome served as the basis for the design of the analysis, associated with the construction of a genomic library of BACs of D. americana. The analysis of In(4)a inversion of neo-X indicated its probable origin by means of ectopic recombination between two copies of a repetitive MITE element, which was widely dispersed in the genome of D. virilis. These same sequences were not present in the corresponding region in strains of the species without inversion (analysis made by PCR) and in D. virilis. The characteristics of this repeating sequence that support its identity as TE is the presence of 240 bp TIRs flanking an internal region of 869 bp. Comparisons of the multiple copies present in the genome of D. virilis with the sequences found at breakpoints in D. americana indicate that the copy present at the proximal breakpoint is a canonical element, whereas the copy present at the distal breakpoint is a rearranged element. From the functional point of view, the proximal breakpoint of this inversion presents allelic associations consistent with co-adaptation .
Subsequently, sequencing with low genome coverage of two strains of D. americana allowed the analysis of the Xa inversion fixed in D. americana and absent in D. virilis and the polymorphic 5a inversion in D. americana . The alignment of the breakpoints of both inversions between the two species indicated that in the regions where the alignment was corrupted, there was always a sequence varying between 500 and 1130 bp, present only in the lineages carrying the inversions. These sequences showed by BLASTN high similarity to an incomplete MITE sequence, with TIRs of 240 bp. In this study, the authors named it DAIBAM (Drosophila americana Inversion Breakpoints Associated MITE). In Xa inversion, it was possible to find clear TSDs and defective copies of TE DAIBAM flanking the inversion. In 5a inversion, copies of the DAIBAM element flanking the inversion had more than 70% nucleotide similarity. Considering that TE DAIBAM copies are defective and that the analyzed inversions are old, the authors infer that the data found supported the origin of inversions Xa and 5a by ectopic recombination between the DAIBAM elements present at the breakpoints of these inversions. It was also found that this element was the same as that present at the breakpoints of inversion In(4)a . Thus, the DAIBAM element is involved in the origin of at least 20% of the inversions occurring in the virilis group [77, 81].
4.2. Inversion origin via staggered breaks and repair by non-homologous end joining.
It has now been characterized that the origin of the inversions via staggered breaks followed by repair by NHEJ, is prevalent in two chromosomal systems: between the fixed chromosomal inversions that differentiate the D. melanogaster karyotype from those of D. simulans Sturtevant, 1919 and D. yakuba Burla, 1954 ; and the chromosomal polymorphism of the E and O chromosomes of D. subobuscura [61, 84, 85, 86, 87].
Drosophila melanogaster, D. simulans, and D. yakuba are members of the D. melanogaster subgroup (Sophophora subgenus). The main karyotypic difference between D. melanogaster and its cryptic D. simulans is the occurrence of inversion in the right arm of the 3 chromosome, denominated In(3R)84F1;93F6-7 . Drosophila yakuba, on the other hand, has at least 28 paracentric inversions differentiating its chromosomes from those of D. melanogaster.
The study of Ranz et al.  analyzed the breakpoints of 29 interspecific inversions in these species through experimental and computational methods.
The analysis of the breakpoints of the In(3R)84F1;93F6-7 inversion highlighted that the breakpoints were proximally flanked by the CG2708 and CG7918 genes, and distally by CG31176 and CG34034 in D. melanogaster. Among these regions, there are occurrences of expressed sequences, and three of these sequences (HDC14862, pfd800 e HDC12400) are duplicated and in opposite directions, in both breakpoints of the inversion in D. melanogaster, with 95% of identity between them. These sequences are single copies in D. simulans and D. yakuba, indicating that these duplications are a derived state with respect to the chromosomal arrangement of these species. Comparisons of the 3R chromosomal arm of D. melanogaster, D. simulans, and D. yakuba at the molecular level, highlighted a fixed inversion in the latter species (In3R(7)), that reuses the breakpoints of the CG7918-CG34034 region, also used by the In(3R)84F1;93F6–7 inversion. In both breakpoints of the In3R(7) inversion, there were two duplicated sequences (CG34034 e CG31286) and in opposite orientation .
Due to the presence of inverted duplications associated with the In(3R)84F1;93F6-7 and In3R(7) inversion breakpoints, the most parsimonious mechanism involved on its origins is through staggered breaks, proposed and schematized for the first time in this analysis. These staggered breaks can be isochromatid, occurring during the premeiotic mitosis and involving staggered single-strand breaks; or chromatid, occurring during the meiotic prophase involving staggered double-strand breaks .
The same in silico study analyzed the breakpoints of 28 paracentric inversions that differentiate the D. melanogaster chromosomes from those of D. yakuba, as well as a pericentric inversion in the chromosome 2. The genomic and phylogenetic evidences suggest that among these 29 inversions, 28 originated in the D. yakuba lineage. The analysis of the inversions breakpoints showed that in approximately 62% of the cases (18 of 29 inversions), occurred the presence of duplicate sequences, which were presents with just a single copy in the D. melanogaster genome. Sequences of both breakpoints were inverted and duplicated in six of these inversions (as in Figure 3 ), and the sequence of just one of the breakpoints was duplicated in 12 inversions, which can be explained by several factors, for example, modifications occurred along the time. Most of these duplications (except three) did not prove to be functional. The comparative analysis of these breakpoints among D. yakuba, D. melanogaster, and other species, regarding the occurrence of TEs and its involvement in the origin of these inversions via NAHR¸ showed little support for this mechanism. It is clear in this analysis that most of the inversions that differentiate the D. melanogaster chromosomes from those of D. yakuba originated by staggered breaks in the latter species (17 of 29 analyzed inversions) and point to a rapid chromosomal evolution in the lineage that leads to D. yakuba .
The polymorphism of the Palearctic species D. subobscura (Sophophora subgenus, obscura group) has been extensively characterized and monitored for more than seven decades, which allows associating its variation with climate changes [18, 19, 20, 56]. Its karyotype is composed of six pairs of chromosomes, with the highest level of polymorphisms for inversions in all of them (except in the dot chromosome). This polymorphism is well characterized for the presence of complex chromosomal arrangements, formed by the occurrence of overlapping inversions, being the E and O chromosomes the ones with the highest occurrence of these arrangements in natural populations.
One of the pioneering analyzes in this species involved the characterization of the breakpoints of the O3 inversion, which can be found in the Ost lineages (corresponding to the current standard arrangement of the species). This inversion originated from the extinct ancestral O3 arrangement, that also gave rise to the O3 + 4 arrangement, which segregates the O4 inversions in the different populations. For this analysis, breakpoints of the O3 inversion in the extinct arrangement and without the O3 inversion were denominated AB (proximal breakpoint) and CD (distal breakpoint). The O3 + 4 chromosome arrangement differs from the O3 arrangement due to a small inversion of its distal breakpoint (called DC), presenting the same order of the proximal breakpoint (AB). In turn, the Ost chromosomal arrangement differs from O3 by inversion O3 (note that the extinct O3 arrangement does not involve the O3 inversion, which occurs in the Ost chromosome), involving B and C regions (their breakpoints being then called AC and BD). The analysis was a strategy that mixed in situ hybridization and in silico tests, together with the knowledge of the location of previously established probes .
New probes were established via comparisons with the available genomes of D. melanogaster and D. pseudoobscura, which made it possible to delimit the genomic region containing the breakpoints of the O3 inversion in the chromosomes of the O3 + 4 arrangement. The posterior sequencing of this region in the Ost e O3 + 4 lineages allowed the comparisons between the breakpoints of the O3 e O3 + 4 inversions, respectively. As a result, it was found that the breakpoints AB and DC in the O3 + 4 inversion comprised two small regions of 309 and 63 bp, respectively. The 63 bp sequence was the same 309 bp sequence, which was deleted at the origin of the O3 + 4 inversion. In turn, the same 309 bp sequence was present at both O3 inversion breakpoints, indicating that at the origin of this inversion such region was duplicated, being present in regions B and C of the breakpoints. The AB and DC regions in the lineage that carries the O3 + 4 inversion showed no similarity to any known TE. This meticulous analysis, based on the absence of TEs and the duplication of the 309 bp fragment, infers that the origin of the inversion O3 , present in the chromosomal arrangement Ost , was by means of staggered double-strand breaks .
Still, in the O chromosome of D. subobscura, the breakpoints of the O4 and O8 inversions were delimited, sequenced and analyzed. Just as the inversion O4 segregates only with the O3 arrangement (giving rise to the complex chromosome arrangement O3 + 4 ), the inversion O8 segregates only with the arrangement O3 + 4 (giving rise to the chromosomal arrangement O3 + 4 + 8 ). Comparisons of the O4 inversion breakpoints with the respective regions in the Ost arrangement (without the inversion) pointed the occurrence of Pxd, CG5225, Acf, and Set8 gene fragments at the proximal breakpoint, and the CG5225, Pxd, and Acf gene fragments at the distal breakpoint .
In the regions corresponding to the breakpoints of the Ost arrangements, fragments of the Pxd, CG5225, and CG4009 genes were found at the proximal breakpoint. The distal breakpoint of the Ost arrangement encompasses fragments of the Set8 and Acf genes, It is evident that at the origin of inversion O4 fragments of the Set8 and Acf genes were duplicated at the proximal breakpoint, and fragments of the CG5225 and Pxd genes were duplicated at the distal breakpoint. This scenario fits the origin of inversion O4 by the staggered double-strand break mechanism. The O8 inversion breakpoints in the O3 + 4 arrangement (without the inversion) and O3 + 4 + 8 arrangements presented a similar picture to that of the O4 inversion. The presence of the Prosβ2R2 gene at both O8 inversion breakpoints shows that this was doubled and fits the origin of this inversion O8 by the staggered double-strand break mechanism. This analysis also found that genes CG5225 and Prosβ2R2 are involved in multiple rearrangements (duplications and transpositions, in addition to inversions) occurring along the chromosomal evolution of the species of the genus Drosophila .
The D. subobscura species also had the breakpoints of the E1 e E2 , E9, E3 , and E12 inversions of the acrocentric chromosome E delimited, sequenced and analyzed. These inversions give rise to the complex arrangements E1 + 2 , E1 + 2 + 9 , E 1 + 2 + 9 + 3, and E1 + 2 + 9 + 12 . These chromosome constitutions, besides providing a great system for the analysis of the mechanisms of origin of inversions, also provide a basis for studying the reuse of the inversion breakpoints at the molecular level [61, 85, 86, 87].
The E1 and E2 inversions share, cytologically, one of the breakpoints. The comparison of the breakpoints of the standard lineage Est (AB, EF, GH breakpoints) with the E1 + 2 lineage (AG, FB, EH breakpoints) showed two motifs, denominated α and β, which share the terminal portion named δ, in opposite orientations. The α motif was present at the AB and AG breakpoints with the same orientation, but with inverted orientation in the GH breakpoint (two copies with inverted orientation in the Est chromosome and a single copy in the E1 + 2 chromosome). The β motif was present with the same orientation at the EF and EH breakpoints, and with inverted orientation at the FB breakpoint (a single copy in the Est chromosome and two copies in the E1 + 2 chromosome). The α motif exhibits small fragments similar to the SGM element, whereas the β motif is not similar to any described TE. Based on this scenario, the probable origin of the E1 inversion was inferred by staggered breakpoints, that lead to the duplication of the β motif present at the FB and EH breakpoints. The origin of the E2 inversion, on the other hand, was inferred due to the ectopic recombination between two α motifs, present in both AB and GH breakpoints. The reuse was inferred by the presence of 400–700 bp repetitions at the breakpoints; however, it was impossible to elucidate which of the two inversions originated first .
The extensive analysis done in the classical rearrangements of the E and O chromosomes, mentioned above, showed that, with the exception of the E2 inversion, the other chromosomal arrangements originated via staggered double-strand break mechanism. Thus, D. subobscura resembles D. melanogaster, and both emphasize a possible predominance of this mechanism in the origin of the inversions of the species belonging to the subgenus Sophophora. In addition, duplicate regions in these events range from a few hundred base pairs to about 8 Kbp (see Table 1 ), encompassing whole and partial genes in some of these duplications. However, no dose effect or generation of new transcripts was detected in the analyses [61, 84, 85, 86, 87].
Still considering staggered break mechanism followed by erroneous repair by NHEJ, the molecular characterization of the inversion breakpoints in D. mojavensis indicates that its inversions 2h and 2q originated by this route. The 2h inversion would have originated by staggered single-break at the distal breakpoint in the parental chromosome, resulting in a duplicated region of approximately 7 Kb, encompassing CG1792, Dmoj\GI23402, and pasha genes. This event resulted in the origin of the gene Dmoj\GI23123, located at the proximal breakpoint of inversion 2h. This gene, by similarity, showed a relationship with the pasha gene, and according to the prediction of the modENCODE software, it is also functional. Thus the Dmoj\GI23123 gene originated from the duplication of the pasha gene (the extra copy of the gene giving rise to a new gene) in the event that resulted in the 2h inversion .
Staggered single-break occurred in the two breakpoints of the parental chromosome in the 2q inversion. This event resulted in a duplication of an approximately 4 Kb region containing a partial fragment of the CG1208 gene. The duplication of this gene resulted in the origin of a new gene, called Dmoj\GI22075, at the distal breakpoint of the 2q inversion. The new gene maintained the MFS domain (Major Facilitator Superfamily) as an important feature of the CG1208 gene .
The 2h and 2q inversions of D. mojavensis are pioneer examples of the origin of new genes with possible new functions, via duplication, based on the origin of an inversion by the staggered break mechanism followed by NHEJ .
5. Concluding remarks
Inversions are structural chromosomal alterations that, most of the time, neither imply genetic unbalance, nor phenotypic modifications in its carriers. However, one of its characteristics is to be a source of genetic variability, in which natural selection acts. Thus, the inversions participate in the chromosomal evolution of numerous species, including Homo sapiens. The basic knowledge about the biological influence of inversions is largely based on the analysis of the polytene chromosomes of the Drosophila model organism, which extends to other living beings.
The first works, with descriptive approaches to the frequency of chromosome polymorphism in different natural populations, while indirectly pointing out that the inversions provided advantages to its users, raised questions that until now guide the analysis on this theme: How does natural selection work in inversions? How do inversions offer greater adaptability to living beings? What is the role of the inversions in the speciation processes? What are the functional consequences of inversions in living beings? Are inversions randomly distributed on chromosomes? How do inversions originate?
The Drosophila model organism provides knowledge and answers to these questions nowadays, with the availability of complete genomes of different species, improved cytomolecular techniques, as well as a solid knowledge about the cytogenetics of polytene chromosomes.
The molecular characterization of the inversion breakpoints tells us about the mechanisms that originate these rearrangements, the genomic composition of the region involved in the inversion—which allows to analyze the nucleotide variation and to show which genes are under selection—the reuse of certain regions for the breakage of different inversions that occurred at different times, the age of the inversion, its monophyletic origin, possible positional effect and its influence on the genes that are inside and outside the inversion, among others. Valuable understandings emerge, but are still incipient.
These analyses go far from being simplistic, but, with the current resources, we have never had so much opportunity to acquire knowledge. Let us live the new time in science, and avail the most of the knowledge already established, with the certainty that many other questions will arise.
As the eminent geneticist Michael Ashburner of the University of Cambridge, United Kingdom, compiles: “What a wonderful time to be a biologist .”
We are grateful to Mr. Leonardo Lindenmeyer for all the assistance provided with the images of this manuscript. Grants and fellowships of Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq, process 455101/2014-0) and Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul (FAPERGS) from Brazil.