Incongruence between phylogenetic trees constructed from different gene sequences has bothered practitioners for decades. Paraphyletic or polyphyletic clustering has been traditionally treated as noise that distorts its genealogical bases. Nevertheless, recent genomic data have provided a first indication that horizontal gene transfer (HGT) in microbes and interspecific hybridization (or polyploidization) in eukaryotes challenge the doctrine of common descent. Due to promiscuous recombination, the initial stages of life would have not had a genealogical history but a common physical one whose graphic representation is known as evolutionary reticulation. Reticulate evolution in plants has long been recognized, and recent genomic evidence from animals also indicates its widespread occurrence. Taking into consideration that mounting evidence for hybridization and polyploidy in eukaryotic taxa accumulates, it is essential to have methods to infer reticulate evolutionary histories. Considering the different forms of transpecific genetic transference and introgression across the tree of life, the origin of a given species may not coincide with the origin of its genes. Accordingly, molecular mutation rates might be erroneous if based on strict genealogical thinking. Given abundant new data, it is time to move forward because a major shift in our understanding of species, speciation and phylogenetics is taking place.
- gene trees
- phylogenetic incongruence
Since Darwin´s seminal work, it has been claimed that organic diversity could be represented by a unique branching pattern of inclusive hierarchies depicting genealogical relationships among organisms . This tree of life, based on shared homologies, was considered to reflect nature´s genuine attributes, exclusively represented by descent with modification. Nevertheless, there is neither
Traditional phylogenetic analysis applied to animal and plant phyla has stumbled with gross, irreconcilable discrepancies since its onset. Molecular phylogenomics has corrected some of these paradoxes, but what gets clarified on one end gets muddled in another. A paradigmatic example of this is the recent synthesis of animal phylogeny and taxonomy of , plagued with conflicts near the base of
Theoretically, species correspond to independent, reproductively isolated populations although Darwin recognized interspecific hybridization as a merging process involving two ancestors. The graphical representation of this phenomenon, otherwise being diverging, is known as reticulate evolution or network evolution, describes the origination of a lineage through the partial merging of two ancestor lineages. Hybridization has played an important role in genome diversification and in adapting organisms to their environment. Nevertheless, methods for reconstructing their reticulate relationships are still in their infancy and have limited applicability. Reticulate evolution in plants has long been recognized, but recent genomic evidence from animals indicates that this phenomenon is much more common than anticipated. Taking into consideration that mounting evidence of hybridization in eukaryotic taxa accumulates, it is essential to have methods to infer reticulate evolutionary histories. Given abundant new data, it is time to move forward because a major shift in our understanding of species, speciation and phylogenetics is taking place.
Many groups of closely related species including insects, vertebrates, microbes and plants have reticulate phylogenies. In microbes, lateral gene transfer is the dominant process that distorts strictly genealogical, tree-like phylogenies. In multicellular eukaryotes, hybridization and introgression among related species are of prime importance. Introgression and reticulation can thereby affect all parts of the tree of life, not just the crown species. Accordingly, conceptual issues regarding adaptive evolution, speciation, phylogenetics and comparative genomics must be modified to fit these recent findings. Reticulation is produced by phenomena like lateral gene transfer, introgressive hybridization and polyploidization. In fact, certain alleles of gene trees may appear more closely related to alleles from a different species than to other conspecific alleles, thus giving rise to instances of paraphyly or polyphyly. The occurrence of such anomalous clustering in the evolutionary history of species poses serious challenges to practitioners of phylogenetic analysis as they result in genomic regions with locally incongruent genealogies relative to the speciation pattern. Thus, phylogenetic analyses should account for the reticulate component of evolution, especially now that whole genome sequencing provides unprecedented phylogenetic information across the web of life . Here, we present genetic and genomic evidence indicating the evolutionary importance of reticulation in multicellular eukaryotes and summarize relevant reticulate issues and its bearings on phylogenetic practice.
2. Horizontal gene transfer (HGT)
HGT phenomenon of genetic transference mainly among prokaryotes can occur via bacterial transformation, conjugation or transduction. It excludes mitosis and meiosis and does not require immediate ancestry. Bacterial genomes have revealed a complex evolutionary history, which cannot be represented by a single strictly bifurcating tree for most genes. Comparative analysis of sequenced genomes indicates that lineage-specific gene loss has been common in evolution, thus complicating the notion of a species tree, of a last universal common ancestor and the delimitation of its taxonomic units by being asexual.
HGT in eukaryotes has been reported in phagotrophic protists and limited largely to the ancient acquisition of bacterial genes. Nevertheless, standard mitochondrial genes, encoding ribosomal and respiratory proteins, are subject to evolutionarily frequent horizontal transfer between distantly related flowering plants. These transfers have created a variety of genomic outcomes, including gene duplication, recapture of gene lost through transfer to the nucleus and chimeric, half-monocot, half-dicot genes .
As a result, from intergenomic comparisons, HGT appears as a dominant process to generate innovations and complex adaptations like the acquisition of shade-dwelling habits in ferns. Molecular evidence indicates that the chimeric photoreceptor, neochrome, was acquired from hornworts, thereby optimizing phototropic responses . HGT not only involve individual genes but also whole chromosomes and even nuclear genomes by asexual means. In the fungi genus
The horizontal transfer of a complete genome, giving rise to a new
Overall results of reticulate evolution via genome-wide quantification reveal that ecological specialization somehow restricts intra- and interspecific recombination . Nevertheless, the genomic architecture and content of transposable elements are also central to HGT and to recombination potential. In addition, genomic regions differ in levels of potential HGT and reticulated evolution from single genes to whole genomes. It is also noticed that genetic distances, genomic rearrangements and genome synteny all show evidence of HGT and network-like evolution both at whole and core genome scales. Moreover, proteomic core genes have experienced reticulated evolution of complex traits and played a transcendent causal role in the radiation and adaptation of life on earth.
3. Interspecific hybridization
One potential cause of gene tree/species tree discordance and concomitant polyphyly is the occasional mating (hybridization) between otherwise distinct species. The resulting transfer of parental alleles to hybrid offspring (introgression) introduces variation at rates much higher than mutation.
Thus, significant levels of genomic replacement may accrue over long periods, even at low hybridization rates. This has been recently demonstrated in extant
Hybridization is increasingly being recognized as a widespread process between ecologically and behaviorally divergent animal species. Determining phylogenetic relationships in the presence of hybridization remains a major challenge for evolutionary biologists. If hybridization has occurred among the species of a given taxon, cladistic analysis fails to account for the process involved since the relationships are not genealogical but reticulate. Since hybridization results in incongruent intersecting data that obscure the underlying hierarchy, the results are always plagued with convergences and parallelisms of no biological relevancy .
Recombination is a form of reticulation that mimics the problems derived from hybridization, except that occurs at the gene level. Recombination can be diagnosed by looking at the compatibility of the phylogenetic partition supported by the polymorphic sites along the sequence. One strategy consists of looking at changes in the most parsimonious topology along sequences, while others use a maximum chi-square test or use the maximum-likelihood approach to detect specific incongruent evolutionary patterns. Unfortunately, no general method to place a putative hybrid in the appropriate clade exists.
Introgression (also known as introgressive hybridization or interspecific gene flow) occurs when alleles from one species penetrate the gene pool of another through interspecific mating and the subsequent backcrossing of hybrids into parental populations. When hybridization is symmetrical, the resulting hybrid species might be polyphyletic, as might be both parental species. Having in mind that hybrid speciation is often associated with whole genome duplication (polyploidy), knowledge of such traits may strengthen the suspicion of polyphyly derived from hybrid speciation . However, in several cases of putative hybrid speciation, alternative explanations have been difficult to rule out. Considering that mitochondrial alleles are more easily introgressed than nuclear ones, their heterospecific plasmidial origin will be more frequently detected. Consequently, mitochondrial gene trees could be particularly susceptible to the effects of introgression and be especially misleading in cases where introgressed haplotype lineages become fixed, leaving no hint that they are of heterospecific origin.
The discovery of cytoplasmic introgression and the disparity between rDNA and cpDNA phylogenies of several plant groups is reflective of past hybridization and subsequent introgression. If an analysis includes hybrids, no matter where the hybrids are placed, a cladistic method produces only divergently branching phylogenetic patterns and thus can never retrieve the correct phylogeny, and we end up with confusing and conflicting results.
Polyploidy is a form of interspecific hybridization followed by whole genome duplication (WGD). As the most drastic modification that a cell can experience, it involves rapid and profound nonrandom changes in chromatin composition, segregation patterns and copy number variation of dispersed repetitive DNA [18, 19]. Polyploidy is also instrumental to introgress alien DNA into breeding lines enabling the introduction of novel characters as demonstrated by FISH, GISH and genetic mapping . Its evolutionary role has motivated intense studies because duplicated gene pathways provide new opportunities for increased body-plan complexity, organismal differentiation and adaptation by recruitment of new genes to new roles [21, 22]. Polyploidy has played a significant role in the hybrid speciation and adaptive radiation of flowering plants but has been considered irrelevant to mammalian speciation due to severe disruptions in the sex-determination system and dosage compensation mechanism [23, 24]. Recent comparative genomic data has further demonstrated the evolutionary transcendence of polyploidy by reporting three rounds of WGD (3R hypothesis) in vertebrate evolution  and five rounds in flowering plants .
The convergence of distinct lineages upon interspecific hybridization (allopolyploid) and subsequent endoreduplication that increases ploidy level is a driving force in the origin of most flowering plants species. Likewise, the grass tribe
Following WGD, duplicated genes show two types of homologies stem from the fact that genes are duplicated: paralogy and orthology. Paralogy stands for genes that are related following a duplication event, whereas orthology is the result of speciation. Consequently, the gene tree based on multigene families in polyploid species would be problematic if confounding these two forms of homology. Due to this limitation, mitochondrial single-copy genes rather than nuclear genes are a more reliable source of allele orthology. A gene tree that includes paralogous alleles may depict polyphyletic species because its topology reflects gene duplication as well as speciation. The cause of this polyphyly may be misinterpreted if the orthology of alleles is assumed. Because mitochondrial loci are single-copy genes rather than members of multigene families, it was long considered safe to assume allele orthology by mitochondrial primers. This is a serious phylogenetic challenge considering that most angiosperms are polyploid. If the 3R and 5R hypotheses are scientifically valid, their implication makes the search for common ancestry irrelevant to science. To celebrate the 150 years of Darwin´s Origin of Species, the prestigious journal,
In short, gene duplication following polyploidy can give rise to multigenic families that correspond to groups of locally distributed, tandemly oriented redundant genes that can subsequently be involved in non-allelic homologous recombination. Duplicated genes can undergo three different outcomes. First, both copies can persist, keeping their sequence identity while maintaining a high level of gene expression. A second possibility, known as subfunctionalization, occurs when one gene copy is silenced (by physical elimination or methylation). Subfunctionalized copies may form pseudogenes, nonfunctional genetic sequences that conserve their similarity to one or more paralogs that confound phylogenetic analyses. The third outcome of a duplicated gene is neofunctionalization, a phenomenon that involves functional diversification to a new role or allelic specialization of a previous function. Clearly, these processes of gene evolution consisting of both gene births and deaths after duplication interfere with the general assumptions of phylogenetic analysis and blur the end results.
5. Incomplete lineage sorting
Incomplete lineage sorting occurs when polymorphisms persist between speciation events, so that the true genealogical relationship of a gene or genome region differs from the species branching pattern. Incomplete lineage sorting and introgression are two main causes of discordance between gene trees and species trees of eukaryotic coding sequences. For instance, around 15% of human genes are more closely related to homologs in gorillas than to the chimpanzee sister lineage. This anomaly is probably derived from their reduced ancestral effective population size (
Several analytical methods assume that reticulation events are the sole cause of all incongruence among the gene trees and seek phylogenetic networks to explain all incongruences. Nevertheless, these methods overestimate the degree of reticulation if other causes of incongruence are at play. Indeed, recent studies in the human genome [30, 31] in
Some authors claim that significant steps have been conducted to put phylogenetic networks on par with phylogenetic trees as a model of capturing evolutionary relationships. Nevertheless, progress with phylogenetic network inference notwithstanding methods of inferring reticulate evolutionary histories while accounting for ILS is poorly understood. Its inapplicability stems mainly from two major issues: the lack of a phylogenetic network inference method and the lack of a method to assess the degree of confidence associated to an inference traveling into a phylogenetic network space. Likewise, methods for assessing the complexity of a network and the use the bootstrap method for measuring branch support of inferred networks have been developed .
6. Identifying complex patterns of genetic diversity through networks
Branching diagrams dominate the phylogenetic thinking. Nevertheless, the genetic patterns of bacterial genome evolution give rise to complex patterns than cannot be accommodated by a tree . The complexity and profound relationships among the three domains of life defy traditional methods. For example, the construction of a web of genetic similarity comprising proteomic data from 14 eukaryotes, 104 prokaryotes, 2389 virus and 1044 plasmids clearly showed the chimeric origin of eukaryotes. These fusion events between
Reticulate patterns can also stem from an inadequate analysis or data processing, wrong specification of the model used, wrong use of data or sequence alignments. Even though network analysis allows a drastic reduction of data misinterpretation, most important is to be aware that genomic hybridization is a more probable explanation to capture the differences among genetic trees .
Interspecific gene exchanges are much common than previously appreciated. This not only includes hybridizing sister species undergoing genomic introgression but whole groups that exchange adaptive and nonadaptive genomic regions, as exemplified in
The only literature survey dealing with the frequency, causes and consequences of species-level paraphyly and polyphyly indicates that their incidence is taxonomically widespread . Interestingly, almost 25% of the scientific literature surveyed does not offer an explanation to polyphyletic gene trees. Polyphyly was observed in 15% of species across the cnidarians, mollusks, insects, crustaceans, arachnids and echinoderms, whereas half of the citations dealing with these deviations claim for a faulty taxonomy. Both introgressive hybridization and incomplete lineage sorting were also invoked in one third of the 2319 species analysed. Inadequate phylogenetic information is invoked in few papers . Consequently, species-level monophyly cannot be assumed as an
The assistance of E. Suárez-Villota is highly appreciated. This contribution was supported with private funds.