Elastin is the extracellular matrix protein providing large arteries, lung parenchyma and skin with the properties of extensibility and elastic recoil. Within these tissues, elastin is found as a polymer formed by tropoelastin monomers assembled and cross-linked. In addition to specific protein regions supporting the covalent cross-links, tropoelastin is featured by the presence of highly repetitive sequences rich in proline and glycine making up the so-called hydrophobic domains. These protein segments promote structural flexibility and disordered protein properties, a fundamental aspect to explain its elastomeric behavior. Unlike other matrix proteins such as collagens or laminins, elastin emerged relatively late in evolution, appearing at the divergence of jawed and jawless fishes, therefore present in all species from sharks to humans, but absent in lampreys and other lower chordates and invertebrates. In spite of an intense interrogation of the key aspects in the evolution of elastin, its origin remains still elusive and an ancestral protein that could give rise to a primordial elastin is not known. In this chapter, I review the main molecular features of tropoelastin and the available knowledge on its evolutionary history as well as establish hypotheses for its origin. Considering the remarkable similarities between the hydrophobic domains of the first recognizable elastin gene from the elasmobranch Callorhinchus milii with certain fibrillin regions from related fish species, I raise the possibility that fibrillins might have provided protein domains to an ancestral elastin that thereafter underwent significant evolutionary changes to give the elastin forms found today.
- extracellular matrix
Elastin is an extracellular matrix (ECM) component of tissues such as the large arteries and lung parenchyma, among others. Even if it is considerably less abundant compared to other matrix proteins, such as collagens, it impacts the biomechanical properties as it is ultimately responsible for the extensibility and elastic recoil . Different aspects make elastin an unusual protein, for example, its molecular structure, fundamental to understand its function, or the complexity of the mechanisms giving to the formation of elastin-based polymers in the ECM. In the genomics era, where uncountable genomes are decoded and extraordinary valuable information on the phylogenetic relationships has been established, it is remarkable that the evolutionary origin of elastin remains unclear. This fact looks anomalous considering that its roots are relatively close to us in an evolutionary scale, compared to other cell components originated at the dawn of life. It is worthwhile and the main objective of this work to review the main molecular features of elastin and the current knowledge on its evolutionary history as well as to explore new avenues to explain its origin. A couple of hypotheses are put on the table to stimulate (and provoke) discussion and further research. In this regard, I can only recognize the contributions of Professor Fred W. Keely to the overall understanding of the biology of elastin, and particularly to its evolutionary relationships .
2. Molecular features of tropoelastin
Most ECM components such as collagens or fibronectin are usually large polypeptides with numerous domains that include repetitive motifs allowing multimerization into supramolecular assemblies . Tropoelastin, the monomer making up polymeric elastin, features some of these characteristics but is indeed an unusual ECM protein. While fibril-forming collagens such as types I and II contain almost 1500 residues and fibronectin goes far beyond 2000, tropoelastin rarely exceeds 800 aminoacids, with some species such as the dog harboring elastin chains of about 500 residues. Moreover, crystal structures of particular domains of collagens or fibronectin have been determined by X-ray diffraction whereas those of elastin have persistently remained elusive [4, 5]. This is in fact a consequence of the disordered nature of the elastin polypeptide, a key feature to understand its elastomeric properties [6, 7]. Analysis of all known elastins shows an alternation between lysine-rich and hydrophobic domains. The former contain the lysine residues destined to take part in the covalent cross-linking catalyzed by members of the lysyl oxidase (LOX) family, therefore named cross-linking domains. The latter are consistently rich in hydrophobic domains and have been shown to be essential for the elastomeric behavior . With noticeable variations, these domains contain stretches of the sequence VPGVG in multiple combinations either forming repeats or included within glycine/proline-rich segments. Molecular dynamic simulations and experimental studies have shown that the hydrophobic motifs support the disordered state by promoting structural flexibility [6, 8, 9]. In fact, this plasticity is likely not limited to the hydrophobic domains but also impacts the whole molecule, considering the fact that lysine oxidation and further condensation of cross-linking domains occur in a random manner, as recently reported . The
3. Evolutionary history of elastin
It may be easily inferred that this sophisticated natural material has required thousands of millions of years to be shaped by the evolution. However, unlike collagens or laminins, whose origin dates back 770–880 million years (Myr) to the emergence of the Metazoa, tropoelastin appeared late on stage [2, 11]. In fact, not even it emerged with the blood vascular systems, a feature of vertebrates and many invertebrates, but it made its debut 400 Myr ago with the jawed vertebrates, therefore absent in jawless fishes such as the lamprey and hagfish. Elastin has been repeatedly invoked as the vascular component that allowed the development of closed circulatory systems. However, these are present in a wide variety of invertebrates including annelids, cephalopods and non-vertebrate chordates . Here, the magic word is blood pressure. It is actually the presence of elastin that led to closed systems exhibiting high pressures (from 30 to 200 mmHg) in jawed vertebrates, in contrast to non-elastin based systems in invertebrates and lower vertebrates, with blood pressures values ranging from one or a few mmHg up to 20–30 mmHg. Most ancient elastin so far reported includes that from the elasmobranch fish
Search and identification of elastin sequences in genome databases from different organisms is crucial to delineate its evolutionary history, and to this aim, a significant number of sequences are known today [2, 14, 15]. Nevertheless, an accurate phylogenetic reconstruction of elastin evolution is still quite incomplete. Focus has been placed on different parts of the gene, including a central conserved region, the C-terminal, the 3′-untranslated region and a region presumably resulting from exon replication. However, reaching a unified picture has been difficult. Being an intrinsically disordered protein (IDP) does not make things easier as IDPs lack strict structural constraints, and therefore are more permissive to substitutions . With the only restriction that the conformational flexibility not be altered, IDPs evolve faster than well-folded proteins adding higher complexity to their phylogenetic analyses . To this respect, the soluble monomer of lamprin, the non-collagen/non-elastin major connective tissue component of the lamprey annular cartilage, contains tandem repeats of the sequence GGLGY that are recognized by anti-elastin antibodies targeting the VPG repeats of elastin in a remarkable example of evolutionary convergence . In fact, more distant polypeptides such as some insect proteins or spider silks have also acquired these repeats .
Despite the difficulties, phylogenetic trees such as that shown in Figure 2 based on the central conserved region have been generated.
4. Hypotheses on the evolutionary origin of elastin
As mentioned above, genomic roots of tropoelastin trace back to the elephant shark and related species. Recent publication and open access to whole genome sequences and assemblies of lower vertebrates/chordates have not shown any traceable sign of tropoelastin-related sequences, and that was also true for genomes of invertebrates. These findings (or the lack of them) raise questions as to the origin of tropoelastin and the existence of an ancestral protein. Here, two main hypotheses are proposed to explain its emergence and further evolution: (1) tropoelastin appeared
4.1 Tropoelastin as a
Following Darwin’s postulates, the general assumption is that new genes evolve from existing ones in an endless, slow-paced journey since the beginning of life. However, recent studies are showing that this has not been always the case and that new genes can arise from the dark depths of the non-coding genome . By gaining the capability of being transcribed and translated, stretches of “junk” DNA can give rise to
4.2 Reorganization or assembly from pre-existing components
It is a recurring theme in the evolutionary history of ECM proteins that the gradual appearance of specific gene families and domains, often in pre-metazoan lineages, allowed thereafter their assembly and formation of matrix components genuine to animals . This has been the case for matrix proteins such as fibrillar or basement membrane collagens, and for matrix-remodeling enzymes like LOX (see Figure 2). The late emergence of tropoelastin does not fit with this behavior. As mentioned above, no single tropoelastin-related sequence has been found in genomes back to the elephant shark in the evolutionary scale. Or yes? Before the onset of tropoelastin, microfibrils were largely responsible for tissue elasticity in many species. Extracellular matrix structures such as the mesoglea from the cnidarian jellyfish or the blood vessels in invertebrates are functionally elastic due to microfibrils [25, 26]. These supramolecular structures, visualized as beaded filaments under electron microscopy, contain numerous proteins, being fibrillins the major constituent. Like many other ECM components, fibrillins, from which three isoforms exist in humans, fibrillin-1, −2 and − 3, are multidomain proteins that expand along a large polypeptide sequence of almost 3000 aminoacids . The large size and the variety of domains explain the existence of multiple diseases caused by defects in fibrillins, named fibrillinopathies, including various forms of Marfan syndrome, isolated ectopia lentis, kyphoscoliosis, Shprintzen-Golberg syndrome, and stiff skin syndrome, among others . Epidermal growth factor-like domains (EGF and calcium binding EGF) dominate the structure, with 46–47 repeats, followed by transforming growth factor (TGF)-β binding protein domains (TB) and hybrid domains with 8 and 2 repeats, respectively (Figure 3). TB domains are shared with latent TGF-β binding proteins (LTBP) and have served to compute phylogenetic reconstructions for these proteins . Using this approach, a TB domain-containing protein was identified in cnidarians, dating the emergence of an ancestral fibrillin to 600 Myr ago. This ancestral fibrillin, not only present in cnidarians, but also in molluscs, annelids, arthropods, echinoderms, urochordates, cephalochordates and lower vertebrates, such as the lamprey, underwent a duplication event at the divergence of jawed and jawless fishes giving to fibrillin-1 and an ancestral fibrillin-2/3. Interestingly, just before this branching, the ancestral fibrillin gained (or reshaped) a domain characterized by a high content of proline- and/or glycine termed “unique region”, claimed to provide a flexible behavior and for which a specific function has not yet been demonstrated (see also Figure 3) . In fact, when looking carefully to the these domains from different species including jawed and jawless fishes, a clear evolution from a short sequence with just a few proline and glycine residues as seen in the ascidian
5. Concluding remarks
Whether elastin evolved as a
Sequences used in this work are:
Elephant shark predicted elastin isoform X1 (
Human fibrillin-1 preproprotein (
Zebrafish fibrillin-1 (
Stickleback fibrillin-1 (
Fugu fibrillin-1 isoform X1 (
Pufferfish fibrillin 1 (
Elephant shark predicted fibrillin-1-like, partial (
Sea lamprey putative fibrillin-1 (
Lancelet putative fibrillin-1 (
We thank M. Mar Alba (Universitat Pompeu Fabra, Barcelona, Spain) for helpful comments.
I acknowledge support of the publication fee by the CSIC Open Access Publication Support Initiative through its Unit of Information Resources for Research (URICI).