Open access peer-reviewed chapter

Influence of Repeats in the Protein Chain on its Aggregation Capacity for ALS-Associated Proteins

By Oxana V. Galzitskaya

Submitted: November 19th 2015Reviewed: March 16th 2016Published: September 14th 2016

DOI: 10.5772/63104

Downloaded: 960


Studies of diseases associated with pathological irreversible aggregation of proteins have become of special relevance and attracted the attention of researchers throughout the world because of the appearance of a new conceptual model based on the capacity of some proteins to self-assemble by the prion mechanism. Along with direct prion diseases, such as bovine rabies and Creutzfeldt-Jakob disease in humans, a great number of neurodegenerative disorders associated with the formation of aggregates through the prion mechanism are revealed. These disorders include Alzheimer’s and Parkinson’s diseases, amyotrophic lateral sclerosis, Huntington disease, and mucoviscidosis, some types of diabetes and hereditary cataracts. The listed diseases are caused by transition of a “healthy” protein or peptide molecule from the native conformation to a very stable “pathological” form. In this case, molecules in the “pathological” conformation aggregate specifically, forming amyloid fibrils that can multiply infinitely. An important result of studying the molecular mechanisms of prion diseases and different proteinopathies, associated with the formation of pathological aggregations by the prion mechanism, is the discovery of protein chain regions responsible for their aggregation. The ability to regulate aggregation (fibrillation) of proteins can be the focal tool for the drug development. Herein by the example of 29 RNA-binding proteins with prion-like domains, we demonstrate what role the amino acid repeats in prion-like domains can play. For these proteins, quite different repeats are revealed in the disordered part of the protein chain predicted with bioinformatics methods. Ten proteins of the 29 RNA-binding proteins are involved in the development of some diseases. The prion-like domains of FUS, TAF15, and EWS are critical for the aggregation of proteins associated with human neurodegenerative diseases. Proteins of this family are involved not only in neurodegenerative diseases, such as amyotrophic lateral sclerosis (ALS), Huntington disease, spinocerebral ataxy, and dentatorubral pallidoluysian atrophy, but also in the formation of human mixoid liposarcoma. It can be suggested that the presence of a great number of repeats in prion-like domains of RNA-binding proteins can accelerate the formation of a dynamic beta-structure and pathological aggregates, which are crucibles of amyotrophic lateral sclerosis (ALS) pathogenesis.


  • disordered regions
  • repeats
  • motifs of low complexity
  • Alzheimer’s disease
  • amyotrophic lateral sclerosis (ALS)

1. Introduction

Misfolded proteins leading to formation of protein aggregations are a reason for many diseases. There is no definite answer to the question what causes death of cells where such aggregations have been found: is this the cell defence mechanism or plain death? In a normal state, cells that accumulate such aggregations are usually programmed to death or apoptosis. Cell fragments subjected to apoptosis are removed by phagocytic cells. However, aggregations like amyloid fibrils are known to be resistant to the action of different proteases [1] that can impede the effective termination of efferocytosis and, as a result, accumulate in tissues. In any case, there is undoubtedly close association of the formation of aggregations and the development of many fatal diseases. There are several models describing the process of fibril formation. For example, in order to begin aggregation, proteins should be preliminarily unfolded or partially folded [2]. As known, the generation of fibrils is facilitated by denaturing conditions. At the same time, aggregation of peptides and proteins involved in pathogenesis of such types of amyloidosis as type II diabetes, Alzheimer’s, and Parkinson’s diseases does not necessitate preliminary unfolding of a protein molecule. But these data sooner support the general rule, because under physiological conditions most of these proteins have no definite structure, i.e., are natively unstructured [3]. However, most natively unfolded protein in vivo does not aggregate [4]. Moreover, unstructured proteins are resistant to denaturing conditions, i.e., to the factors bringing about stress, and, in the first place, high temperatures [5]. It was demonstrated that the absence of structure does not correlate with the aggregation capacity [6]. Therefore to avoid spontaneous self-assembly of a protein molecule, the evolutionary selection has led to an increased content of such amino acids as proline and glycine that inhibit protein aggregation [7] and to an increased content of charged amino acids [8]. On the contrary, due to a large number of amyloidogenic regions, globular proteins have developed capacity to avoid aggregation because of rapid folding into a globular structure. This shows that protein unfolding is necessary but not sufficient for activation of amyloid fibrils. It is most likely that there should be special motifs of amino acid sequences exposed to the solvent, which are more liable to aggregation than other regions of the amino acid sequences. Experimental data corroborate the hypothesis that there are small regions of a protein molecule responsible to the amyloidogenic behavior [911].


2. Role of repeats in protein aggregation

It has remained unclear what the mechanism of the earliest stage of initiation of the pathologic irreversible aggregation of proteins is and how this process is triggered in healthy organisms. It is supposed that the key role in the development of systematic amyloidosis belongs to the so-called primes or factors accelerating pathologic aggregation. Such primes can be infectious agents as well protein molecule regions containing motifs of low complexity, especially when these motifs are recurrent. It was shown that the more frequently the repeats occur in a protein sequence, the less structured the protein is. Generally, most homo-repeats are not structured [1215]. Nevertheless, this is not characteristic of fibrillar proteins [16], the capacity of which to aggregation depends strongly on the amino acid sequence of the protein [1719]. For example, if the identity of the amino acid sequence of immunoglobulin domains is lower than 30–40%, the proteins lose their capacity to co-aggregation [18]. Using the bioinformatics analysis, it has been established that in a large number of multidomain proteins the identity of amino acid sequences of their domains is below 40%. It suggests a conclusion that in this way the domains avoid mutual aggregation.

The existence of repeats in proteins and the clarification of their special roles are in the focus of attention of researchers. The role of different repeats is studied actively, including repeats such as PGMG (GPGM) and PNN upon biomineralization of PM27 proteins [20], NPNA (NANP) repeats in circumsporozoite protein of Plasmodium falciparum, YSPTSPS repeats in RNA-polymerase II, PHGGGWGQ repeats in prion protein, YGHGGG(N) and YNHGGG(G) repeats in plant proteins rich in glycine, PGQGQQ, PGQGQQGQQ, and GYYPTSOQQ repeats in wheat gluten, and FGGMGGGKGG repeat in bivalves (Aequipecten abductin). As a rule, conformational loops formed by these repeats are stabilized upon interaction with different cations; they are characterized by noncovalent interactions, particularly, interactions of aromatic groups. It was demonstrated that many tandem repeats add plasticity and mobility to the protein [21]. A leader on the occurrence of repeats in proteins is P. falciparumwhere 35% proteins of 5300 have repeats. Moreover, P. falciparumcontains 24% proteins with prion-like domains rich in asparagine, whereas flies have only 3.4% of such proteins [22]. The occurrence frequency of asparagine repeats in our HRaP database ( [14] shows that P. falciparumis the leader. Of interest is the fact that one of the functions of these homorepeats is connected with the parasitic life [14]. For example, the sequence of the protein from P. falciparum(ID Q8IKW2_PLAF7 1304 aa 493.P_falciparum) has asparagine homorepeats of different lengths, the maximal length reaching 41 amino acids. The basic functions of this protein are associated with processes such as deacylation of proteins and “silence” of chromatin [14].

3. Cell stress and generation of stress granules

Cell stress can also be an important factor of initiation of aggregation even though each cell has developed an intricate defence system, because it is subjected to destructive stress action at regular intervals. A striking example of waiting till stress is over in the formation of stress granules (SG), when the nontranslated mRNA and RNA-binding proteins are assembled in ribonucleoprotein (RNP) complexes in order to terminate protein synthesis and thereby maintain cell energy. In this case, only those proteins that are synthesized are required for cell survival [23]. This means that after termination of the stress action, SGs disintegrate rapidly and the “released” mRNA resumes its functioning. However, if due to some reasons, the residence time of such proteins in SGs increases or their concentration in SGs exceeds the norm, disintegration of SGs can be impeded, creating favorable conditions for generation of the “center of aggregation initiation” that may induce transition to irreversible pathologic protein aggregation [24]. A detailed analysis of the mechanism of assembly and disassembly of SGs can provide a new insight into the development of diseases associated with this process and suggest novel therapeutic approaches. Since the main function of SGs is protection of cells from stress, many investigations are conducted to reveal factors shifting the balance of reversible aggregation towards pathology after stress termination (Figure 1).

Figure 1.

Schematic representation of different conformational states of self-assembly and disintegration of prion-like domains and a possible transition to irreversible pathologic aggregation. Modified and adapted from Li et al. [25].

It is assumed that due to self-assembly, RNA-binding proteins can facilitate formation of SGs using prion-like domains [2628]. At any rate, it has been established that namely the structured part of RNA-binding proteins is responsible for the formation of hydrogel and binding to it [29]. Thus, protein FUS maintained its capacity to form gel even without the removal of the C-terminal region that corresponds to the RNA-binding domain and lost this capacity upon removal of the N-terminal unstructured region corresponding to the prion-like domain. The capacity to bind to hydrogel has been established for proteins such as hnRNPA2, RBM3 RNA-binding proteins, hnRNPA1, TIA1, CPEB2, FMRP, CIRBP, TDP43, and yeast Sup35. The formation of hydrogel was also demonstrated for the hnRNPA2 protein, and it was shown that proteins are retained to a different degree by hydrogel formed of different proteins [29].

4. The role of RNA-binding proteins with prion-like domains in diseases

Studies of the human genome have allowed isolating a set of RNA-binding proteins with a canonical RNA-binding motif. For example, prion-like domains were predicted for 29 of the 210 RNA-binding proteins for human diseases [30]. Ten of these 29 proteins are associated with neurodegenerative diseases, i.e., proteinopathies [31]. In this chapter, we focus our attention on isolation of repeats in well-studied proteins of the FET/TET family that are included in this group of 10. To elucidate the mechanism of pathologic aggregation of proteins, we have set ourselves the task to determine what repeats of amino acid residues in RNA-binding proteins can be responsible for reversible and irreversible aggregation.

The prion-like domains predicted for TDP-43 and FUS overlap with the region containing a large number of glycine residues. The prion-like domains are critical for aggregation of these proteins associated with human neurodegenerative diseases [32]. It was found that protein TAF15 is also involved in the development of disorder such as amyotrophic lateral sclerosis (ALS) or Lou Gering’s disease [33]. At first, this protein was discovered using a bioinformatics analysis as a possible candidate and later it was revealed in ALS patients. It turned out that this protein has very similar properties to those of TDP-43 and FUS proteins involved in this disease. The only difference is that mutations of the protein associated with this disease are capable of higher aggregation in vitro than the wild-type protein, exert a stronger effect on the lifetime, and lead to incorrect localization of proteins in the spinal cord of mammals [33].

Among 29 candidates of RNA-binding proteins with predicted prion-like domains, the first and second places belong to FUS and TAF15, whereas TDP-43 is the 10th in this list [34]. Probably, it is expedient to pay attention to the first 10 proteins because the third place belongs to protein EWS [35]. They should also be detected in ALS patients.

5. Members of the FET family: FUS, EWS, TAF15

Members of the FET family are very similar RNA-binding proteins containing the following main domains in their structure: a SYGQ-rich N-terminal prion-like domain enriched with uncharged polar amino acids (asparagine, glutamine, serine, and tyrosine) and glycine [36], an RNA-binding motif (RNA-recognition motif, RRM), a “zinc finger” motif, and several RGG domains rich in glycine [3739]. Three proteins (FUS, EWS, and TAF15) also have a nuclear location signal (NLS) that is recognized by the nuclear receptor transportin and is responsible for protein transport from the cytoplasm into the nucleus and back [27, 4043]. Genes of the FET family are expressed practically everywhere. Proteins of this family are involved in regulation of different stages of gene expression, including transcription, pre-mRNA splicing, mRNA transport, and also take part in DNA repair [4446]. The family name stems from three capital letters of its three members: proteins FUS/TLS (fused in sarcoma or translocated in liposarcoma), EWS (Ewing’s sarcoma), and TAF15 (TATA-binding protein-associated factor 2N) [47]. These RNA-binding proteins participate in many cell processes, including transcription, pre-mRNA splicing, DNA repair, and mRNA transport in neurons [4446]. As a rule, under stress conditions such RNA-binding proteins shift from the nucleus into the cytoplasm and take part in the formation of SGs, after which they return to the nucleus. This cycle is multiply repeated during the cell lifetime, and it is not excluded that under such conditions errors may occur in its functioning. The mechanism of activity of FET proteins, their cellular localization, and determination of domains involved in different processes are to be clarified.

Prion-like domains of these proteins associated with human neurodegenerative diseases are critical for their aggregation [32]. Proteins of this family are involved not only in neurodegenerative diseases, such as ALS [33], Huntington’s disease, spinocerebellar ataxia, and dentatorubral cerebellar atrophy [48], but also in the development of human mixoid liposarcoma [4951].

5.1. FUS: Its functions, structural peculiarities of a prion-like domain, mutations associated with diseases

A member of the FET family is the RNA-binding protein FUS, for which a number of mutations associated with ALS have been revealed [27, 4043, 52]. Under normal conditions, this protein is localized mostly in the nucleus, whereas under pathologic conditions its aggregations are accumulated in the cytoplasm. It was demonstrated on several cell models that FUS delocalized in the cytoplasm was accumulated in SGs. This permitted a conclusion that a high concentration of the protein facilitates generation of protein aggregations under FUS pathology [53, 54]. Nonetheless an exact mechanism by which this transformation occurs remains unclear. It is not excluded that prion-like domains affect aggregation-like prions or fibril formation analogous to amyloidogenesis. It is also possible that FUS, accumulated in large amounts, aggregates in the cytoplasm independently of its capacity to sequestration in SGs. The protein accumulation can also facilitate dysfunction of the protein intracellular degradation systems that frequently occurs parallel to neurodegenerative disorders [55]. It should be noted that under sporadic ALS diseases, the control of splicing of a large amount of genes is violated [56]. The FUS protein also regulates alternative splicing, as a rule binding to transcripts containing large introns. Therefore, accumulation of FUS in the cytoplasm can result in the loss of its function in the nucleus and consequently violation of the alternative splicing of a number of genes, e.g., coding proteins associated with the growth and development of axons and the cell cytoskeleton [57]. This version is in agreement with the results obtained on a drosophila model, when an increased FUS concentration in the cell was extremely toxic for organisms that can also cause dysfunction during splicing. In addition, enhanced expression of the gaz gene of the fus ortholog caused death at the pupal stage [58].

In transgenic Caenorhabditis elegansworms, in which serine 57 was removed in protein FUS, paralysis occurred much earlier when compared to transgenic species with a full-sized FUS protein [59]. Therewith, the sequence of the FUS protein itself corresponded to that of the FUS protein from the human proteome. This mutation (removal of serine 57) is associated with the sporadic form of ALS [60]. It has been shown that for aggregation and toxicity of yeasts a prion-like (1–239) and an RGG2 (374–422) domains are required, though FUS (1–359) is sufficient for aggregation in models of neuroblastoma cells [54]. Domain RGG3 does not affect aggregation, and mutations in (502–526) lead only to the protein accumulation in the cytoplasm and its insertion in SGs. In yeasts, FUS was accumulated only in the cytoplasm [61]. Some components of SGs, such as translation initiation factors and poly-A-binding proteins, suppress toxicity of FUS aggregations. It is important that many proteins associated with RNA metabolism can affect the toxicity of FUS aggregations.

To verify the hypothesis that dislocation of FUS from the nucleus into the cytoplasm leads to the loss of its function in the nucleus, Murakami et al. created transgenic animals with double gene expression. One of these genes (fus) having the N-terminus, labeled with red fluorescent protein RFP, could not move into the cytoplasm under the action of stress, and the other gene (mutant GFP-fus-P525L) was uniformly distributed both over the nucleus and the cytoplasm. After heat shock, most of GFP-FUS-P525L moved into the SGs, whereas GFP-RUS remained in the cytoplasm. Therewith, the damaging effect was the same as in the experiment with expression of only GFP-fus-P525L. Therefore, the authors concluded that accumulation and aggregation of FUS in the cytoplasm are more neurotoxic than that when FUS lost its function in the nucleus [62].

In any case, formation of such aggregations is the reason for apoptosis, i.e., the process of programmed cell death. Apoptosis is characterized by retaining fragmentation of intracellular components with retention of the integrity of the plasmatic membrane that facilitates fast phagocytosis.

5.2. Role of FET proteins in the formation of stress granules

To understand the mechanism of assembly and disassembly of SGs, it is necessary to know what regions of the chain of RNA-binding proteins can perform the function of a prime and what the role of unstructured regions of simple complexity is. It is worth noting that in many respects protein FUS is a perfect model for studying processes involved in the formation of protein aggregations by the prion mechanism. To search for and reveal properties of prion-like domains, we have chosen three proteins from the FET/TET family of 29 RNA-binding proteins of the human proteome, the structures of which included prion-like domains [31]. The prediction was made using the algorithm developed by Alberti et al. [63], which is based on the choice of protein regions of 60 amino acid residues, similar in the amino acid content to the prion domains of yeast proteins, such as Sup35, Ure2p, and Rnq1p [64]. As a rule, these regions are rich in hydrophilic amino acid residues such as glutamine, asparagine, and tyrosine. In the range of proteins used in the prediction of prion-like domains, the first and second places belonged to FUS and TAF15 among 29 candidates of RNA-binding proteins, and the third was protein EWS [35].

The experiments, devoted to disclosing the capacity of protein FUS to aggregate, demonstrated that upon deletion of the most part of the predicted prion-like domain the protein lost its capacity to self-assemble; however, the formed aggregations did not reveal toxicity (Figure 2) [67]. Some components of SGs, such as translation initiation factors and poly-A-binding proteins, suppress toxicity of FUS aggregation [67]. It is important that proteins associated with RNA metabolism can affect the toxicity of FUS aggregations. Prion-like (1–239) and RGG2 (374–422) domains are also required for aggregation and toxicity of yeasts, although FUS (1–359) is sufficient for simulations on the neuroblastoma cell culture [68]. Domain RGG3 does not affect aggregation, and mutations at (502–526) result only in accumulation of the protein in the cytoplasm and its insertion in SGs [24, 69]. It should be mentioned that in contrast to mammalian cells, in yeast cells protein FUS is accumulated largely in the cytoplasm [61]. This may be connected with the fact that NLS FUS is not recognized by nuclear receptors of yeasts [65]. In mammalian cells, protein FUS is accumulated in the cytoplasm only when it has mutations distorting the reverse transport into the nucleus [69].

Figure 2.

Effect of different constructions of FUS on aggregation and toxicity in yeasts and aggregation and localization in SGs in cell culture of the SH-SY5Y neuroblastoma [65,66].

The N-terminal domain of FUS has 27 different variants of GYG, GYS, SYG, and SYS triplets (that can be designated as [G/S]Y[G/S] repeats) [70]. Four mutants were obtained with a different number of substituted tyrosine residues for serine ones to demonstrate that namely tyrosine residues are responsible for the formation of hydrogel. There were 5, 9, 15 substitutions and all 27. Neither of the mutants could form hydrogel, however, all of them could equally well bind to it. Mutants with substituted residues 5 and 9 could bind to hydrogel, but the remaining mutants could not [70].

5.3. Disordered regions in proteins of the FET family and search for partners for interactions with these proteins

For all three proteins, the IsUnstruct program [15] predicts the presence of unstructured domains, as a rule, at the N and C termini of the polypeptide chain. Unstructured proteins often play the role of hubs, i.e., have a capacity to concentrate a large number of partners around them (it is accepted that when there are more than five partners, it is a hub) [71]. To what extent is this role validated? To answer this question, it is necessary to determine the presence of functional sites in the protein considered. The search for the number of partners in the STRING database revealed that it exceeds 5 [72] (see Figure 3). Usually upon binding to a partner, natively unfolded proteins can acquire a structure that imparts a certain function to them. In other words, the conformation of unstructured proteins is “dictated” by the interaction partners. This explains their capacity to perform different functions both in the cell and in the extracellular space. For example, the analyzed RNA-binding proteins have regions with large amounts of glycine in addition to large amounts of asparagine, glutamine, and tyrosine, which also facilitates their unfolded state and performance of various functions in the cell because these proteins are involved in the formation of RNP complexes, control of DNA transcription, pre-mRNA splicing, protein posttranslational modification, and many other vital functions [73]. According to the STRING database (version 10), the number of partners is 44 for TAF15, 132 for EWS (Figure 3), and 218 for FUS.

Figure 3.

The list of partners for EWS has been obtained from the STRING database using the information from the data bases, data about homologies, possible co-expression, experimental confirmation about interactions, etc. The boundary condition for entry in the list of partners is probability of interactions equal to 0.4. Transparency of edges connecting vertices in the graph designates probability of interaction. The more transparent of edge, the less probability of interaction.

These proteins contain many motifs of simple complexity. As shown, the portion of proteins, included in periodic fragments or homorepeats, is an order of magnitude lower in eukaryotic proteomes than in bacterial ones [74, 75]. The proteins with periodic fragments are extremely nonuniformly distributed both over the kingdoms and over organisms within each kingdom [74, 75]. It is worth underlining that these repeated motifs can be located in the region not predicted as prion-like. It is known that protein Ure2 has regions in the carboxy-terminal domain affecting the capacity of the amino-terminal domain to become prion-like [76]. The available data for FUS allow us to conclude that the presence of both a prion-like domain and a C-terminal region corresponding to the RGG2 region (see Figure 2) is important for pathologic aggregation. It is most likely that the RNA-binding domain also contributes to the pathologic aggregation. The RNA-binding motif (RRM) in proteins RUS and TAF15 is highly identical and is retained from species to species. On the contrary, the RNA-binding motif (RRM) in protein EWS differs remarkably from other members of the family and does not preserve 100% identity in different species. For example, for FET proteins, the following amyloidogenic regions, revealed using program FoldAmyloid [77], can be indicated in the RNA-binding domain (RRM): AIYVQ/ADFFK/MIHIYL/VEWFD for EWS; TIFVQ/INLYT/IDWFD for TAF15; and TIFVQ/INLYT/IDWFD for FUS.

The prediction of unstructured regions in proteins of the FET/TET family using program IsUnstruct allowed us to isolate two unstructured domains at the N and C termini and one structured region, corresponding to the RNA-binding domain. Figure 4 shows probability profiles for amino acid residues of the FET/TET proteins, which make it possible to determine whether they are structured or unstructured. Motifs in the amino acid sequence are shown by different colors. These proteins are characterized by the presence of a large number of homorepeats when one amino acid is recurrent many times. Generally, the longer are the repeats, the higher is the probability that the aggregated protein containing and it is associated with the development of a disease. For example, protein FUS is characterized by five unstructured patterns from the pattern library obtained by us from the Protein Data Bank: GSHM, GGGGSGG, GGGGG, GGSGGGGSGGG, and RGGGGSG. The occurrence of these patterns in the given protein in different organisms (human, monkey, pig, mouse, quicken and fish) can be found in our HRaP database, containing data on the occurrence of unstructured patterns and homorepeats in 122 proteomes [14]. For protein TAF15, the glycine-rich recurrent motif is well isolated at the C-terminus: DRGGGYGG/DRSSGGGYSG/DRGSRGGYGG, which is characteristic of many animals and fish [78]. We isolated 22 repeats, and the Uniprot program finds 21 repeats at the C terminus: GRGGRGG/DRGGYGG (Figure 4). As concerns EWS, we can observe 14 repeats as SYSQAPS in the prion-like domain (the N-terminal part) and 6 repeats (DRGRGGPGG) in the C-terminal part (Figure 4). It should be noted that 15 imperfect repeated motifs (QPGQGYSQQSS) are positioned in a prion-like region (the N-terminal part) and four repeats (DDRRGGRGGY) in the C-terminal one for FUS (Figure 4).

As known, protein regions enriched in glycine residues cannot have a rigid spatial structure; therefore, the main function characterizing this protein region is determined by a number of adjacent amino acid residues. It should be noted that in the mentioned proteins with high toxic aggregation, the glycine repeats adjoin arginine, serine, and tyrosine. Domains rich in glycine and arginine (RGG) are known to be responsible for the interaction of proteins with each other and with RNA. As a rule, these interactions are controlled by methylation of arginine [79], whereas phosphorylation of serine residues affects the direct mutual interaction of prion-like domains [45, 80]. Of interest is the fact that deletion of serine (S57), the mutation associated with sporadic form of ALS [81], induced paralysis in transgenic C. elegansspecies, that took place much faster than in transgenic species with a full-sized human FUS [82]. These data make it possible to suggest that phosphorylation and dephosphorylation of serine residues are critical not only for self-assembly of prion domains but also for disassembly by aggregation when required, e.g., termination of stress action on the cell. It was assumed that the presence of tyrosine residues facilitates the formation of hydrogel. In this connection, it should be mentioned that in spite of high similarity of these proteins, relative to other members of the family, the cytoplasmic aggregation of FUS is more toxic that correlates with longer glycine repeats in the amino acid sequence of FUS [33].

Figure 4.

Probability profiles of amino acid residues in proteins of the FET/TET family (A for FUS, B for TAF15, and C for EWS) according to which it is possible to predict possible formation of the structure or its absence by the IsUnstruct program [15].

5.4. Zinc-finger motif in proteins of the FET family

All proteins of the FET/TET family are characterized by the presence of a zinc-finger motif. The exclusion is the EWS protein in chickens, because here the zinc-finger motif has not been determined [78]. As known, a classic zinc-finger motif forms a loop, where two cysteine residues and two histidine residues bind zinc ions. The main function of a classic zinc-finger motif is the binding of DNA, which corresponds to its structure consisting of two to three beta-sheets in the N-terminal region of the protein and one alpha-helix in its C-terminal region. As for the FET/TET family of proteins, the amino acid sequence of the zinc-finger motif in them differs significantly from the classic consensus motif (Cys-X2–4-Cys-X3-Phe-X5-Leu-X2-His-X3-His) [83]. It should be noticed here that the amino acid sequence of this motif in proteins FUS and EWS is highly similar, which is preserved in all organisms studied by us. Our plots on the prediction of the structure demonstrated quite well the correspondence of this motif and the predicted structure (Figure 4: proteins FUS and EWS, second peak from the bottom, blue). On the contrary, in TAF15, this motif differs somewhat not only from FUS and EWS but also in the organisms studied by us; according to our prediction, it forms no structure (Figure 4). It is important that in proteins of the FET/TET family the zinc-finger motif occurs only once, contrary to the classic variant when it occurs as tandem repeats. As a rule, if the zinc-finger motif occurs once and its sequence differs considerably from the canonical one, the functions of this motif can differ remarkably from the classic motif. For example, it can both be bound to RNA and have no relation to the binding of nucleic acids [83]. The removal of this motif together with the terminal part of the FUS molecule did not affect the protein ability to aggregate and have toxicity either in yeasts or in the cell culture of the SH-SY5Y neuroblastoma [67]. Additional studies should be conducted to reveal the functions of the zinc-finger motif in proteins of the FET/TET family.


6. Conclusion: Repeats are a general characteristics of prion-like domains

In this study, we have analyzed RNA-binding proteins with prion-like domains, such as TAF15, FUS, and EWS. Using the FoldAmyloid program [77], we revealed the existence of amyloidogenic regions in the RRM domain in all three proteins of the FET family. This allows us to suggest that when the binding to RNA is violated, the proteins can aggregate spontaneously forming amyloid fibrils. In this case, protein FUS can aggregate both upon the removal of RRM and at point mutations in this domain [65, 66]. Further studies should be conducted to clarify whether aggregation of protein FUS in the presence of RRM does not lead to the formation of irreversible pathologic aggregations and functions only by the mechanism of assembly and possible disassembly under favorable conditions analogous to the functioning of SGs. We have also found that in all members of the FET/TET family the zinc-finger motif does not correspond to the classical one because it concerns both the location of tandem repeats and the amino acid sequence. Moreover, according to our prediction, in the FUS and TAF15 proteins, it does not even form any structure. Based on this, we can propose that in proteins of the FET/TET family the zinc-finger motif is not responsible for the DNA binding, but performs other functions determination of which requires further investigations.

We have also predicted unstructured regions corresponding both to prion-like domains and to additional regions, where we have revealed several recurrent amino acid motifs—tandem repeats. In addition, these proteins are characterized by the occurrence of homorepeats when one amino acid is multiply recurrent. The length of homorepeats can affect not only the capacity to aggregate but also the toxicity in an aggregated state. Correspondingly, the larger is the number of repeats, the higher is the probability that the aggregated protein containing them is associated with the development of the disease. We have established that protein FUS contains five disordered patterns from the pattern library obtained from the protein data bank: GSHM, GGGGSGG, GGGGG, GGSGGGGSGGG, and RGGGGSG, which are not located in the prion-like domain predicted for FUS. For other members of this family, these patterns are not characteristic and, as known, are less toxic when compared to FUS. When a larger part of the prion-like domain in protein FUS was removed (100 residues of the 165), it retained its capacity to aggregate [65], but lost completely the ability to form gel and display toxicity. These results allow us to suggest that the presence of short repeats in the unstructured prion-like domain of RNA-binding proteins is required for fast formation of a dynamic cross-beta structure of SGs [84].


I thank T.B. Kuvshinkina, E.I. Leonova, I.V. Sokolovsky, and N.V. Dovidchenko for assistance in the chapter preparation. This study was supported by the Russian Science Foundation No. 14-14-00536.

© 2016 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Oxana V. Galzitskaya (September 14th 2016). Influence of Repeats in the Protein Chain on its Aggregation Capacity for ALS-Associated Proteins, Update on Amyotrophic Lateral Sclerosis, Humberto Foyaca Sibat and Lourdes de Fatima Ibañez Valdés, IntechOpen, DOI: 10.5772/63104. Available from:

chapter statistics

960total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

MicroRNAs in Amyotrophic Lateral Sclerosis

By Maria Teresa Gonzalez Garza

Related Book

First chapter

Psychosocial and Cultural Aspects of Epilepsy

By Hamdy Fouad Moselhy

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us