Molecular Evolution Within Protease Family C2, or Calpains

Calpains, or the intracellular Ca2+-dependent proteases (EC 3.4.22.17, family C2, cysteine protease clan CA) are one of the most important proteolytic systems in cytosol of eukaryotic and some prokaryotic cells (reviews: Croall & DeMartino, 1991; Goll et al., 2003; Sorimachi et al., 2011a). Calpain family is ancient and diverse since highly variable modules flanking conservative proteolytic core are found in the structure of these proteins. Linking of protease catalytic domain with ancillary domains, i.e. specialized functional modules with their own spectrum of non-proteolytic activities and binding partners, expands calpain functions on multiple cellular processes. Calpains are processing proteases cleaving their specific substrates at one or a limited number of sites to modulate their structure and activity rather than degrade it. Calpains are versatile proteases that have been implicated in diverse cellular signaling pathways mediated by calcium, such as cytoskeleton remodeling, cell-cycle regulation, differentiation, and death (Croall & DeMartino, 1991; Carafoli & Molinari, 1998; Goll et al., 2003; Nemova et al., 2010). Calpains were also considered to participate in chromosome rearrangements during mitosis (Schollmeyer, 1988), microtubule assembly and disassembly (Billger et al., 1988), intracellular signaling, motility, and vesicle traffic (Choi et al., 1997; Huttenlocher et al., 1997; Lu et al., 2002; Li & Iyengar, 2002). In mammals and plants it is clear that calpains are of critical importance for development (Wang et al., 2003; Dutt et al., 2006; Sorimachi et al., 2010). Furthermore, impaired calpain activity due to mutations or misregulation of the calpains has been implicated in a variety of pathological conditions including muscular dystrophy, ischemia, diabetes mellitus, cancer, and neurodegenerative disease (Tidball & Spencer, 2000; Huang & Wang, 2001; Crocker et al., 2003; Mamoune et al., 2003; Suzuki et al., 2004; Zatz & Starling, 2005). Calpains have attracted much attention because of the recent discovery of correlations between calpain gene mutations and human diseases, together with elucidation of its three-dimensional structure (Hosfield et al., 1999; Strobl et al., 2000) and Ca2+-induced activation mechanisms (Moldoveanu et al., 2002; 2004). Because the enzyme participates not only in normal intracellular signal transduction cascades but also in various pathological states, calpain research has attracted tremendous interest in wide areas of life sciences in both basic and clinical terms.


Introduction
Calpains, or the intracellular Ca 2+ -dependent proteases (EC 3.4.22.17, family C2, cysteine protease clan CA) are one of the most important proteolytic systems in cytosol of eukaryotic and some prokaryotic cells (reviews: Croall & DeMartino, 1991;Goll et al., 2003;Sorimachi et al., 2011a). Calpain family is ancient and diverse since highly variable modules flanking conservative proteolytic core are found in the structure of these proteins. Linking of protease catalytic domain with ancillary domains, i.e. specialized functional modules with their own spectrum of non-proteolytic activities and binding partners, expands calpain functions on multiple cellular processes. Calpains are processing proteases cleaving their specific substrates at one or a limited number of sites to modulate their structure and activity rather than degrade it. Calpains are versatile proteases that have been implicated in diverse cellular signaling pathways mediated by calcium, such as cytoskeleton remodeling, cell-cycle regulation, differentiation, and death (Croall & DeMartino, 1991;Carafoli & Molinari, 1998;Goll et al., 2003;Nemova et al., 2010). Calpains were also considered to participate in chromosome rearrangements during mitosis (Schollmeyer, 1988), microtubule assembly and disassembly (Billger et al., 1988), intracellular signaling, motility, and vesicle traffic (Choi et al., 1997;Huttenlocher et al., 1997;Lu et al., 2002;Li & Iyengar, 2002). In mammals and plants it is clear that calpains are of critical importance for development (Wang et al., 2003;Dutt et al., 2006;Sorimachi et al., 2010). Furthermore, impaired calpain activity due to mutations or misregulation of the calpains has been implicated in a variety of pathological conditions including muscular dystrophy, ischemia, diabetes mellitus, cancer, and neurodegenerative disease (Tidball & Spencer, 2000;Huang & Wang, 2001;Crocker et al., 2003;Mamoune et al., 2003;Suzuki et al., 2004;Zatz & Starling, 2005). Calpains have attracted much attention because of the recent discovery of correlations between calpain gene mutations and human diseases, together with elucidation of its three-dimensional structure (Hosfield et al., 1999;Strobl et al., 2000) and Ca 2+ -induced activation mechanisms (Moldoveanu et al., 2002;. Because the enzyme participates not only in normal intracellular signal transduction cascades but also in various pathological states, calpain research has attracted tremendous interest in wide areas of life sciences in both basic and clinical terms.
Calpains are not specific for certain amino acid residues or sequences but recognize bonds between domains. As a consequence, calpains hydrolyze substrate proteins in a limited manner, and large fragments retaining intact domains are produced by hydrolysis. Thus calpains are regarded as the biomodulators of its substrate proteins (Saido et al., 1994;Sorimachi et al., 1997;Carafoli & Molinari, 1998;Suzuki et al., 2004). Not all in vitro calpain substrates are cleaved by calpains in vivo. Certain extracellular proteins, such as fibronectin or factor V, are unlikely cleaved by calpains in vivo as the enzymes are localized exclusively intracellularly except in diseased or injured tissues.

Some well-documented and proposed calpain functions
Wide distribution of the protease family among living organisms indicates evolutionary conservation of the essential calpain functions. The physiological cell-based techniques show that the calpains system are involved in functions as diverse as cytoskeletal/plasma membrane attachments, cell motility, signal transduction by the activation of some signaling molecules and assembly of focal adhesions, progression through the cell cycle and regulation of gene expression, some apoptotic pathways, and long-term potentiation (review: Goll et al., 2003).
Modifying integrins, cadherins, and cytoskeletal proteins calpains may contribute to the developmental regulation of cell adhesion (Franco & Huttenlocher, 2005;Mellgren et al., 2009). Some cases of membrane fusion, as in myoblast differentiation (Kwak et al., 1993), have also been reported to involve calpain activity. In S. mansoni, calpain may regulate surface membrane biogenesis (Sorimachi et al., 1993). A controversial but attractive topic is the possible involvement of calpain in cell division (Schollmeyer, 1988) and in long-term potentiation (del Cerro et al., 1990;Suzuki et al., 1992).
Non-proteolytic calpain homologues with substituted amino acids in classical Cys/His/Asn triad or truncated catalytic core domain supposed to be involved in calpain system regulation by non-productive substrate binding as well as inhibitor binding. The existence of calpains with this structural feature indicates that they have specific non-proteolytic functions the nature of that has not been established yet. The non-proteolytic function(s) may be extended on proteolytically active homologues as well.

Pathological implications of calpains
A number of pathologic conditions have been associated with disturbances of the calpain system. Calpain-associated pathologies, so called "calpainopathies", are related with genetic defects in calpain polypeptides as well as with calpain misregulation due to loss of Ca 2+ homeostasis in cell.
Mutations in some calpain genes are also linked with diseases such as senile diabetes and muscular dystrophy. Disease-related disruptions in CAPN3 gene are linked to loss of proteolytic activity of muscle-specific calpain 3a (Zatz & Starling, 2005;Dugues et al., 2006). Specific splice variants of CAPN3 occur in the lens of the eye and are linked to the formation of cataracts (Shih et al., 2006). Limited proteolysis of crystallins by abnormally activated calpain is likely to induce lens protein insolubilization during aging. Variations in the CAPN10 gene may be a factor contributing to increased susceptibility to a multifactorial disease -type 2 diabetes (Horikawa et al., 2000;Goll et al., 2003;Turner et al., 2005). Calpain 10 may contribute to stimulated secretion and(or) pancreatic cell death (Johnson et al., 2004;Marshall et al., 2005), and thereby be relevant to this disease.
Calpainopathy appears to be primarily caused by compromised protease activity, rather than by damaged structural properties. Calpains are believed to be strongly related to certain conditions accompanied by calcium misregulation including neurodegeneration. Calpains are involved in long-term potentiation, Alzheimer's disease (Suzuki et al., 1992;Saito et al., 1993;Muller et al., 1995) and ischemia (Blomgren et al., 1995;review: Siman et al., 1996). Taking into account the absence of brain-specific calpain species ubiquitously expressed μand(or) m-calpains are involved in brain-specific functions. Neuronal degeneration in postischemic brain has been shown to involve the activation of various signal transduction-related elements such as excitatory amino acid receptors, protein kinase C, μ-calpain, and other Ca 2+ -dependent enzymes (Schmidt-Kastner & Freund, 1991).
Besides specific role of tissue-specific calpains in target organs, conventional calpains tend to be overactivated in muscular dystrophies, cardiomyopathies, traumatic ischemia, and lissencephaly, probably due to compromised intracellular Ca 2+ homeostasis caused by these diseases. These observations suggest that calpains function as the mediators of the pathological process, not necessarily as the ultimate destroyers of cells. The involvement of calpain in pathological states is of particular clinical interest because it is expected that further research may eventually lead to therapeutic applications.

Catalytic core
Calpain domain II is the proteolytic papain-like domain with typical catalytic triade Cys/His/Asn (clan CA) (Arthur et al., 1995) ('CysPc' or cd00044 in the NCBI conserved domain database) that shares however little sequence homology with other cysteine proteases, and it is likely that it evolved from a different ancestral gene. Predicted subdomain IIa contains active site Cys105, whereas the remainder amino acid residues of catalytic triade (His262 and Asn286) are located in subdomain IIb. The crystallographic data indicate that the catalytic residues of subdomains IIa and IIb are uncoupled in the absence of Ca 2+ (distance 10.5 Å) (Hosfield et al., 1999;Strobl et al., 2000), and Ca 2+ binding induces conformational changes in the full-length molecule, allowing assembly of active site. Domain II has affinity to Ca 2+ essential for enzyme activity since contain two Ca 2+ -binding sites located in peptide loops (Moldoveanu et al., 2004).
Due to complex domain structure of calpains with highly divergent N-and C-terminal regions flanking conservative catalytic core only catalytic domain II sequence but not complete amino acid alignment is used to establish protein homology. The degree of similarity that allows assigning a polypeptide to the calpain family has not been determined yet. Depending on the molecule, the amino acid identity in domain II region varies from less than 30% to more than 75%. For example, recently assigned to family protease from prokaryote Porphyromonas gingivalis contains catalytic domain almost equally similar to μcalpain on the one hand and papain sequence from the other hand and the enzymatic properties of the protease drastically differ from the characteristics of μand m-calpains (Bourgeau et al., 1992).
Some family members (calpain 6 and demi-calpain of vertebrates, CALPC of Drosophila, some of C. elegance homologues, and numerous calpains of kinetoplastid parasites), seem to be catalytically inactive proteases owing to substitutions in specific amino acid residues (one or more) that are located in critical active-site regions or constitutively truncated catalytic domain.

Other structural and functional modules
Several functional domains are found on N-or C-terminus of protease core. Calpain genes are the products of evolutional combinations of several ancestral unit genes, that is, genes for the C2-like, penta-EF-hand, C2-containing T, SOH, PBH, calpastatin-like, Zn-finger, and transmembrane domains.
A typical member of C2 family, calpain 2 (or the catalytic 80-kDa subunit of m-calpain), is a prototype used for designation other polypeptides as calpains. Calpain 2 has classical fourdomain architecture, where domain II is proteolytic (Fig. 1), domain III is C2-like domain and domain IV is Ca 2+ -binding domain of PEF family. Other сalpain homologues have a combination of the "classical" domains and the others listed above. Based on the amino acid sequence, calpain 2 consists of four domains (Suzuki, 1990); according to the X-ray diffraction analysis data (Hosfield et al., 1999;Strobl et al., 2000), in the absence of Ca 2+ domain II is subdivided on two subdomains carrying different components of catalytic triade (Cys or His/Asn, respectively), and there is a linker region between domains III and IV.
N-terminal domain I is only 18-20 amino acid residues in length and has no sequence homology with any known polypeptide. This propeptide region is autolyzed upon activation. According to crystallography data, the propeptide does not block sterically the active site of the enzyme (Hosfield et al., 1999;Strobl et al., 2000) as it is shown for a most of protease zymogens; the activation mechanism for calpains is recognized as calcium-induced interdomain autoactivation (Goll et al., 2003;Benyamin, 2006).
Domain III provides the coupling of the catalytic domain II and the Ca 2+ -binding domain IV and accelerates Ca 2+ -induced conformational changes (Tompa et al., 2001). The spatial organization of domain III is determined by eight antiparallel -strands. This domain shows a functional (but low sequence) resemblance to other Ca 2+ -regulated proteins such as the conserved domain 2 of protein kinase C, or C2; owing to this fact, calpain domain III is assigned as C2-like domain (or C2-L) (Fig. 1). Functionally C2 and the C2-L domains have an affinity to Ca 2+ and enable to associate with internal membrane phospholipids in Ca 2+dependent manner (Kawasaki & Kawashima, 1996;Hosfield et al., 1999;Tompa et al., 2001). Analysis of domain III amino acid sequence indicates that it includes two potential Ca 2+binding EF-hand motifs functionally active in the calpain from flat worm S. mansoni but apparently not binding Ca 2+ in mammalian calpains (Goll et al., 2003).
The sequence of calpain 2 domain IV represents a distinct module in the protein; it has a similarity with calmodulin (24-44% identical residues for human molecules) and contains five putative Ca 2+ -binding EF-hand motifs ) ( Fig. 1) thus classical calpains belong to penta-EF-hand (PEF) protein family. In addition several alternative mechanisms for binding calcium and associating with membrane phospholipids are found throughout the family. Since several substrates have been shown to bind to domain IV (Noguchi et al., 1997;Shinozaki et al., 1998), it is likely that this domain is important for substrate recognition by calpain. In this case, the function can be assumed by other proteininteractive domains, such as the C2-containing domain T, and, possibly, the SOH and PBH domains (see below).
SOH domain -conservative C-terminal module with putative substrate-recognition function in protein SOL (small optic lobes) of Drosophila or in calpain 15, vertebrate homologue of SOL.
Zn-fingers motifs localized into N-terminal Zn-finger domain of some calpains, such as SOL (Kamei et al., 1998), function as intermolecular interaction modules enabling to bind DNA, RNA, proteins or small molecules.
T-module, or classic C2-domain, shares both structural and functional similarity with conserved domain 2 of protein kinase C, or C2. This Ca 2+ -binding module is associated with membranes and involved in signal transduction and transmembrane traffic (Barnes & Hodgkin, 1996).
PBH domain, or MIT (microtubule-interacting and transport) domain, is usually found in AAA-ATPases and some proteins lacking ATPase domains, such as PalB from Aspergillus (Mugita et al., 1997); it is responsible for the association with microtubules and intracellular traffic of molecules.
Domain IQ found only in calpain 16, or demi-calpain, is responsible for the interaction with Ca 2+ -binding molecule -calmodulin in the absence of their own Ca 2+ -binding motifs.
Some calpains from protists contain domain CSTN weakly similar to calpastatin; their role is not elucidated yet.
Transmembrane domain (usually in tandem) was found on N-terminus of plant calpain, or phytocalpain, and some ciliate calpains ascribed to constitutive membrane anchoring proteins. Although membrane binding is not well substantiated for classical calpains, predicted transmembrane segments in phytocalpain and some ciliate calpains suggest an evolutionary link between calpain function and membranes.
Domain I K (K for kinetoplastids) of calpain-like proteins -possesses a high degree of sequence conservation in one of two kinetoplastid phyletic branches and contains mainly three consensus motifs: GLLF/Y toward the N-terminus, WAFYNDT in the center, and VYPxETE toward the C-terminus; its function is not elucidated yet. Only in I K -containing kinetoplastid proteins the acylation motifs are localized. The other kinetoplastid calpain-like proteins have domain I H (H for heterogeneous) -N-terminal sequence highly variable in both intra-and interspecies comparisons (Ersfeld et al., 2005).
Critical role in calpain activation and calpain response to Ca 2+ signaling play temporary interaction or co-localization with cellular membranes (Cong et al., 1989;Gil-Parrado et al., 2003). Several ancillary domains or even auxiliary non-calpain polypeptides are responsible for such interaction: (1) acylated N-terminal region in calpain-like proteins of the kinetoplastids L. major and T. brucei allowing association with the cytoplasmic face of membranes and lipid rafts and contribution to signal transduction (Tull et al., 2004;Croall & Ersfeld, 2007); (2) Gly-clustering (conserved peptide GTAMRILGGVI) located in small subunit N-terminal domain and formed a membrane-penetrating -helical structure (Dennison et al., 2005), providing a mechanism for calpains 1 and 2 binding to membranes; (3) for many calpains, the C2-L domain III provides an additional or alternative mechanism for membrane association via its phospholipid-binding properties. None of the conventional calpains is acylated, however, that some of the established in vivo substrates of mammalian calpains, the cytoskeletal proteins vinculin, spectrin, ankyrin, and band 4.1, are themselves acylated and membrane-associated (Maretzki et al., 1990;Bhatt et al., 2002).

Classification of calpains
Conventionally calpains are subdivided on the basis of domain composition onto two classes -typical and atypical and on the basis of polypeptide composition onto monomeric and oligomeric proteins. Functionally it is rationally to separate calpains on ubiquitous and tissue-specific as well as on constitutively proteolytically active and inactive enzymes.

Structural classification
Based on the similarity of the domain organization to that of calpains 1 or 2 and the presence of penta-EF-hand motif in domain IV, it is proposed to divide calpains into two general classes -typical and atypical. Typical calpains have a C-terminal calmodulin-like domain carrying EF-hand motifs lacking in atypical calpains. Atypical calpains further subdivided into six groups based on homology in the region of ancillary domains (Fig. 1).
Typical calpains include fifteen known molecules: (1) ten vertebrate proteins such as calpains 1-3, 8, 9, 11-14, and μ/m-calpain from chicken and Xenopus, (2) three calpains from Drosophila such as CALPA, CALPB, and CALPC, and (3) two trematode proteins Sm-calpain from Shistosoma mansoni and Sj-calpain from S. japonicum (reviews: Goll et al., 2003;Sorimachi et al., 2011a). The complete amino acid sequences of these proteins are highly conservative and folding is described as "classical" four-domain architecture. Some data suggest that calpains 12 and 14 have domain IV with degenerative EF-hand motifs that are unlikely to bind calcium (Croall & Ersfeld, 2007); thus these molecules could be functionally assigned to atypical group. probably the interaction of these enzymes with Ca 2+ may involve structures distinct from EF-hand motifs such as located in domain III (Hood et al., 2004;Shao et al., 2006;Samanta et al., 2007) or some possible cofactors (Goll et al., 2003). Conserved domains found on Cterminus instead of the calmodulin-like domain IV permits one to divide atypical calpains on SOL and PalB subfamilies (Fig. 1); and the latter includes calpains 5, 6, 7 and 10 because of structural similarity and evolutionary relation of domains III (C2-L) and T (C2).
Other criterion for structural classification of calpains is their polypeptide composition. While a most of identified calpains are monomeric proteins there are some calpains which potentially form hetero-or homooligomers in the native form. So, μand m-calpains in vivo and in vitro forms heterodimers with 28 kDa polypeptide (calpain small subunit, or CSS1, formerly referred as calpain 4) through fifth EF-hand motif in domain IV and similar EFhand motif in 28 kDa subunit. It means that this EF-hand fragment does not bind Ca 2+ but is involved in dimerization instead. Full proteolytic activity of digestive tract-specific calpain 9 (nCL-4) also requires dimerization with CSS1 (Lee et al., 1999). No other calpains identified to date bind small subunit. Presence of CSS1 is of critical importance for calpastatin regulation as it provides one of three binding site for effective calpain/inhibitor interaction. Stomach-specific calpain 8 (nCL-2) exists in both monomeric and homooligomeric forms, but not as a heterodimer with CSS1. The oligomerization occurs through domains other than the 5EF-hand domain IV, most probably through domain III, suggesting a novel regulatory system for nCL-2 (Hata et al., 2007). Calpain 8 sensitivity to calpastatin suggests effective enzyme/inhibitor binding through motifs distinct from those localized in small subunit.

Distributions of calpains in living organisms
Calpain genes are distributed among all kingdoms of life except archea and viruses. There are 895 calpain sequences in studied organisms identified to date according to protease database MEROPS (Rawlings et al., 2010). Now 15 calpains are known in human (and in almost all vertebrates). Other organisms have far fewer genes coding for calpain-like proteins. With few exceptions, most organisms outside the animal kingdom (fungi, plants, and bacteria) and a most of protists have only a single calpain gene if any Goll et al., 2003) while other cysteine proteases are abundant in lower eukaryotes and parasitic protozoa (Sajid & McKerrow, 2002;Mottram et al., 2003).
Unexpected variety of calpain genes were shown in a number of parasitic kinetoplastids and infusoria. The discovery of this surprisingly large family of calpain-like proteins in lower eukaryotes contributes to our understanding of the molecular evolution of this abundant protein family.

Calpains in bacteria
Proteins homologous to calpains have been detected only in few prokaryotes, whereas 96% of bacteria, including Escherichia coli and all archea, have no calpain genes. Trp protease from Porphyromonas gingivalis illustrates some of the difficulties encountered when assigning molecules to the calpain family on the basis of cDNA-derived amino acid sequence alone. Although the predicted amino acid sequence of the P. gingivalis enzyme has 53.1% similarity (23.7% identity) to domain IIa/IIb of human µ-calpain, the P. gingivalis enzyme has nearly the same homology (22.5% identity) with the amino acid sequence of papain; the expressed protease is not inhibited by leupeptin, an inhibitor of the calpains, reveal distinct substrate specificity and is inhibited, not activated, by Ca 2+ (Bourgeau et al., 1992).

Calpains in fungi
Most fungal genomes sequenced have only a single gene for calpain-like proteins, the exception being Neurospora crassa, where three genes have been identified (data from MEROPS database). Highly modular structure of calpains and calpain-like proteins in lower eukaryotes that combines novel and conserved sequence modules suggests that they are involved in diverse cellular activities. PalB protein from Aspergillus is an ancestor of the most evolutionarily conserved subfamily of calpains with known members in vertebrates (human calpain 7, or PalBH), yeasts, fungi, protists, nematodes, and insects (except Drosophila), but not in plants. PalBH homologues commonly contain two C-terminal C2-L domains in tandem and conserved microtubule-interacting and transport (MIT) motifs at the N-terminus.

Calpains in plants
Phytocalpain genes encoding a highly conserved, unique plant-specific calpain-like molecule (phytocalpain) derive their names from the Zea mays Defective Kernel-1 (DEK1) gene, which was the first to be phenotypically characterized and cloned (Becraft et al., 2002;Lid et al., 2002). The polypeptide encoded by gene DEK1 consists of C-terminal intracellular domain with significant sequence homology to domains IIa/IIb and III of calpains 1 or 2 and extended N-terminal region that is predicted to contain 21 transmembrane domains interrupted by an extracellular loop and an extended cytoplasmic juxtamembrane region showing little homology to other proteins (Lid et al., 2002). Highly conserved homologues of DEK1 have been described across the plant kingdom, including basal plants such as Physcomitrella. Interestingly, these genes occur in one copy in plant genomes sequenced thus far, including that of the model plant Arabidopsis thaliana. Phytocalpain has been shown to be essential for the correct development of both an embryonic epidermal cell layer and the specialized outer layer of the endosperm (aleurone) during seed development in maize and Arabidopsis (Lid et al., 2002;Ahn et al., 2004;Johnson et al., 2005).

Calpains in invertebrates
Based on screening of sequenced or partially sequenced genomes a most of unicellular protozoan (Plasmodium falciparum, Theileria annulata, Cryptosporidium parvum, Entamoeba histolytica, etc.) has a single copy of calpain-coding gene whereas no calpain-like sequences were identified in others (for instance, Giardia lamblia) (review: Croall & Ersfeld, 2007).
Uniquely within protozoa, some kinetoplastid parasites and the ciliate Tetrahymena thermophila display expansion of calpain genes. Fourteen genes encoding calpain-related proteins have been identified in Trypanosoma brucei, 17 in Leishmania major, 15 in T. cruzi (Ersfeld et al., 2005), and 26 in macronuclear genome of the ciliate T. thermophila (Ersfeld et al., 2005;review: Croall & Ersfeld, 2007). The presence of numerous calpain-like proteins, exceeding the numbers found in most other organisms including vertebrates, and their unique protein architecture point to important trypanosomatid-specific functions of these proteins that have not been elucidated yet; however, a number of calpain-like proteins are differentially expressed during the life cycle of the parasites (Matthews & Gull, 1994).
Only atypical calpains are found in lower eukaryotes, including protozoan and fungi, as well as in C. elegans, plants, and S. cerevisiae (Sorimachi et al., 2011a), whereas three of four Drosophila calpains, four S. mansoni calpains and three A. gambiae calpains are typical. All calpains in trypanosome and one typical calpain in Drosophila lack one or more of the Cys/His/Asn residues at their catalytic site, so they probably are not proteolytic enzymes. It is predicted that proteomes of obligate parasites is enriched by the genes of longer conserved proteins of fundamental function (Brocchieri & Karlin, 2005) owing to elimination low expressed and poorly conserved genes under no or weak selection (Haigh, 1978). Calpain diversity in kinetoplastid parasite genomes emphasizes house-keeping function of calpains including non-proteolytic ones.
Only a few calpain-like molecules isolated from Schistosoma mansoni, Drosophila, and crustacean have been enzymatically characterized (reviews: Mykles, 1998;Goll et al., 2003;Cantserova et al., 2010;Lysenko et al., n.d.). Some of them such as SOL calpain from Drosophila are closely related to mammalian calpains and the others are highly distinct from that ones.
Interestingly, neither calpastatin activity nor a calpastatin-like DNA sequence has yet been detected in invertebrates; a search of the Drosophila genome failed to detect a gene homologous to calpastatin (Laval & Pascal, 2002). CSS polypeptide is also has not been found in invertebrates (Pintér et al., 1992).

Calpain-calpastatin system in vertebrates
Today, fifteen calpain genes (CAPN1-3, CAPN5-16) ( Table 1) and calpain-related genes such as calpastatin (CAST) and calpain small subunite (CSS1 and CSS2) have been identified in human genome. Two or more of the three well-characterized members of the calpain system, µ-calpain, m-calpain, and calpastatin, has been detected in any vertebrate cell that has been carefully examined for their presence. Different tissues (or cells), however, differ widely in the ratios of the three proteins (Thompson & Goll, 2000). Other calpains are expressed in more or lesser extent in tissue-specific manner.  (Horikawa et al., 2000).
Chickens and Xenopus laevis have been shown to contain μ/m-calpain with intermediate structural and enzymatic characteristics between common mammalian calpains and was recently shown to correspond to mammalian calpain 11 from an evolutionary viewpoint Macqueen et al., 2010). Fish due to polyploidization events in nature history have a duplicate set of most of the fifteen mammalian calpains and calpain-related molecules (Lepage & Bruce, 2008;Macqueen et al., 2010).
Only oligomeric molecules of calpains are inhibited by calpastatin; μand m-calpains have similar susceptibility to calpastatin. Among other homologues only calpains 8 and 9 are also inhibited by calpastatin (Lee et al., 1999;Hata et al., 2001;. Significantly that higher organisms along with enzymes having "advanced" characteristics (dimere structure, tissue-specific patterns of expression, alternative promotors, regulation by calpastatin, etc.), contain almost whole set of non-classical calpain homologues conservative from lower eukaryotes and invertebrates (Bondareva & Nemova, 2008).

Phylogenetic analysis of calpains
Phylogenetic trees have been constructed for isolated domains Jékely & Friedrich, 1999) and for the defining catalytic core domain II in conjunction with the most common auxiliary domain III of selected species (Jékely & Friedrich, 1999;Wang et al., 2003). The phylogenetic approach confirms clear segregation of two main groups of calpains: the first clade includes EF-hand-containing CAPN genes (Schistosoma, Drosophila A/B and the classic vertebrate calpains) and also EF-hand-free C. elegans CLP-1, and the second cluster contains capn5(TRA-3) and capn6 typical and atypical (Jékely and Friedrich, 1999). This segregation clearly corresponds to common groups of typical and atypical calpains except for CLP-1 protease.
Heterodimeric vertebrate calpains (μ-and m-calpains and calpain 9) are grouped together with a reliability of 96%, and it is highly probable that calpains of S. mansoni and Drosophila also belong to this group. The results of the phylogenetic approach to the study of calpains indicate that the scenario of their evolution is similar to that proposed for the evolution of other multigene families (actin, troponin, Hox and others) in vertebrates (Holland et al., 1994;Ohta, 1991).

Evolutionary roots of calpains
For the first time, Koichi Suzuki has suggested in 1987 the hypothesis that calpains are chimeric proteins originated by the fusion of two genes coding for proteins with completely different functions and origins -cysteine protease and calmodulin (Suzuki et al., 1987).
Initially it was predicted that the catalytic domain of calpains and other cysteine proteases, such as papain and cathepsins B, H, and L, arose from a single archetypical protease. Based on this presumption, these enzymes were grouped into a unified family of papain peptidases C1. According to the refined structural information calpains were included in the database MEROPS (Rawlings et al., 2010) as an individual family C2 of the papain-like clan CA of cysteine peptidases. Although common catalytic triade Cys/His/Asn and papain-like catalytic mechanism (Arthur et al., 1995) calpains share little sequence homology with other cysteine proteases, and it is likely evolved from a different ancestral gene.

Possible scenario of molecular evolution in calpain family
Structural features, together with the organization of mammalian calpain genes, strongly suggest that calpains evolved by the arrangement and restructuring of genes of the ancestral calpain-type cysteine protease with other functional units. Evolutionary history of calpains is a classical example of a multigene protein family evolution.

www.intechopen.com
Protein Engineering 132 A small-scale or large-scale genome duplication occurred in different lineages generates additional new genes, the majority of which (about 70-90%) have since been degraded and(or) lost ("nonfunctionalization"). The functions of the ancestral gene may be distributed among the duplicates (so called "subfunctionalization") or one of the duplicates retains the ancestral functions while the other acquires a completely new function (neofunctionalization) (Braasch & Salzburger, 2009). The gene duplication with subsequent divergent resolution, subfunctionalization and neofunctionalization ("duplicationdiversification hypothesis") is applicable to calpain evolution as well.

Gene duplication
Any gene family is the product of gene duplication. The importance of gene duplication in molecular evolution is well established (Nei, 1969;Ohno, 1970). Gene copies can be generated through one of two main mechanisms, namely small-scale or large-scale duplication events, with the most extreme large-scale event being duplication of the entire genome (Hakes et al., 2007).
It has been hypothesized that the highly conserved sequence and protein structures of typical calpain family members arose from multiple tandem duplications of a single ancestral calpain gene with subsequent functional diversification of copies (Jekely & Friedrich, 1999;Hata et al., 2001). The topology of the phylogenetic tree of calpains together with the analysis of chromosomal localization of calpain genes shed light on the time of two episodes of major calpain gene duplications coinciding with early chordate whole-genome duplications (Holland et al., 1994;Jékely & Friedrich, 1999).
One episode of gene duplication occurred later than the separation of lines plants-animals and fungi-animals (Jékely & Friedrich, 1999). This conclusion was made after the detection of atypical T-domain-containing calpains 5 and 6 in vertebrates (Mugita et al., 1997;Dear et al., 1997), homologous to C. elegans TRA-3, probably, their direct ancestor. Since plants and fungi lack TRA-3 homologues and typical EF-hand Ca 2+ -binding calpains (Wolfe et al., 1989;Mewes et al., 1997) it was supposed that the the first round of duplication that led to the separation of TRA-3 and typical calpains occurred earlier than the divergence of the lines of protostomatic and deuterostomatic animals. The acquisition of conventional domain IV is considered a late event in calpain evolution that has occurred during animal evolution but not in other eukaryotic branches (Ohno et al., 1984).
The origination of calpains specific to vertebrates is refered to the second episode of gene duplication occurred at early stages of the evolution of chordates. The mapping of the human genome showed that 15 calpain genes as well as the gene of the 28 kDa calpain subunit are grouped on nine chromosomes: 1, 2, 3, 6, 11, 15, 16, 19, and X (Hata et al., 2001;Dear & Boehm, 2001) (Table 1). Tandem chromosomal localization of calpain genes may also suggest their coordinated regulation (e.g., through the Ca 2+ sensitivity, the involvement of common enhancer elements or locus control regions). Proposed gene duplication events may explain the closer evolutionary relationships between the pairs CAPN2 and CAPN8, CAPN3 and CAPN9, and CAPN1 and chicken μ/m calpain gene (Jékely & Friedrich, 1999).
The third predicted round of whole genome duplication (also known as the 3R duplication) is specific for teleost fish lineage (Hoegg et al., 2004;Hurley et al., 2007;Braasch & Salzburger, 2009) and endowed teleosts with additional new genes. Thus, both capn1a and capn1b in fish appear to be orthologous to the single CAPN1 gene and capn2a and capn2borthologous to calpain 2 present in other vertebrate species. Furthermore, both Css1a and Css1b identified in zebrafish exhibit conserved synteny and significant sequence identity to human CSS1, whereas only a single CAST orthologue that produces multiple transcript variants through alternative splicing has been found in teleost (Lepage & Bruce, 2009). Fishspecific proteins of calpain system most probably arose from the 3R duplication during teleost evolution (Hoegg et al., 2004;Hurley et al., 2007).
The presence of five calpain-like proteins with three copies of domain II identified in kinetoplastids has to be a relatively late gene segment duplication in an ancestral species (Ersfeld et al., 2005).

Gene fusion
Protein domains are fundamental and largely independent units of protein structure and function which occur in a number of different combinations or domain architectures. Interestingly, more complex organisms have more complex domain architectures, as well as a greater variety of domain combinations (Chothia et al., 2003;Babushok et al., 2007). It has been proposed that the novel combinations of preexisting domains had a major role in the evolution of protein networks and more complex cellular activities (Pawson & Nash, 2003;Peisajovich et al., 2010).
Fusion of interacting single-function proteins into oligodomain units may facilitate the interaction between functional units and diminish the need to produce proteins in greater amounts to achieve appropriate concentrations of their complexes in the cell. It seems plausible that the acquisition of longer, polyfunctional proteins in eukaryote organisms may have evolved concomitant with the acquisition of multi-exon proteins (Brocchieri & Karlin, 2005).
Gene duplication has preceded domain gain in at least 80% of the gain events (Buljan et al., 2010). Genesis of multiple gene variants as a result of small-and large-scale gene duplication followed by their fusion, exon deletion, insertion of extended sequences, amino acid substitution, etc. are considered as the basis of calpain diversity. Novel domain combinations have a major role in evolutionary innovation. Newly formed structures underwent functional differentiation and caused the appearance of new morphological and cellular characteristics of vertebrates (Ohta, 1991). The major mechanism for gains of new domains in metazoan proteins is likely to be gene fusion through joining of exons from adjacent genes, possibly mediated by non-allelic homologous recombination (Buljan et al., 2010).
Acquisition of the penta-EF-hand module involved in calcium binding (and the formation of heterodimers for some calpains) seems to be a relatively late event in calpain evolution. Calpain homologues from protozoan, plant, C. elegans, and fungi lack calmodulin-like domain, and thus it seems likely that the proposed cysteine protease-calmodulin gene fusion leading to the classical calpain structure (Croall & DeMartino, 1991;Goll et al., 2003;Sorimachi et al., 2010) occurred exclusively within the animal lineage. Even similar functional units such as EF-hand module evolutionary gained by different ways. A phylogenetic tree rooted to the calpain-related sequence of the prokaryote Porphyromonas gingivalis and based only on the catalytic core suggests that the EF-hand-containing calpains from animals (C-terminal EF-hands) and Tetrahymena ( N-terminal EF-hands) are phylogenetically well separated (Croall & Ersfeld, 2007). This raises the possibility that the acquisition of these motifs occurred through independent gene-fusion events in these groups.
Transmembrane motif present in calpains from different sources (ciliate Tetrahymena and plants). The phylogenetic analysis reveals a close relationship of indicated transmembrane motif-containing calpains, thus raising the possibility of a common origin for these unusual calpains (Croall & Ersfeld, 2007). Lateral gene transfer from a green alga-type endosymbiont of ciliates is one possible mechanism.
Apart from episodes of gene fusion there are elucidated some episodes of sequence partial deletion. Calpain 8 of vertebrates lacks the functional exon 10'. EF-hands module has been lost during evolution by some calpain genes. Calmodulin-like domain is absent in the structure of calpain CLP-1 from C. elegans but on the basis of phylogenetic analysis it occures to belong to EF-hand Ca 2+ -binding clade (Jékely & Friedrich, 1999); that may suggest that this sequence might have lost it calmoduline-like domain secondarily.
Small subunit of calpains shares homology only with EF-hand-containing domain IV of typical calpains. Mamimum-likehood tree resolved the phylogeny of calpain small subunit which derived from CAPN3 gene; chromosomal localization of corresponding calpain genes strongly supports this hypothesis (Jékely & Friedrich, 1999).

Divergent evolution of duplicated calpain genes
The domain structure of calpains was realized during several acts of fusion and duplication of genes with subsequent loss of functional exons or residues. It is highly probable that the variable C-terminal domains such as the C2-L domain (domain III), the EF-hand-containing calmodulin-like domain (domain IV), the TRA-3-homologous domain (or C2 domain, Tmodule), PalB-homologous domain (PBH), and SOL-homologous domain (SOH) originated from independent sources.
In splice-variant CAPN3 (muscle-specific calpain 3a) and Drosophila CALPA there are unique inserted regions (Pintér et al., 1992). Insertion of extended sequences led to the formation of new functional associations and substantionaly changes the characteristics of the protein. For example, due to three unique fragments (NS, novel sequence; IS1, insert 1; and IS2, insert 2) in the structure of calpain 3a are responsible for high susceptibility to autoproteolysis, capability to binding skeletal muscle myofibrils, and nuclear localization (Kinbara et al., 1998;Goll et al., 2003).
Along with mechanisms of protease invention that are based on gene duplication and divergence, the evolution of calpains has also been driven by exon shuffling and the duplication of protein modules in protease genes to form new architectures. By these mechanisms protease substrate and binding specificities can be altered in an evolutionarily rapid and selectable way that leads to gene-family diversity and results in substrate specificity or diversity and new kinetic, inhibitory, and cell or tissue localization properties (Puente et al., 2003).

Functional diversification of calpains
Following gene duplication the newly emerging paralogues descendants may undergo functional differentiation (Ohta, 1991). Genetic variants of calpains were subjected to further functional diversification.
Acquired ancillary domains in calpain structure can interact with many subcellular organelles and molecules including phospholipids, calmodulin, calpastatin and even with each other (that was shown for calpains 8 and 9 and some invertebrate calpains). Structural diversity of calpains due to composition of highly conserved catalytic domain with other functional modules indicates that calpains are involved in a variety of fundamental processes at cellular (for example, motility) and organism (for example, embryogenesis) levels. Despite of overlapping or unique substrate specificities and inhibitor sensitivities of calpains the studies of their individual functions in cellular pathways have to be designed to acheieve their distinguishing. Nevertheless high selectivity and processing mode of their action allow considering calpains as intracellular modulator proteases.
Both in genetic and cell-biological models on targeted and conditional deletion of CAPN1, CAPN2, and CSS1 genes (review: Sorimachi et al., 2010) vital essentiality of calpain 2 in embryogenesis of mammals was shown (Dutt et al., 2006). Variable in structure and functionality splice variants of CAPN3 gene play tissue-specific roles: calpain 3a in the skeletal muscles; calpains Lp82 (or calpain 3b), Lp85, and Lp74 in the crystalline lens; the others -in the retina (Rt88, Rt88', and Rt90) and in the cornea (Cn94) of the eye (Fukiage et al., 2002). Calpain 3a is essential for skeletal muscle integrity therefore multiple inactivated mutations in CAPN3 seem to be a cause of limb-girdle muscular dystrophy type IIa (review: Zatz & Starling, 2005). Single nucleotide polymorphism in CAPN10 gene is a hereditary factor of high susceptibility to type 2 diabetes . For calpain 5 (human orthologue of TRA-3 from C. elegans) it was proposed analogous role of in mammalian sex determination, since the increased expression of the enzyme have been found in testis in relation to colon, kidney, liver and trachea and other tissues.
For the family members lacking key catalytic residues, alternative functions await discovery. Inactive homologues found to be abundant in some protease families and might have important roles as regulatory or inhibitory molecules, acting as dominant negatives by binding substrates through the inactive catalytic or exosite ancillary domains in nonproductive complexes, or by titrating inhibitors from the milieu to increase the net proteolytic activity (López-Otín & Overall, 2002;Puente et al., 2003;Pils & Schultz, 2004). A recent report describes a role for the non-catalytic calpain 6 in the stabilization of microtubules (Tonami et al., 2007). The expression of non-protease homologues is interesting with an evolutionary viewpoint as the discovery of their physiological functions may elucidate calpain functions distinct from proteolysis. It has been argued that nonproteolytic calpains are derived from catalytic precursors, because the majority of family members are active (Todd et al., 2002); however, in kinetoplastids the number of calpain-like proteins with a nonstandard catalytic domain far outnumbers the few proteins with the classical Cys/His/Asn triad (Ersfeld et al., 2005).

Evolution of calpain-related genes
Evolutionary achievements in regulation of calpains implicate the proteins of other families: calpastatin, endogenic inhibitor of calpain activity (protease inhibitor family I27), and regulatory small subunit (PEF protein family). In addition to CAPN1 and CAPN2 gene products, CSS1, and CAST are regarded as a classic/ubiquitous components of the calpain system (Nakamura et al., 1988).
Calpain small subunit is not essential for protease activity (Yoshizawa et al., 1995;Pal et al., 2001) but indispensable for correct folding of catalytic subinit (Yoshizawa et al., 1995;Moldveanu et al., 2002) primarily acting as a chaperon. CSS1 also plays a role of adaptor protein allowing interaction of heterodimer calpains with membranes and consequently facilitates their activation. Dissociated CSS1 might have a function different from proteolysis after forming a homodimer (Pal et al., 2001); thus, it is required for the induction of senescence (Demarchi et al., 2010), Ca 2+ -dependent repair of wounded plasma membranes (Mellgren et al., 2009), and macroautophagy (Demarchi et al., 2006).
Calpastatin functions as a major inhibitor with high affinity and strict specificity for calpain (Suzuki et al., 1987;Maki et al., 1990). Monomeric calpains including calpains 1 and 2 dissociated from native dimer are not inhibited and thus escape from the regulatory actions of calpastatin. Calpastatin displays molecular polymorphism, the biological significance of which is not yet understood though it seems to be associated somewhat with cellular differentiation (Maki et al., 1990;Lee et al., 1992).
Described proteins are found only in vertebrates. Thus, both calpain small subunit gene CSS1 and calpastatin gene CAST are the "vertebrate genes" Cantserova et al., 2010;Lysenko et al., n.d.).

Conclusion
On the basis of molecular evolution within the calpain protease family there was demonstrated how acquisition of structural features facilitates spatial and temporal control of the protease activity. Evolutionary achievements developed in the course of calpain molecular evolution concern some aspects: (1) regulation of calpain synthesis (alternative splicing, tissue-specific patterns of expression), (2) structural characteristics (oligomeric structure, specialized functional domains and insertion sequences), (3) enzymatic characteristics (limited substrate specificity, increased Ca 2+ sensitivity), (4) intracellular behavior (additional Ca 2+ -binding sites, membrane-interacting modules, wide range of binding partners), and (5) mechanisms of regulation of their activity (endogenous specific inhibitor, calpastatin, activating proteins).
However calpain study is far for complete and future efforts are needed to determine how the modules associated with proteolytic core influence its function. There is likely to be interplay between protein-protein interactions, membrane binding, Ca 2+ binding and, potentially, posttranslational modifications in the modulation of calpain function (Croall & Ersfeld, 2007). Many calpain proteins rem a i n t o b e p u r i f i e d a n d c h a r a c t e r i z e d biochemically, so the challenge of identifying their relevant binding partners as well as specific functional activity remains. The increased knowledge of the structure, function and regulation of proteases will provide excellent opportunities to design new generations of therapeutic inhibitors, including those based on endogenous protease inhibitors.