Open access peer-reviewed chapter - ONLINE FIRST

Structural and Functional Consequences of the SMA-Linked Missense Mutations of the Survival Motor Neuron Protein: A Brief Update

By Wei Li

Submitted: August 20th 2018Reviewed: October 5th 2018Published: November 5th 2018

DOI: 10.5772/intechopen.81887

Downloaded: 169

Abstract

Genetically linked to the survival motor neuron 1 gene SMN1, spinal muscular atrophy (SMA) is an autosomal recessive neuromuscular disease with dysfunctional α-motor neurons. As the product of the SMN1 gene, the survival motor neuron protein (SMN) plays an essential role in the molecular pathogenesis of SMA. On 1 June 2017, a PLoS ONE article reported a set of computational structural analysis to illustrate how do SMA-linked mutations of SMN1 lead to structurally/functionally deficient variants of SMN. Following this article, this chapter provides a brief update of the structural and functional consequences of the missense mutations of this SMA protein.

Keywords

  • spinal muscular atrophy
  • survival motor neuron protein
  • missense mutation
  • structural consequence(s)
  • functional consequence(s)

1. Setting the scene up

On 1 June 2017, PLoS ONE published an original research article (Figure 1) [1] with a title ‘How do SMA-linked mutations of SMN1 lead to structural/functional deficiency of the SMA protein?’, of which this chapter aims to provide a brief update.

Figure 1.

A flow chart for the computational analysis of structural/functional consequences of clinically identified genetic diseases-linked missense mutation(s) of key gene(s) and protein(s). In [1], SMA was shown as an example of the computational analysis as illustrated here in this figure (http://biomedical-advances.org/ep-20182-14/).

1.1. The genetics of SMA: a brief introduction

SMA is an autosomal recessive neuromuscular disease with α-motor neuron (anterior horn of the spinal cord) dysfunction and muscular atrophy [2]. SMA is caused by loss (95% of SMA cases) or mutation (5% of SMA cases) of the survival motor neuron gene 1 SMN1 (telomeric SMN, telSMN or SMN1, GenBank: U18423, the 5q13 region of human chromosome) [3]. In the 5q13 region of the human chromosome, there is also a nearly identical survival motor neuron 2 gene SMN2 (centromeric SMN, cenSMN or SMN2, GenBank: NM_022875) [3]. The two genes (SMN1 and SMN2) have been extensively characterised, and their roles in SMA have been reviewed in detail [2, 3, 4, 5, 6, 7, 8].

1.2. The survival motor neuron protein and its role in SMA

The survival motor neuron (SMN) protein is the product of SMN1, the SMA-determining survival motor neuron gene [2, 3]. As a result, SMN is also called the SMA protein. In fact, the 38-kD SMN is the actually affected protein in SMA [9, 10, 11], and is a cytoplasmic protein that also occurs in dot-like nuclear structures called gems, which is why SMN is formerly termed Gemin1 [3, 12], too.

In the molecular pathogenesis of SMA, of particular interest is an exon 7-skipping splicing defect identified in the pre-mRNA editing of the SMN2 gene [5]. Due to this splicing defect, SMN2 predominantly produces exon 7-skipped transcripts, which encode a truncated isoform of the SMN protein (SMNΔ7 or SMN2 with 282 residues), in comparison with the full-length SMN protein with 294 residues (SMN1 or FL-SMN).

In pre-mRNA editing, spliceosome is the major functional unit, and spliceosomal small nuclear ribonucleoproteins (snRNPs) are essential components of the nuclear pre-mRNA processing machinery [13, 14, 15, 16, 17]. In the pathogenesis of SMA, the SMN protein plays a critical role in pre-mRNA processing, because the biogenesis of spliceosomal snRNPs is promoted by the SMN complex [14, 18, 19], which consists of SMN (Gemin1), Gemin2–8 and UNR-interacting protein (UNRIP) [13, 16, 20]. In the formation of the SMN complex, SMN forms oligomers and directly interacts via its N-terminus with Gemin2 and via its tudor domain with spliceosomal (Sm) proteins [13, 21, 22]. A key component of the SMN complex, SMN first assembles the essential SMN/Gemin complex, which in turn mediates the formation of the Sm core domain of the spliceosomal snRNPs [13, 21, 22].

2. Structural and functional consequences of the SMA-linked missense mutations of SMN

In general, genetic mutation includes missense, nonsense, insertion and deletion mutations. A nonsense mutation is a point mutation in a DNA sequence that results in a premature stop codon, or a nonsense codon in the transcribed mRNA, and in a truncated, incomplete and usually functionally deficient protein product. In contrast, a missense mutation involves substitution of one single amino acid residue, and therefore is able to provide unique access to residue-specific structural insights into the role of the residue in the structure and function of the target protein, provided that the three-dimensional structure of the target protein is experimentally determined and deposited in the Protein Data Bank. Thus, this chapter focuses on SMA-linked missense mutations of SMN and aims to provide a brief update of their structural and functional consequences with a set of computational structural analysis as described in [1].

2.1. An update of SMA-linked missense mutations of SMN

A set of point mutations (missense and nonsense mutations) have been previously summarised in [1], including A2G [23], nonsense mutation Q15X [24], D30N [25], D44V [25, 26, 27], V94G [28], G95R [25], Y130C [29], nonsense mutation Q157X [30], A188S [31], nonsense mutation W190X [32], nonsense mutation L228X [33], P245L [34], L260S [28], S262G and S262I [4, 25], M263T [32], S266P [29], Y272C [4, 35, 36], H273R [29], T274I [4, 35, 36], G275S [32], G279C and G279V [4, 35, 37, 38]. As of 25 September 2018, eight more missense mutations of SMN were summarised and reported, including A2V, Y109C, Y130C, Y130H, P221L, S230L, P244L and R288S [39].

2.2. An update of experimentally determined SMN-related structures

In [1], 11 SMN-related structures were retrieved from the PDB database [40] with 2 search parameters (text search for: survival motor neuron protein and molecule: survival motor neuron protein). In a new search of the PDB database (accessed 25 September 2018) [40] with the same parameters, 14 PDB entries were retrieved, including 1G5V, 1MHN, 2LEH, 4A4E, 4A4G, 4GLI, 4QQ6, 4V98, 5XJL, 5XJQ, 5XJR, 5XJS, 5XJT and 5XJU. In a comparison with the PDB entries in [1], during the past 16 months, six new SMN-related structures were deposited in the Protein Data Bank, including 5XJL (to supersede 3S6N [41]), 5XJQ [42], 5XJR [42], 5XJS [42], 5XJT [42] and 5XJU [42]. While the six PDB entries do contain a set of different yet functionally related protein molecules, including snRNP Sm-D1, snRNP Sm-D2, snRNP E, snRNP F and snRNP G, they also contain a fragment of the survival motor neuron protein (SMN residues 26–62), according to the fasta format data of the six PDB entries [42].

2.3. An update of the structural and functional consequences of the missense mutations of SMN

2.3.1. Asp44 in the Gemin2-binding domain of SMN

In light of the six new experimentally determined SMN-related structures (Table 1), a new set of computational structural analysis, as previously described in detail in [1], is within the reach of this chapter to provide an update of it. Two aspartates (Asp 35 and Asp44) of SMN stood out in the structural analysis of both intramolecular and intermolecular salt bridges for this SMA protein, as listed in Table 2.

PDB IDStructure titleMethodRelease date
5XJLCrystal structure of the Gemin2-binding domain of SMN, Gemin2 in complex with SmD1/D2/F/E/G from humanX-ray2 May 2018
5XJQCrystal structure of the Gemin2-binding domain of SMN, Gemin2 in complex with SmD1(1–82)/D2/F/E/G from humanX-ray4 July 2018
5XJRCrystal structure of the Gemin2-binding domain of SMN, Gemin2dN39 in complex with SmD1(1-82)/D2/F/E/G from humanX-ray4 July 2018
5XJSCrystal structure of the Gemin2-binding domain of SMN, Gemin2dN39 in complex with SmD1(1-82)/D2/F/E from humanX-ray4 July 2018
5XJTCrystal structure of the Gemin2-binding domain of SMN, Gemin2 in complex with SmD1(1-82)/D2.R61A/F/E/G from humanX-ray4 July 2018
5XJUCrystal structure of the Gemin2-binding domain of SMN, Gemin2dN39 in complex with SmD1(1-82)/D2.R61A/F/E/G from humanX-ray4 July 2018

Table 1.

A list of new (compared with those summarised in [1]) experimentally determined SMN-related structures as of 25 September 2018 [40].

In this table, X-ray represents X-ray crystallography as a biophysical tool for biomolecular structure determination.

PDB IDSBnumResidue AAtom AResidue BAtom BDistance (Å)
5XJL4M_ASP_44OD12_ARG_213NH12.946 (Yellow)
5XJL4M_ASP_44OD12_ARG_213NH23.579 (Red)
5XJL4M_ASP_44OD22_ARG_213NH13.236 (Brown)
5XJL4M_ASP_44OD22_ARG_213NH23.848 (Blue)
5XJQ3M_ASP_44OD12_ARG_213NH12.760
5XJQ3M_ASP_44OD12_ARG_213NH23.593
5XJQ3M_ASP_44OD22_ARG_213NH12.968
5XJR2M_ASP_44OD12_ARG_213NH12.385
5XJR2M_ASP_44OD22_ARG_213NH12.871
5XJS3M_ASP_44OD12_ARG_213NH13.078
5XJS3M_ASP_44OD12_ARG_213NH23.670
5XJS3M_ASP_44OD22_ARG_213NH12.631
5XJT3M_ASP_44OD12_ARG_213NH12.335
5XJT3M_ASP_44OD12_ARG_213NH23.386
5XJT3M_ASP_44OD22_ARG_213NH13.067
5XJU2M_ASP_44OD12_ARG_213NH12.302
5XJU2M_ASP_44OD22_ARG_213NH12.989
5XJQ1M_ASP_35OD1M_LYS_41NZ3.921
5XJS2M_ASP_35OD1M_LYS_41NZ3.670
5XJS2M_ASP_35OD2M_LYS_41NZ3.803
5XJT2M_ASP_35OD1M_LYS_41NZ2.416
XJT2M_ASP_35OD2M_LYS_41NZ2.931
5XJU1M_ASP_35OD1M_LYS_41NZ3.274

Table 2.

A summary of salt bridge analysis of the six new SMN-related structures as of 25 September 2018 [40].

In this table, the residue naming scheme is Chain ID_residue name_residue number, SBnum represents the number of salt bridges computationally identified from the PDB entries listed in this table. In the top four rows for PDB entry 5XJL, Yellow, Red, Brown and Blue represent the colouring scheme for Figure 2. Distance represents the distance between two oppositely charged groups/atoms in Å.

Asp44 is in the exon 2a of SMN1 (the Gemin2-binding domain), and involved in an SMA-linked Asp44Val (D44V) missense mutation [25], which involves a substitution of Asp44’s charged side chain by Val44’s hydrophobic side chain. Of extraordinary functional significance is that SMN’s Gemin2-binding activity is totally suppressed by the D44V mutation in SMN1 [41]. Moreover, the D44V SMN (SMND44V) mutant’s snRNP assembly activity is lower than that of the wild-type SMN (FL-SMN or SMN1) [27].

In a solid alignment with the computational analysis in [1], a set of salt bridges were structurally identified between SMN’s Asp44 (M_Asp_44) and Gemin2’s Arg213 (2_Arg_213), as shown in Table 2. In particular, four intermolecular salt bridges were identified between the buried side chains (Table 3) of these two charged residues, i.e. according to the coordinates data in the PDB entry 5XJL [42], as shown in Figure 2.

ResidueSASA (Å2)SASA-intrinsic (Å2)SASA-Ratio
2_Arg_21357238.760.238
M_Asp_4467140.390.477

Table 3.

Solvent accessible surface area (SASA) values of SMN’s Asp44 and Gemin2’s Arg213 (PDB ID: 5XJL) [42].

In this table, SASA, SASA-intrinsic and SASA-ratio represent for SMN’s Asp44 and Gemin2’s Arg213 the average SASA value calculated by DSSP [43], the intrinsic SASA value [44] and the ratio of SASA divided by SASA-intrinsic, respectively. In this table, the residue naming scheme is Chain ID_residue name_residue number.

Figure 2.

Four salt bridges formed between the buried side chains of SMN’s Asp44 (M_Asp_44 in red text) and Gemin2’s Arg213 (2_Arg_213 in white text). In this figure, the residue naming scheme is Chain ID_residue name_residue number. In this figure, Asp44’s side chain oxygens are coloured red, and Arg213’s nitrogen atoms are coloured blue, while all hydrogen atoms are coloured in white, the four dotted lines in four colours represent the four side chain salt bridges formed between the two oppositely charged residues, where the colouring scheme is described in Table 2.

Taken together, it is conceivable that the buried side chains of SMN’s Asp44 and Gemin2’s Arg213 form a salt bridge, which constitutes a favourable electrostatic energy contribution to the SMN-Gemin2 complex structural stability [41], and highlights the functionally indispensable roles of the two residues’ charged side chains, considering the experimental observation that the SMN-Gemin2 binding is abrogated by the D44V mutation [41], resulting in a functionally deficient SMA-linked D44V SMN mutant.

In addition to the intermolecular salt bridges formed between SMN’s Asp44 and Gemin2’s Arg213, a set of intramolecular salt bridges were also identified between side chains of SMN’s Asp35 and Lys41 (Table 2), which was reported in [1], too, where 15 salt bridges were identified between the side chains of SMN’s Asp35 and Lys41 in the salt bridge analysis of the NMR-determined SMN-Gemin2 complex ensemble (PDB ID: 2LEH) [22, 41]. In SMN, Lys41 is a positively charged residue and also a neighbouring residue of Asp44. Functionally different to the SMA-linked D44V mutation, a Lys41Ala (K41A) mutation (not SMA-linked) does not affect SMN-Gemin2 binding [41]. Thus, in another solid agreement with the structural analysis in [1], the structural analysis highlights that the salt bridges between SMN’s Asp35 and Lys41 are intramolecular, i.e. within the apo SMN protein, instead of intermolecular, i.e. at the SMN-Gemin2 complex structure interface, which help to explain why the Lys41Ala (K41A) mutation is not SMA-linked [41].

Overall, there is a solid agreement between the old [1] and the new (this chapter) sets of computational structural analysis for both NMR and X-ray SMN-related structures, reflecting the technical maturity of the two main biophysical tools for biomolecular structure determination, particularly in light of the booming number of cryo-electron microscopy (cryo-EM) images uploaded to the Electron Microscopy Data Bank (EMDB), where a long way is there to go still for cryo-EM to match NMR spectroscopy and X-ray crystallography in terms of technical maturity and the urgent need of tools for structural model quality validation [45].

2.3.2. Gly95 in the SMN tudor domain

Although not located in the structurally determined region of the six new structures (Table 1), Gly95 is a residue in the SMN tudor domain, and it is involved in a Gly95Arg (G95R) mutation [25]. This G95R mutation significantly reduces SMN’s ability to bind Sm proteins, such as Sm-B and Sm-D1 [25], confirming that tudor domain is the essential binding site of SMN to Sm proteins.

In a further inspection of the computational analysis as reported in [1], no salt bridge or hydrogen bond was identified for Gly95. Nonetheless, in the SMN tudor domain NMR ensemble [46], between the side chains of Asp96 and Lys93, 1 salt bridge was found for PDB ID 1G5V [46] with 10 structure models, 18 salt bridges were found for PDB ID 4A4E [47] with 20 structure models (Figure 3) and 16 salt bridges were found for PDB ID 4A4G [47] with 20 structure models. Similarly, 15 salt bridges were also identified between the side chains of Glu147 and Lys97 of SMN (PDB ID: 4A4G [47], with 20 structure models), with the distance between 2 oppositely charged groups being 2.93 ±0.39 Å.

Figure 3.

Two salt bridges formed between the side chains of SMN’s Asp96 and Lys93 (shown as sticks here) according to a salt bridge analysis of the third structural model of the NMR ensemble (PDB ID 4A4E) [47]. In this figure, Asp96’s side chain oxygens are coloured red, and Lys93’s side chain nitrogen is coloured blue, while all hydrogen atoms are coloured in white.

Quite interestingly, Gly95 sits right between the two oppositely charged neighbouring residues (Asp96 and Lys93), which are the only two charged residues in the tudor domain that are in the spatial proximity of Gly95. Thus, it is conceivable that a G95R mutation disrupts the Asp96-Lys93 salt bridge and/or builds another one (possibly even stronger) between the side chains of Lys95 and Asp96, which either perturbs the structure-stabilising activity of the Asp96-Lys93 salt bridge, and/or makes it energetically more unfavourable for Asp96’s side chain to orient towards positively charged side chains in Sm proteins and thereby affect the binding of SMN to Sm proteins. While the potential local electrostatic interaction disruption mechanism here for this SMA-linked G95R mutation is similar to that of the E134K and the Q136E mutations of SMN [1], the former mechanism is dependent on the occurrence of energetically unfavourable electrostatic interaction(s), but the latter mechanism is dependent on the loss of energetically favourable electrostatic interaction(s) for local structural stability of the SMN tudor domain, the essential part of SMN for the Sm protein-binding, which can help explain the reduced Sm core assembly activity of the two SMA-linked SMNE134K and SMNQ136E mutants.

2.3.3. Y109C, Y130C and Y130H in the SMN tudor domain

Among the eight SMN residues with SMA-linked missense mutations [39], only Y109 and Y130 are located in the structurally determined region of SMN [1], according to the updated list of SMN-related structures as of 25 September 2018. Although Y109C, Y130C and Y130H are not located in the structurally determined region of the six new structures, the three missense mutations are located in the structurally determined region of the experimentally determined structures [1].

Tyr130 is a tudor domain hydrophobic residue with a Tyr130Cys (Y130C) mutation [29]. In the computational analysis in [1], no salt bridge or hydrogen bond was identified for Tyr130. Nonetheless, Tyr130 is 50% buried, with an SASAvalue of 111.1 ±4.18 Å2 compared with its standard SASAvalue at 212.7 Å2, while Tyr109 is deeply buried, with an SASAvalue of 61.1 ±8.43 Å2 compared with its standard SASAvalue at 212.7 Å2. Taken together, the SASA analysis of the three SMA-linked mutations highlights the potential significance of the deeply buried hydrophobic side chains of Tyr109 and Tyr130 in the SMN tudor domain.

What is more, in the computational analysis in [1], 10 side chain hydrogen bonds (Table 4) were identified between SMN’s Tyr109 and Asp105 in the PDB entry 4A4E [47], with the donor-acceptor distances (DAin Table 4) at 2.72 ±0.06 Å and ADHat 14.75 ±2.93, no salt bridge was identified for Asp105, and no further hydrogen bonds were identified for Tyr109 and Asp105 for all experimentally determined SMN-related structures as of 25 September 2018.

PDB FileAcceptor (A)Donor (D)Hydrogen (H)D-A (Å)H-A (Å)ADH
0.pdbOD2, A_ASP_105OH, A_TYR_109HH, A_TYR_1092.731.8013.75
3.pdbOD2, A_ASP_105OH, A_TYR_109HH, A_TYR_1092.691.7715.61
4.pdbOD2, A_ASP_105OH, A_TYR_109HH, A_TYR_1092.671.7210.96
5.pdbOD2, A_ASP_105OH, A_TYR_109HH, A_TYR_1092.771.8617.26
6.pdbOD2, A_ASP_105OH, A_TYR_109HH, A_TYR_1092.711.7815.09
8.pdbOD2, A_ASP_105OH, A_TYR_109HH, A_TYR_1092.781.8716.14
12.pdbOD2, A_ASP_105OH, A_TYR_109HH, A_TYR_1092.831.9520.70
14.pdbOD2, A_ASP_105OH, A_TYR_109HH, A_TYR_1092.711.7813.76
18.pdbOD2, A_ASP_105OH, A_TYR_109HH, A_TYR_1092.711.7610.91
19.pdbOD2, A_ASP_105OH, A_TYR_109HH, A_TYR_1092.631.7013.36

Table 4.

The hydrogen bonds formed between the residue side chains between SMN’s Tyr109 and Asp105 (PDB entry 4A4E).

In this table, the names of the PDB files correspond to the single NMR structural model split from the NMR ensemble (PDB entry 4A4E) by a tcl script [1], the residue naming scheme is Chain ID_residue name_residue number, ADHrepresents the angle formed by acceptor (A), donor (D) and hydrogen (H) ADH.

Taken together, the computational findings here indicate that SMN’s Tyr109 and Asp105 contribute to the structural stability of SMN through hydrogen bonding between their side chains, as it is quite clear that if Tyr109 is replaced by Cys109, then the side chain hydrogen bond (Figure 4, Table 4) will disappear, and that the negatively charged side chain of Asp105 will gain more geometric freedom due to the disappearance of the hydrogen bond, which can cause a potential disruption of the (either intramolecular and/or intermolecular) electrostatic interaction network, not to mention the possibility of a disrupted disulphide bonding network within the SMN protein, the SMN complex or even the snRNP assembly, which is critical to ensure that pre-mRNA editing of the SMN1 gene does not go wrong and that its product is the FL-SMN protein, instead of its truncated functionally deficient counterpart.

Figure 4.

The hydrogen bond (Table 4) formed between the side chains of SMN’s Tyr109 and Asp105 in the PDB entry 4A4E [47]. In this figure, SMN’s Tyr109 and Asp105 are shown in sticks, all side chain oxygens are coloured red, and side chain nitrogen is coloured blue, while all hydrogen atoms are coloured in white, and all atoms are labelled with their names nearby. The blue dotted line between OD2 of Asp105 and HH of Tyr109 represents the hydrogen bond formed between SMN’s Tyr109 and Asp105.

2.3.4. A structural analysis of the hydrogen bonds formed within the six new SMN-related structures

In light of the six new experimentally determined SMN-related structures (Table 1), a new set of hydrogen bonding analysis is conducted according to the details in [1], the result of which is briefly summarised in Table 5.

PDB IDAcceptor (A)Donor (D)Hydrogen (H)D-A (Å)H-A (Å)ADH
5XJROE1, A_GLN_24NH2, B_ARG_94HH21, B_ARG_943.001.991.79
5XJROD2, B_ASP_104NH1, B_ARG_102HH12, B_ARG_1022.982.0720.93
5XJSOD1, B_ASP_93NE, 2_ARG_235HE, 2_ARG_2352.941.9611.86
5XJSOD1, B_ASP_93NH2, 2_ARG_239HH21, 2_ARG_2392.981.985.80
5XJSOD2, B_ASP_60ND2, B_ASN_64HD22, B_ASN_642.992.1325.53
5XJTOD1, B_ASP_93NE, 2_ARG_235HE, 2_ARG_2352.651.7521.43
5XJUOD1, B_ASP_93NE, 2_ARG_235HE, 2_ARG_2352.982.0823.00
5XJUOD2, B_ASP_60ND2, B_ASN_64HD21, B_ASN_642.801.9929.63

Table 5.

The hydrogen bonds formed between the residue side chains within the six new experimentally determined SMN-related structures.

In this table, the residue naming scheme is Chain ID_residue name_residue number, ADHrepresents the angle formed by acceptor (A), donor (D) and hydrogen (H) ADH.

Table 5 shows the four hydrogen bonds formed between snRNP Sm-D2’s Asp93 and Gemin2’s Arg235 and Arg239. Functionally, Gemin2 is closely linked to SMN (formerly known as Gemin1), and NMR spectroscopy was used to experimentally determine a Gemin1-Gemin2 complex structure (PDB ID: 2LEH) [22, 41], making a closer visual inspection worthwhile of the SMN-related structures (PDB IDs: 5XJS, 5XJT and 5XJU [42], Table 1).

From Figure 5 (PDB ID:5XJS), it is quite clear that the three charged residues (snRNP Sm-D2’s Asp93 and Gemin2’s Arg235 and Arg239) sit right at the structural interface between Sm-D2 (pink) and Gemin2 (green), with their oppositely charged side chains closely facing each other, similar to the situation as reported by [1], where the deeply buried side chains of SMN’s Lys45 and Asp36 act as two electrostatic clips at the SMN-Gemin2 interface via interactions with both the side chains and the backbone of Gemin2’s Gln105, Gln109, His120, His123 and Trp124.

Figure 5.

Crystal structure of the Gemin2-binding domain of SMN, Gemin2 in complex with SmD1/D2/F/E (PDB ID: 5XJS [42]). In this figure, the whole structure is shown in cartoon and coloured by chain using PyMol [48], where green and pink represent Gemin2 and snRNP Sm-D2, respectively. In this figure, three amino acid residues are shown in sticks and labelled with red and blue texts nearby.

In the subsequent computational salt bridge analysis of the six new SMN-related structures, it turned out that the three charged residues did form salt bridges between their closely facing oppositely charged side chains, as listed in Table 6 below and illustrated in Figure 6.

PDB IDSBnumResidue AAtom AResidue BAtom BDistance (Å)
5XJL3B_ASP_93OD12_ARG_239NH13.734
5XJL3B_ASP_93OD12_ARG_239NH23.052
5XJL3B_ASP_93OD22_ARG_239NH23.052
5XJQ3B_ASP_93OD12_ARG_239NH13.817
5XJQ3B_ASP_93OD12_ARG_239NH23.059
5XJQ3B_ASP_93OD22_ARG_239NH23.004
5XJR3B_ASP_93OD12_ARG_239NH13.811
5XJR3B_ASP_93OD12_ARG_239NH23.022
5XJR3B_ASP_93OD22_ARG_239NH22.938
5XJS3B_ASP_93OD12_ARG_239NH13.688
5XJS3B_ASP_93OD12_ARG_239NH22.983
5XJS3B_ASP_93OD22_ARG_239NH23.092
5XJT2B_ASP_93OD12_ARG_239NH23.251
5XJT2B_ASP_93OD22_ARG_239NH23.163
5XJU3B_ASP_93OD12_ARG_239NH13.634
5XJU3B_ASP_93OD12_ARG_239NH23.084
5XJU3B_ASP_93OD22_ARG_239NH23.089
5XJL2B_ASP_93OD12_ARG_235NH23.657
5XJL2B_ASP_93OD22_ARG_235NH23.475
5XJQ2B_ASP_93OD12_ARG_235NH23.686
5XJQ2B_ASP_93OD22_ARG_235NH23.647
5XJR2B_ASP_93OD12_ARG_235NH23.847
5XJR2B_ASP_93OD22_ARG_235NH23.800
5XJS2B_ASP_93OD12_ARG_235NH23.548
5XJS2B_ASP_93OD22_ARG_235NH23.379
5XJT2B_ASP_93OD12_ARG_235NH23.258
5XJT2B_ASP_93OD22_ARG_235NH23.766
5XJU1B_ASP_93OD12_ARG_235NH23.996

Table 6.

A summary of salt bridge analysis of the six new SMN-related structures as of 25 September 2018 [40].

In this table, the residue naming scheme is Chain ID_residue name_residue number, SBnum represents the number of salt bridges computationally identified from the PDB entries listed in this table. Distance represents the distance between two oppositely charged groups/atoms in Å.

Figure 6.

Crystal structure of the Gemin2-binding domain of SMN, Gemin2 in complex with SmD1/D2/F/E (PDB ID: 5XJS) [42]. In this figure, the yellow dotted lines represent two examples of the hydrogen bonds formed between Asp93 and Arg235, while the blue dotted line represents an example of the salt bridge formed between Asp93 and Arg239 (Table 6).

Collectively, snRNP Sm-D2’s Asp93 and Gemin2’s Arg235 and Arg239 are three structurally important residues which help stabilise the structural interface through intermolecular electrostatic interactions, including both salt bridges and also hydrogen bonds, similar to the way SMN’s Asp44, Gemin2’s Arg213 and the two SMN residues (Lys45 and Asp36) play stabilising roles in the SMN-Gemin2 complex structure formation [1].

Considering the intimate functional relationship between Gemin2 and SMN, a further set of structural analysis was conducted for the hydrogen bond and the salt bridge for Arg235 and Arg239 of PDB entry 2LEH [22, 41], and it turned out that the two arginines did not form any intermolecular electrostatic interaction with SMN, neither salt bridge nor hydrogen bond. Instead, the 2 arginines of Gemin2 only formed 2 hydrogen bonds with Gln272 and His231 of Gemin2, and 1 stable salt bridge with Asp274 of Gemin2, where 16 salt bridges were identified for the 32 NMR structural models (Table 7), according to the structural analysis of PDB entry 2LEH [22, 41].

PDB IDSBnumResidue AAtom AResidue BAtom BDistance (Å)
24.pdb2A_ASP_274OD1A_ARG_239NH23.797
24.pdb2A_ASP_274OD2A_ARG_239NH22.738
01.pdb1A_ASP_274OD2A_ARG_235NH23.577
02.pdb1A_ASP_274OD2A_ARG_235NH23.164
04.pdb1A_ASP_274OD2A_ARG_235NH23.195
09.pdb1A_ASP_274OD2A_ARG_235NH22.692
16.pdb1A_ASP_274OD2A_ARG_235NH13.871
17.pdb1A_ASP_274OD2A_ARG_235NH12.868
19.pdb1A_ASP_274OD2A_ARG_235NH13.718
21.pdb2A_ASP_274OD2A_ARG_235NH13.457
21.pdb2A_ASP_274OD2A_ARG_235NH23.429
23.pdb1A_ASP_274OD2A_ARG_235NH23.798
24.pdb1A_ASP_274OD2A_ARG_235NH22.888
26.pdb3A_ASP_274OD1A_ARG_235NH23.770
26.pdb3A_ASP_274OD2A_ARG_235NH13.836
26.pdb3A_ASP_274OD2A_ARG_235NH22.442
31.pdb1A_ASP_274OD2A_ARG_235NH23.403

Table 7.

A summary of salt bridge analysis of PDB entry 2LEH [22, 41].

In this table, the residue naming scheme is Chain ID_residue name_residue number, SBnum represents the number of salt bridges computationally identified.

3. Concluding remarks

Given SMN’s critical role in the maturation of snRNP and in the development of SMA [2, 6, 11], it is necessary for the structure-activity relationship (SAR) characterisation to continue for the SMA protein. With various biophysical tools available for structural determination, for SMN-related proteins and biological complexes, such as the SMN complex and snRNPs, their structure determination and functional characterisation will undoubtedly continue to advance, which will be helpful both in further understanding of SMN’s role in SMA from a molecular structural point of view. In practice, however, advancements do not come easy. For instance, although both full-length structures of FL-SMN (with 294 residues) and SMNΔ7 (with 282 residues) were already experimentally determined using X-ray crystallography and deposited in the database (PDB IDs: 4NL6 and 4NL7), they were subsequently withdrawn by the author because the sample used for the structure determination was wrong. Otherwise, these two full-length SMN structures would constitute the very first step towards a comprehensive picture of the structural and functional insights into SMN’s role in the molecular pathogenesis SMA.

As of 25 September 2018, there is still no full-length SMN (or the SMN complex or the snRNP assembly) structure deposited in the wwPDB website [40], although it contains six new experimentally determined SMN-related structures, in addition to those reported in [1]. In terms of amino acid sequence, those SMN-related structures are still only SMN fragments, ranging from Gly26 to Lys51, and from Asn84 to Glu147. In between, there is still structurally not-determined-yet regions (referred to as structural gaps below) consisting of 204 SMN residues. Sixteen months have passed since the publication of [1], the structural gaps still remain, literally zero progress has been made to bridge them in spite of the six newly deposited structures, calling again [1] for further comprehensive structural determination and functional research for this SMA protein.

4. A residue-specific distributional analysis of the structural gaps in the Protein Data Bank

As a 38-kD protein, SMN is essentially a small one in terms of molecular weight, in comparison with all proteins whose structures have been deposited in the Protein Data Bank (PDB), a primary database for experimentally determined structures of biological molecules [40]. As discussed above, even for a protein as small as SMN, experimental structure determination does not seem simple or easy, especially when it has to be done in a full-length and gapless manner. Therefore, to test whether any residue-specific statistical pattern (not known yet before this chapter) exists in the structural gaps in the whole Protein Data Bank (accessed 25 September 2018), this chapter presents a set of residue-specific distributional analysis of all structural gaps throughout PDB.

While the number of experimentally determined protein structures keeps increasing in the PDB, with the number of cryo-EM structures [49] on the rise, X-ray crystallography and NMR spectroscopy remain to date the two main (Table 8) supplementary biophysical tools in structural biology, both with strengths and weaknesses [50, 51].

Experimental methodProteinsNucleic acidsProtein/NA complexOtherTotal
X-ray121,0811958625710129,306
NMR10,8481256250812,362
Electron microscopy17503162302404
Other2444613267
Multi-method117521125

Table 8.

A summary of the number of experimentally determined biomolecular structures in PDB as of 25 September 2018.

In PDB-format data, the atomic coordinates presented in ATOM records in a PDB file may not exactly match the sequence in the SEQRES records. However, these amino acids will often be included in the SEQRES records, since the portion of the chain was present during the experiment. In these cases, a ‘REMARK 465’ entry will be included in the header of the PDB file to identify each missing residue. For X-ray crystallography data, the ends of chains and mobile loops are often not observed in crystallographic experiments, and as a result, atomic coordinates are not included as ATOM records in the file, leading to the occurrence of gaps for structure determined by X-ray crystallography. Among currently available biophysical tools, NMR spectroscopy is able to provide unique access to atomic-level structural dynamic behaviour of protein molecules in solution under physiological conditions (such as temperature, pH, etc.). As a result, this chapter focuses on the structural gaps within protein structures determined by NMR spectroscopy, and aims to test whether any residue-specific statistical pattern exists in them. Here, structural gaps are defined as protein fragments with residues which exist in the originally studied molecule as shown in the SEQRES records, but not in the observed structure/atomic coordinates.

As of 20 September 2018, 10,844 NMR-determined protein structures have been deposited in the Protein Data Bank, according to a structure search with two parameters (molecule type = protein, experimental method = NMR). After the 10,844 PDB files were downloaded from the PDB website, the numbers of the total and the missing amino acid residues were extracted with an in-house python script for all proteins, as listed in Table 9.

ResidueMissing no.Total no.Ratio = Missing no./Total no.
A178275,6270.023
C15225,7770.00589673
E181179,7290.022
D139059,9080.023
G332181,3470.041
F61538,9930.015
I64055,0700.011
H514626,1820.196
K144275,7660.019
M120323,6520.050
L143990,8330.015
N83143,0800.019
Q127344,5940.028
P151846,2050.032
S315973,9040.042
R123452,7610.023
T111256,7870.019
W13813,2110.010
V98070,2520.013
Y62632,9050.019
Sum29,8121,066,5830.027

Table 9.

The numbers of the total and the missing amino acid residues in NMR-determined protein structures as of 25 September 2018.

In total, the 10,844 protein structures contains 1,066,583 amino acid residues, 2.8% of which (29812) are missing, i.e. the atomic positions of the 29,841 residues were not experimentally determined by NMR spectroscopy, although they were present in the NMR sample during the structural determination process.

From Figure 7, it can be seen that for 19 residues (excluding histidine), the missing ratio is well below or pretty close to 5%, while the missing ratio is 19.6% for histidine, as shown by the blue sharp peak on Figure 7. In a statistical one sample t-test analysis of the 19 missing ratios, it turned out 100% acceptable (P=1) that the average of ratio is 0.0231, and that the fitness between the 19 missing ratios and the red horizontal line (Figure 8) is 100% acceptable (P=1), according to a statistical Chi-square test, as revealed by Figure 8.

Figure 7.

A residue-specific distribution of the missing residues in NMR-determined protein structures as of 25 September 2018. In this figure, x-axis represents the one-letter codes for amino acid residues, and y-axis represents the residue-specific ratio of missing versus total residues in those NMR structures. The red vertical line highlights histidine as a particular residue with an outstanding missing ratio.

Figure 8.

A residue-specific scatter plot of the missing residues in NMR-determined protein structures as of 25 September 2018. In this figure, x-axis represents the one-letter codes for amino acid residues, and y-axis represents the residue-specific ratio of missing versus total residues in those NMR structures. The red horizontal line represents the average missing ratio level of the 19 residues.

While a missing ratio of 5% might be considered statistically insignificant, a missing ratio of 19.6% is clearly not to be ignored here, raising one obvious question: what on earth is so special about histidine that makes it so special among the 20 naturally occurring amino acids in this residue-specific distributional analysis of the structural gaps?

Similar to the other 19, histidine is a naturally occurring amino acid that is used in the biosynthesis of proteins. Also similar to the other 19, it contains an amino group (which is in the protonated ▬NH3+ form under biological conditions) and a carboxylic acid group (which is in the deprotonated ▬COO form under biological conditions). In particular, histidine has an imidazole side chain (which is partially protonated), classifying it as a positively charged amino acid at physiological pH (7.4). That is, among the 20 naturally occurring amino acids, five (Arg, Lys, His, Glu and Asp) possess ionisable side chains. Among the five, histidine is the only one whose side chain has an ionisable (with an intrinsic pKa at 6.04) [52, 53] imidazole ring structure, which can exist in two inter-convertible tautomeric states. While at a pH of 7.0, the imidazole ring is mostly deprotonated (proton occupancy = 9.88%), at a pH of 6.0, the imidazole ring is largely protonated (proton occupancy = 52.30%), as defined by the classical Henderson-Hasselbalch equation [50], where the positively charged imidazole ring bears two NH bonds and has a positive electric charge, which is equally distributed between both nitrogens. As the pH increases, the imidazole ring loses the positive charge, and the remaining proton of the neutral imidazole ring can reside on either nitrogen, giving rise to two tautomeric states of the histidine side chain [52, 54, 55].

To sum up, it is probable that the missing ratio of histidine is much higher than the other 19 because it has a special side chain with special dynamic structural and physicochemical properties (such as stacking interaction [56]), and with a special imidazole ring in constant protonation-deprotonation equilibrium [57] and two tautomeric states [52, 54, 55], making its NMR-observables (chemical shift for instance) difficult to be experimentally observed and measured by NMR spectroscopy and structurally calculated by NMR-related software in the structural determination of proteins. To address this issue of PDB-wide structural gaps, selective isotope labelling of histidine residues (the side chains in particular) can be a useful approach in biomolecular structural determination by NMR spectroscopy, not just alone, but also in collaboration with other biophysical tools, not just for the special histidine, but also for its 19 siblings in the fundamental building block of life.

Download

chapter PDF

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Wei Li (November 5th 2018). Structural and Functional Consequences of the SMA-Linked Missense Mutations of the Survival Motor Neuron Protein: A Brief Update [Online First], IntechOpen, DOI: 10.5772/intechopen.81887. Available from:

chapter statistics

169total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us