Distribution of the different exonic substitutions throughout the 18 exons of the LDLR gene.
Hypercholesterolemia is a major risk factor for atherosclerosis and its premature cardiovascular complications. Hypercholesterolemia can be multifactorial (diet, genetic background...) or - less frequently - monogenic, leading to Autosomal Dominant Hypercholesterolemia (ADH, OMIM #143890). ADH is characterised by a selective elevation of plasmatic Low Density Lipoprotein (LDL) levels, tendinous xanthoma and premature coronary heart disease. ADH has proven to be genetically heterogeneous and associated with defects in at least 3 different genes: LDLR (LDL receptor), APOB (apolipoprotein B) and PCSK9 (proprotein convertase subtilisin-kexin type 9).
Familial hypercholesterolemia (FH, OMIM #606945) is the most frequent form of ADH and is due to mutations within the gene encoding the LDL specific receptor. FH is an autosomal co-dominant trait, with homozygotes being more severely affected than heterozygotes (Goldstein and Brown, 1989). FH is also one of the most common inherited disorders with a frequency of heterozygotes estimated to be 1:500 and a frequency of homozygotes being 1:106 in most populations. In certain communities, such as French Canadians (Moorjani et al. 1989), Finns (Koivisto et al. 1992), Afrikaners (Kotze et al. 1989; Leitersdorf et al. 1989), Druze (Landsberger et al. 1992) and Lebanese (Lehrman et al. 1987), FH frequency can be as high as 1/67 because of founder effects.
2. The LDL receptor
The human low-density lipoprotein receptor mediates the transport of LDL into cells via endocytosis, and thus plays a major role in the clearance of lipoproteins from the blood. In 1973, by studying homozygous patient fibroblasts, Michael S. Brown and Joseph L. Goldstein showed that the deficient protein in Familial Hypercholesterolemia was the LDL receptor (Goldstein and Brown, 1985).
The LDLR gene is localised at 19p13.1-p13.3, spans 45 kb and includes 18 exons (Lindgren et al. 1985; Yamamoto et al. 1984). It is ubiquitously expressed and encodes a glycoprotein of 839 amino acids that is pivotal in cholesterol homeostasis. The correspondence between the 6 functional domains of the protein and the exons of the LDLR gene is now well-established (Figure 1) (See Jeon and Blacklow 2005 for a review).
The signal peptide (21 amino acids) encoded by exon 1 is necessary for transport to the cell membrane and is cleaved during translocation into the endoplasmic reticulum (ER).
The ligand binding domain, encoded by exons 2 to 6 mediates the interaction with lipoproteins. This domain is made of seven modules named LDL receptor type A repeat (LR) and homologous to sequences of the protein C9 of the complement cascade (Südolf et al. 1985). Each LR module is about 40 residues long, has six conserved cysteine residues, and contains a conserved acidic region near the C-terminus which serves as a calcium-binding site (Yamamoto et al. 1984, Fass et al. 1997). Mutational studies of the seven LR modules of the LDL receptor indicate that modules 3-7 all contribute significantly to the binding of LDL particles (Russel et al. 1989). Each of the LR5 and LR6 modules is essentially structurally independent of the other (North et al. 1999).
The EGF precursor homology domain (400 amino acids encoded by exons 7 to 14) is made of three 40 amino acids repeats homologous to the EGF precursor, and is involved in the dissociation of the receptor and the lipoprotein in the endocytosis machinery. The two first repeats are contiguous and separated from the third by a 280 amino acid sequence that contains five copies of a conserved motif (YWTD) repeated once for each of 40-60 amino acids. The first epidermal growth factor-like repeat (EGF-A) in the EGF homology domain interacts in a sequence-specific manner with proprotein convertase subtilisin/kexin type 9 (PCSK9) (Zhang et al. 2007, Kwon et al. 2008). PCSK9 post-translationally regulates hepatic LDL receptors by binding to them on the cell surface and by leading to their degradation. Gain-of-function mutations that increase the affinity of PCSK9 toward the receptor and increase plasma LDL-cholesterol levels in humans, have been reported in the PCSK9 gene associated with Autosomal Dominant Hypercholesterolemia (Abifadel et al. 2003, 2009). Loss-of-function mutations that decrease the affinity of PCSK9 toward the receptor have also been reported in the PCSK9 gene associated with low plasma levels of LDL (Cohen et al. 2005).
Exon 15 encodes a 58 amino acid sequence that is enriched in serines and threonines, which serve as attachment sites for O-linked sugar chains. The absence of this exon has no significant functional consequence in cultured hamster fibroblasts (Davis et al. 1986).
The 22 amino acids membrane-anchoring domain, encoded by exon 16 and the 5’ end of exon 17, is essential to the attachment of the receptor to the cell membrane.
The 50 amino acid cytoplasmic tails, encoded by the remainder of exon 17 and the 5’ end of exon 18, are involved in the endocytosis of the protein. The NPXY motif was shown to interact with the AP-2 clathrin adaptor and thus is important in the localisation of the receptor in coated pits on the cell surface. The NPXY motif was also shown to interact with the phosphotyrosine binding (PTB) domain of a specific clathrin adaptor protein encoded by the LDLRAP1 gene. Mutations in the LDLRAP1 gene have been reported in Autosomal Recessive Hypercholesterolemia (Garcia et al. 2001, Soutar 2010).
The reminder of exon 18 specifies the 2,6 kb 3’ untranslated region of the mRNA.
In normal fibroblasts, the precursor protein is modified in the ER: the 21 amino acid signal peptide is cleaved and the precursor of 120 kDa is O-glycosylated to give rise to the 160 kDa protein. The resultant mature protein is transported from the Golgi apparatus to the cell surface within 30 minutes. The transmembrane receptor is present at the surface of most cell types and mediates the transport of LDL into cells, via receptor-mediated endocytosis, thus playing a pivotal role in cholesterol homeostasis (Goldstein and Brown, 2009). By endosome acidification, the lipoparticle is dissociated from the receptor, degraded and the receptor recycles back into the membrane.
3. Mutations in the LDLR gene
Mutations involving a small number of nucleotides, from point mutations to small deletions or insertions, account for 90% of all mutations in the LDLR gene, while the remaining are major rearrangements due to unequal recombination between the 30 Alu sequences identified throughout the gene (Hobbs et al. 1990). To date, more than 1400 point mutations and small deletions or insertions associated with FH have been reported in the LDLR gene (
The UMD-LDLR database (
Among these 1404 small DNA variations of the LDLR gene, 58.5% are missense mutations, 21.7% are small deletions or insertions, 10.4 % are nonsense and 9.4% are splice site mutations. A large majority of these small DNA variations are single nucleotide substitutions (76.6%, 1076/1404), including 75.1% missense, 13.6% nonsense and 11.3% splice site mutations.
3.1. Missense mutations
Missense mutations are the most numerous of the small DNA variations (58.5%, 821/1404) reported in the LDLR gene in association with Familial Hypercholesterolemia (FH). Like the other small DNA variations in the LDLR gene, missense mutations are widely distributed throughout the whole sequence of the gene (Figure 2). Therefore, no real mutation hot spot can be defined which sustains the need to scan the whole gene sequence to identify FH-causing mutations in the diagnostic procedures.
The CpG dinucleotide has been shown to be a hot spot for mutations in humans because it can undergo oxidative deamination of 5-methyl cytosine (Krawczak et al. 1998). The LDLR gene sequence includes 123 CpG dinucleotides, accounting for 4.8% of the coding sequence. This ratio is similar to the mean percentage of CpG (3.7%) in the coding sequence of a large number of genes involved in human diseases and localised on autosomes (Cooper and Krawczak 1990). Missense mutations are the only substitutions in the LDLR gene occurring at the CpG dinucleotide for 4.8% (46/954) of all the single nucleotide variations. Interestingly, in the LDLR gene, the percentage of substitution occurring at the CpG (4.8%) is significantly lower than the mean observed for disease-causing mutations in other genes (37%) (Cooper and Krawczak 1990). There is no explanation, to date, for this observation.
In the LDL receptor protein, the most numerous amino acids are aspartate (8.7%), serine (8.1%), leucine (7.7%), cysteine (7.3%), glycine (7.2%) and valine (6.7%). The less represented amino acids are methionine (1.3%), tyrosine (2.0%), histidine (2.2%), tryptophane (2.3%) and phenylalanine (3.0%). This distribution of amino acids is consistent with the one reported for human proteins in general, with an exception for cysteine that is less abundant (3%) (Lewin 1990). The LDL receptor is known to be a cysteine-rich protein in which disulphide bonds between two cysteines are essential for ensuring the correct folding of 10 major modules necessary for protein activity (Russell et al. 1989, Kurniawan et al. 2001).
The number of mutations affecting an amino acid is not always related to its frequency in the protein. Cysteine, tryptophane and aspartate are more frequently affected than others residues, indicating that they are essential actors of protein activity. Substitutions affect 57 (90%) of the 63 cysteines of the LDL receptor, 43 (57%) of the 75 aspartates and 12 (60%) of the 20 tryptophanes. Cysteines are involved in the folding of the ligand binding and EGF-like domains. Aspartates are also highly conserved residues of the repeated modules of the LDL binding domain. Their negative charges are involved in bonds with positively charged residues of the apo B and apo E ligands. Apart from its hydrophobicity, tryptophane does not have a structural or functional role as manifest as those of a cysteine or a charged residue. However, along with methionine, tryptophane is the only amino acid encoded by a single codon, probably explaining its “more mutable” trait observed here.
A certain proportion of the disease-causing substitutions (missense and nonsense mutations), ~25%, have been shown to alter functional splicing signals within exons, such as exonic splicing enhancers (ESE), to create an alternative splice site within exons that is used preferentially, or induce the loss of the consensus exonic splice site (Cartegni et al. 2002, Sterne-Weiler et al. 2011). Within the LDLR gene, 28.4% of the reported missense mutations are predicted to alter functional splicing signals. The missense mutation c.2140G>C (p.Glu714Gln) that was predicted to be benign with four prediction tools for substitutions (Polyphen*, SIFT*, Pmut* and SNPs3D*) was predicted to create the loss of the intron 14 donor splice site with either NetGene2* and NNSPLICE* prediction tools for splice site mutations (Marduel et al. 2010). It is clear, however, that mRNA analyses are necessary to support these predictions, as performed for a small number of exonic substitutions. The conservative amino acid substitution c.2389 G>T (p.V776L) that would be unlikely to affect LDL receptor function, concerns the last nucleotide of exon 16 and causes exon 16 skipping (Bourbon et al. 2009). These missense mutations would therefore be likely to exert their major pathological effects on splicing rather than through an alteration in the amino acid sequence of the LDL receptor. This is reinforced by the observation of several silent substitutions associated with the clinical phenotype of familial hypercholesterolemia. The silent mutation p.Leu605Leu (c.1813C>T) was predicted to create a new donor splice site AGGT at position 1813 in exon 12. The use of this new donor site would lead to the substitution of leucine 605 by a threonine, the deletion of 11 amino acids (from Alanine 606 to Aspartate 616), a frameshift and the appearance of a premature termination 49 codons further on (Marduel et al. 2010). The variant, c.621C>T (p.Gly207Gly), was found to be associated with altered splicing. The nucleotide change leading to p.Gly207Gly resulted in the generation of new 3'-splice donor site in exon 4 of the LDL receptor gene. Splicing of this alternate splice site leads to an in-frame 75-base pair deletion in a stable mRNA of exon 4 and nonsense-mediated mRNA decay (Defesche et al. 2008). The silent mutation, p.Arg406Arg, that also introduces a new splice site, causes a deletion of 31 bp in the LDLR mRNA sequence, and introduces a premature termination 4 codons further on (Bourbon et al. 2007).
Tools for in silico prediction of protein function.
3.2. Frameshift mutations
Among the 1404 small DNA variations of the LDLR gene, a total of 305 (21.7%) are small deletions or insertions, including 261 (85.6%) independent mutations leading to a frameshift and 55 (14.4%) in-frame deletions or insertions. This proportion of in-frame small deletions or insertions is consistent with observations made for other disease-causing genes (Cooper, Antonarakis and Krawczak 1995). The frameshift mutations are due to either a small deletion (176/261, 12.5%) or insertion/duplication (85/261, 6.0%) of a few nucleotides (from 1 to 49 for deletions, from 1 to 23 for insertions). The sequence context analysis provides evidence that a repeated motif flanking the frameshift event could be involved in the aetiology of the mutation in 48.0% of the deletional events and in 29.2% of the insertional events.
Half of the frameshift mutations involved a single nucleotide: 58.5% (103/176) among deletions and 56.5% (48/85) among insertions. In half of the deletion cases and in half the insertion cases, the single nucleotide deletion/insertion occurs within runs of 2 to 7 identical bases. Runs of identical bases are known to cause deletions/insertions according to the slipped mispairing mechanism occurring at DNA replication (Ball et al. 2005).
Deletions involving larger sequences (from 2 to 49 bp) can be divided into three different types: (1) One of the repeated flanking sequences is included in the deletion, which is also explained by the slipped mispairing mechanism occurring at DNA replication (Ball et al. 2005); (2) The repeated sequences flanking the deletion are not included in the frameshift mutation, which is explained by homologous recombination between palindromic or symmetric repeated sequences (Cooper 1995); (3) Parts of the flanking repeated sequences are included in the deletion. To date, no molecular mechanism has been identified to explain such deletional events.
Insertions involving larger sequences (from 2 to 23 bp) can be explained by the same mechanisms as described for deletions, and can be divided into two different types: (1) The inserted sequence is a duplication; (2) The inserted sequence is new within the LDLR gene sequence. This latter observation raises the hypothesis that very probably insertions do not occur at random but rather in order to create repeated sequences that were not present in the original gene sequence. A consensus sequence, GTAAGT, was frequently identified flanking small deletions or insertions (Ball et al. 2005). In the LDLR gene sequence, this consensus is present at the 3’ end of exon 4 at position c.681-687. Among the 96 deletions (in frame and frameshift) in the LDLR gene, 11 (11.5%) are at this position pointing to a discrete hot spot for insertions, as observed in Figure 2 and in accordance with previous reports (Kotze et al. 1996).
3.3. Nonsense mutations
Nonsense mutations represent 10.4% (146/1404) of the small DNA variations in the LDLR gene, and 13.6% (146/1076) of the FH-causing substitutions.
Among the 860 codons of the LDLR gene sequence, 253 potential stop codons (codons that can be turned into a stop codon with only one substitution) were identified (29.4%) and were not equally distributed throughout the whole gene. In exons 2 to 8, more than 33% of the protein codons are potential stop codons, while less than 21% of the protein codons are potential stop codons in exons 9, 10, 13, 15 and 16. Among these 253 potential stop codons, 93 of them (36.8%) are affected by a mutational event.
The number of mutations affecting potential stop codons is not always related to their frequency in each exon. Potential stop codons are more frequently affected by mutation in exons 3, 9, 10 and 14, with 57.1%, 50.0%, 46.2% and 53.3% respectively of potential stop codons in each exon carrying a mutational event. Conversely, in exons 1, 12, 13 and 17, 16.7%, 18.2%, 20.0% and 26.7% respectively of the potential stop codons are affected by a mutational event.
3.4. Splice site mutations
Among the 1404 small DNA variations of the LDLR gene, a total of 132 (9.4%) are splice site mutations and, among the 1076 single nucleotide FH-causing substitutions, 122 (11.4%) are intronic. From the analysis of a large number of genes, a mean proportion of 15% for splice site mutations among disease-causing DNA substitutions was evaluated (Krawczak et al. 2007). The expected frequency of splice site substitutions within the LDLR gene is 9% (Cooper and Krawczak 1990). The number of FH-causing splice site substitutions observed in this wide review of the literature (9.5%) is thus consistent with the expected value for the LDLR gene.
Among the 132 splice site mutations of the LDLR gene, 14 (10.6%) are mid-intronic mutations situated at more than 10 bp of intron/exon junctions. Half of the intronic mutational events in the LDLR gene (55.3%, 73/132) affect the two canonical ‘‘AG’’ and ‘‘GT’’ highly conserved dinucleotides of the acceptor and donor splice sites respectively. Accordingly to the analysis of a large number of disease-causing mutations in different genes (Krawckak et al. 1992), within the LDLR gene intronic mutations affecting a donor splice site are more frequent (65.1%, 86/132) than mutations affecting an acceptor splice site (36.4%, 48/132).
4. Comparative analysis of mutations in the LDLR gene
To facilitate the mutational analysis of the LDLR gene and promote the analysis of the relationship between genotype and phenotype, in 1997 we created a software package along with a computerised database: UMD-LDLR. For each mutation, information is provided at several levels: at the gene level (exon and codon number, wild type and mutant codon, mutational event, mutation name), at the mRNA level (size, processing), at the protein level (wild type and mutant amino acid, affected domain, activity, mutation class), and at the personal level (ethnic background, age, sex, body mass index and familial history of coronary heart disease). The software package contains routines for the analysis of the LDLR database that were developed with the 4th dimensionR (4D) package from ACI. The use of the 4D SGDB gives access to optimised multi-criteria research and sorting tools to select records from any field. Moreover, 13 routines were specifically developed (Varret et al. 1997, 1998, Villèger et al. 2002, Béroud et al. 2005,
The aim of this study was to analyse these four mutation groups at the molecular, biological and clinical level.
4.1. Analysis of LDLR mutations at the molecular level
4.1.1. Frequency of mutational events
DNA substitutions are of two types: transitions are interchanges of two-ring purines (A>G and G>A) or of one-ring pyrimidines (C>T and T>C) and, therefore, involve bases of similar shape; transversions are interchanges of purine for pyrimidine bases, which involve exchange of one-ring and two-ring structures. Therefore, there are twice as many possible transversions as there are transitions. However, among human diseases-causing substitutions, transitions (63%) are observed more frequently than transversions (37%) (Cooper and Krawczak 1990).
Accordingly, in the LDLR gene, missense mutations due to transitions (55.9%, 459/821) are more frequent than substitutions due to transversions (42.5%, 349/821) (Figure 3). Like exonic mutational events, small DNA variations at the splice site are substitutions (92.4%, 122/132) or small deletions/insertions (9.1%, 12/132). Again, among the intronic substitutions, transitions (59.8%, 73/122) are observed more frequently than transversions (40.1%, 49/122) (Figure 3). Interestingly, in the LDLR gene, the ratio of transversion/transition is different for nonsense mutations. The transversions are the more frequent mutational event leading to a stop codon (52.7%, 77/146) compared to transitions (47.3%, 69/146) (Figure 3).
Because of the constraints mediated by the genetic code, transition A>G and transversion A>C, G>C cannot be at the origin of a stop codon. Thus, only two transitional events (G>A and C>T) and 6 transversional events (A>T, C>A, C>G, G>T, T>A and T>G) lead to a stop codon, which means that half of the transitional events and a quarter of the transversional events are not involved in nonsense mutations. These constraints can explain the observed difference in the ratio of transversion/transition between missense and nonsense mutations.
However, the ratio of transversion/transition is consistent with the one observed for human diseases-causing substitutions (Cooper and Krawczak 1990) when the three groups of mutations are taken together (missense, nonsense and splice). Altogether, transitions (55.9%, 601/1076) are observed more frequently than transversions (44.1%, 475/1076).
4.1.2. Distribution of the substitutions in the 18 exons of the LDLR gene
The expected number of mutations in each exon is estimated by the ‘Stat exons’ tool of the UMD software according to the size and the composition (mutability of each codon) of each exon (Béroud et al. 2000 and 2005). This analysis enables the detection of a statistically significant difference between observed and expected mutations.
For exons 1, 5, 8 and 10 to 14, all types of substitutions are distributed as expected. There is a significant excess of all substitutions (missense and nonsense) within exons 3 and 4 (Table 1), indicating discrete mutational hot-spots and underlining the essential role played by the encoded domains in protein function. Exon 3 encodes the second LR motif of the ligand binding domain in the LDL receptor. To date, there is no data revealing a more essential function of this LR motif when compared to the six others. Exon 4 encodes the three central LR motifs (LR3, LR4 and LR5) of the ligand binding domain in the LDL receptor. The LR5 motif have been shown to be the only one of the seven LR motifs to be able to bind the two ligands of the receptor, apo B and apo E, while the 6 other motifs only bind apo B (Russel et al. 1989). Thus, the mutations affecting this motif are associated with a more severe alteration of lipoprotein catabolism and, therefore, have a higher tendency to be selected by FH definition criteria. There is a significant deficit of all substitutions (missense and nonsense) within exons 15 and 16 (Table 1) indicating discrete mutational cold-spots. Exon 15 encodes the O-linked sugar domain of the LDL receptor that has been shown to have no significant functional activity (Davis et al. 1986). To date, there is no explanation as to the observed deficit of substitutions within exon 16 which encodes the membrane-anchoring domain that is essential to the attachment of the receptor to the cell membrane.
The two types of exonic substitutions (missense and nonsense) are differently distributed in exons 2, 6, 7, 9, 17 and 18 of the LDLR gene (Table 1). Missense mutations are the only ones presenting a significant excess in exons 6 and 9 and a significant deficit in exons 17 and 18 (Table 1), maybe reflecting a bias in this analysis due to the different number of mutations of each type. Nonsense mutations are less numerous than missense mutations, a significant difference is thus less probably obtained for nonsenses than for missenses. Nevertheless, these observations indicate discrete mutational hot-spots within exons 6 and 9 and discrete mutational cold-spots within exons 17 and 18. Exon 6 encodes the last LR motif of the ligand binding domain in the LDL receptor. To date, there is no data revealing a more essential function of this LR motif when compared with the six others. Exon 9 encodes the NH2-terminal part of the EGF-like domain which is rich in YWTD repeats which are essential for the correct folding of the receptor at the cell surface. To date, there is no explanation as to the observed deficit of substitutions within exons 17 and 18 encoding the COOH-terminal part of the membrane-anchoring domain and the cytoplasmic tail, which are essential for the attachment of the receptor to the cell membrane and in the endocytosis of the protein.
In exon 2, we observed a significant deficit of missenses and a significant excess of nonsenses (Table 1). Exon 2 encodes the first LR motif of the ligand binding domain in the LDL receptor. To date, there is no data revealing a more or less essential function of this LR motif when compared with the six others.
Interestingly, nonsense mutations are the only ones that present a significant excess in exon 7 of the LDLR gene (Table 1). This excess relies upon the high frequency of the c.1048C>T, p.Arg350X mutation, formerly called FH-Fossum. Indeed, this mutation is reported in 9 apparently unrelated patients from different geographic origins: Norway (Solberg et al. 1994), the Netherlands (Lombardi et al. 1995), the U.K. (Day et al. 1997), Poland (Gorski et al. 1998), Germany (Thiart et al. 1998), Canada (Gaudet et al. 1999), Japan (Yu et al. 2002), Denmark (Damgaard et al. 2005) and Spain (Brusgaard et al. 2006). In the absence of haplotypes demonstrating a common ancestor, these mutational events are supposed to be recurrent and to correspond to a mutational hot-spot in the LDLR gene.
|Exon||Expected mutations (%)||Observed missenses||Observed nonsenses||Observed exonic substitutions|
|2||5,0||2,5||< 0.01||11,6||< 0.001||3,9||ns|
|3||4,8||6,4||< 0.05||6,8||< 0.05||6,5||< 0.02|
|4||14,9||20,5||< 0.001||20,5||< 0.001||20,5||< 0.001|
|6||4,9||7,0||< 0.01||5,5||ns||6,8||< 0.01|
|9||6,6||11,2||< 0.001||4,1||ns||10,1||< 0.001|
|15||6,4||1,6||< 0.001||2,1||< 0.05||1,7||< 0.001|
|16||2,8||1,5||< 0.05||0,0||< 0.05||1,3||< 0.01|
|17||6,1||2,3||< 0.001||4,8||ns||2,7||< 0.001|
|18||1,3||0,1||< 0.01||0,0||ns||0,1||< 0.001|
4.2. Analysis of LDLR mutations at the biological level
4.2.1. Functional classes of LDLR gene’s mutations
Mutations in the LDLR gene have been classified into 5 functional groups based on the characteristics of the mutant protein produced and analysed in patients’ fibroblasts (Hobbs et al 1992):
Class 1 mutations disrupt the synthesis of the LDL receptor and no precursor is produced (null alleles).
Class 2 mutations block transport to the Golgi apparatus: mutations are reported in class 2A when a complete defect in transport to the cell membrane is observed and in class 2B when receptors are transported at a detectable - but markedly reduced - rate.
Class 3 mutations produce proteins that reach the membrane but fail to bind the LDL.
Class 4 mutations produce a receptor that binds the lipoprotein but which cannot be internalised. The mutations affecting the cytoplasmic domain alone are classed 4A, while those also affecting the membrane-spanning region are classed 4B.
Class 5 mutations block the acid-dependant dissociation of the receptor and the ligand in the endosome, an essential event for receptor recycling.
The link between the functional class type of the mutation and the severity of the disease has been established, and patients carrying a class 1 mutation are more severely affected than those with a mutation from another functional group (Hobbs et al 1992). In the UMD-LDLR database, among the 288 single nucleotide mutations with available data concerning the functional group, 42.0% (121/288) are class 2B, 31.9% (92/288) are class 1, 13.5% (39/288) are class 5, 7.6% (22/288) are class 2A, 3.8% (11/288) are class 4A and 1.0% (3/288) are class 3. Class 1 mutations are mainly nonsense and frameshift mutations (66.3% nonsenses, 30.4% frameshifts and 3.3% missenses) and 62% of them are localised in exons 2 to 6, encoding the ligand binding domain for one half and in exons 7 to 14 encoding the EGF-like domain for the other half (Figure 4). Class 2B mutations are mainly missense mutations (92.6% missenses and 7.4% frameshifts) and 71% of them are localised in exons 2 to 6, encoding the ligand binding domain (Figure 4). Class 5 mutations are mainly missense mutations (95% missenses and 5% splice site mutations) and 95% of them are localised in exons 7 to 14, encoding the EGF-like domain (Figure 4). Class 2A, 3 and 4A mutations are mainly missense mutations (59% missenses, 22% nonsenses and 19% frameshifts) and 67% of them are localised in exons 7 to 14, encoding the EGF-like domain. As expected, the localisation of these different classes of mutations is consistent with the functional definition of each class. The higher prevalence of mutations at the origin of truncated proteins (nonsenses and frameshifts) within the class 1 functional group is consistent with the expected null allele effect of these kinds of mutations. Altogether, these observations are globally in agreement with the admitted dogma according to which mutations leading to a protein of abnormal size (nonsense, frameshift and splice) are at the origin of a more severe phenotype than missense mutations.
4.2.2. LDL receptor activity
In the UMD-LDLR database, the LDL receptor activity measured in patients’ fibroblasts is available for 91 single nucleotide mutations: assays were performed for 24 heterozygote carriers, 22 homozygote carriers and 45 compound heterozygotes.. For homozygote carriers of a missense mutation, the mean LDL receptor activity is 8.7% rather than 2.7% for carriers of a mutation leading to a protein of abnormal size (nonsense, frameshift and splice) (Figure 5). For heterozygote carriers of a missense mutation, the mean LDL receptor activity is 33.2% rather than 19.8% for carriers of an abnormal-protein mutation. Moreover, a gradient can be drawn for compound heterozygotes with a mean LDL receptor activity of 13.3%, 7.3% and 3.6% for carriers of two missense mutations, one missense and one abnormal-protein mutation and two abnormal-protein mutations respectively (Figure 5). Once again, these observations are globally in agreement with an admittedly more severe phenotype for mutations leading to a protein of abnormal size when compared with missense mutations. However, missense mutations in the LDLR gene are associated with a larger spectrum of LDL receptor activity in fibroblasts (from 2% to 67% for heterozygotes and from 2% to 22.5% for homozygotes) when compared with mutations leading to a protein of abnormal size (from 2% to 47% for heterozygotes and from 2% to 11% for homozygotes).
4.3. Analysis of LDLR mutations at the biochemical/clinical level
4.3.1. Plasmatic lipid levels among LDLR gene mutations carriers
Among the 1061 unique events included in the UMD-LDLR database, lipid values are available for only 307 of them (29%), corresponding with 25 homozygote carriers and 282 heterozygote carriers of different molecular events within the LDLR gene (Table 2). According to the biochemical definition of familial hypercholesterolemia, triglycerides and HDL-cholesterol levels were within the normal range while the total- and LDL-cholesterol levels were elevated. As expected for a co-dominant disease, the total- and LDL-cholesterol levels were higher for homozygote mutation carriers than for molecular heterozygotes. No differences were observed between the four groups of mutations (missenses, frameshifts, splice sites and nonsenses), suggesting a similar effect of missense and mutations leading to a protein of abnormal size (nonsense, frameshift and splice) on the biochemical expression of the disease. Furthermore, no differences were observed among the distribution of total- and LDL-cholesterol levels among the four groups of mutations (Figure 6).
|Mean (SD)||1.31 (0.51)||7.50 (2.38)||9.50 (2.18)||1.66 (0.94)|
|Mean (SD)||1.21 (0.34)||7.84 (2.05)||9.89 (2.22)||1.39 (0.89)|
|Mean (SD)||1.28 (0.41)||7.17 (2.08)||9.56 (2.20)||1.49 (0.54)|
|Mean (SD)||1.17 (0.40)||7.74 (1.64)||9.43 (1.53)||1.46 (0.73)|
|Mean (SD)||1.04 (0.41)||15.55 (4.96)||17.39 (4.49)||1.42 (0.72)|
|Mean (SD)||0.66 (0.21)||16.01 (1.17)||17.43 (0.93)||1.23 (0.04)|
|Mean (SD)||0.67 (0.16)||15.25 (1.79)||18.06 (4.74)||1.34 (0.17)|
|Mean (SD)||0.87 (0.52)||17.54 (0.37)||19.56 (0.76)||2.00 (1.27)|
4.3.2. Clinical expression of familial hypercholesterolemia among LDLR gene mutation carriers
Of the 1061 unique events reported in the UMD-LDLR database, clinical data is available for only 230 of them (22%) including 25 homozygote carriers and 215 heterozygote carriers of different molecular events within the LDLR gene (Table 3). This clinical data concerns tendinous cholesterol deposits - such as xanthomas - and the diagnosis of premature coronary artery disease (CAD). Tendinous xanthomas are more frequently observed for the carriers of a mutation leading to a protein of abnormal size rather than for the heterozygotes for a missense mutation (Table 3). Once more, this observation is in agreement with the admitted dogma according to which mutations leading to a protein of abnormal size (nonsense, frameshift and splice) are at the origin of a more severe phenotype than are missense mutations. However, no differences were observed for the occurrence of CAD between missenses and those mutations leading to a protein of abnormal size (Table 3). This latter observation suggests a similar effect with regard to missense and mutation leading to a protein of abnormal size (nonsense, frameshift and splice) in the clinical expression of the disease.
|Missenses||Frameshifts, Splice sites, Nonsenses|
|Sex ratio (M/F)||1.06 (83/78)||1.09 (60/55)|
|Age (mean years ± SD)||39.6 ± 17.5||36.8 ± 14.9|
|N||Yes (%)||No (%)||N||Yes (%)||No (%)|
To date, it seems logical that mutations leading to a protein of abnormal size (nonsense, frameshift and splice) are at the origin of a more severe phenotype than missense mutations. The genotype/phenotype correlations performed with the UMD-LDLR database provide molecular, biological and clinical evidence that underlies this dogma. Moreover, missense mutations in the LDLR gene are the source of a wider spectrum in the severity of FH, than are mutations leading to a protein of abnormal size, from an almost normal phenotype to very severe forms of the disease.
Mutations in the LDLR gene are numerous and frequently recurrent but, conversely, rarely sporadic. These observations reveal not only the high mutability at one time of this gene, but also that these mutations were probably selected through time. It can be postulated that a hypercholesterolemic mutation could have given a selective advantage to carriers and may be a member of the pool of alleles that constitute the «”thrifty genotype” (Neel at al. 1998). The thrifty genotype hypothesis suggested that, in the early years of life, the hypercholesterolemic genotype was thrifty in the sense of being exceptionally efficient in the utilisation of food. It would thereby confer a survival advantage during times of food shortage. However, in contemporary societies, as food is usually available in unlimited amounts, the thrifty genotype no longer provides a survival advantage but instead renders its owners more susceptible to hypercholesterolemia.