Predisposition HLA haplotypes in T1D (marker IDDM1 of 6p21.3 locus).
Type 1 diabetes (T1D) is considered as a complex genetic trait: not only do multiple genetic loci contribute to susceptibility, but environmental factors also play a major role in determining risk. A large body of evidence indicates that inherited genetic factors influence both susceptibility and resistance to the disease. There is significant familial clustering of T1D with an average prevalence risk in siblings of 7% and in a child of diabetic of 6%, compared to 0.4% in the general population. The degree of clustering of disease in families can be estimated from the ration of the risk for siblings or a patient with the disease and the population prevalence of that disease. If this ratio (λs) is close to 1, there is no evidence for familial clustering. For T1D, the degree of familial clustering λs is about 15 (6/0.4%), which means that if a family has a child with T1D, then the other siblings are at a 15 times greater risk of developing diabetes than a member of the general population.
Genetic susceptibility in family members is clearly dependent on the degree of genetic identity with the proband, and in fact the risk of T1D in families has a non-linear correlation with the number of alleles shared with the proband; the highest risk is observed in monozygotic twins (100% sharing), followed by first and second degree relatives (50% and 25 % sharing respectively). Study of monozygotic twins provides a good opportunity to examine the expression of a human trait in a fixed genetic background: the absolute risk of a monozygotic twin of an effected individual gives a direct estimate of penetrance of this trait for a given environment. Although increased concordance rate for T1D has been found in monozygotic twins compared to dizygotic ones, it only reaches 40-50% with long-term follow-up.
The genetics of T1D has a long history of studies evaluating candidate genes for association with disease status in either case-control or family-based studies. Four chromosomal regions have emerged with consistent and significant evidence of association with T1D across multiple studies. These are the major locus - the human leukocyte antigen (HLA) region - on chromosome 6p21, the protein tyrosine phosphatase non receptor 22 (PTPN22) gene on chromosome 1p13, the insulin (INS) gene region on chromosome 11p15 and the high-affinity alpha chain of IL-2 receptor (IL2RA) gene on chromosome 10p15.
2. Strategies for identifying disease predisposing genes
Until very recently, three basic approaches have been used to identify genetic variants that may contribute to any human phenotype, including autoimmune disorders (Gregersen & Olsson, 2009). These approaches are a) candidate gene association studies, b) linkage analysis in multiplex families, and c) genome-wide association studies/scans (GWAS).
Candidate gene studies have been a mainstay for human genetic studies for several decades, and they will continue to play an important role. However, early candidate gene case-control studies often suffered from insufficient statistical power owing to inadequate sample sizes and to a lack of appreciation for the importance of careful matching of cases and controls. The strong publication bias for initial positive findings has been clearly documented (Ioannidis et al., 2001), and reports of candidate gene associations should be viewed with caution until multiple replications have been carried out. Candidate gene studies are usually done to address a plausible hypothesis, but plausibility should not mitigate the requirement for robust and reproducible statistical evidence.
Genetic linkage analysis depends on the cosegregation of chromosomal regions with a phenotypic trait within families, as is typical for highly penetrant Mendelian disorders. For most common autoimmune diseases, familial aggregation is rather modest, and therefore linkage analysis has quite low statistical power to detect chromosomal regions with shared genetic risk within families. Nevertheless, linkage approaches have occasionally contributed significantly to the identification of new risk genes.
In the last four years, GWAS have dominated efforts in gene mapping for autoimmune diseases, and these studies have led to the majority of the new genetic associations discussed in this review. Like candidate gene studies, the analysis is based on association, but in the case of GWAS, no particular hypothesis is being addressed. Rather, hundreds of thousands of hypotheses are being addressed simultaneously, without regard to biologic plausibility. This is a purely discovery-driven approach to gene identification, free of the limitations imposed by a priori assumptions about which genes and pathways are likely to be involved in the disease under study. Despite early skepticism, GWAS has proved to be a remarkably effective method of gene discovery.
Although the initial sequence of the human genome was an important first step, it is really the International HapMap Project that has provided the basis for a rational approach to GWAS (Frazer et al., 2007). When considering single nucleotide polymorphisms (SNPs), any two unrelated individuals in the population differ by approximately 0.1% across the 3.2 billion base pairs of the genome, or approximately 3 million SNPs. By studying 90 individuals in families from three major racial groups (Caucasian, Asian, and African), the HapMap Project has cataloged the majority of the common SNPs (e.g., SNPs with minor allele frequencies of 5% or greater) in these populations.
An important result of the HapMap Project has been the realization that to define most of the common variations among individuals, it is not necessary to genotype all 3 million SNP differences among them, but only a subset of these, on the order of 300,000 to 500,000 SNPs. This is because SNP alleles are distributed non-randomly among individuals, forming blocks of linkage disequilibrium (LD) that may extend from thousands to many hundreds of thousands of base pairs. This results in a kind of bar code that can be used to define the common genetic variation across the genome of a given individual.
This pattern of common variation across genomes had led to the concept of tagging SNPs (Chapman et al., 2003). This involves the use of a single SNP to tag a block of LD formed by many other SNPs, thus allowing for the interrogation of a large section of the genome with a single marker. A large block of LD encompasses several genes. Thus, many SNPs in this region are likely to provide evidence of association for diseases that are associated with the certain marker. Conversely, any association observed with such a SNP may be due to causal variants in any of the genes in this LD block. Additional functional and biological evidence, that this gene is involved in autoimmunity, is necessary to prove that the gene is the relevant risk gene in this region or that the marker used to first detect the association is causative. Once a SNP association is observed and confirmed, much work remains to be done to establish which genetic variants in the region are actually responsible (e.g., causative) for the association.
Furthermore, because many GWAS employ 500,000 or more SNPs across the genome, each addressing a separate hypothesis, the statistical significance levels must be adjusted for multiple testing. An overall
A major consideration for GWAS, as well as any case-control association study, is the issue of proper matching of cases and controls. The availability of genome-wide SNP data across many different populations has now permitted the use of so-called ancestry informative markers (AIMS) to match more precisely cases and controls for their ethnic background. This is quite straightforward for major racial groups, such as Asian, Caucasian, and African, but is more challenging within these groups (Seldin & Price, 2008). Polymorphisms in European Caucasian populations nicely illustrate this problem. The certain SNP displays a wide range of allele frequencies in the normal population, generally increasing in frequency going from southern to northern Europe. Therefore, if the cases and controls are taken from different European subpopulations, there is considerable risk of false positive (or negative) results. This phenomenon is generally referred to as population stratification. In the experience of many investigators, self-report by study participants is an unreliable indicator of ancestry, but in the context of GWAS, it is possible to correct for unknown population stratification using the entire set of SNP markers.
Most of the associations with autoimmunity involve the detection of odds ratios between 1 and 2, with many associations on the lower end of this range. The sample sizes required to generate statistical significance in the setting of GWAS (p 5 x 10-7) can be very large, depending on the allele frequency in the population and the odds ratios to be detected. For risk ratios on the order of 2 or more, sample sizes of 1000 are generally adequate. However, for risk ratios in the range of 1.2-1.3, even sample sizes of three or four thousand may have low statistical power depending on marker allele frequency (Iles, 2008). This magnitude of a population sample is now considered a minimum for a thorough analysis in the setting of a GWAS, and truly comprehensive genetic studies will require considerably larger sample sizes to be studied in the future.
3. HLA genes
The best evidence for a genetic component in the susceptibility to T1D comes from studies of the HLA genes in both populations and families as well as from animal models. It has been estimated that HLA provides up to 40-50% of the familial clustering of T1D. The contribution of the HLA region is easily detectable in genome-wide linkage analyses, as indicated by a LOD score of 65.8.
The HLA (human leukocyte antigens) region is a cluster of genes located within the major histocompatibility complex (MHC) on chromosome 6p21.3 (Mungall et al., 2003). MHC is a region of 4 million base pairs, what is 0.1% of the human genome. It contains over 100 genes, which are characterized by high degree of allele polymorphism. The human MHC is divided into three regions: class I, class II, and class III. HLA class I genes are located telomeric in the complex. They have single polypeptide chain containing 3 domains associated with β2-microglobuline. The class I genes classically are HLA-A, -B and -C molecules. They are expressed on the surface of almost all tissue cells and their function is to present peptide antigens to CD8+ cytotoxic T-cells. HLA class II genes are located at the centromeric end of MHC and occupy a region of 1 million base pairs. Class II molecules are heterodimeric proteins consisting of heavy α-chain and lighter β-chain. Molecules are expressed on antigen presenting cells (monocytes, macrophages, B lymphocytes, dendritic cells and activated T lymphocytes). The HLA class II genes are HLA-DR, -DQ and –DP molecules and their major function is to present peptide antigens to CD4+ T-cells. The MHC class III region contains many genes with varying functions. The most studied genes are tumour necrosis factor and lymphotoxin (TNFA and TNFB).
The state of the immune tolerance to self antigens is maintained by a complex network of T and B lymphocytes and their regulatory products. The mechanisms, by which this discrimination between ‘self’ and ‘non-self’ is established, involved central thymic tolerance and peripheral or post-thymic tolerance (Roitt, Brostoff, Male, 2002). Central thymic tolerance to self antigens (autoantigens) results from deletion of differentiating T-cells that express antigen-specific receptors with high binding affinity for the HLA-peptide complex, where peptide antigen belongs to intrathymic self peptide antigens (negative selection). Low-affinity self-reactive T-cells, and T-cells with receptors specific for the HLA-peptide complex, where peptide antigens are not represented intrathymically, mature and join the peripheral T-cell pool (positive selection). Post-thymic tolerance to self antigens has five main mechanisms. Self-reactive T-cells in the circulation may ignore self antigens, for example when the antigens are in tissues sequestered from the circulation. Their response to a self antigen may be suppressed if the antigen is present in a privileged site. Self-reactive cells may under certain conditions be deleted or rendered anergic and unable to respond. Finally a state of tolerance to self antigens can also be maintained by immune regulation.B-cell deletion takes place in both bone marrow and peripheral lymphoid organs. Differentiating B-cells that express surface immunoglobulin receptors with high binding affinity for self-membrane-bound antigens will be deleted soon after their generation in the bone marrow. A high proportion of short-lived, low-avidity, autoreactive B-cells appear in peripheral lymphoid organs. These cells may be recruited to fight against infection. From this point of view towards the HLA molecules, the discovery of function of HLA-DO, that is mainly expressed in B-cells and involved in antigenic peptide binding to HLA class II molecules in the endocytic pathway, has triggered interesting hypothesis.
The pathway resulting in the antigen processing and presentation of peptides on HLA class II begins with the biosynthesis of the HLA class II α and ß chains in the endoplasmic reticulum, where they form a trimeric complex with a co-expressed protein called invariant chain. The invariant chain, which fills the peptide-binding groove of the HLA class II molecule, directs the HLA class II molecule first to the Golgi and secondly to a specialized MHC class II compartment in the endosomal/lysosomal pathway. Here, invariant chain is proteolytically degraded in a stepwise fashion until only a fragment remains bound to the HLA class II molecule, called the class II-associated invariant chain peptide (CLIP). The exchange of CLIP for the stably binding antigenic peptides in the compartment is catalyzed by a specialized lysosomal chaperone called HLA-DM. The HLA class II peptide complexes are then transported to the cell surface for recognition by the CD4+ T lymphocytes.
HLA-DO molecule, expressed in B-cells, associates with DM thereby markedly affecting DM function. DO forms a unique, cell type-specific modulator that masters the immune response induced by B-cells (van Ham et al., 2000). First, DO reduces class II-mediated antigen presentation as a whole, leading to a generalized diminished CD4+ response. Second, DO regulates the composition of the class II-mediated peptide repertoire that will determine the specificity of the B cell-induced immune response. Both modulatory actions of DO may increase the threshold for nonspecific CD4+ activation and may prevent autoimmune responses, both of which could be obvious evolutionary selection criteria. In this respect, it is interesting to note that DO-deficient mice have a normal functioning immune system, but show higher levels of serum immunoglobulins. This may indicate that the absence of DO leads to a more generalized immune activation.
An association between HLA and autoimmune diabetes mellitus has been recognized for more than 20 years. The initial associations were found with the HLA class I serologically-defined alleles B8 and B15. When the HLA-DR locus was defined, it was found that T1D associated more closely with the DR4 and DR3 serotypes than the linked B8 and B15. Later, it was determined that DR4 was a haplotype consisting of a family of distinct HLA-DR and –DQ molecules. This led to the conclusion that the DQB1 locus of HLA-DQ is highly associated with T1D. Recent technologies, which have allowed for genome-wide searches for linkage in humans to T1D, have verified that HLA is a gene locus involved in T1D and, in fact, HLA is the major locus. The strongest association within the HLA gene complex is to combinations of DQA1 and DQB1.
Many studies have verified that DQB1*0302 is a strong susceptibility gene and that the heterozygous combination of DQA1*0301-DQB1*0302 on the HLA-DR4 haplotype and DQA1*0501-DQB1*0201 on the HLA-DR3 haplotype results in a synergistically increased risk of T1D (Table 1). DQA1*0301-DQB1*0302 is the most prevalent risk conferring haplotype in Caucasians detected in 74% of type 1 diabetes patients followed by DQA1*0501-DQB1*0201 detected in 52% of Caucasian patients (Sanjeevi et al., 1995).´
|DQB1*0201 - DQA1*0501 - DRB1*0301|
|DQB1*0302 - DQA1*0301 - DRB1*0401|
Also, there is agreement that DQB1*0602 allele, linked with DR2, is a strong protective gene (Kockum et al., 1993). The protective effect of DQB1*0602 is dominant to the susceptibility effects of DQB1*0302-DQA1*0301 and DQB1*0201-DQA1*0501 (Pugliese et al, 1995). A single copy of this DQB1*0602 allele is adequate to confer significant negative association.
HLA-DQ genes are of primary importance but HLA-DR genes modify the risk conferred by HLA-DQ (Sanjeevi et al., 1996). The risk associated with an HLA genotype is defined by the particular combination of susceptible and protective alleles. The frequencies of predisposition alleles, DQB1*0302 and DQB1*0201, are usually increased, while frequency of protective DQB1*0602 allele is usually decreased.
The categorization of T1D as an autoimmune disease is not ambiguous, with the presence of the predisposing HLA class II haplotypes, DRB1*04-DQB1*0302 and DRB1*03, in at least 90% of cases and the presence of autoantibodies and, more recently, autoreactive, anti-islet antigen-specific T-cells in the circulation of prediabetic individuals and of newly diagnosed and established cases (Todd, 2010). Transgenic mouse modeling (Nakayama et al., 2005, Mellanby et al., 2008) has provided direct support that it is the peptide-binding activity of the HLA class II molecules in antigen-presenting cells (APCs) for T lymphocyte peptide recognition that is the main mechanism of action of the DR and DQ molecules in T1D etiology. This is by far the major determinant of disease in the genome (Cucca et al., 2001, Zhang et al., 2008). In humans (and mice), susceptibility and resistance to T1D has been mapped to particular polymorphic peptide-binding pockets of the DQ molecule, pocket 9, and of DR, pockets 1 and 4 (and their mouse orthologs, IA and IE, respectively) (Cucca et al., 2001, Suri et al., 2005). It remains a goal to identify molecules that modulate the function of these pockets and could be therapeutic.
Class II molecules in APCs bind peptides from the currently identified autoantigens, preproinsulin (PPI), insulinoma-associated antigen 2 (I-A2), glutamic acid decarboxylase (GAD), and zinc transporter (ZnT8), and present these to CD4+ T-cell antigen receptors (TCRs) in the thymus and in the periphery, for example, in pancreatic lymph nodes and within the islets themselves (von Herrath, 2009). CD4+ T-cells provide help to CD8+ cytotoxic T-cells, which are the widely accepted most important killer of human islet beta cells in T1D autoimmunity (Skowera et al., 2008, Willcox et al., 2009).
Recent fine mapping of the extended MHC region, 8 Mb of chromosome 6p21, which was made possible by the development of high-throughput, dense, and genome-wide SNP genotyping (GWAS), has confirmed that the major susceptibility, and resistance, to T1D does indeed map to the HLA class II region of the MHC (Nejentsev et al., 2007, Howson et al., 2009).
This long-awaited comprehensive genetic mapping also showed that T1D susceptibility is modified, however, by lesser but still important effects of the specific allotypes of the equally polymorphic HLA class I molecules, HLA-B and HLA-A (Nejentsev et al., 2007, Howson et al., 2009). These molecules present peptides to cytotoxic CD8+ T-cells and are expressed strongly in the pancreatic insulin-producing beta cells (von Herrath, 2009). These recent HLA class I associations with T1D and additional histological evidence (Willcox et al., 2009) help secure a central role for CD8+ T-cells and their helpers, CD4+ T-cells and APCs, in T1D etiology.
4. Non-HLA genes
This central role for APC-peptide-T-cell interaction in T1D is emphasized. However, the etiopatogenesis of disease is completely dependent not only on the environment, but also on the alleles of multiple genes across the genome (Wicker et al., 2005), which products are involved in following immunological reactions. The specific MHC/peptide-TCR interaction, although necessary, is not sufficient to fully activate the T-cell. A second signal is required; otherwise the T-cell will become unresponsive. This second signal, also referred to as co-stimulation, is of crucial importance. The most potent co-stimulatory molecules known are B7s, which are Ig superfamily molecules, including B7-1 (CD80) and B7-2 (CD86). B7s exist as homodimers on the cell surface. These proteins are constitutively expressed on dendritic cells, but can be upregulated on monocytes, B-cells and probably other APCs. They are the ligands for other Ig superfamily molecules, CD28 and its homologue CTLA-4 (CD152) – which is expressed after T-cell activation. CD28 is the main co-stimulatory ligand expressed on naïve T-cells. CD28 stimulation has been shown to prolong and augment the production of IL-2 and other cytokines, and is probably important in preventing the induction of tolerance. CTLA-4, the alternative ligand for B7, is an inhibitory receptor limiting T-cell activation, resulting in less IL-2 production. Thus CD28, constitutively expressed, initially interacts with B7, leading to T-cell activation, but once this has peaked, the upregulation of CTLA-4 with its higher affinity limits the degree of activation, as available B7 will interact with CTLA-4.
Apart from the cell-surface interactions, cytokines, acting locally, are also involved in T-cell activation. IL-2 is responsible for promoting cell division in a resting T-cell. Triggering of the TCR along with co-receptors results in IL-2 synthesis by the T-cell itself. This binds to low-affinity IL-2 receptors on the T-cell surface that consist of two chains, beta and gamma. TCR triggering also results in expression of the alpha-chain of the IL-2 receptor, which, in conjugation with the other two chains, results in a high-affinity receptor. The alpha chain together with the beta chain binds IL-2, while the gamma chain signals to the cell. There is a transient production of IL-2 for 1-2 days. Other cytokines may also contribute to T-cell proliferation, and their relative representation in microenvironment determines development of certain T-cell subtypes. It is assumed that impairment of the balance between Th1 and Treg cells participates in etiopatogenesis of autoimmune diabetes. The transient expression of the high affinity IL-2 receptor for about 1 week after stimulation of the T-cell, together with the induction of CTLA-4 helps limit T-cell division. (In the absence of positive signals, the T-cells will start to die by programmed cell death, apoptosis.)
Based upon current analyses of completed genome screens, these non-HLA region genes may contribute relatively smaller (but significant) increments in genetic risk on an individual basis (Invernizzi & Gershwin, 2009). For example, PTPN22, INS, and IL2RA loci, which are the only other generally accepted genetic contributors to T1D risk, have a relative risk 3.9, 3.5 and 2.5, respectively. Other locus-specific effects range from 1.1-1.9.
In T1D, there are currently identified 52 non-HLA regions. For 15 of them, the most likely causal gene in the locus was defined, as supported by current functional evidence (Todd, 2010). Summary of candidate causal genes is in the Table 2. They are listed according to decreased relative risk. Further, there are currently identified 423 genes in the immediate high-association interval and 790 genes in the > 1 Mb regions. In addition, there are 167 and 433 non-protein-coding sequences in these two categories of regions, and these could well have altered functions owing to genome polymorphism.
Perhaps the most replicated and broadly relevant of these associations is with the intracellular protein tyrosine phosphatase non receptor 22 (PTPN22). The gene for PTPN22 is located on chromosome 1p13.2. The initial association of PTPN22 with T1D was reported by Bottini et al., 2004, who took a candidate gene approach and focused on a nonsynonymous amino acid polymorphism (R620W, substitution of tryptophan for arginine) that was judged likely to have functional correlates. This polymorphism corresponds to a single nucleotide substitution of thymin for cytosine at the position 1858 of DNA (C1858T). In an independent effort, Begowich et al., 2004, selected PTPN22 as part of a limited genome-wide screen of likely functional variants in several thousand candidate genes, informed in part by previous linkage results. This led to the association of PTPN22 with RA. Both associations have now been widely replicated, and the PTPN22 associations with these and several other autoimmune diseases are among the most robust in the literature. For RA and T1D the PTPN22 620W allele confers a nearly two-fold risk for disease, with odds ratios in the range of 3-4 for homozygous individuals. Thus, in terms of strength of association, PTPN22 is second in importance only to the HLA for these two diseases.
|Name||Original designation||Chromosome localization||RR|
in immune system
|PTPN22||1p13.2||3,9||TCR and BCR signaling|
|IL2RA||IDDM10||10p15.1||2,5||T-cell and Treg homeostasis|
|SH2B3||12q24.12||1,7||Intracellular adaptor protein|
|PTPN2||18p11.21||1,7||Negative regulation of T-cells|
|IL10||1q32.1||1,5||Inhibition of Th1-cells|
|CTLA-4||IDDM12||2q33.2||1,5||T-cell costimulatory inhibition|
|TLR7/TLR8||Xp22.2||1,4||Receptors for viral RNAs|
|IFIH1||2q24.2||1,4||Receptor for viral RNAs|
|IL2||4q27||1,3||T-cell and Treg homeostasis|
|IL7RA||5p13.2||1,2||Memory T-cell homeostasis|
The patterns of association between the PTPN22 620W allele and autoimmunity are instructive on many levels. First, PTPN22 was among the first and most convincing demonstrations that common susceptibility genes underlie diverse autoimmune phenotypes. In addition to T1D and RA, PTPN22 is associated with Graves’ disease (Velaga et al., 2004), Hashimoto thyroiditis (Criswell et al., 2005), myasthenia gravis (Vandiedonck et al., 2006), systemic sclerosis (Dieude et al., 2008), generalized vitiligo (LaBerge et al., 2008), Addison’s disease (Skinningsrud et al., 2008), and alopecia areata (Betz et al., 2008). Associations with juvenile idiopathic arthritis (JIA) and SLE have generally been weaker than for RA and T1D. Strikingly, there is no evidence of association with multiple sclerosis, and the 620W allele actually appears to be protective for Crohn’s disease (Barrett et al., 2008). These contrasting patterns of association are likely to reflect fundamental similarities and differences in the mechanisms underlying the pathogenesis of these disorders. In general, it appears that an important feature of the PTPN22-associated diseases is that they all have a prominent component of humoral autoimmunity.
Knockout animals for Lyp (also known as PEP, the mouse ortholog of PTPN22) exhibit enhanced T-cell activation in combination with an increased production of antibodies (Hasegawa et al., 2004). This is consistent with the ability of PTPN22 to dephosphorylate Lck at the activating phosphotyrosine 394, leading to persistent phosphorylation and Lck activation in knockout animals. Lck is the tyrosine kinase of the Src family associated with CD4 and CD8. Yet somewhat surprisingly, the consequence of the 620W risk allele in humans is apparently a lower degree of T cell activation - an increased threshold for T-cell receptor signaling (Vang et al., 2005, Rieck et al., 2007). One clear biochemical consequence of the 620W polymorphism is to reduce the binding of PTPN22 with the intracellular kinase Csk (Bottini et al., 2004, Begovich et al., 2004). Indeed, amino acid position 620 of PTPN22 is located within one of several SH3 binding sites in the PTPN22 molecule. An important role of Csk is to inhibit Lck activity by phosphorylation of amino acid 505 of the Lck molecule (Vang et al., 2008). Whether this particular activity is affected by the 620W polymorphism in PTPN22 is unclear. Bottini and coworkers have proposed a model for interactions among Lck, PTPN22, and Csk that may explain the elevation of thresholds for TCR signaling, with the overall implication that reduced, rather than elevated, T-cell triggering may be part of the phenotypic predisposition to autoimmunity. A similar tendency to increased thresholds for receptor triggering has also been reported in B-cells (Rieck et al., 2007).
A second intracellular protein tyrosine phosphatase non receptor 2 (PTPN2) encoded on chromosome 18p11.21 has also been associated with human autoimmunity; convincing associations have been reported for T1D (Todd et al., 2007) and Crohn’s disease (Wellcome Trust Case Control Consortium, 2007, Barrett et al., 2008), with odds ratios in the range of 1.3. PTPN2 is ubiquitously expressed and is clearly involved in immune function. PTPN2-knockout animals exhibit a fatal inflammatory wasting syndrome (Pao et al., 2007), with accompanying abnormalities in multiple cell types. PTPN2 appears to have a negative regulatory role on IL-2R signaling in T-cells, consistent with the fact that Janus family kinases 1 and 3 (Jak1 and Jak3) are among its substrates (Simoncic et al., 2002).
The insulin (INS) gene is located on chromosome 11p15.5. The immune-mediated process leading to development of T1D is highly specific to pancreatic beta cells. The insulin gene, therefore, is a plausible candidate susceptibility locus since preproinsulin has emerged as the most important autoantigen in childhood-onset T1D (Skowera et al., 2008).
Mutations of INS cause a rare form of diabetes that is similar to MODY (Maturity Onset Diabetes in the Young). Other variations of the insulin gene (variable number tandem repeats and SNPs) play a role in susceptibility to T1D and T2D. The polymorphism in the insulin minisatellite or variable number of tandem repeats (INS VNTR) are associated with the risk of diabetes and influence thymic insulin messenger RNA (Bell et al., 1984, Bennett et al., 1995, 1996). It is located 596bp upstream of the insulin gene translation initiation site and it is composed of a variable number of tandem repeats of 14-15bp in length based on the consensus sequence 5`-ACAGGGGTGTGGGG-3` (Bell et al., 1982). There are three main types of INS VNTR defined by their size: class I (26-63 repeats), class II (approximately 80 repeats) and class III (140-200 repeats). Each of them can be further divided based on the number of repeats and sequence. In white European population the minisatellite displays a bimodal allele size distribution with class I alleles and class III alleles at frequencies of 70% and 30%. Class II alleles are rare in white European population. Allelic variation in size of the insulin VNTR correlates with the expression of insulin in the pancreas and thymus and with placental expression of insulin growth factor-2 gene (IGF-2), which is downstream from the insulin gene (Moore et al., 2001).
Homozygosity for class I alleles is generally associated with high risk for diabetes, whereas class III alleles confer dominant protection. Class III alleles are associated with higher expression of insulin messenger RNA within the thymus and insulin with thymic transcription activity correlate inversely with susceptibility to diabetes in humans (Pugliese et al., 1997, Vafiadis et al., 1997). These findings support hypothesis that genetically determined differences in the expression of self antigens in the thymus could influence susceptibility to autoimmunity. High concentration of thymic insulin might lead to negative selection (deletion) of autoreactive T-cells bearing a TCR that is directed against self antigens and thus to the development of tolerance.
The gene encoding the high-affinity alpha chain of the three-chain IL-2 receptor (IL2RA, CD25) on chromosome 10p15.1 is also strongly associated with T1D with evidence for at least two distinct causal variants (Dendrou et al., 2009). Correlations, first with soluble CD25 in serum and plasma (Lowe et al., 2007), and then with mRNA and with CD25 surface expression (Dendrou et al., 2009), indicate unequivocally that IL2RA is a causal gene in this chromosome region. T1D-predisposing alleles of SNPs in intron 1 and the 5’region of the gene lower transcription, which results in lower amounts of CD25 on the surface of memory CD4+ T-cells and also in number of CD25+ naive CD4+ T-cells. One of the notable results to come out of this large-scale clinical study was that the expression of CD25 on memory T-cells was highly repeatable within the same person across several months and, thus, strongly heritable, indicating that here is an immunophenotype that is hard-wired into the human genome. For the 1% of the British population that have the most protective IL2RA genotype and most CD25 on these memory cells, there is considerable protection from T1D. Memory CD4+ T-cells with more CD25 on their surface secrete more IL-2 on stimulation, and hence, it is possible that T-cells are a source for IL-2, which could promote Treg cell function. From the results in T1D for the IL-2 and IL2R genes (there is an evidence for the exon 1 common synonymous SNP of IL2 and the SNP of IL2RB associations with T1D), one can ask the question: with such fundamental molecules being altered in the immune system, both of which are essential for immune tolerance, why are several immune diseases not associated with these variants? Answer can be simple: The balance between the functions of T effector and Treg cells appears to be important in human T1D (Lawson et al., 2008, Long et al., 2009, 2010).
Moreover, the cytokine-chemokine receptor 5 (CCR5) gene on chromosome 3p21.31, which has a naturally occurring, albeit rare (1% in populations), functionally disruptive variant, delta32 (the 32 base pair deletion), is associated with protection from T1D, RA, and celiac disease (Smyth et al., 2008), indicating the importance of chemokines and T-cell trafficking in T1D.
An exciting example of the rapid gain in knowledge is the intracellular adaptor protein with PH and SH2 domains (SH2B3) gene on chromosome 12q24.12. This gene was first associated with T1D (Todd et al., 2007), in which a nonsynonymous SNP, corresponding to a nonsynonymous amino acid polymorphism (R262W, substitution of tryptophan for arginine), was the most associated marker. The 262W allele is the nonancestral allele altering the sequence of a predicted functional pleckstrin homology domain, suggestive that this could be a causal SNP and gene. Subsequently, the same variant was associated with celiac disease (Hunt et al., 2008) and, most recently, with platelet count and cardiovascular disease (Soranzo et al., 2009). The latter study also focused on the R262W polymorphism and showed that the 262W allele lies on a very long extended conserved haplotype for which they provided evidence for very recent selection in European populations, presumably to help provide resistance to a pathogen(s). SH2B3 encodes Lnk, an important negative regulator of cell-signalling events from a number of receptors, including the TCR and MPL, the latter of which is the receptor for thrombopoietin on platelets. In general, Lnk negatively regulates lymphopoiesis and early hematopoiesis. It functions in responses controlled by cell adhesion and in crosstalk between integrin- and cytokine-mediated signalling (Takaki, 2008). The lnk-deficiency results in enhanced production of B-cells, and expansion as well as enhanced function of hematopoietic stem cells. These facts support the hypothesis that SH2B3 is the disease-causal gene in this chromosomal region.
Another candidate gene for T1D and other autoimmune diseases is located on chromosome 2q33.2 and encodes the T-cell costimulatory receptor, cytotoxic T lymphocyte antigen 4 (CTLA-4, CD152). In addition to T1D (Ueda et al., 2003), many human autoimmune diseases are associated with SNPs in the CTLA-4 gene, including Graves’ disease (Ueda et al., 2003), RA (Plenge et al., 2005), SLE (Barreto et al., 2004), and celiac disease (Smyth et al., 2008). Mutations or polymorphisms leading to altered activity of CTLA-4 are believed to play an important role in the risk for developing autoimmunity (Maier & Hafler, 2009). The effect of CTLA-4 gene seems to be independent of the HLA alleles or the insulin VNTR risk genotype (van der Auwera et al., 1997). There is a microsatellite marker in the 3’UTR of the CTLA-4 sequence, and several point polymorphisms have been detected at CTLA-4. A large-scale fine mapping and sequencing study has previously pinpointed the causal variant for T1D and Graves’ disease to a SNP called CT60, which is located in the 3’ untranslated region (UTR) of CTLA-4 near the polyA site. Such large-scale fine mapping studies are warranted for all other autoimmune diseases that have previously associated with variants at CTLA-4. Interestingly, some evidence has also been produced to suggest that CTLA-4 polymorphisms may influence gene expression. CTLA-4 functions as a potent negative T-cell regulator. Deficiency of CTLA-4 in mice leads to a lymphoproliferative disorder, causing death within 4 weeks of birth. More recently, its protein product has been shown to have an essential and specific role in Treg cell function in mice (Wing et al., 2008). One outstanding question is the role and function of the isoform of CTLA-4 that is missing the transmembrane domain (encoded by exon 3), referred to as soluble CTLA-4, and which is presumed to be secreted. The genetic analyses and correlation of CTLA-4 gene haplotypes with messenger RNA levels suggest that it is a reduction in the amount of soluble CTLA-4 that is responsible for the increased susceptibility to T1D in T-cells, including Treg cells (Atabani et al., 2005). Ueda et al., 2003, suggested that the disease susceptible genotype at the CT60 SNP (the G allele) influences the relative splicing efficiency and production of soluble CTLA-4 (sCTLA-4) versus full-length CTLA-4 (flCTLA-4) mRNA. In fact, GG homozygote individuals at CT60 (disease susceptible genotype) had half the levels of sCTLA-4 when normalized to flCTLA-4 compared with AA homozygote individuals at CT60. Indeed, the 3’ UTR of CTLA-4 has been shown to regulate the mRNA steady-state levels of CTLA-4 and sequences in the 3’ UTR regulate CTLA-4 mRNA stability and in vitro translation efficiency (Malquori et al., 2008). Another study has shown that the relative responsiveness of naive (CD4+ CD45RAhigh) versus memory (CD4+ CD45RAlow) T-cell subsets to TCR signaling was altered in healthy donors with the susceptible G allele compared to healthy donors with the protective A allele at CT60 (Maier et al., 2007). The CT60 SNP, in combination with autoantibody measurement, has also allowed the identification of a subgroup of T1D subjects; the presence of autoantibodies against thyroid peroxidases (TPOAb) not only increases risk of T1D, but also results in an earlier age of onset of disease. Individuals with the G allele or GG genotype exhibit an almost twofold increased risk of having both T1D and TPOAbs (Howson et al., 2007). It is of interest to note that the CTLA4 association at CT60 with RA is also enhanced in subjects with RA who are seropositive for anti-citrulline antibodies (Plenge et al., 2005).
CTLA-4 is also important in preventing autoimmunity in a major mouse model of T1D, the non-obese diabetic (NOD) mouse. In addition to the full-length and soluble isoforms, a ligand-independent form of CTLA-4 (liCTLA-4) was discovered in mouse (Wicker et al., 2004). The mRNA for liCTLA-4 lacks exon 2 and therefore encodes a molecule lacking the CD80/CD86 ligand binding domain. In the NOD mouse, it is the liCTLA-4 isoform that is differentially regulated by susceptible and protective CTLA-4 alleles. This genotype-dependent expression of the liCTLA-4 isoform is most likely because of a synonymous SNP in an exon splicing silencer motif located in exon 2. The liCTLA-4 isoform was shown to strongly inhibit T-cell responses by binding and dephosporylating the TCR chain and emerged as a more potent negative T-cell regulator than flCTLA-4 in terms of proliferation and cytokine secretion (Vijayakrishnan et al., 2004). As reduced expression of sCTLA-4 and liCTLA-4 isoforms is associated with increased disease susceptibility in humans and the NOD mouse model, respectively, and liCTLA-4 has been shown to be a potent inhibitor of T-cell activation and/or expansion, human sCTLA-4 may have a similar role. These data suggest that autoimmune-susceptible alleles may directly predispose to loss or reduced levels of self-tolerance.
Concerning virus participation in disease etiopatogenesis, encouraging candidates for T1D are the toll like receptor 7 - toll like receptor 8 (TLR7-TLR8) region of chromosome Xp22.2 that encodes intracellular receptors for viral RNAs and for DNA and RNA from apoptotic cells, and the interferon induced with helicase C domain 1 (IFIH1) gene on chromosome 2q24.2 that also encodes intracellular receptor known to recognise viral RNA and mediate the innate immune response. IFIH1 is also termed as melanoma differentiation-associated protein 5 (MDA5). IFIH1 is causal in T1D based on the protective associations of four rare variants, where the derived alleles are predicted to reduce gene expression or function (Nejentsev et al., 2009). Most intriguingly, among the viruses detected by IFIH1 are viruses from the picornavirus family, of which enterovirus is a member. Enteroviruses, of which coxsackie virus is a member, are one of the most common types of viruses reported to be associated with T1D (Dotta et al., 2007). Binding of viral RNA to IFIH1 triggers the production of the type 1 interferons (alpha and beta), which could, in a highly plausible scenario, enhance anti-beta cell CD8+ cytotoxic T lymphocyte activity in islets via HLA class I upregulation on beta cells, and direct effects on CD4+ and CD8+ T-cells (Devendra & Eisenbarth, 2004, Li et al., 2008, von Herrath, 2009). The genetic results suggest that a variety of virus infections could, via increased type 1 interferon levels, enhance susceptibility to autoimmune beta-cell destruction and T1D, provided that susceptibility alleles at other loci are present. Seasonal differences in viral infections, combined with other seasonal effects such as reduced vitamin D levels in more northern countries during the winter months (Hyppönen & Power, 2007, Svoren et al., 2009), could help explain the well-established seasonality of T1D diagnosis itself.
The gene encoding the GLI-similar 3 (GLIS3) protein on chromosome 9p24.2 is also associated with T1D. GLIS3 protein belongs to a subfamily of the Krüppel-like zinc finger transcription factors. GLIS3 plays a key role in pancreatic development, particularly in the generation of beta-cells and in the regulation of insulin gene expression. It could be an autoantigen in T1D for which polymorphism in or near the gene alters its expression in beta-cells, as found for insulin. Mutations in GLIS3 have been implicated in a syndrome characterized by neonatal diabetes and congenital hypothyroidism (Senée et al., 2006) and in some patients accompanied by polycystic kidney disease, glaucoma and liver fibrosis. In addition, the GLIS3 gene has been associated with fasting glucose levels and type 2 diabetes (T2D) susceptibility (Dupuis et al., 2010). Therefore, this would be the first convincing example of a gene predisposing to both T1D and T2D. Otherwise, T1D and T2D are genetically and, therefore, etiologically distinct (Rafiq et al., 2008, Raj et al., 2009).
The gene on chromosome 7p12.2 encoding the IKAROS family zinc finger 1 (IKZF1) protein, that is an essential regulator of lymphopoiesis and immune homeostasis, has been implicated in the development of childhood acute lymphoblastic leukemia. The major IKZF1 genotype conferring susceptibility to this leukemia has been shown to protect against T1D (Swalford et al., 2011). This finding strengthens the link between autoimmunity and lymphoid cancers.
The gene encoding the T-cell activation RhoGTPase activating protein (TAGAP) on chromosome 6q25.3 is associated not only with T1D, but also with RA, celiac and Crohn’s diseases. The TAGAP minor allele confers protection against RA, similar to previous reports of T1D but contrasting with celiac and Crohn’s diseases in which the minor allele is associated with risk (Eyre et al., 2010). TAGAP is transiently expressed in activated T-cells, suggesting that it may have a role in immune regulation.
Exploring the genetic overlap between related diseases may reveal key common autoimmunity-inflammatory pathways, and that further combinations of more disease-specific variation at HLA and non-HLA genes, in interaction with epigenetic and environmental factors, determine the final clinical outcomes (Table 3).
It is interesting to note that the shared genetic predisposition between T1D and celiac disease (Smyth et al., 2008) supports further evaluation of the hypothesis that gut microflora dysbalance and gluten consumption might be an environmental factor in T1D leading to the alteration of the function of the gut immune system and its relationship with the pancreatic immune system (Turley et al., 2005, Vaarala et al., 2008; Wen et al., 2008). Conversely, genes classified as autoimmunity genes, because they are associated with T1D, contribute to celiac disease. However, there are some causal alleles with the effects in the opposite direction and some distinct differences in genetic susceptibility between the two diseases. There is also a certain significant overlap between T1D and RA that is much greater than between celiac disease and RA (Eyre et al., 2010). These data suggest that a common etiology may exist between T1D and celiac disease, but a common pathogenesis may exist between T1D and RA.
Shared causal variants among autoimmune diseases could suggest therapeutic targets applicable to more than one disease. However, the susceptibility loci unique to a particular disease are also of interest; differences may reflect genuine specificity between the diseases and may influence what determines the particular autoimmune phenotype that may have also clinical application.
|Locus||Other autoimmune diseases|
|HLA||almost all autoimmune diseases|
|PTPN22||RA, Graves’ disease, Hashimoto thyroiditis, myasthenia gravis,|
systemic sclerosis, generalized vitiligo, Addison’s disease, alopecia areata,
weak association with JIA and SLE,
opposite effect (protection) in Crohn’s disease,
no association with celiac disease and multiple sclerosis
|IL2RA||multiple sclerosis, but no association with celiac disease|
|CCR5||RA, celiac disease|
|PTPN2||celiac disease, Crohn’s disease|
|CTLA4||Graves’ disease, RA, SLE, celiac disease, Crohn’s disease|
|TAGAP||RA, opposite effect (predisposition) in celiac disease, Crohn’s disease|
|IL18RAP||opposite effect (predisposition) in celiac disease, Crohn’s disease|
The other area that is relevant is the role of epigenetics (Wang et al., 2009, MacFarlane et al., 2009, Javierre et al., 2010), which provides a molecular bridge between genes and environment (Waterland et al., 2006, MacFarlane et al., 2009). Stochastic early epigenetic imprinting that can alter gene expression as well as environmentally-induced epigenetic changes (Waterland et al., 2006, MacFarlane et al., 2009, Tobi et al., 2009), including the aging process itself (Rakyan et al., 2010), could help account for the discordance of monozygotic twins for disease (Kaminsky et al., 2009). Recently, the first parentally imprinted susceptibility region has been reported, involving the DLK1-MEG3 locus on chromosome 14q32, in which expression of the maternal haplotype of these genes is suppressed by epigenetic mechanisms such that the risk of T1D at this locus is transmitted from fathers only (Wallace et al., 2010). Therefore, going forward, it will be necessary to combine studies correlating disease-associated SNP alleles and haplotypes with gene expression and splicing, with measurement of their methylation status (Todd, 2010). T1D genes are already known to have differentially-methylated regions that affect their expression, namely INS, IL2, and IL10.
This survey was funded by the Research design of the Ministry of Education and Youth of Czech Republic: Identification code: MSM 0021620814: Prevention, diagnostics and therapy of diabetes mellitus, metabolic and endocrine damage of organism.