Number of functional human immunoglobulin gene segments in the heavy and light chain locus.
Because of the huge diversity, the immunoglobulin repertoire cannot be encoded by static genes, which would explode the genomic capacity comprising about 20,000–25,000 human genes. The immunoglobulin repertoire is provided by the process of somatic germ line recombination, which is the only controlled alteration of the genomic DNA after meiosis. It takes place in mammalian B lymphocyte (B cells) precursors in the bone marrow. The genome germ line sequence of undeveloped B cells is organized in gene segments and compromise V (variable), D (diversity), and J (joining) gene segments constituting the variable domain of the heavy chain and only V and J genes for building up the variable domain of the light chain. The rearrangement of the variable region follows a strict order. The following processes that participate in the generation of antibody diversity were summarized—allelic, combinational, and junctional diversity, pairing of IgH and IgL, and receptor editing—which all together produce the primary antigen repertoire (pre-antigen stimulation). When a B cell encounters a foreign antigen, affinity maturation and class switch are induced. Thereby the antibody repertoire increases. The resulting secondary immunoglobulin repertoire reveals in humans at least 1011 specificities for different antigens.
- antibody diversity
- somatic recombination
- somatic hypermutation
- class-switch recombination
- allelic exclusion
- B-cell receptor editing
- pairing of VH and VL
- germinal center
The immune system is a complex system, comprising different organs and many specialized cell types, which are carrying out their development, maturation, and pathogen recognition at various sides in the body. The immune system has two major approaches to recognize and attack pathogens. The first is the innate immunity followed by the delayed adaptive immune response, based on specific antigen recognition receptors. The innate immune system is nonspecific and uses general pathogen recognition mechanisms, through pathogen-associated molecular patterns (PAMPs) recognized by cell surface or intracellular pattern recognition receptors (PRRs), such as toll-like receptors or NOD-like receptors (NLRs) and RIG-I-like receptors (RLRs) . Cell types of the innate immunity are monocytes/macrophages, dendritic cells, mast cells, natural killer cells, granulocytes, B1 cells, and innate lymphoid cells (ILCs). Although it lacks specificity, it can react immediately on the invading pathogens and activates the adaptive immune system by presentation of the foreign antigen peptides.
The adaptive immune system needs to be activated and primed by the antigen and therefore acts delayed from the initial pathogen attack. It is based mainly on two cell types, the B cell and the T cell. Both cell types express specific receptors on their cell surface for pathogen recognition. Many different B- and T-cell clones exist in parallel inside the body, and each has a different receptor specificity to the antigen. These receptors were called B-cell receptor (BCR) and T-cell receptor (TCR). It is remarkable that despite the relatively small genome size of approximately 20,000–25,000 human genes [2, 3], the human body can produce an antibody repertoire which can recognize almost every possible antigenic structure. Of course, this cannot be achieved by encoding the antigen receptor specificity directly in the genome sequence.
The huge B-cell diversity is generated by a complex multistep process, starting in the bone marrow and ending up in the peripheral lymphoid tissues, such as lymph nodes, spleen, or mucosal lymphoid tissue. In the maturation of functional BCR or TCR, the antigen receptor genes were rearranged from many different possible gene segments to form a full receptor. In each step, the receptor is tested for functionality and excluded when it reveals self-antigen reactivity in order to prevent autoimmunity and making the immune system self-tolerant.
The B-cell maturation occurs inside the bone marrow before the B cells migrate to peripheral lymphoid tissues. On the contrary to B cells, T-cell progenitors migrate to the thymus to differentiate and to mature. After their maturation, B and T cells meet again in lymph nodes. In the germinal centers of the lymph node, antigens were presented to the B cells through antigen-presenting cells, particularly through follicular dendritic cells (FDC). In response to a foreign pathogen, B cells with the highest antigen affinity were selected from a pool of different BCR clones. This process is organized in a form of a repetitive cycle inside of the dark and light zone of a germinal center of the lymph node and is known as the cyclic reentry model (Figure 4).
An essential part of the cycle is the BCR affinity maturation of the B cells. It begins with the tight controlled somatic hypermutation (SHM), particularly in the variable regions of the light and heavy chain of the antigen receptor and is only active in the dark zone of the germinal center. This process creates BCRs with higher affinity, whereby the mutations which produced very low or nonfunctional receptors were excluded. Finally, high-affinity B cells differentiate either into plasma cells, which start to produce secreted antibodies with the same specificity as the BCR, or they differentiate into memory cells, conferring lifelong immunity.
This chapter will discuss in detail the different steps and processes, which contribute to the high diversity of B cells. Many steps are similar for the generation of T-cell receptor diversity and were not covered by this article.
2. The primary antibody repertoire
2.1. Combinatorial diversity of immunoglobulins
Before immature naive B cells encounter a foreign antigen, their genomic sequence is rearranged by a well-controlled process, called somatic DNA recombination. This process is unique in lymphocytes, and except of the meiosis in the gametes, this is the only DNA recombination of somatic cells . Before B cells leave the bone marrow to the secondary lymphatic organs, somatic DNA recombination takes place. The sum of all B lymphocytes in an individuum, producing different antibodies with different specificities and affinities, is designated as the antibody repertoire. In humans, the antibody repertoire consists of at least 1011 specificities . The number varies and is limited by the total number of B cells and encountered antigens of an individuum. The immunoglobulin loci contain gene fragments to build up all immunoglobulin variable domains of the heavy and light chain. The different immunoglobulin loci are located on different chromosomes (Chr), the heavy chain on Chr14, the kappa light chain on Chr2, and the lambda light chain on Chr22. In contrast to the light chain loci, the heavy chain locus has several constant regions; each represents a different immunoglobulin isotype, e.g., IgM, IgD, IgG1, Ig2a, IgG2b, IgG3, IgE, and IgA in mice. The gene segments consist of different germ line sequences. For example, the variable gene locus of the heavy chain comprises 38–46 genes, which varies between individuals.
Besides different germ line segments, there exist a relative large number of pseudogenes of which some can undergo recombination leading to a nonfunctional variable region. An overview of the number of gene segments in the respective gene locus is given in Table 1 (slightly modified from IMGT ).
|Immunoglobulin (Ig) gene segments|
|Gene locus||Ig chain||Chromosomal location||Locus size (kb)||Variable (V)||Diversity (D)||Joining (J)||Constant (C)|
|IGK||κ Light chain||2p11.2||18201||34–38||0||5||1|
|IGL||λ Light chain||22q11.2||1050||29–33||0||4–5||4–5|
The light chain loci have only variable (V) and joining (J) gene segments, whereby the heavy chain locus additionally has a diversity (D) gene segment, which lay between the V and J genes of the heavy chain variable region. One of each gene segment is randomly selected by the RAG1/RAG2 recombinase and joined together to form the variable region (Figure 2c) as shown as example with the variable region of the λ light chain. The recombination steps of the V region follow a strict order. The variable light chain recombines first with the V-J segments. Afterward the constant (C) domain is joined through RNA splicing of the primary RNA to the variable region. The construction of the V region of the heavy chain begins with the recombination of the D and the J gene; then the V gene is joined to the DJ segment. Finally, the C domain is joined through RNA splicing of the primary RNA. Figure 1 gives an overview of the respective steps of the V(D)J recombination for construction of the V region of the heavy chain immunoglobulin.
The figure illustrates the somatic recombination event of the antibody heavy chain in the bone marrow of developing B cells. At first, one of the D and J segments is randomly chosen and rearranged. In the following step, one of the variable gene segments is joined to form the V-D-J variable region. This process is catalyzed by the recombination activating gene 1/recombination activating gene 2 (RAG1/RAG2) recombinase. In the immature B cells in the bone marrow, the variable region is transcribed with the constant mu (Cμ) and the constant gamma (Cδ) chain, which produces two different mRNAs through alternative splicing which are finally translated into either IgM or IgD immunoglobulin.
The guided fashion of the recombination is mediated by recombinase signaling sequences (RSSs). The RSS is always directly adjacent to the coding region of the gene segments (Figure 2A). The nucleotide structure of the RSS is well defined and conserved (Figure 2B). A heptamer of seven conserved nucleotides is linked with a non-conserved linker sequence to a conserved nine-nucleotide nonamer [6, 7, 8]. The linker sequence is either 12 or 23 nucleotides long, and only a RSS with a 12 bp linker sequence can recombine with a 23 bp linker RSS, which is called the 12/23 rule. With the 12/23 rule, only corresponding gene segments can recombine. For instance, the V gene segments of the lambda light chain are always flanked downstream by a 23 bp RSS, and the genes of the J segments of the lambda light chain are always flanked upstream by a 12 bp RSS to the coding sequence. For the kappa light chain, it is the other way around, with the 12 bp RSS at the end of the V gene and the 23 bp RSS upstream of the coding sequence of the J gene. The heavy chain diversity gene segment is flanked by a 12 bp linker RSS from both sides and the V gene and the J gene segments with a 23 bp linker RSS upstream of the coding sequence, respectively. This allows only recombination in the desired V-D-J orientation, whereby during the recombination, the sequence between the chosen genes is excised and discarded. Figure 2 shows the position and structure of recombinase signal sequences (RSSs) at the V, J, and D gene segments and RSS-guided RAG-dependent V-J rearrangement of the variable domain of the λ chain.
2.2. Junctional diversity of immunoglobulins
During V(D)J recombination the diversity of immunoglobulins is further increased by incorporation of additional nucleotides between the junctions of the V, D, and J gene segment of the heavy and V and J gene segment of the light chain. Especially the diversity of the CDR3 (complementarity-determining region), which has a huge influence on the antigen binding [9, 10], is affected with high frequency by this process, because of its position between the V and J gene segments in the heavy chain and between the V and J gene segments in the light chain. The CDR1 and CDR2 loops are not affected by junctional diversity, because of their position in the V gene segment of the heavy and light chain.
When two gene segments guided by the recombinase signaling sequences (RSSs) and the RAG1/RAG2 complex were brought together, the RAG complex excises the intervening DNA and produces short hairpins on both sides of the immunoglobulin gene segments (Figure 3). Then the Artemis/DNA-dependent protein kinase (DNA-PK) complex is recruited and cuts the DNA strand randomly at the site of the hairpin of both ends of the DNA strands [11, 12, 13]. This can produce palindromic DNA sequences at the side of the gene segment joint, and these nucleotides are called P nucleotides, because of its palindrome nature. Next, the terminal deoxynucleotidyl-transferase (TdT) adds further nucleotides at the single-stranded P nucleotide stretch . The nucleotides were added randomly without any DNA template; hence they are called N nucleotides (non-template). After addition of a couple of N nucleotides, some base pairs between both single-stranded DNA stretches and the mismatched nucleotides were removed by an exonuclease; in this process the Artemis might be involved. The remaining gaps were filled by a DNA polymerase, and finally both DNA strands were joined together by the DNA ligase IV/X-ray repair cross-complementing protein 4 (XRCC4) complex.
The presence of N nucleotides is not equally distributed in the light and heavy chain . The light chain has a remarkable lower appearance of N nucleotides in comparison to the heavy chain. The reason for this difference is the expression pattern of the terminal deoxynucleotidyl-transferase, which is much higher when the heavy chain is rearranged and already lower when subsequently the light chain is rearranged. The incorporation of additional nucleotides has not only beneficial effects, of cause the affinity of the antibody can be changed dramatically, but also missense mutations can be produced by violating the 3 bp codon structure, which can produce a frameshift in the coding sequence (non-productive rearrangements, see Figure 3).
2.3. Antibody diversity is further expanded by allelic exclusion, B-cell receptor editing, and pairing of VH and VL
In most cases, only one functional allele of an immunoglobulin gene is expressed. The other gene is transcripted in parallel, but usually only one of them can assemble into a functional B-cell receptor (BCR) . Allelic exclusion means that only clonally identical BCRs were expressed on the B-cell surface and not two different versions from two different alleles. In diploid organisms, such as mammals, two different copies of a gene are on a chromosome. For the immunoglobulin gene loci, only one allele is expressed on the B-cell surface. When V(D)J rearrangement did not produce a functional BCR, the second allele will be activated and tested. When this will also fail, the B cell will die by apoptosis; this process is called clonal deletion. The choice of two different immunoglobulin alleles further increases the antibody diversity .
The exact mechanism of allelic exclusion is not completely understood by now, but in general some important steps are known. During pre-B-cell development when the heavy and light chain rearrangement takes place in the bone marrow; only one allele is chosen for recombination, whereas the other will be silenced. When a functional heavy chain is produced, RAG1/RAG2 recombinase expression will be decreased, and RAG1/RAG2 is targeted for degradation . Furthermore, the RAG1/RAG2 recombinase access to the heavy chain loci will be decreased. Later, when the light chain is rearranged, the prevented access to the heavy chain loci is sustained, and no further rearrangement or change of allele activity can occur.
Although some essential steps in the mechanism are known, the precise mechanism is still unknown and under controversial discussion .
When the production of a functional B-cell receptor fails, another immunoglobulin allele is tested, or the BCR could undergo additional rounds of V(D)J recombination, until a functional receptor will be produced or no further V, D, and J genes for recombination were available. Usually, V(D)J recombination ends when a functional BCR is produced. When a functional BCR exhibits reactivity against antigens of the own body (self-reactivity), a specialized mechanism attempts to rescue the functional BCR and tries to edit the self-reactive B-cell receptor. This mechanism is called receptor editing and is one of the key checkpoints and rescue mechanisms to ensure self-tolerance and to escape clonal deletion.
The idea of rendering self-reactive B cells by editing the BCR through continued recombination of the antibody genes was investigated by several groups between the late 1980s and early 1990s [17, 18, 19, 20]. In one experiment, an H-2Kb MHC class I and an anti-H-2Kb antibody was expressed ectopically in transgenic mice . They found that the anti-H-2Kb B cells were absent in the periphery, but B cells with the anti-H-2Kb were still in the bone marrow, trying to edit the BCR by high levels of RAG1/RAG2 recombinase . About 25% of the functional antibodies are produced by receptor editing .
But there are reports that about 50% of B cells are initially self-reactive, and it is suggested that receptor editing is the main mechanism to confer self-tolerance , beside the clonal deletion of self-reactive B cells in the bone marrow and anergy of self-reactive B cells in the periphery. Anergy and deletion inactivate or remove self-reactive clones. Receptor editing is based on secondary Vκ → Jκ light chain rearrangements or, more rarely, by altering the variable region of heavy chains by the replacement of a VH gene segment in an established VHDJH rearrangement.
In conclusion, the modification of the V region by receptor editing extents the antibody diversity and rescue some B cells from apoptosis especially when self-reactivity was observed.
The pairing of heavy and light chains is considered to contribute not to the same extent to the antibody diversity as the processes of somatic recombination and junctional diversity mentioned before. By combination of different variable regions of the light (VL) and the heavy chain (VH), the antibody repertoire is further expanded. Previous studies suggested that the combination of different VH and VL is completely by chance and no preference of V gene pairing was observed [23, 24]. But more recent publications, unveiled some preferred VH and VL gene pairings in human and mouse antibodies, by searching a newer and larger antibody database set (KabatMan dataset ) not available in the previous studies before . The results revealed that pairing preference do exists but only for a small proportion of germ line immunoglobulin gene sequences.
3. The secondary antibody repertoire
3.1. Somatic hypermutation
After the assembly of the V region of the heavy and light chain and cell surface expression of a functional BCR, naive B cells migrate to the secondary lymphatic organs, for example, to the lymph nodes. In the germinal center of the lymph node, the primary antibody repertoire is further diversified by introducing mutations in the V domains of the heavy and light chain mediated by the activation-induced cytidine deaminase (AID) [27, 28, 29]. This enzyme is only expressed and active in germinal center-activated mature B cells and is the key enzyme for the somatic hypermutation (SHM). The anatomical structure of the germinal center in the lymph node is divided macroscopically in two parts, the light zone and the dark zone. Somatic hypermutation mediated by AID activity takes place in the dark zone. Cells which produce a nonfunctional B-cell receptor (BCR) upon mutation are dying by apoptosis, whereby the B cells with a functional BCR will migrate into the light zone. In the light zone, positive selected B cells with a low-affinity B-cell receptor were stimulated for survival, proliferation, and reentry to the dark zone for a next round of affinity maturation. At this point, after several rounds of affinity maturation, B cells can leave the germinal center and differentiate into antibody producing plasma cells or B memory cells. B cells with very low affinity are suffering for survival signals, and before they can reenter the dark zone, they die by apoptosis .
During the migration, B cells change their expression pattern depending on their location in the light or dark zone of the germinal center. The C-X-C chemokine receptor type 4 (CXCR4) is one of the classical markers, which changes the expression level in response of the migration to the other germinal zone. In the dark zone, the B-cell CXCR4 expression is strong and is reduced in the light zone. CXCL12 is a ligand of CXCR4 and expressed on the cell surface of reticular cells in the dark zone. The CXCL12/CXCR4 signaling of B cells in the dark zone is regarded as a homing signal to keep B cells in the dark zone, if CXCR4 expression is high [31, 32]. CXCR4 deficiency in germinal B cells restricted the B cells to the light zone, but the deletion is not sufficient alone for functional transition of dark zone B cells (centroblasts) to light zone B cells (centrocytes) .
The process of affinity maturation is also known as the cyclic reentry model (Figure 4). It starts with the introduction of mutations in the V region initiated by AID. The induced mutation rate is about one nucleotide per 10,000 nucleotides after each cell cycle division . This is much higher than the normal mutation rate of about 1010 mutations per cell cycle. Only a slight change in one or a few amino acids in the CDRs or frameworks of the V region can change dramatically the antigen affinity and specificity. Mutations can have detrimental effects and produce lower affinity B-cell receptors, especially when the complementarity-determining regions (CDRs) are affected with mutations leading to antibodies which cannot anymore recognize the antigen-binding site. At this stage, negative selection of B cells occurs. When B cells are affected by negative changes and were not able to produce a functional receptor presented on the B cell surface or lost antigen affinity, cell death by apoptosis is initiated. Subsequently, phagocytic clearance of apoptotic B cells is executed by tingible body macrophages (TBM).
On the other hand, B-cell clones with a BCR of high affinity toward the antigen receive growth signals, for example, from the follicular T helper cells, and are expended. This principle is called positive selection. Selected B cells which have undergone affinity maturation are showing more mutations in the critical regions for the antigen binding, namely, the CDRs. A mutation in the CDR, which produces an amino acid change, very likely alters the antigen affinity.
B cells with sufficient affinity to the antigen which is presented by follicular dendritic cells (FDCs) in the light zone can capture it, process it, and present the antigen peptide via the major histocompatibility complex II (MHC II) to the T cells. Then, B-cell clones get survival and mitogenic signals through the T-cell receptor (TCR) recognition, CD40-CD40L interaction, and cytokine stimulation of T cells (Figure 4). As a consequence, B-cell receptors and CD40 cluster together and promote thereby positive selection signaling. Follicular dendritic cells present foreign antigens on their dendritic surface in form of iccosomes (immune complex-coated bodies) [33, 34, 35]. Iccosomes are antigen/antibody/complement complexes bound to Fc and complement receptors on FDCs. When B cells recognize antigens presented by iccosomes, they can take them up and process them for MHC II-mediated T-cell presentation. The efficiency and amount of iccosome uptake can also influence the fate of the B cell. B-cell clones with higher affinity for the antigen can capture more from the iccosome-presented antigen, which resulted in more representation of the processed antigen peptide on the B-cell surface, complexed in the MHC II molecule. Therefore, these clones get more surviving and proliferation signals in the light zone from the recognizing follicular T helper cell (TFH).
The process of somatic hypermutation (SHM) has not only a cellular dimension; it also has a molecular dimension, which can be characterized by the details of the mechanism of SHM and affinity maturation. The central enzyme in SHM is the activation-induced cytidine deaminase (AID). AID catalyzes the deamination of the DNA nucleotide cytosine to uracil, which is usually only present in RNA molecules.
The expression of AID is tightly restricted to germinal center B cells; this protects other cells from somatic hypermutation. Furthermore, it cannot act on predominantly double-stranded genomic DNA. To protect the majority of the genomic DNA from mutation, AID has developed a clever mechanism [36, 37, 38]. AID can act specifically only on single-stranded DNA molecules. The genomic DNA is released during transcription as a single strand by the RNA polymerase, which granted access of the AID for deamination. The immunoglobulin V region genes are actively transcribed in germinal center B cells, and somatic hypermutation can occur. Beside of the immunoglobulin V region, also some other transcribed genes can be affected by AID, fortunately by a lower frequency. AID has not only the function of somatic hypermutation by acting on the immunoglobulin V region loci; it can also activate the immunoglobulin class-switching process by acting on the residues in switch regions.
The deamination of cytidine to uracil by AID is the initiation step of SHM or class-switch recombination (CSR). Further mutation of the DNA around the initial deamination is executed by two different DNA repair pathways [39, 40, 41]. For example, the DNA mismatch repair process recognizes the wrong base pairing of uracil (U) to guanosine (G). Mismatch repair proteins MSH2 and MSH6 (mutS homolog 2/6) detect the wrong U/G base pairing, which then recruits DNA nucleases to remove the uracil and the adjacent nucleotides. The following DNA polymerase Polη has no exonuclease activity and is error prone in B cells. The polymerase preferentially misincorporates thymidine (T), regardless of the template sequence, which leads to a preference of adenosine (A)-thymidine (T) mutations at the original targeted cytosine and the adjacent nucleotides by the mismatch repair pathway.
Alternatively, in the base excision repair pathway, the uracil DNA glycosylase (UNG) cleaves the uracil nucleobase from the uridine and leaves an abasic site in the DNA strand. During the following DNA replication, a random DNA base will be inserted in the opposite DNA strand of the abasic nucleotide. This is mediated by an error-prone DNA polymerase used in translesion DNA synthesis for damaged DNA caused by UV radiation.
As mentioned before, AID can also initiate class-switch recombination, by acting of apurinic/apyrimidinic endonuclease 1 (APE1) upon UNG-mediated introduction of an abasic nucleotide in the switch region. APE1 cleaves the DNA strand at the abasic site and produces a single-strand nick. In the switch regions, upstream of the constant region genes, the DNA nick is further cleaved which produces a double-strand break (DSB). This leads to a joint of another constant region gene to the V region, produced by the double-strand break repair machinery.
3.2. Class-switch recombination
In naive B cells, which had already rearranged their V region by somatic DNA recombination, two antibody isotypes are co-expressed at the same time. The V region and the μ chain (IgM) together with the δ chain (IgD) were transcribed on the same RNA transcript. By alternative splicing, either the μ chain or the δ chain is chosen, which produces two different messenger RNAs (Figure 1). Upon antigen contact and B-cell activation, B cells switch their antibody isotypes from IgM/IgD to IgG, IgA, or IgE. This is achieved by a process called class-switch recombination (CSR) or isotype switching. The antibody isotype is changed by an exchange of the constant region of the heavy chain locus. Only the constant region is replaced by CSR, which means the V region stays the same, but class switch confers the antibody the ability to interact with different effector molecules by their fragment crystallizable (Fc) region (Figure 5).
Unlike in the parallel expression of IgM and IgD, the class switch is a chromosomal DNA rearrangement, leading to only one ultimate antibody isotype in the affected B cell. The process is guided by conserved switch region (S) upstream of the heavy chain constant genes, coding for the respective constant domains. The switch regions are repetitive stretches of DNA placed in introns upstream to the C region genes [28, 42]. The initial activation of CSR is done by the enzyme activation-induced cytidine deaminase (AID), which has also an essential role in the somatic hypermutation process. This produces a single-stranded DNA break (nick) at two switch regions, and the DNA between both switch sites were irreversible excised. The removal includes always the μ and δ chain constant region. Both DNA strands were brought together by the non-homologous end joining (NHEJ) mechanism; this rearranges the variable region with the constant region of the chosen immunoglobulin isotype. The decision which isotype will be produced is influenced by different cytokines, secreted by T cells [43, 44].
CSR is induced by the enzyme activation-induced cytidine deaminase (AID) acting on the switch regions (S) of the respective constant region gene. The non-homologous end joining (NHEJ) machinery joins the chosen constant gene segment to the V region (here Cy2b). The constant region gene, which is next to the V region, is then expressed together with the V(D)J gene sequence.