The most common major stereotyped subsets.
Immunoglobulin molecule is the key component of B cell receptor (BCR), which governs the survival, differentiation and function of normal B lymphocytes, but accumulating data suggest that, in the case of chronic lymphocytic leukaemia (CLL), it is also involved in the pathogenesis and clinical course of the disease. CLL is a malignancy of mature CD5+ CD19+ CD23+ sIgMlow B lymphocytes and is characterized by extremely heterogeneous clinical course, which varies from indolent to rapidly progressive. Somatic hypermutational status of immunoglobulin heavy chain variable genes (IGHV) defines two CLL subtypes, mutated (M‐CLL) and unmutated (U‐CLL). U‐CLL patients suffer from more aggressive disease, characterized by shorter time to treatment, progression‐free survival and overall survival in comparison to M‐CLL patients. Since these correlations are not dependent on the clinical stage and since there is no interconversion between subtypes, IGHV mutational status is currently the most reliable prognostic marker in CLL. Several lines of evidence indicate that both M‐CLL and U‐CLL arise from an antigen‐experienced cell of origin. Immunogenetic studies have revealed CLL‐biased usage of immunoglobulin variable region genes, as well as the existence of highly homologous, ‘stereotyped’ BCRs in CLL clones, strongly implying the role of antigenic drive in the development and evolution of the disease.
- chronic lymphocytic leukaemia
- B cell receptor
- immunoglobulin rearrangements
- gene repertoire
- BCR stereotypy
The central role that B lymphocytes play in immunity relies upon their capacity to produce a vast array of different immunoglobulin molecules which can recognize virtually limitless number of foreign and autoantigens. Immunoglobulins (IG) are expressed on the surface of B cells as antigen‐binding component of B cell receptor (BCR), in complex with CD79A/79B heterodimer responsible for signal transduction. During the immune response, IG molecules are secreted as antibodies which exert different effector functions. BCR signalling is crucial for survival, proliferation and differentiation of normal B lymphocytes, but has also been implicated in the pathogenesis of several mature B cell malignancies, including chronic lymphocytic leukaemia.
Chronic lymphocytic leukaemia (CLL) manifests as clonal expansion of mature CD5+ CD19+ CD23+ sIgMlow B lymphocytes which gradually accumulate in blood, bone marrow and secondary lymphoid organs . It is the most frequent type of leukaemia in Western countries, accounting for 30–40% of all adult leukaemia cases, while it is very rare in Asian and African countries . CLL affects predominantly elderly individuals, aged approximately 67–72 years at diagnosis, men more frequently than women .
CLL is characterized by extremely heterogeneous clinical presentation, with diverse therapy requirements and overall survival. In some patients, rapid progression and need of treatment occur soon after diagnosis, while others may live for decades without developing any symptoms. The majority of cases, however, lie in between these extremes; the disease can follow an indolent course for years, but eventually turn into aggressive form.
Aetiology of CLL is still elusive. Familial clustering of CLL has been documented, implying a strong genetic basis of the disease. The relative risk of CLL has been estimated to be around eight‐fold higher in first‐degree relatives . Genome‐wide association studies have identified multiple CLL susceptibility loci mapping to genes involved in apoptosis, BCR signalling, immune response and maintenance of chromosome integrity [4, 5].
A growing body of evidence indicates that CLL development and evolution result from concerted action of intrinsic genetic abnormalities and extrinsic factors from the tissue microenvironment, including antigens . The most common chromosomal aberrations in CLL are deletion 13q14, trisomy 12q, deletion 11q22‐q23 and deletion 17p13, observed in approximately 80% of patients . The genes localized within minimally deleted/gained regions in these aberrations include miR‐15a and miR16‐1 (del13q), CDK4, GLI and MDM2 (trisomy 12), ATM (del11q) and TP53 (del17p), which are involved in regulation of apoptosis and DNA repair [8–10]. The recent next‐generation sequencing‐based studies have identified a number of recurrently mutated genes in CLL (e.g. NOTCH1, SF3B1, MYD88, BIRC3, NFKBIE, TP53 and ATM), predominantly belonging to BCR, toll‐like receptor, Notch1 and NF‐κB signalling pathways [6, 11]. In addition, genetic alterations and aberrant expression of many apoptotic regulators involved in both mitochondrial and death receptor apoptotic pathways have been described in CLL, most notably overexpression of BCL2, detected in the majority of patients [12–14]. However, immunogenetic studies over the past few decades have pointed to the antigenic drive on the BCR of the cell of origin as the key player, and possibly an initiating event, in CLL pathogenesis [15, 16].
The diversity of mechanisms involved in pathobiology of CLL cells is likely the basis of the clinical heterogeneity, making the prognostication for individual patients very difficult. Currently, the most important prognostic markers, widely used in routine clinical practice, are clinical stage (Rai and Binet) and cytogenetic aberrations . In an attempt to overcome the clinical variability and improve the prognosis assessment, particularly in early‐stage disease, a number of cellular and molecular prognostic markers have been identified and validated. Among the novel markers that have entered clinical practice (e.g. CD38 and ZAP‐70 expression, TP53 mutations), the most powerful one, in terms of prognosis definition, turned out to be the somatic hypermutational status of rearranged immunoglobulin heavy variable genes .
In this chapter, we will discuss the current concepts of immunoglobulin gene expression in chronic lymphocytic leukaemia, and its relevance for both the pathogenesis and clinical progression of the disease.
2. Immunoglobulin gene rearrangements and the development of B lymphocytes
2.1. Generation of immunoglobulin diversity
Immunoglobulin (IG) molecules are heterodimers composed of two identical heavy (H) chains and two identical light (L) chains (κ or λ), linked by disulphide bonds. Both heavy and light chains contain N‐terminal variable (V) region and C‐terminal constant (C) region (Figure 1a). Juxtaposed variable regions of H and L chains (VH and VL) form antigen‐binding site, whose structure determines the specificity and the affinity of immunoglobulin molecules for antigens. Constant regions are not involved in antigen recognition. Heavy chain constant region (CH) defines IG isotypes (IgA, IgD, IgE, IgG and IgM) and mediates effector functions of antibodies. In addition, CH region is responsible for anchoring of membrane‐bound IG in the plasma membrane of B cells. Variable region of each IG chain consists of four relatively conserved framework regions (FR1, FR2, FR3 and FR4) and three hypervariable complementarity‐determining regions (CDR1, CDR2 and CDR3). The CDR regions of H and L chains form six loops which create a surface that directly interacts with antigens. Heavy chain CDR3 region (VH CDR3) exerts the highest variability and is the key determinant of antibody specificity .
IG molecules are encoded by a multitude of tandemly arranged gene segments that constitute IGH (heavy chain) locus, IGK and IGL locus (κ and λ light chains). Human IGH locus, located on chromosome 14q32.33, consists of four types of gene segments: V (variable), D (diversity), J (joining) and C (constant), in 5′–3′ orientation. There are 38–46 functional IGHV gene segments, which can be divided into 6–7 subgroups based on sequence homology, 23 functional IGHD gene segments, 6 functional IGHJ gene segments and 9 functional IGHC gene segments (Figure 2). Light chain loci, on the other hand, lack D segments. Human IGK locus (chromosome 2p11.2) contains a cluster of 34–38 functional IGKV gene segments which belong to 5 subgroups, followed by 5 IGKJ gene segments and a single C gene segment. Human IGL locus (chromosome 22q11.2) is composed of 29–33 functional IGLV gene segments, divided into 10 subgroups, and 4–5 functional IGLJ‐IGLC tandems . Allelic variants of many gene segments exist, particularly in the IGH locus. It should be noted that the actual number of gene segments in all three loci is much higher, due to the presence of pseudogenes and ORFs (open reading frames). In addition, the number of functional gene segments in a locus depends on the haplotype, since some genes can be inserted or deleted, or can be functional or pseudogene, depending on the allele.
Immunoglobulin variable region is being generated by somatic recombination between V, D and J gene segments (H chains) and V and J gene segments (L chains), which occur during differentiation of B lymphocytes. At the IGH locus, which rearranges before IGL loci, the first recombination event joins one of the IGHD gene segments to one of the IGHJ gene segments, and the sequence between the rearranged genes is being deleted. The obtained IGHD‐IGHJ rearrangement then recombines with one of the IGHV gene segments, leading to the formation of complete IGHV‐IGHD‐IGHJ rearrangement which will be fused to an IGHC gene (Cμ or Cδ) during RNA splicing and, ultimately, expressed at the cell surface as IgM or IgD. Productive rearrangement of one IGH locus inhibits the rearrangement of IGH locus on the other chromosome (allelic exclusion), thus ensuring the monospecificity of B lymphocyte . However, if the rearrangement of one allele is unproductive, the other one will undergo recombination and, if the second rearrangement fails, the cell will die by apoptosis. Similar recombination process occurs between V and J gene segments at the light chain loci. IGK locus rearranges before IGL; successful recombination at one IGK allele inhibits the rearrangement of the other one (allelic exclusion), as well as the rearrangement of IGL loci (isotypic exclusion). Alternatively, unproductive rearrangement of one IGK locus leads to recombination of the other allele and, if unsuccessful, the IGL locus will rearrange. Once again, if neither of the attempts results in productive light chain rearrangement, the cell will undergo apoptosis.
Given the number of germline gene segments that can recombine at IG loci, as well as random pairing of heavy and light chains, it is clear that B lymphocytes can produce a vast number of different antibodies (‘combinatorial diversity’). However, the actual number of combinations is lower than the theoretical estimate of ∼1.6 × 106, since not all gene segment recombinations occur with the same frequencies and not all IGH‐IGL pairs are functional. In addition, it has been shown that V(D)J recombinations are not a stochastic process, but are determined by genetic factors and are regulated during ontogeny .
The diversity of the primary antibody repertoire (the repertoire of naïve B cells) is further increased by ‘junctional diversity’. The process of somatic recombination is catalysed by several enzymes jointly called V(D)J recombinase and, although very precise, their action introduces variability at the junctions of V, (D) and J gene segments. Recombination is enabled by the presence of conserved recombination signal (RS) sequences which flank 3′ end of V genes, 5′ end of J genes and both ends of D genes. RS sequences, recognized by recombination activating gene 1 and 2 (RAG1 and RAG2) enzymes, ensure that light chain V genes can rearrange only with J genes, while IGHV genes can rearrange only with IGHD, and IGHD only with IGHJ genes. During this process, trimming of the ends of recombining gene segments by exonucleases occurs, as well as the addition of short palindromic sequences and non‐templated nucleotides (the latter catalysed by terminal deoxynucleotidyl transferase, TdT) . The random addition and deletion of nucleotides during IGHV‐IGHD and IGHD‐IGHJ ligation creates two N regions (N1 and N2), and is the source of the extreme variability of VH CDR3, which is positioned at the VDJ junction (Figure 1b). Diversity of VH CDR3 in both length and amino acid sequence results in the production of much larger IG repertoire than it would be generated solely by combining germline gene segments (up to 1011 different IGs).
Diversification of immunoglobulins continues after antigen encounter (secondary antibody repertoire) via somatic hypermutations and class‐switch recombination, generating B lymphocytes with enormously wide range of specificities (see next section).
2.2. B cell differentiation
B cell differentiation is a multi‐step process which can be divided into two phases: antigen‐independent phase, taking place in bone marrow (and fetal liver), followed by antigen‐dependent phase in secondary lymphoid organs.
The first stage of B cell development in bone marrow is early pro‐B cell, defined by the beginning of IGHD‐IGHJ recombinations. Joining of IGHV gene to IGHD‐IGHJ rearrangement occurs in late pro‐B cells and leads to transcription and synthesis of μ heavy chain, which contains IGHV‐IGHD‐IGHJ complex attached to Cμ. The expression of μ heavy chain defines the large pre‐B cell stage. The μ chain is predominantly cytoplasmic, but it can associate with surrogate light chains and, in complex with CD79A/CD79B, is transiently expressed at the cell surface as the pre‐BCR. Subsequently, the cell enters the small pre‐B stage in which rearrangements of light chain loci occur, enabling pairing of previously synthesized μ chain with IGK or IGL and, thus, assembly of IgM. Expression of surface IgM, as a part of BCR, marks the immature B cell. At this stage, self‐reacting clones are being eliminated, or their specificities may be changed via receptor editing and IGHV replacement . Immature B cells migrate to the spleen where they become mature naïve B cells. As a result of alternative splicing of IGH transcripts, which joins IGHV‐IGHD‐IGHJ gene to either Cμ or Cδ, these cells coexpress membrane‐bound IgM and IgD with the same antigen specificity.
Naïve B lymphocytes reside in secondary lymphoid organs (spleen, lymph nodes and mucosal lymphoid tissues) where they encounter various antigens. Engagement of BCR with a specific antigen gives rise to a cascade of signalling events that activate B cell, leading to proliferation of antigen‐specific clone and, ultimately, differentiation into antibody‐secreting plasma cells and memory cells. Based on the requirement for T cell help in activation of B lymphocytes, two types of response to antigen stimulation exist. Bacterial polysaccharides and lipopolysaccharides can directly activate B cells (T cell‐independent response), resulting in rapid IgM production. In contrast, the response to protein antigens is T cell‐dependent and requires the interaction of B cells with CD4+ T cells and antigen‐presenting cells. Upon T cell‐mediated activation, proliferating B cells migrate deep into lymphoid follicle, forming the structure called germinal centre. In a highly specialized microenvironment of germinal centres, B cells start to proliferate at high rate and undergo somatic hypermutations and class‐switch recombination .
The process of somatic hypermutation (SHM), mediated by activation‐induced cytidine deaminase (AID), introduces point mutations into the rearranged immunoglobulin loci at a rate 106 times higher than the spontaneous mutation rate of other genes. The single base substitutions are localized in the variable region of heavy and light chains, while the constant region remains unaffected. They are preferentially targeted to specific hotspot motifs (RGYW and its inverse repeat WRCY), with transitions predominating over transversions, and accumulate in both FRs and CDRs . Replacement mutations tend to be clustered in CDRs, since they alter the affinity of IGs to antigens. In FRs, on the other hand, replacement mutations, which could disrupt the basic IG architecture, are counter‐selected, and silent mutations are more frequent. The somatic hypermutation process can also introduce small insertions or deletions, although this is a rare event created by a mechanism different than AID‐mediated SHM.
Accumulation of somatic hypermutations generates clonal progeny of activated B cell with diversified IG rearrangements and, hence, different affinity for antigen. These cells are subsequently subjected to selection by antigen: B cells that efficiently recognize antigen presented by follicular dendritic cells receive survival signals, provided by BCR engagement and T cell co‐stimulation, and continue to proliferate, while B cells that do not bind antigen or bind it with low affinity die by apoptosis. Multiple rounds of proliferation, somatic hypermutation and selection result in affinity maturation, i.e. production of B lymphocytes with increasing specificity and affinity for antigen. Along with affinity maturation, the cells undergo class‐switch recombination (also mediated by AID), which leads to fusion of IGHV‐IGHD‐IGHJ rearrangement to a downstream constant gene segment. This enables production of isotypes other than IgM and IgD, but with the same antigen specificity . Antigen‐selected B cells ultimately exit the germinal centre and finalize their differentiation into high‐affinity antigen‐specific plasma cells and memory cells, with specific effector functions.
Somatic hypermutations and class‐switch recombination further enhance immunoglobulin variability and, in combination with other sources of diversity (combinatorial and junctional diversity), enable formation of up to 1012 possible antibody specificities . The potential of B cells to create such a huge IG repertoire, however, comes at a high cost since it causes a considerable wastage of cells along the pathway of their differentiation. The mechanisms responsible for variability of immunoglobulin rearrangements can also render them unproductive due to recombination of non‐functional pseudogenes, out‐of‐frame junctions, generation of stop codons at the junctions, as well as introduction of frameshifts and stop codons by SHM. In addition, replacement mutations induced by SHM process can impair the structure of immunoglobulin molecule or lower its affinity for antigen. As mentioned above, B cells that fail to generate productive heavy‐ and light chain rearrangements and produce functional antibodies undergo apoptotic cell death.
3. Immunoglobulin gene rearrangements in CLL
3.1. IGHV mutational status
The extreme clinical heterogeneity of chronic lymphocytic leukaemia has inspired an extensive search for molecular and cellular markers with the prognostic and predictive value. Immunoglobulin rearrangements of CLL clones were brought into the spotlight upon the findings that, in around 50% of CLL patients, heavy chain rearrangements carry somatic hypermutations, and that SHM status of rearranged IGHV genes significantly correlates with the clinical course of the disease. Patients with unmutated IGHV‐IGHD‐IGHJ rearrangements are usually in advanced clinical stages, have progressive disease, atypical morphology and require chemotherapy soon after diagnosis. In contrast, patients with mutated IGHV‐IGHD‐IGHJ rearrangements predominantly present with non‐progressive disease, typical morphology, require no or minimal chemotherapy and have significantly longer time to first treatment, progression‐free survival and overall survival [29–33]. These correlations have been confirmed in multiple studies, and today, it is widely accepted that CLL can be divided into two subtypes, mutated (M‐CLL) and unmutated (U‐CLL), with different clinical outcome. The IGHV mutational status turned out to be the strongest independent prognostic marker whose value, inter alia, lies in the fact that it does not change over time and that it can predict the clinical behaviour of CLL at the time of diagnosis as well as at any stage of the disease (i.e. regardless of the tumour burden).
The cut‐off level that is being in use for distinguishing M‐CLL from U‐CLL is 98% of identity between the rearranged IGHV gene and its germline counterpart (calculated from codon 1 to codon 104); cases with ≥98% identity are considered unmutated, while those with <98% identity are considered mutated [34, 35]. This cut‐off has originally been chosen in order to eliminate the possibility of interpreting allelic polymorphisms as somatic mutations. Although in some studies other cut-off values (97% and 95%) allowed better separation of the two prognostic groups, 2% of somatic mutations are generally accepted as the best discriminator between mutated and unmutated cases [36–38]. However, since this level of mutations is an arbitrary cut‐off, the caution is recommended when interpreting the prognostic implications in cases with the borderline mutational status . Indeed, it has been demonstrated that the group of patients with the borderline mutated rearrangements (97–97.9% identity) comprised cases with both poor and good prognosis [38, 39]. In addition, sequencing of the unrearranged IGH genes in patients with high percentage of identity (98–99.6%) revealed that the divergence of rearranged IGHV gene from the closest germline gene, even in this group, is actually due to somatic hypermutation, further underscoring the statistical, rather than biological rationale for the 98% cut‐off . However, the fact that median survival does not differ between patients with 100% and those with 99% or 98%, but is significantly shorter in comparison to survival of patients with <98% identity, justifies the application of 98% cut‐off in clinical practice . Finally, it should be noted that the absence of correlation between IGHV mutational status and the prognosis in a proportion of patients can be attributed, at least in some cases, to other factors that influence the clinical outcome (see below).
Besides the borderline cases, clinical prognostication can be challenging in cases carrying double IGHV‐IGHD‐IGHJ rearrangements. In the majority of these cases only one rearrangement is productive, but in rare instances (up to 5% of cases), double productive rearrangements can be detected [41, 42]. Expression of double productive rearrangements may be the result of the lack of allelic exclusion, which has been described in CLL B cells or, alternatively, double (or multiple) productive rearrangements originate from different CLL clones [41, 43]. If both rearrangements are of the same mutational status, prognostic interpretation is straightforward regardless of whether both or just one rearrangement is productive. The cases with productive mutated and unproductive unmutated IGHV‐IGHD‐IGHJ rearrangements are considered mutated, since the productive rearrangement is relevant for the biology of CLL cells. However, if double productive rearrangements are of discordant mutational status or if unmutated rearrangement is productive while the mutated rearrangement is unproductive (implying that the cell has undergone the SHM process), the clinical implications currently cannot be predicted .
The association of IGHV mutational status with other prognostic markers in CLL has been extensively studied. Besides the contribution to better understanding of the disease biology, the research also aimed at finding a potential surrogate marker that could substitute the effortful IGHV mutational analysis in clinical practice. The four most frequent clonal chromosomal aberrations (del13q, del11q, trisomy 12q and del17p) represent strong independent prognostic markers and are differentially distributed between M‐CLL and U‐CLL [7, 45, 46]. The aberrations with adverse prognostic impact (del11q, trisomy 12q and del17p) are associated predominantly with unmutated IGHV‐IGHD‐IGHJ rearrangements, while favourable del13q is more frequent in mutated cases [36, 37, 47–49]. Furthermore, unmutated CLL subtype is characterized by high risk of acquiring adverse chromosomal aberrations during the disease course . In contrast to cytogenetic abnormalities, the association of CD38 and ZAP‐70 with IGHV mutational status is less consistent. The expression of CD38 on the surface of >30% of leukemic cells is an independent negative prognostic factor associated with the progressive disease, shorter time to first treatment and shorter overall survival, although the level of expression may vary over time [29, 51–53]. In some studies, CD38 positivity was strongly correlated to unmutated IGHV status, while others failed to detect any association, regardless of the cut‐off level used for defining CD38 status [29, 36, 49, 54]. Similarly to CD38, the expression of zeta‐chain‐associated protein kinase 70 (ZAP‐70) is also independent negative prognostic marker associated with adverse clinical characteristics and poor prognosis [55–59]. Initially, in many studies, ZAP‐70 was found to be expressed predominantly in unmutated CLL and was suggested as a surrogate marker for IGHV mutational status; however, subsequent research revealed a substantial discordance between these two markers [49, 55, 57, 60–63].
The expression of several other genes has been reported to exert a strong prognostic value, qualifying them as potential biomarkers. Among those RNA‐based markers, lipoprotein lipase (LPL) emerged as the most powerful one, whose high expression level correlates with advanced clinical stage, shorter time to first treatment and overall survival, as well as with other adverse prognostic parameters (short lymphocyte doubling time, ZAP‐70 and CD38 positivity, poor‐risk cytogenetics) [64–70]. Moreover, LPL expression turned out to be a potent predictor of IGHV mutational status, as high levels of LPL were found to be strongly associated with unmutated IGHV‐IGHD‐IGHJ rearrangements [55, 60, 64–67, 71].
To conclude, despite certain limitations, IGHV mutational status analysis is currently the golden standard for CLL prognostication, which has been introduced into clinical practice in many centres. It is integrated into the most advanced prognostic scoring systems suggested for risk stratification of CLL patients [72–75].
3.2. Immunoglobulin variable region gene repertoire in CLL
The analyses of immunoglobulin heavy chain rearrangements in CLL revealed that not only IGHV, IGHD and IGHJ gene usage in CLL B lymphocytes is distinct from that of normal peripheral blood B cells, but also the gene repertoires of U‐CLL and M‐CLL clones significantly differ.
The most commonly used IGHV subgroup in CLL rearrangements is IGHV3 (as is the case with normal B cells), followed by IGHV1 and IGHV4. However, the comparison of IGHV subgroup usage between CLL and normal B cells showed that there is a significant over‐representation of IGHV1 subgroup, as well as underrepresentation of IGHV3 subgroup in CLL [33, 76–80]. In addition, the frequencies of IGHV subgroups are different in the two CLL subtypes: IGHV1 genes are present predominantly in the rearrangements of U‐CLL clones, in contrast to IGHV3 and IGHV4 genes that predominate in M‐CLL clones. Moreover, a hierarchy in the SHM level among IGHV subgroups has been documented: IGHV3 and IGHV4 genes show a high mutational load while IGHV1 genes carry very few mutations (IGHV3 > IGHV4 > IGHV1) [30, 33, 80].
A strong bias in usage of individual IGHV genes has also been detected. In most studies, only 6–7 IGHV genes were utilized in more than 50% of CLL IGHV‐IGHD‐IGHJ rearrangements. The most frequently used IGHV genes were IGHV1‐69, IGHV3‐23, IGHV3‐7 and IGHV4‐34, followed by several others (IGHV3‐30, IGHV3‐30.3, IGHV3‐48, IGHV1‐2, IGHV1‐3, IGHV1‐18, IGHV4‐39 and IGHV4‐59), depending on the cohort [30, 33, 79–82]. It should be noted, though, that normal B cell repertoire is not random, and that certain genes (such as IGHV3‐23, IGHV3‐7 and IGHV3‐30.3) are overused . Hence, some of the most common IGHV genes in CLL are represented with frequencies similar to those of normal B cells [33, 76, 79]. However, CLL‐related over‐representation of IGHV1‐69 has been consistently reported, as well as its predominance in unmutated rearrangements. On the other hand, IGHV3‐23, IGHV3‐7, IGHV4‐34 and IGHV3‐48 are the most frequently used genes in mutated rearrangements. The differences in the mutational load, observed for IGHV subgroups, are even more evident when individual genes are considered. For example, IGHV1‐69 gene usually harbours no or just a few somatic mutations, whereas IGHV3‐7, IGHV3‐23 and IGHV4‐34 genes are highly mutated [30, 33, 78–80, 82].
The majority of CLL IGHV‐IGHD‐IGHJ rearrangements contain IGHJ4 and IGHJ6 genes; IGHJ6 gene is predominantly used in unmutated rearrangements, in contrast to IGHJ4, which is over‐represented in mutated rearrangements. Since IGHJ6 is the longest IGHJ gene, this results in significantly longer median VH CDR3 lengths of unmutated vs. mutated rearrangements [30, 33, 80].
Besides the biased usage of IGH subgroups and individual genes in CLL, early studies of CLL immunoglobulin repertoire have also revealed the over‐representation of certain IGHV‐IGHD‐IGHJ combinations. For example, IGHV1‐69 was frequently found in combination with IGHJ6 and IGHD3‐3 or IGHD2‐2, creating VH CDR3 longer than the average, which is not common in rearrangements of normal B cells [33, 83, 84]. In contrast, the majority of IGHV3‐7 genes were found to be combined with IGHJ4 and IGHD3 yielding shorter VH CDR3, while IGHV4‐34 was associated with both IGHJ4 and IGHJ6 genes . These findings pointed to the CLL‐biased VH CDR3 features and laid the foundations of the stereotyped B cell receptor concept (see below).
Geographical and ethnical differences in IGHV gene usage in CLL rearrangements have also been reported [79, 82, 85–89]. For example, IGHV3‐21 gene has been detected in IGHV‐IGHD‐IGHJ rearrangements of more than 11% of Scandinavian patients, while it was less frequent in the UK (7.9%) and very rare in Mediterranean cohorts (less than 3% of cases) [79, 90–93]. In addition, IGHV1 genes have been shown to be represented with lower and IGHV4 genes with higher frequencies in CLL clones of patients from Asian countries in comparison to patients from Western populations [94–96].
The light chain variable region gene repertoire in CLL has been substantially less studied but, nevertheless, some similarities with the repertoire of heavy chains have been observed. The ratio of expressed κ and λ light chains in CLL B lymphocytes mirrors that of normal B cells (2:1) . As is the case with IGH rearrangements, roughly 50% of IGK/IGL rearrangements belong to the mutated subtype and, in most cases, IGH and IGK/IGL rearrangements are of the same mutational status . A skewed usage of IGKV/IGLV and IGKJ/IGLJ subgroups and individual genes has been reported, but the interpretations of whether their relative frequencies differ from those of normal B cells are discrepant, probably due to different normal control datasets used for comparison. Similar to IGHV, the distribution of individual IGKV and IGLV genes between mutated and unmutated rearrangements is asymmetrical and, for some genes, CLL‐biased. In addition, certain IGKV‐IGKJ and IGLV‐IGLJ combinations are over‐represented and CLL‐related [97–99]. Importantly, non‐stochastic pairing of heavy and light chains has been detected and shown to depend on VH CDR3 motifs . Since preferential pairing of specific IGHV and IGKV or IGLV genes has not been observed in normal B cell repertoire, biased usage of certain VH CDR3/VL CDR3 associations strongly implies that the expression of BCRs with specific antigen‐binding characteristics is favoured in CLL [101, 102].
The usage of particular IGHV genes has been found to correlate with clinical course of CLL. The most striking example is IGHV3‐21 gene, which emerged as an adverse prognostic factor regardless of the IGHV mutational status. IGHV3‐21 is expressed in both CLL subtypes, but predominantly in M‐CLL. However, median overall survival of patients expressing mutated IGHV3‐21 rearrangements was found to be significantly shorter than median survival of non‐IGHV3‐21 mutated patients, and comparable to the survival of unmutated cases [90, 91, 103, 104]. Other IGHV genes also exhibited association with certain clinical characteristics; for example, IGHV3‐23 has been indicated as a marker of worse prognosis within M‐CLL subtype, IGHV3‐72 is over‐represented in highly stable CLL, and IGHV3‐30 has been linked to spontaneous regression [105–107]. The associations of IG repertoire with clinicobiological features of CLL will be further discussed in the next section, in the context of BCR stereotypy.
3.3. BCR stereotypy
The discovery that CLL includes patients with both mutated and unmutated IGHV‐IGHD‐IGHJ rearrangements was the first evidence pointing towards the role of antigens in the pathogenesis of the disease. The presence of somatic hypermutations and higher replacement/silent mutations (R/S) ratio in VH CDRs than in FRs indicate that M-CLL cells have undergone germinal centre reactions and been selected by T cell‐dependent antigen . Consequently, due to the lack of SHM in IGH rearrangements, U‐CLL cells have initially been thought to originate from naïve B lymphocytes. However, further studies revealed that both U‐CLL and M‐CLL cells express highly restricted, non‐random immunoglobulin repertoire. CLL‐biased representation of certain IGHV genes and IGHV‐IGHD‐IGHJ combinations, as well as VH CDR3 characteristics, implies the recognition of limited set of antigens, suggesting that CLL clones, both mutated and unmutated, derive from activated B cells. In the case of U‐CLL, the cell of origin could have been activated either by T cell‐independent antigens and autoantigens outside germinal centres or by antigens that select against SHM . High R/S ratio in VH CDR3 of minimally mutated U‐CLL rearrangements (<2% mutations) further argues in favour of an antigen‐driven process, since even a single mutation can significantly enhance antigen‐binding affinity of BCR and, hence, be selected for. In keeping with these observations, studies of gene expression profiles and surface phenotypes showed that both M‐CLL and U‐CLL cells exhibit characteristics of antigen‐experienced B lymphocytes [60, 109, 110]. Finally, the most compelling evidence for the involvement of antigen in the development of CLL comes from the discovery of ‘stereotyped’ B cell receptors.
Following the initial findings on IG gene repertoire and VH CDR3 restrictions, it has been observed in multiple studies that a proportion of unrelated CLL patients expresses highly homologous, almost identical BCRs (stereotyped BCRs) [42, 79, 82, 85, 111–115]. Stereotyped BCRs have been detected in both CLL subtypes, although with higher frequency in U‐CLL. Closely related BCRs have been clustered into stereotyped subsets. With the increase in the number of cases investigated in these studies, the number of identified stereotyped subsets grew larger, reaching several hundreds. However, the proportion of cases which could be assigned to stereotyped subsets did not exceed ∼30%, regardless of the cohort size . In the largest study conducted by now, which included >7000 CLL patients, 19 subsets accounted for 41% of the stereotyped cases (major subsets) and 12% of the total cohort; other stereotyped subsets accounted for 18% of cases, while the remaining 70% of cases were heterogeneous, i.e. did not belong to any of the stereotypes .
The required criteria initially adopted for stereotyped subset definition included the usage of the same IGHV, IGHD and IGHJ gene and IGHD reading frame, as well as identity of VH CDR3 amino acid sequence ≥60% [111, 113]. However, it soon became apparent that different IGHV genes (although with substantial sequence similarity) could generate highly homologous VH CDR3s if recombined with the same IGHD and IGHJ genes. In addition, introduction of somatic hypermutations could lead to convergence of VH CDR3 sequences encoded by different IGHV genes [115, 117]. Therefore, a revised set of criteria for clustering of IGH rearrangements into stereotyped subsets has been developed, which included additional parameters: (1) the presence of IGHV genes of the same phylogenetic clan, (2) identical VH CDR3 length and a unique amino acid motif at the exact position within VH CDR3, (3) VH CDR3 amino acid identity >50% and similarity > 70% . Conserved amino acid motifs which define a subset can encompass almost the entire VH CDR3 sequence (e.g. subset #6 and #10) or, alternatively, can involve just a few, or even just one, critical amino acid residue (e.g. subset #2). Furthermore, in some subsets, the conserved motifs are encoded solely by specific IGHD‐IGHJ combinations (e.g. subsets #3, #5 and #8), while in others, conserved amino acids are located in junctional N1 and N2 regions (e.g. subsets #4, #16, #77 and #201) . The strong bias in usage of individual IGHV genes in stereotyped BCRs has been detected, since only a few genes (IGHV1‐69, IGHV1‐2, IGHV1‐3, IGHV3‐21, IGHV4‐34 and IGHV4‐39) are expressed in around 80% of clustered cases, while IGHV3‐7, IGHV3‐23, IGHV3‐30 and IGHV3‐33, though frequent in CLL, are virtually absent from stereotyped subsets . In addition, the majority of subsets exhibit restricted light chain usage with subset‐biased κ and λ CDR3 motifs, thus evidencing the significant role of light chains in antigen‐binding specificities of stereotyped BCRs . Most of the major subsets are characterized by exclusively mutated or unmutated rearrangements, while several of them (e.g. subset #1, #2 and #99) can be detected among both M‐CLL and U‐CLL clones [115, 117]. Characteristics of the most frequent among major stereotyped subsets are depicted in Table 1.
|Subset||Mutational status||IGHV||IGHD||IGHD RF||IGHJ||VH CDR3 length||VH CDR3 pattern*||IGKV/IGLV|
|#2||mostly M||V3‐21||no D||J6||9||[AVLI].[DE]…M[DE].||LV3-21|
|#4||M||V4‐34||D5-5 D4-17 D3-10||133||J6||20||[AVLI]RG…….[KRH]RYYYYG.[DE].||KV2-30|
Extensive research on BCR stereotypy revealed the consistent association of certain stereotyped subsets with clinicobiological features of patients. It is well known that proliferation and survival of CLL cells rely on BCR signalling, along with signalling via other surface receptors which transduce signals from the microenvironment, since they rapidly undergo apoptosis when cultivated in vitro [16, 119]. The differences in aggressiveness of M‐CLL and U‐CLL clones have been attributed, at least in part, to their different BCR signalling capacity; CLL cells with unmutated BCRs have been shown to respond more avidly to sIgM cross‐linking and express higher levels of BCR target genes than M‐CLL cells, which are more anergic [120–123]. However, it has been observed that patients belonging to specific stereotyped subsets follow different clinical course from patients assigned to other subsets, even if expressing the same IGHV gene and having the same IGHV mutational status [82, 93, 124, 125]. The culprit for these subset‐related clinical distinctions could be the stereotyped BCR itself, since differences in antigen reactivity and signalling capacity of BCRs belonging to certain subsets have been detected. For example, it has been demonstrated that subset #1 and #2 primary B cells were significantly less responsive to antigenic stimulations in vitro in comparison to subset #8 cells . Additionally, subset‐specific distribution of prognostically significant chromosomal aberrations (del13q, del11q, trisomy 12q and del17p), as well as recurrent mutations in frequently mutated genes in CLL (TP53, BIRC3, MYD88, NOTCH1 and SF3B1) has been reported, further underscoring the differences between stereotyped subsets [127, 128].
As mentioned in the previous section, the usage of IGHV3‐21 gene has been identified as a factor of poor prognosis independent of IGHV mutational status in several studies. However, subsequent research revealed that this was only true for a proportion of cases, which turned out to belong to subset #2. Subset #2 (IGHV3‐21/IGLV3‐21) is the largest among stereotyped subsets, detected in both U‐CLL and M‐CLL, and associated with del11q, del13q, CD38 expression and SF3B1 mutations [124, 127, 128]. It has been found that IGHV3‐21‐utilizing cases assigned to subset #2, whether mutated or not, follow an aggressive clinical course, while cases carrying IGHV3‐21 in heterogeneous BCRs have variable clinical course which correlates to IGHV mutational status [79, 85, 129].
Subset #1 (IGHV1/5/7/IGKV1(D)‐39) is the second largest stereotyped subset, mostly unmutated, and also associated with aggressive disease and adverse prognosis. Recent studies revealed a significant enrichment for TP53 defects (del17p and/or TP53 mutations), trisomy 12q and NOTCH1 mutations [128, 130]. In addition, subset #1 B cells exhibited higher proliferation rate following in vitro BCR ligation with anti‐IgM antibodies than non‐subset #1 unmutated B cells . Similarly to subset #2, cases assigned to subset #1 have worse prognosis when compared to unclustered cases using the same IGHV genes [82, 85, 130].
The aforementioned subset #8 (IGHV4‐39/IGKV1(D)‐39) is associated with the highest risk of Richter’s transformation among all CLL . In addition to broad polyreactivity and higher capacity for BCR signalling compared to subsets #1 and #2, the observed association with trisomy 12q and enrichment for NOTCH1 mutations likely contribute to the aggressiveness of subset #8 clones [124, 128].
In contrast to clinically aggressive subsets #1, #2 and #8, subset #4 (IGHV4‐34/IGKV2‐30), the largest within M‐CLL subtype, is associated with younger age at diagnosis and remarkably indolent clinical course in comparison to non‐subset #4 IGHV4‐34 cases, as well as to all other M‐CLL cases [42, 82]. Subset #4 is characterized by CD38 negativity, the lack of recurrent gene mutations and the presence of favourable deletion 13q14 as the only recurrent chromosomal abnormality [127, 128]. Gene expression profiling and in vitro antigenic stimulation of subset #4 leukemic cells revealed diminished response to BCR‐mediated signalling and the resemblance with anergic B cells, which probably underlie the indolent phenotype of subset #4 patients [132, 133].
Given that mathematical probability of two independent B cells creating identical IG rearrangements is virtually negligible, the existence of stereotyped BCRs is considered to be the strongest evidence for recognition of common antigens leading to selection of the CLL clones. This implies that BCR reactivity and intensity of response to antigenic stimulation, as well as the frequency of exposure to antigens, could determine the behaviour of CLL clones and, hence, the course of the disease. Similar clinical characteristics of cases belonging to the same stereotyped subset corroborate this notion. Therefore, BCR stereotypy could potentially become a reliable prognostic marker for at least a proportion of patients. However, most of the clinical variability in CLL is confined to cases with heterogeneous BCRs, for whom the prognosis definition remains dependent on IGHV mutational status and other molecular markers.
4. Concluding remarks
Although the cellular origin of CLL is still a controversial issue, immunogenetic studies of BCR gene repertoire have provided unequivocal evidence that CLL precursor, in both M‐CLL and U‐CLL subtype, is an antigen‐experienced B lymphocyte . Studies of antigen reactivity have revealed that U‐CLL cells generally express low‐affinity polyreactive BCRs that recognize microbial antigens and autoantigens present on the surface of apoptotic cells (single‐ and double‐stranded DNA, cytoskeletal proteins, oxidized LDL and lipopolysaccharides) [135–139]. B cell receptors of M‐CLL cells, on the other hand, exhibit more restricted antigen specificities and are mainly oligo and monoreactive. Auto‐reactivity has been demonstrated for several stereotyped subsets. For example, it has been observed that subset #6 (IGHV1‐69/IGHD3‐16/IGHJ3) antibodies bind non‐muscle myosin heavy chain IIA, exposed on apoptotic cells, while subset #1(IGHV1/5/7/IGKV1(D)‐39) recognizes oxidized LDL, as well as vimentin and calreticulin on stromal cells [137, 140, 141]. Furthermore, analysis of IGHV‐IGHD‐IGHJ sequence of subset #4 (IGHV4‐34/IGKV2‐30) has indicated similarities with anti‐DNA antibodies, as well as the binding of N‐acetyllactosamine, which is a common epitope present on various autoantigens (I/i blood group antigen, B cell isoform of CD45) and microorganisms . The recognition of bacterial and viral antigens by CLL BCRs is further supported by the association of persistent infections with Epstein‐Barr virus and cytomegalovirus with subset #4, and hepatitis C virus with subset #13 (IGHV4‐59/IGKV3‐20), the latter exhibiting rheumatoid factor activity [143, 144]. The unmutated IGHV1‐69‐utilizing BCRs have been shown to react with hepatitis C, HIV‐1 and intestinal commensal bacteria antigens . In addition, reactivity against the capsular polysaccharides of Streptococcus pneumoniae has been detected, which is in agreement with the observed association of respiratory tract infections with elevated risk of CLL [137, 146]. Fungal antigens have also been implicated in CLL, after the notion that mutated IGHV3‐7/IGKV2‐24 BCRs recognize β‐(1,6)‐glucan, antigenic determinant of yeast and filamentous fungi .
Whatever the antigens might be, they clearly play a key role in the natural history of CLL. However, the major unanswered questions concern the moment in the disease development at which BCR‐antigen interaction occurs, and to what extent the nature of this interaction influences the disease progression. Stimulation by auto‐ and/or exo‐antigen may be limited to phases prior to or during malignant transformation, leading to the selection and clonal expansion of precursor cell with the distinctive BCR, during which it acquires the oncogenic hit and becomes CLL cell . Yet, it is still unclear whether antigenic stimulation continues after transformation. Several studies have investigated if CLL cells accumulate somatic hypermutations post‐transformation, and have detected extensive intraclonal diversification in cases assigned to stereotyped subset #4 (but not in subsets #2, #8 and #16 and heterogeneous BCRs), implying an on‐going antigenic triggering in this subset [149, 150]. In addition, gene expression profiling of CLL cells from lymph nodes has revealed up‐regulation of BCR target genes, thus indicating continual antigenic stimulation . The fundamental role of BCRs in CLL is underscored by the success of newly developed therapeutic strategies targeting BCR signalling pathways (BTK, PI3K and SYK inhibitors) [151–154].
The configuration of BCR expressed on the surface of the CLL clone represents its specific molecular signature which does not change during the disease course. Hence, it is reasonable to believe that, in addition to IGHV mutational status, the informations about the clonotypic BCR will in future become important for individual patient prognostication and, ultimately, will contribute to tailoring of patient‐specific treatment modalities.
This work was supported by Ministry of Education, Science and Technological Development, Republic of Serbia (Grant No. III41004).