Open access peer-reviewed chapter

Proteoforms in Acute Leukemia: Evaluation of Age- and Disease-Specific Proteoform Patterns

By Fieke W. Hoff, Anneke D. van Dijk and Steven M. Kornblau

Submitted: April 13th 2019Reviewed: October 30th 2019Published: December 19th 2019

DOI: 10.5772/intechopen.90329

Downloaded: 70

Abstract

Acute leukemia are a heterogeneous group of malignant diseases of the bone marrow that occur at all ages. Acute lymphoid leukemia (ALL) accounts for about 80% of all pediatric leukemia patients, whereas acute myeloid leukemia (AML) is more common in adults compared to pediatric patients. Despite similar patterns in the pathogenesis of acute leukemia in children and adults, clinical outcome in response to therapy differs substantially. Studying proteoforms in acute leukemia in children and adults, might identify similarities and differences in crucial signaling pathways that play a key role in the development or progression of the disease. In this chapter we will discuss how the study of proteoforms in acute leukemia could potentially contribute to a better understanding of the leukemogenesis, can help to identify effective targets for specific targeted treatment approaches in different subgroups of age and disease, and could aid the development of reliable biomarkers for prognostic stratification.

Keywords

  • acute myeloid leukemia
  • acute lymphoblastic leukemia
  • proteoforms
  • RPPA
  • pediatrics

1. Introduction

Acute leukemia forms a group of rapidly progressing malignant diseases characterized by a block in the differentiation and an uncontrolled clonal proliferation of abnormal hematopoietic progenitor cells in the bone marrow and the peripheral blood [1, 2]. This accumulation of immature cells (“blasts”) interferes with the production of normal blood cells, causing neutropenia, thrombocytopenia and anemia. According to the lineage of origin of the progenitor cells, the common lymphoid or the common myeloid, acute leukemia can roughly be classified into acute lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML).

Acute leukemia patients are diagnosed using morphologic, cytochemical and immunophenotypic methods, and are further sub-classified by chromosomal analysis and the presence or absence of somatically acquired gene mutations. While classification allows for prediction of outcome, the outcome risk of a large group of patients is still difficult to define. In addition, treatment options are expanding that treat patients based on their genetic abnormalities (in particular in the adult population), but so far most genetic abnormalities are not yet targetable, and most drugs that enter clinical trials rely on the increased abundance or altered activity of proteins, namely specific proteoforms, instead of the genetic lesion itself.

Proteoforms are defined as different forms of a protein derived from a single gene, and include all forms of genetic variation (e.g. amino acid variation), alternative splicing, and post-translational modifications (PTM). This means that one transcribed gene can lead to a variety of protein structures, and that the biological function of the final proteoform, as well as the cellular localization, binding partners and kinetics can vary greatly. As this suggests that gene sequences do not accurately predict the expression of a protein or whether the protein is stable or functional, it is not surprising that transcriptome data only correlates for about 17–40% with protein abundance [3, 4, 5]. Proteoforms are the basic units of a proteome. We believe that the study of proteoforms is an essential strategy to reveal cell dependencies and their underlying mechanism, and that this could add in the process of risk stratification and could identify novel therapeutic targets in highly complex diseases such as acute leukemia. Moreover, as the cure rates between ALL and AML, and between children and adults markedly differ, a direct comparison of the leukemic proteoforms between those patients, may aid to unravel the biological pathogenesis, and reveal similarities and dissimilarities that can propose therapeutics that target these proteoforms in one disease, that could also be effective in an otherwise disparate leukemia that shares protein patterns.

2. Acute leukemia

2.1 Acute lymphoblastic leukemia

ALLs are neoplasms composed of immature B (pre-B), T (pre-T) or NK-cells that are referred as (“lymphoblasts”), of which the majority is pre-B ALL (85–90% in children vs. 75% in adults) [1]. It is the most common cancer in children and accounts for a quarter of all childhood malignancies. Although there are as many adults with ALL as there are children with the disease, the relative frequency in adults is much lower. Worldwide, the overall incidence is approximately 1–2 per 100.000 people, with a peak incidence occurring in childhood and a second peak above the age of 50 years [6].

2.1.1 Cytogenetic abnormalities

Chromosomal aberrations are the hallmark of ALL and are often used to categorize patients. In B-ALL, recurrent chromosomal abnormalities are found in 80% of the patients, including numerical and structural changes as translocations, deletions and inversions. There are substantial differences in the frequencies of occurring of cytogenetic abnormalities between children and adults [1, 7, 8, 9, 10]. For instance, the translocation 9;22 [BCR-ABL1] is observed in 2–5% of the children compared to in 30% of the adults, whereas the translation 12;21 [ETV6-RUNX1] is observed in 25% of the pediatric patients versus 3% in adult population. The hyperdiploid (gain of chromosomes) karyotype is present in 30–40% of the children compared to 3% in adults. Finally, translocation 4;11 resulting in the MLL-AF4 fusion gene, is detected in 60–80% of the infants (younger than 1 year old), whereas it is seen in only 2% of the patients up to 15 years and rare in adults. Hypodiploidy (loss of chromosomes) occurs in 5–6% of the ALL patients, independent of age.

Chromosomal translations occur less frequently in T-ALL compared to B-ALL (approximately 50–60%) and unlike in B-ALL their prognostic impact is not well defined and they are not used for risk stratification [10]. They are involved in both the T-cell receptor and the non-T-cell receptor loci on the chromosome or aberrant expression of the transcription factor oncogenes. There is less association with age [8].

2.1.2 Prognosis

Survival in children with ALL is much better compared to adults, with the exception of infant ALL. In 60–80% of the cases, infant ALL is characterized by translocations involving 11q23, affecting the KMT2A gene. Aberrant KMT2A in ALL is associated with a high rate of early treatment failure and a very poor outcome (long-term event-free survival of 28–45%), even when treated with more aggressive chemotherapy regimens [11, 12]. Historically, pediatric T-ALL was considered as high-risk disease. With the introduction of therapy intensification in T-cell ALL, this has changed to outcomes comparable to B-cell ALL, resulting in a five-year OS rate of more than 90% [13]. However, within certain high-risk subgroups (e.g. infants or children ≥10 years of age), 25–30% still experience relapse, which has a dismal outcome even with hematopoietic stem cell transplantation. Death resulting from treatment toxicity remains a challenge with an estimated 10-year cumulative incidence of treatment-related death of 2.9% [14].

Survival for ALL in adults is around 45%, but patients above the age of 60 suffer from inferior outcomes with only 10–15% long-term survival [10]. This is, at least partially, due to higher risk of medical comorbidities, the inability to tolerate standard chemotherapy regimens, and age-related unfavorable intrinsic biology such as Philadelphia chromosome positive, hypodiploidy and complex karyotype. However, as even the adolescents and young adults who lack medical comorbidities do significantly worse compared to their younger counterparts, the contribution of the different underlying biology should not be underestimated [8].

2.2 Acute myeloid leukemia

In general, patients with AML have similar signs and symptoms as patients with ALL which mainly includes symptoms related to (pan)cytopenia. AML is the most common acute leukemia in adults, whereas it is relatively rare in children (accounting for only 10% of the acute leukemia) [15]. Overall, AML occurs in 3–5 cases per 100.000 people, and the incidence strongly increases with age.

2.2.1 Cytogenetic and molecular abnormalities

AML is a very heterogeneous disease and the identification of AML-associated chromosomal translocations and inversions have led to the current 2016 World Health Organization (WHO) classification system [16]. In this classification, eight recurrent genetic abnormalities (e.g. translocation (15;17) [PML-RARA], translocation (8;21) [RUNX1-RUNX1T1], inversion (16) or translocation (16;16) [CBFB-MYH1], translocation (9;11) [MLLT3-KMT2A], and translocation (9;22) [BCR-ABL1]) and their variants are included. In approximately 50 percent of patients, no cytogenetic abnormalities will be present, referred to as “normal karyotype” [17]. Additional classification in AML is provided by detection of one or more recurrent genetic mutations, with NPM1, FLT3, IDH1, IDH2, RUNX1 and CEBPA most studied.

Recently, the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) study has presented the molecular landscape of nearly 1000 pediatric AML patients that participated in several Children’s Oncology Group (COG) clinical trials [18]. Like adult AML, they found that pediatric AML has one of the lowest rates of mutations as compared to other cancers as recognized by The Cancer Genome Atlas, suggesting that the number of recognized recurrent mutations in AML alone is not sufficient to explain its heterogeneity. They demonstrated that the landscape of somatic variants in pediatric AML was markedly different from that reported in adults, highlighting the need for and facilitate the development of age-tailored targeted therapies for the treatment of pediatric AML [19, 20].

2.2.2 Prognosis

Among adult patients who are under 60 years of age, AML can be cured in 35–40% of the patients, whereas the survival rates of patients older than 60 is only 5–15%. For older patients who are unable to receive intensive chemotherapy without acceptable side effects the prognosis is even more dismal, with a median survival of only 5–10 months [2]. Survival rates in the pediatric population, have improved greatly, although OS rates of 65–70% are still much lower than that for pediatric ALL [21].

3. Proteome differs from transcriptome

The human genome is the total amount of DNA that each cell in the body contains, including an estimated of 30,000–40,000 protein-coding genes. While the basic dogma of biology formerly was that DNA was transcribed into messenger RNA, which is then translated into proteins, and that mRNA levels could be used to predict protein abundance, it becomes more and more clear that this is overly simplistic due to our expanding knowledge of the effects of epigenetics, environmental influences, mRNA editing, alternative splicing and noncoding RNAs on gene expression. For instance, coding single-nucleotide polymorphisms and mutations can affect the final protein sequence and function, and based on endogenous proteolysis and mRNA splicing, different isoforms can be generated from the same set of nucleotides. Additionally, after translation of the RNA transcript, proteins undergo multiple modifications affecting the protein function, localization, lifespan and activity. Together this results in up to a million of proteoforms.

One of the first studies back in 1999, that compared a limited number of mRNA and proteins using Saccharomyces cerevisiae, already concluded that the correlation between both was only 0.36 [4, 22]. And, even with the significant improvements in high-throughput genomic and proteome approaches, this fundamental observation continues to be widely, though not universally, supported, as most studies nowadays still show a correlation coefficient that varies between 0.17 and 0.40. Per example, Mun et al. recently performed correlation analysis of mRNA and protein log2-fold changes between gastric cancer tumor samples and adjacent normal tissues using 6803 genes with protein and mRNA abundances available in at least 30% (≥24) of the patients. Of the 6803 genes, only 34.3% showed significant (FDR < 0.01) positive correlation with an average correlation coefficient of 0.28 [23]. Zang et al., performed an integrated proteogenomic analyses human colon and rectal cancer samples and while 89% of the samples showed significant positive mRNA-protein correlation (of which only 32% was significantly correlated), the average correlation between messenger RNA transcript abundance and protein abundance was only 0.23 [24].

4. Age-associated proteoforms in acute leukemia

Aforementioned, the functional variant of a protein, the proteoform, is defined by genetics, mRNA editing, and PTMs. In particular in ALL, that peaks between 2 and 5 years of age followed by a gradual increase in the older patients, is it suggested that different combinations of genetic factors (resulting in different proteoforms) contribute to leukemogenesis at different ages. In order to answer the question why genes are differentially expressed upon age, a closer look at biological processes that influence the final proteoform production via pre-translational modifications may help. Here we will discuss a few examples of how these differ between younger and older patients with acute leukemia.

4.1 Genetic variants

Emerging genome wide sequencing techniques identified disease and age-specific gene variants in acute leukemia. For example, Perez-Andreu et al. discovered a single nucleotide polymorphism (SNP), a variant of the coding region of the DNA, of GATA3 on 10p14 that was associated with the susceptibility to ALL in adolescents and young adults, and that progressively increased with age [25]. Furthermore, genomic variants that occur in both pediatric and adult leukemia sometimes display a different phenotype at the protein level. As shown by Zuurbier et al., loss of PTEN protein due to the production of an unstable and truncated proteoform caused by a frameshift mutation or genomic deletion is a frequently seen in T-cell ALL (predominantly in pediatric T-ALL). PTEN is often recognized as a tumor suppressor, but its behavior and relation to outcome is highly context dependent. PTEN abnormalities may impact NOTCH1 and, in a cohort of PTEN mutated pediatric T-ALL patients (with loss of PTEN protein) that lacked the NOTCH1 activating mutations, had significantly fewer relapses compared to patients with activated PTEN and NOTCH1 [26]. In contrast, another study showed that PTEN mutations without NOTCH1 abnormalities were associated with poor prognosis in adults [37]. Thus, genomic mutations within the same gene, do not always produce the same proteoform with the same function. Mutations can create a proteoform with a completely different function and can convert a protein from a tumor suppressor into a tumor driver [27]. Although, genome wide studies are very meaningful in detection of conditions specific to age and disease, but the net effect on the cell largely depends on the production of the final proteoform (tumor suppressor or tumor driver) and the pathways they act in.

4.2 Chromosomal translocations

Chromosomal abnormalities, gene fusions and copy number aberrations are more common in the younger patient population [28]. The ratio of structural variation to mutational burden decreases continuously with age, with the most chromosomal translocations in infants (<1 year) compared to all other ages. Within this young age-group, the most common fusion involves KMT2A (also known as MLL1), present in 38% of the infants [28]. A second age-peak is recognized in young to middle aged AML adults. Overall, more than 80 fusion partners of KMT2A are described and it is the protein partner of KMT2A that determine characteristics specific to age and disease. Interestingly, 50% of the infants younger than 1 year with ALL contain the specific MLL-AF4 fusion protein caused by the t(4;11)(q21,q23) translocation [29], whereas in AML, the most common MLL rearrangement is the MLL-AF9 that arises from a t(9;11)(p22,q23) translocation. In both populations, MLL leukemia confers poor prognosis and identification of unique proteoforms in this subtype leukemia may guide treatment stratification by providing targetable leads. For instance, downstream proteomic targets mediated by MLL-AF4 include HOX, EPHA7, MEIS, PBX and GSK-3 and these are already considered or investigated as therapeutic targets in the context of MLL-rearranged leukemia. In addition, RAS, DOT1L, and HSP-90 also have been described as potential targets in MLL leukemia [30]. As those genes and their protein products are in particular involved in transcription regulation, we hypothesize that patients that expression differences in those conserved genes, likely also harbor differences in abundance, or proteoforms of its downstream proteins, compared to wild-type patients.

4.3 Non-coding microRNAs

MicroRNAs (miRNA) are small non-coding RNAs that affect the proteome through their binding to mRNA influencing/inhibiting the translation to proteins. Aberrant miRNA expression is associated with leukemogenesis [31], and multiple miRNAs are found to be expressed differently upon age. A study by Noren Hooten et al. showed downregulation of miRNA expression in peripheral blood of healthy individuals with advancing age. Cancer is often age-related and five out of nine downregulated miRNAs in this study were related to cancer pathogenesis [32]. Another study compared miRNA profiles between pediatric and adult patients with AML and again, identified significant lower miRNA expression in adults compared to children. In addition, they found distinct miRNA expression patterns in both t(8;21) and t(15;17) translocated pediatric AML, but not in adults. Also, nine-fold upregulation of miR-21 was identified in the MLL-rearranged pediatric patients compared to others and this finding was also not reflected by the MLL-rearranged adult population [33]. The identification of age-specific miRNA specific expressing in leukemia together with the fact that miRNA will affect the final proteomic state, indicates that further proteomic approaches could likely unravel differences in proteoforms between younger and older patients within leukemic subtypes.

4.4 Post-translational modifications

DNA is wrapped around histone to form a compact chromatin structure and PTMs on histone tails, such as the addition or removal of methyl or acetyl groups on lysine residues, or direct DNA methylation regulate chromatin accessibility and initiate and maintain gene expression patterns that account for specific cell lineage differentiation and development [34]. Packaging of the chromatin structure changes with age and include global loss of heterochromatin resulting in a more open chromatin state in the elderly. Reduction of heterochromatin due to increased histone acetylation during aging is also well-established [35, 36], but less well-characterized is the role of histone methylation. Since the prevalence of AML increases with age, we asked ourselves if histone methylation profiles are different between pediatric and adult AML. We recently applied RPPA-based profiling using antibodies against multiple histone methylation sites which enabled us to define disease and age characteristic patterns of histone modification. In agreement with our hypothesis, a significant decline in histone methylation was seen upon age in both ALL and AML cases (manuscript in preparation).

As mentioned, MLL-rearrangements are specific to age and disease, and are frequently altered in leukemia. As MLL fusion proteins modulate the chromatin structure by histone tail modifications, MLL-rearranged leukemia is considered as epigenetic malignancy. In addition, mutations in proteins that modify the histone PTM process (e.g. writers, erasers and readers) are more frequently found in T-ALL compared to other childhood malignancies, and distinct DNA methylation patterns were recognized among different subtypes of ALL. Those patterns correlated with changed transcriptomes. Aberrant DNA methylation is associated with silencing of genes that involved in lymphoid development, and contribute to leukemogenesis. By combining DNA methylation and transcriptome analysis, transcriptional silencing via promotor hypermethylation was recently identified in pediatric AML [28], and correlated with age, karyotype and outcome.

Since hypomethylating agents have been widely used to treat, in particular the older leukemia patients, we hypothesize that proteomics can help to identify more refined subgroups (maybe even from the younger population) that can be treated with certain treatment regimens that alter the epigenome. For instance, the discovery of a specific protein or proteomic signature (either related to the epigenome or not) that is correlated with sensitivity to hypomethylating agents, can potentially act as biomarker in MLL-rearranged leukemia, to select patients that can benefit from those agents. In the experimental setting, therapies with hypomethylating agents have already showed re-expression of the hypermethylated genes along with restored chemosensitivity, and in relapsed ALL, increased promotor methylation was found to be related to increased chemoresistance [37]. If it is possible to identify a set of proteins that is specific for relapsed MLL-rearrangement AML and/or ALL, rapid tests (e.g. ELISA, IHC or FPPA) could be developed to quickly provide information about the protein abundance in relapsed patients, to identify those who will benefit from additional treatment, as well as, a priori, predict which newly diagnosed patients are most likely to relapse, and treat those with additional treatment to prevent relapse.

5. Oncogenic proteoforms leading to leukemia

Mutations in the DNA of the hematopoietic stem cells play a pivotal role in leukemogenesis and within single genes, multiple mutations have been identified that results in different forms of the protein. One example involves transcription factor CCAAT/enhancer binding protein A (CEBPA) mutated AML patients, which is known to regulate growth arrest and differentiation in hematopoiesis by promoting granulocyte lineage differentiation in common myeloid progenitor cells, and disruption of normal CEBPA expression in myeloid progenitors may lead to a block in granulopoiesis resulting in erythropoiesis in its place [38]. As critical regulator of myeloid lineage development it is not surprising that CEBPA is mutated in ~10% of AML patients and most frequently classified as myeloblastic AML subtype M1 of M2 according the French-American-British (FAB) classification. CEBPA transcript translates for a full-length (CEBPA-p42) or shorter isoform (CEBPA-p30). CEBPA-p30 isoforms contain the DNA binding domain but lack the N-terminal transactivation domain. However, CEBPA-p30 is dominant negative by reducing transcriptional activity after heterodimerization with full-length CEBPA-p42. About half of CEBPA mutated AML patients have one allele with a N-terminal mutation and one allele with a C-terminal mutation. The N-terminal mutant results in translational termination of the full-length isoform and increase truncated CEBPA-p30 expression. In contrast, C-terminal mutations in CEBPA-p42 are mostly characterized by in-frame basic region leucine zipper (bZIP) variants inhibiting normal CEBPA function by disrupting DNA binding and dimerization [39]. CEBPA mutated patients might be candidates for inhibition of the oncogenic CEBPA-p30 isoform to recover the disrupted p42/p30-ratio.

6. High-throughput proteomics methodologies

Proteomics may be the least developed and investigated “-omics” approach, it is likely one of the most informative for understanding of cellular behavior as it can provide useful information about both protein abundance and activity, as regulated by the PTM, the protein-protein and protein-DNA interactions. Nowadays, two of the most commonly used high-throughput techniques to study the proteome in leukemia are mass-spectrometry (MS)-based techniques and antibody-based techniques.

6.1 MS-based

MS is a high-throughput technique uses the formation of ions (charged fragments) from the protein analyte to distinguish between proteoforms. Those ions can be sorted and measured using electrical and/or magnetic fields based on their mass-to-charge ratio (m/z), and identification of the protein follows based on the abundance of those m/z-fragments [40]. Globally, proteins can be ionized with two distinct methods: matrix assisted laser desorption/ionization (MALDI) and electrospray ionization (ESI). In MALDI the protein sample is mixed with an energy absorbing matrix. Irradiation of this matrix causes vaporization of the matrix together with the sample, resulting in the formation of ions [41]. ESI creates ions using electrospray to dissolve the protein lysate, by applying high-voltage to the dissolvent to create an aerosol of small charged fragments. When a protein sample is highly complex, samples may require separation prior to MS analysis using 1D or 2D gel electrophoresis, high-pressure liquid chromatography (LC-MS), or gas chromatography (GC-MS) to maximize the sensitivity. Because proteoforms are derived from a single gene, they often contain homologous sequence regions, and because of the digestion step, information about the relationship between amino acid sequence and the PTM often lacks, this significantly complicates the process of identifying proteoforms. Several overviews have been published that discuss recent technological developments of MS to enable analysis distinct proteoforms [42, 43, 44].

6.2 Antibody-based

Another high-throughput approach is the protein microarray (PMA), of which two different types exist: forward phase protein arrays (FPPA) and reverse phase protein arrays (RPPA). Given that antibodies can be raised to specifically recognize sequence variations or PTM, they enable measurement of selected proteoforms. In FPPA, protein antibodies are immobilized on an array in known positions, and samples are then printed on the array. If a particular proteoform is present in the sample, the proteoform binds to the antibody and after exposure to a secondary antibody, the abundance can be measured. Each slide is incubated with a single protein sample, but multiple proteins can be measured simultaneously depending on the number of antibodies printed on the slide.

The “reverse” version of the FPPA is the RPPA methodology. In RPPA, samples are first printed on the array, and subsequently each slide is stained with a single protein antibody, followed by a secondary antibody to amplify the signal. The downsides of RPPA are that all samples must be printed at the same time to avoid methodological barriers due to printing irregularities between batches, and that RPPA can only be used to detect proteins for which a strictly validated antibody is available. As there is no separation of the proteins according to molecular weight, it is crucial that antibodies are proven to be highly specific, selective and reproducible. Plus, RPPA is biased to proteins and isoforms for which a strictly validated antibody is available. On the other hand, RPPA requires only a small number of cells (approximately 3 × 105 cells to test 400 different antibodies), making it highly suitable for retrospective clinical applications. As it in addition analyzes all samples at once, it allows a direct comparison of protein abundance across samples.

7. Proteomics in acute leukemia

7.1 Disease-specific proteoform landscape of acute myeloid leukemia and acute lymphoid leukemia

Acute leukemia is a heterogeneous group of diseases both in terms of biology and prognosis. Classification into those arising from the myeloid or the lymphoid lineage is based on cytomorphology and cytochemistry, with further differentiation into specific subgroups based on morphology, immunophenotyping, cytogenetics, and molecular genetics of the acute leukemia cells. However, present classification systems are not adequate to differentiate between all subtypes and do not always accurately predict the clinical outcome. Whether changes in the leukemic cells that cause those differences are due to developmental, genetic, or environmental effects, they all are ultimately mediated by changes in protein abundance or modification. Therefore, we hypothesize that systematic comparative or differential proteomics can discover changes in the presence and quantity of individual proteoforms that underlie these cellular changes, and can add to current diagnostics, prognostics and therapeutics.

Assessment of the “diseased”-proteome compared to the proteome of the “normal/healthy” cells (e.g. CD34+, CD38+CD34+, CD38−CD34+; a discussion about the optimal normal comparator is discussed elsewhere [45]) can identify proteins that are aberrantly expressed or activated compared to normal, as well as can identify different forms of the same protein that differ between the diseased cell and the healthy comparator. This enables recognition of pathways utilization of cells present within a certain set of patients or related to a specific clinical feature. In addition, proteins or sets of proteins that are differentially expressed, may aid for confirmatory diagnostic purposes and early disease detection.

Furthermore, detailed proteomic profiling can help identifying differences between subgroups of diseases, including ALL and AML, and also between subgroups within one of both. It may be important (informative) to know how these two diseases are similar as well as how they differ. As ALL and AML are both dominated by immature malignant hematopoietic cells, they can serve as lineage-independent control for each other. Defining which proteins display similar expression in ALL and AML, but which are different compared to the “normal” healthy control, or to more mature cells, are likely to be related to a block in differentiation, whereas other proteins patterns that are similar in both, could be related to the hallmark of uncontrolled proliferation, resistance to cell death, or other shared deregulations.

As example, Cui et al. performed proteomic analysis using 2D-MS for 61 bone marrow biopsies from patients diagnosed with French-American-British (FAB) M1-M5 AML or ALL [46, 47]. Comparative analysis, identified 27 proteins with lineage-specific expression. Among them, myeloperoxidase was already known to be highly expressed in AML compared to ALL, but they also recognized heat shock factor binding protein 1 (HSBP1) as being high in ALL. In addition, they found proteins that were higher expressed in M2 and M3 AML compared to M1, and 23 proteins that were differentially expressed between granulocytic lineage (M1, M2, M3) AML, and AML derived from the monocytic lineage (M5). To prove clinical usefulness, Cui et al. also applied proteomic analysis to an AML-M3 bone marrow (which was classified based on morphology by the presence of atypical granules) from a patient who did not respond to the standard differentiation-inducing therapy with all-trans retinoic acid or As2O3. Their analysis showed that this sample exhibited a “protein expression profile” specific to M1, and not to M3, and after changing this treatment to chemotherapy, the patient gained complete remission within 3 weeks. Xu et al. performed proteomic profiling of the bone marrow samples from patients with different subtypes of acute leukemia (APL, AML, ALL) and healthy volunteers by SELDI-TOF-MS. Based on 109 protein signatures, they constructed a proteomic-based classification model capable of replicating the morphological and differentiation-based classification scheme of the well-established FAB system. Their results suggested that this mode could potentially serve as new diagnostic approach [48].

In our own group, we performed proteomic profiling using RPPA for 265 patients, in which we were able to separate 3 clusters of proteins that tended to track similarly within a FAB class from a subset of 24 differentially abundant proteins and PTMs; myeloid subtypes (M0–M2), the monocytic subtypes (M4–M5), erythroleukemia and megakaryocytic leukemia [49]. Foss et al. studied proteomics from 4 AML patients and 5 ALL patients using LC-MS/MS in blasts, as well as in CD34+ cells from 6 healthy donors and mononuclear cells from 2 healthy donors to correct for mononuclear cell contamination. Blinded unsupervised clustering enabled grouping with each cell type forming a discrete cluster, suggesting that proteomics can indeed, at least in some cases, robustly distinguish known classes of leukemia.

Recently, another study by our group analyzed pediatric AML (n = 95) and pediatric ALL (n = 73) on RPPA for antibodies against 149 different total proteins in addition to 45 antibodies recognizing different PTMs (e.g. phosphorylation, histone modification and cleavage) [50]. We felt that traditional hierarchical clustering was suboptimal as it weighs all proteins equally, in all situations across the dataset, and is agnostic to all known functional relationships between proteins, ignoring known interactions. Hence, we developed a novel computational method that accounts for known functional interactions which we call the “MetaGalaxy” approach [50, 51, 52]. This methodology starts with the allocation of proteins into groups of proteins with a related function based on existing knowledge or strong association within this dataset (“Protein Functional Group” (PFG), n = 31). For each PFG, a clustering algorithm enabled recognition of an optimal number of protein clusters; a subset of cases with similar (correlated) expression of core PFG components.

In order to know how the activity between the different PFG relate to each other within pediatric ALL and AML, we next hypothesized that there would be recurrent patterns of interaction between the various PFG clusters that would form a finite set of “protein expression signatures” that are shared by different subsets of patients. Therefore, patients were clustered based on their protein cluster membership using a binary matrix system. Correlation between protein clusters from various PFG was defined as a “Protein Constellation”. We were able to identify subgroups of patients (signatures) that expressed similar combinations of protein constellations.

With this segmented approach a substantial amount of structure was observed across the data set (Figure 1), with an optimal number of 12 constellations and 12 signatures. Notably, signatures were strongly associated the leukemia-lineage. Signature 1 and 2 were specific to T-ALL (Figure 1, annotated in pink), whereas signatures 3, 4 and 5 were dominant to B-ALL and signature 7–12 to AML. Only signature 6 was a mixture of both B-ALL and AML patients. This clear distinction could also be discerned by the constellations. Protein constellation 1–4 were all specific to ALL, with constellation 1 (Figure 1, magenta box) only being found in B-ALL, 4 exclusively to T-ALL (Figure 1, yellow box), and 2 and 3 being present in both B- and T-ALL. On the other hand, constellation 7 and 8 were strongly associated with AML (Figure 1, blue box) and constellation 5 and 6 were found in both ALL and AML.

Figure 1.

“MetaGalaxy” analysis for pediatric ALL and AML. Annotations shows clear separation in protein patterns for T-ALL (magenta; signature 1 and 2), B-ALL (yellow; signature 3, 4, and 5), and AML (blue; signature 7, 8, 9, 10, 11, and 12). Constellations 1 (horizontally, magenta box) is associated with T-ALL, constellation 4 (horizontally, yellow box) with B-ALL and constellation 7 and 8 with AML (horizontally, blue box). This figure was adapted from Hoff et al. Molecular Cancer Research 2018 [50].

We also identified proteins that were universally changed in the same direction in at least 6 of the 8 signatures. Interestingly, GATA1 and STAT1 were universally lower expressed in both pediatric AML and adult AML patients, whereas and phosphorylated RB1-pSer807_811, a phosphorylation event that deactivates the RB1 protein, showed universally opposite expression in children and adults, being predominantly unphosphorylated (active) in pediatric patients and highly phosphorylated (inactive) in adults. For pediatric AML and ALL samples, comparable expressions were seen for the higher expressed universals CASP7 cleaved at domain 198 and phosphorylated CDKN1B-pSer10, and the lower expressed JUN-pSer73 and GATA1.

Unpublished data from 205 adult AML and 166 adult ALL patients identified the existence of 11 protein signatures, of which 5 were AML dominant (93–100%), 4 were T-ALL dominant (79–100%) and 2 signatures contained a mixture of AML, B-ALL, T-ALL samples (50 and 68% AML). Three out of the 12 constellations were predominantly associated with AML, 4 were associated with ALL, 2 were associated with a mixture of ALL and AML cases, and 3 signatures were not strongly associated with any particular signature. This study used a total of 230 antibodies, including antibodies against 169 different proteins along with 52 antibodies targeting.

phosphorylation sites, 6 targeting Caspase and Parp cleavage forms and 3 targeting histone methylation sites. A third study (manuscript in preparation) from 500 pediatric AML, 68 adult T-ALL and 290 pediatric T-ALL patients, again showed similar results, with T-ALL and AML dominant signatures (81.5–100%), and only 1 out of the 15 signatures that had both T-ALL and AML (39% T-ALL and 61% AML). This clearly suggests that proteomics can be used to distinguish ALL from AML, and that although ALL and AML are very different in terms of overall proteomics, they share “protein expression signatures”, which suggests that there are shared patterns of deregulation within some pathways. However, as all studies used a mixture of both total and PTM-proteins, it would be interesting to assess how expression and classification differs across diseases using a panel with a larger number of PTM, or a panel limited to PTM only, given that PTM often provide information about the activity or biological function of the protein.

7.2 Global proteomic landscape of pediatric and adult T-ALL

When we assessed the global proteomic landscape in pediatric and adult T-ALL, using the “MetaGalaxy” approach, we found 10 signatures based on 11 constellations (manuscript in preparation). Overall, signatures were not associated with age (i.e. pediatric vs. adult), with the exception of one signature. This signature was strongly associated with 2 constellations, which were only present in this particular signature. This suggests that pediatric T-ALL and adult T-ALL are more similar than ALL and AML, but that despite mostly overlapping signatures and constellations, there is an expression pattern specific to children. As this is similar to what we see in the genetics, were most recurrent aberrations are seen in both children and adults, but with different frequencies of occurring, correlation with genetic features would be interesting.

7.3 Assessing dynamic change upon treatment exposure

Children have a significant better prognosis and ALL responds better to treatment than AML. In addition to extracting information about differences in baseline protein abundance between those groups of patients, another consideration is to look at the dynamic response of the cells to stress, such as chemotherapy, or apoptotic inducers, to see whether changes in protein abundance patterns can provide a marker or whether a cell is responsive or resistant, and whether this is different between patients. Looking at post-treatment abundance and presence of proteoforms may provide insights into biological effects of drugs and mechanisms of drug resistance. This can either be done from static expression levels post-treatment at a given time point, or from the dynamic change in expression during treatment (i.e. expression post-treatment minus expression pre-treatment). Particularly, in leukemia, were blood can easily be drawn from the patient without performing any additional invasive procedures, expression can be measured at several time points during treatment.

Although this will not provide a priori information about which patients will respond to therapy or which patient needs which chemotherapy, it can give information about the response to treatment during early stages and so, aid in the decision of a more intensive treatment strategy should be achieved, or whether additional combinational treatment would be beneficial. For instance, if it is known that a particular protein pathway is utilized be the cell in order to circumvent cell death, in theory, this pathway can be targeted. Also, by comparing response to treatment on protein abundance or activity between ALL and AML, or children and adults, this can provide important information about why some patients respond while others do not.

While, theoretically, this approach would be promising, in reality this it much more complicated. First, of all, the time point of measuring the expression would be crucial. Assessment of the dynamic change too early, in cells that are not yet fatally hit by the chemotherapy or are in the process of dying, would suggest that the chemotherapy does not work, or has no effect on protein level, whereas measuring too late would measure the expression in cells that already died. Moreover, despite the ability of chemotherapy to kill the vast majority of leukemic cells, the rare leukemic stem cell that survives the chemotherapy, and that is responsible for the outgrowth of the leukemia cells which is manifested as relapse or primary resistant disease, is the cell from which we can potentially gather the most information. Proteomic analysis of these resistant cells, rather than taking the average of all, might be more informative than the analysis of the bulk leukemia population. Especially, knowing how those cells respond to chemotherapy (in comparison to other), would then be likely to raise new biological questions about why different cells behave differently, and why, or how, cells are able to circumvent chemotherapy, and what can be done to treat those cells. However, without a current means to a priori identify those few cells, isolation of (enough of) those cells remain a real challenge. So, if we want to know what is going in, pre- and post-treatment, as means to identify who those are, is required.

8. Conclusions

Despite significant improvement in treatment regimes, outcomes of both pediatric and adult patients with acute leukemia remain unsatisfactory. When a leukemia patient enters the clinic, particularly cytogenetics and mutation analysis are the methods of choice to perform risk stratification. And after induction therapy, choice of consolidation therapy is mainly based on the present chromosomal alternations and driver mutation(s). Emerging research in the field shows that prognosis is largely context-dependent and that acute leukemia are molecularly diverse diseases with similar phenotypes. Many years of exploration the molecular diversity in leukemia taught us that the combined influences of genetics, epigenetic remodeling, the microenvironment and PTM of leukemic blasts determine its cell fate. Since the net effect of these combined influences is predominantly displaced on the abundance and activity of the proteoforms, as well or their affected signaling pathways, we argue that characterization of differentially abundant proteoforms and recognition of proteomic patterns within and between (subgroups of) acute leukemia may facilitate and improve risk stratification as well as could provide therapeutic leads that may contribute to treatment personalization. However, while much is known about cytogenetics in AML and ALL, little is known about the proteomics of these cells.

While distinct proteoform patterns within and between different leukemic subtypes are only beginning to be recognized, age-specific proteome characterizations are far more limited. Bone marrow aspiration is a relative painful procedure and healthy donors, such as patient relatives or medical students who donate bone marrow that could function as internal control against AML blasts are scarce in many studies. The control group therefore often does not represent the median age of the patient cohort and leukemic-specific findings cannot be directly compared to a matched age group. Many studies focusing on leukemia therefore avoid controls and perform internal disease comparisons. Age-related analysis is then only applicable when a wide age distribution across the cohort is present, but this is often not the case as most research focuses on either pediatric or adult leukemia, instead of both.

More research is needed to identify single proteins and sets of proteins that are associated with disease and age specific subgroups. As far as we know, we are the first to analyze protein abundance and their PTM between AML and ALL across all ages, using antibody-based proteomics. Almost all studies look at AML or ALL and if they look at both, they mainly focus on the differences rather than the similarities. However, ALL and AML share the same pathophysiology in terms of the occurrence of a differentiation block that gives rise to uncontrolled clonal proliferations of immature hematopoietic progenitor cells in the bone marrow. Defining which proteoforms have similar expression in ALL and AML, but different expression compared to the “normal” healthy control or to more mature cells are likely to be related to a block in differentiation. Other similar protein patterns could be related to the hallmark of an uncontrolled proliferation or resistance to apoptosis. Identification of differences in proteomic profiles between ALL and AML can additionally lead to lineage-specific proteomic signatures which may help to distinguish (subgroups) of the diseases.

Recognition of similar and dissimilar proteomic patterns among acute leukemia should also be analyzed in relation to responses to therapy. Treatment that is used in one group that was highly sensitive to it can be tested in other groups based on similar proteomic patterns. Cytogenetic and mutational information provides prognostic information, but so far lacks the a priori information to predict treatment outcomes. Rational selection of targeted therapies based on the functional activity state of the cell, as determined by the proteome, is more likely to sensitize patients for certain treatment regimens compared to random selection.

Conflict of interest

The authors declare no conflict of interest.

© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Fieke W. Hoff, Anneke D. van Dijk and Steven M. Kornblau (December 19th 2019). Proteoforms in Acute Leukemia: Evaluation of Age- and Disease-Specific Proteoform Patterns, Proteoforms - Concept and Applications in Medical Sciences, Xianquan Zhan, IntechOpen, DOI: 10.5772/intechopen.90329. Available from:

chapter statistics

70total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Introductory Chapter: Proteoforms

By Xianquan Zhan

Related Book

First chapter

Overview of Current Proteomic Approaches for Discovery of Vascular Biomarkers of Atherosclerosis

By Lepedda Antonio Junior, Zinellu Elisabetta and Formato Marilena

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us