Open access peer-reviewed chapter

Comprehensive Network Analysis of Cancer Stem Cell Signalling through Systematic Integration of Post-Translational Modification Dynamics

By Hiroko Kozuka‐Hata and Masaaki Oyama

Submitted: November 11th 2016Published: September 13th 2017

DOI: 10.5772/intechopen.69647

Downloaded: 1092


Post‐translational modifications, such as phosphorylation, acetylation and ubiquitination, are widely known to play various important roles in cellular signalling. Recent significant advances in mass spectrometry‐based proteomics technology enable us not only to comprehensively identify expressed proteins but also to unveil their post‐translational modifications with high sensitivity. In our advanced proteome bioinformatics frameworks, statistical network analyses of large‐scale information on various post‐translational modification dynamics were conducted to define the key machinery for cancer stem cell properties. The bioinformatical approaches using IPA (ingenuity pathway analysis), NetworKIN and a newly developed platform named PTMapper (post‐translational modification mapper) allowed us to perform network‐wide prediction of upstream interactors/kinases with the related information on the diseases and functions, leading to systematic finding of novel drug candidates to regulate aberrant signalling in cancer stem cells. In this chapter, we apply patient‐derived glioblastoma stem cells as a representative model of cancer stem cells to introduce some useful platforms for statistical and mathematical network analyses based on the large‐scale phosphoproteome data.


  • glioblastoma stem cells
  • signal transduction
  • proteomics
  • post‐translational modification
  • network analysis

1. Introduction

Glioblastoma (GBM) is known to be the most common and aggressive brain tumour in adults. Despite the enormous efforts to overcome this tumour for many years, the median survival for GBM patients remains around only 1 year [1]. GBM is characterized by high invasiveness and intratumoral heterogeneity (ITH) [2, 3]. Up to date, it is known that GBM‐ITH contributes to the resistance to chemotherapy, radiation and surgical resection. Since functional diversity is the main feature of multilineage differentiation of cancer stem cells (CSCs) [4, 5], glioblastoma stem cells (GSCs) were thought to be major therapeutic targets of GBM. Furthermore, post‐translational modifications (PTMs) of GSCs are reported to tightly regulate highly tumourigenic potential of GSCs through aberrant signalling [6, 7]. Therefore, it is important to comprehensively elucidate PTM‐based GSC signalling networks for developing the effective treatment of GBM.

Advanced nanoscale liquid chromatography‐tandem mass spectrometry (nanoLC‐MS/MS) enables us to identify and quantify thousands of proteins in a single experiment [8]. Moreover, using the nanoLC‐MS/MS system coupled to the high‐affinity enrichment methods of the peptides with PTMs, we can also acquire in‐depth biological information on PTM dynamics. In this chapter, we introduce high‐resolution shotgun proteomics technology for large‐scale PTM determination in combination with statistical bioinformatics platforms such as IPA [9], NetworKIN [10, 11] and PTMapper [12].

2. System‐wide proteomic analysis of PTM dynamics

PTMs are widely known to play crucial roles in cell fate control, such as proliferation, differentiation and apoptosis. More than 500 kinds of PTMs regarding eukaryotes and prokaryotes have been registered with Unimod, a comprehensive database of protein modifications for mass spectrometry [13]. Recent technological advances in mass spectrometry‐based proteomics in combination with appropriate enrichment techniques for each PTM enable us to perform comprehensive identification and quantification of PTMs [14]. Here, we introduce biochemical purification methods for highly sensitive detection of the representative PTMs: phosphorylation, acetylation and ubiquitination (Figure 1).

Figure 1.

Strategy for mass spectrometry‐based identification of peptides modified with phosphorylation, acetylation and ubiquitination. Regarding ubiquitinated lysine residues, Gly‐Gly remnants are generated from the C‐terminal of ubiquitin as a consequence of tryptic digestion. PTMs: post‐translational modifications, P: phosphorylation, Ac: acetylation, Ub: ubiquitination, TiO2: titanium dioxide.

2.1. Phosphorylation

Protein phosphorylation is recognized as one of the most important and well‐studied PTMs and regulates a variety of biological processes by transmitting diverse external signals [15, 16]. About as many as 280,000 phosphorylation sites have already been registered in PhosphoSitePlus, a knowledgebase containing non‐redundant mammalian PTMs [17]. Titanium dioxide (TiO2), which has very high affinity for phosphorylated peptides, is widely used for large‐scale phosphoproteome analysis [18, 19].

2.2. Acetylation

Lysine acetylation plays a key role in modulating transcriptional regulation through the coordinated function of histone acetyltransferases (HATs) and histone deacetylases (HDACs) [20]. The stabilization of p53, one of the most important transcription factors, is reported to greatly depend on lysine acetylation [21]. Thousands of lysine acetylation sites can be identified using an antibody against acetyl‐lysine in combination with a high‐resolution mass spectrometry system [22, 23].

2.3. Ubiquitination

The ubiquitin system transmits protein degradation signal to proteasome as well as regulates multiple cellular functions such as cell‐cycle progression, DNA repair and transcriptional regulation. Dysfunction of this system leads to various pathological conditions [24]. Ubiquitination sites are detected as diglycine (Gly‐Gly) remnants on the modified lysine residues, which are generated by tryptic digestion of ubiquitinated proteins [25, 26].

3. Systematic characterization of the phosphoproteome dynamics in GSCs

The quantitative information on the phosphoproteome dynamics can provide us with systematic description of the key machinery for cellular signalling. In this section, we introduce two examples of global phosphoproteome analyses of GSCs using SILAC (stable isotope labelling by amino acids in cell culture)‐based quantitative technique [27, 28] (Figure 2). One was carried out using epidermal growth factor (EGF) to elucidate the mechanism for stemness maintenance of GSCs [29], whereas the other was conducted through serum‐induced differentiation of GSCs to unveil the key pathways responsible for disrupting stemness characteristics [30].

Figure 2.

Schematic workflow for quantitative proteome analysis using SILAC, a representative relative quantitation technique based on metabolic labelling of specific amino acids such as arginine. Two populations of GSCs were cultured in the media supplemented with 12C614N4‐Arg (light) or 13C615N4‐Arg (heavy), respectively. After one of the two cell populations was stimulated/perturbed, both of the cells were lysed, equally combined and enzymatically digested to perform nanoLC‐MS/MS analyses. The intensity of each mass peak is used for relative quantitation of each peptide with high accuracy.

3.1. Global quantitative phosphoproteome analyses of EGF‐stimulated GSCs

EGF is known to be essential for maintenance and growth of GSCs [31]. The quantitative phosphoproteomic analysis of EGF‐stimulated GSCs was performed to acquire network‐wide information on the molecules related to stemness maintenance. As a result, a total of 6073 phosphopeptides from 2282 phosphorylated proteins were identified, leading to quantitative classification of 516 upregulated and 275 downregulated phosphorylation sites [29].

3.1.1. IPA‐based network analysis

IPA canonical pathway analysis was then performed using SILAC‐based quantitative phosphoproteome data on EGF‐stimulated GSCs [29] (Figure 3). Protein synthesis‐related pathways (EIF2 signalling, mTOR signalling) and cell cycle regulation‐related pathways (cyclins and cell cycle regulation, cell cycle: G1/S checkpoint regulation, cell cycle: G2/M DNA damage checkpoint regulation) were extracted with statistical significance (‐log (p‐value) > 5).

Figure 3.

IPA‐based pathway analysis of the quantitative phosphoproteome data on EGF‐stimulated GSCs. (A) The significant canonical pathways across the entire dataset (‐log (p‐value) > 5). (B) The mTOR signalling pathway is representatively depicted with the predicted information on the biological activities related to this pathway.

3.1.2. Upstream kinase prediction analysis

Protein phosphorylation is known to be controlled by specific kinases depending on consensus sequence motifs of substrates [32]. The motif‐x algorithm [33, 34] is applicable to statistical extraction of significant consensus sequence motifs from the large‐scale phosphoproteome data on EGF‐stimulated GSCs (Figure 4(A) and (B)).

Figure 4.

Phosphorylation site‐oriented network analysis of the quantitative phosphoproteome data on EGF‐stimulated GSCs. The consensus sequence motifs surrounding the quantitatively regulated phosphorylation sites regarding (A) downregulation and (B) upregulation can be described as a result of the motif‐x analyses. (C) The numerical distribution of the putative kinases predicted by NetworKIN. The colour of cells reflects the number of the predicted kinases for each consensus sequence as described in (A) and (B).

NetworKIN [10, 11] is designed to predict upstream kinases based on the sequence motifs around the functionally regulated phosphorylation sites through construction of the related protein‐protein interaction (PPI) networks using STRING [35]. The NetworKIN algorithm enables further interpretation of the results obtained from the motif‐x analyses (Figure 4 (C)).

3.2. Global quantitative phosphoproteome analyses of serum‐induced GSCs

CSCs are regarded as one of the most clinically important cell populations in causing tumour heterogeneity, which is responsible for the resistance to chemotherapy [36]. As recent studies have demonstrated that non‐CSCs can also readily acquire CSC‐like characteristics [37], it is very important to figure out the detailed mechanisms underlying CSC differentiation and understand the principle of their heterogeneity. Serum‐induced phosphoproteome dynamics in GSCs was measured to systematically elucidate the regulatory nodes for stemness alteration over the entire signalling networks [30]. Among 2876 phosphorylation sites on 1584 proteins identified, 732 phosphorylation sites on 419 proteins were found to be regulated through serum‐induced differentiation. The integrative network analyses of the quantitative phosphoproteome data using various bioinformatical tools including IPA and NetworKIN indicated that transforming growth factor‐β receptor type‐2 (TGFBR2) might be one of the crucial upstream regulators concerning GSC alteration (Figure 5).

Figure 5.

Upstream kinase/regulator analyses based on the regulated phosphoproteome data on serum‐induced GSCs. (A) Heatmap of the over‐representation p‐values calculated for each predicted kinase using PhosphoSiteAnalyzer, a bioinformatical platform for the NetworKIN prediction results from the phosphoproteome data [38]. The subset ‘serum (−)’ indicates SILAC ratio > 2.0, whereas ‘serum (+)’ shows SILAC ratio < 0.5. TGFBR2 and ACVR2A/B‐specific phosphorylation sites were predicted to be significantly enriched in the ‘serum (−)’ subset (adjusted p‐value < 0.05). (B) Upstream regulator analysis by IPA. The top 10 upstream regulators relevant to the regulated phosphoproteome are shown with the corresponding score (−log [p‐value]). (C) IPA‐based description of TGF‐β1 and the target molecules in the phosphoproteome data. Dashed lines represent indirect interactions caused by TGF‐β1, adapted from Ref. [30].

4. Development of advanced bioinformatical platforms for complicated kinase‐substrate interaction networks

Although shotgun proteomics strategy based on advanced nanoLC‐MS/MS system can provide us with large‐scale information on various kinds of PTMs, there are only a few PTM‐based network analysis tools available compared to conventional protein‐protein interaction (PPI). Recently, CEASAR: connecting enzymes and substrates at amino acid resolution [39] and PhosphoPath [40] were developed to visualize kinase‐substrate interactions in a phosphorylation site‐oriented manner. CEASAR was designed to provide a high‐resolution map of kinase‐phosphorylation networks based on functional protein microarrays and bioinformatics analysis. On the other hand, PhosphoPath was developed as a Cytoscape app [41] to visualize both quantitative proteome and phosphoproteome data using PPI information extracted from BioGRID [42] and PhosphoSitePlus [17]. Recently, we also have developed a Cytoscape‐based bioinformatical platform named ‘post‐translational modification mapper (PTMapper)’ to visualize kinase‐substrate interactions regarding multiple phosphorylation sites on signalling molecules (Figure 6) [12]. The kinase‐phosphorylation site interaction dataset for this platform was integratively generated from PhosphoSitePlus [17], Phospho.ELM [43], PhosphoNetworks [44] and Uniprot KB [45], leading to construction of phosphorylation site‐oriented PPI networks using Pathway Commons [46]. We applied this platform to extract crucial kinase‐substrate interactions from the quantitative phosphoproteome data on EGF‐stimulated GSCs [29]. As a result, p70S6K and Lyn were significantly extracted as key regulators (Figure 7).

Figure 6.

Construction of phosphorylation‐oriented PPI networks via PTMapper. (A) Workflow for the visualization of kinase‐phosphorylation site relationships in PPI networks via PTMapper. Phosphorylation sites are connected with the parental protein nodes in PPI networks and the upstream kinases are then added to the phosphorylation sites. (B) Phosphorylation site-oriented networks constructed from the phosphoproteome data on EGF-stimulated glioblastoma stem cells. The solid arrows represent functionally directed protein‐protein interactions or kinase‐substrate interactions, whereas the dotted lines show the linkages of proteins and their phosphorylation sites, adapted from Ref. [12].

Figure 7.

Comparison of the sub‐networks extracted from EGF‐dependent phosphorylation dynamics of glioblastoma stem cells. (A) Schematic procedure for the evaluation of PTMapper‐based network construction. (B) The most significantly regulated sub‐networks extracted from the conventional protein interaction network. (C) The phosphorylation site-oriented network generated via PTMapper. The nodes surrounded by the border with the upper-right numbers indicate the common molecules in the two types of the sub-networks. The solid arrows represent functionally directed protein-protein interactions or kinase-substrate interactions, whereas the dotted lines show the linkages of proteins and their phosphorylation sites. The dashed circles indicate p70S6K and Lyn, adapted from Ref. [12].

5. Perspectives and conclusions

The bioinformatical description of GSC signalling dynamics based on the global quantitative phosphoproteome data led to network‐wide extraction of critical molecules and their related pathways for defining stemness characteristics. Further integrative description of multiple PTM dynamics in GSCs will deepen our understanding of the nature of their cell signalling complexity at the network level. We believe that shotgun proteomics‐based quantitative analyses of cancer stem cell signalling networks in combination with various statistical and mathematical platforms will pave the way to establish new directions towards systematic evaluation of drug targets in a cell‐type specific manner.


We thank Dr. Yuta Narushima for his technical support. We are also thankful to all the members of Medical Proteomics Laboratory, The Institute of Medical Science, The University of Tokyo. This work was supported by grants‐in‐aid for scientific research on innovative areas (integrative understanding of biological signalling networks based on mathematical science) and grant‐in‐aid for scientific research (C).

© 2017 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Hiroko Kozuka‐Hata and Masaaki Oyama (September 13th 2017). Comprehensive Network Analysis of Cancer Stem Cell Signalling through Systematic Integration of Post-Translational Modification Dynamics, Applications of RNA-Seq and Omics Strategies - From Microorganisms to Human Health, Fabio A. Marchi, Priscila D.R. Cirillo and Elvis C. Mateo, IntechOpen, DOI: 10.5772/intechopen.69647. Available from:

chapter statistics

1092total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Epitranscriptomics for Biomedical Discovery

By Min Xiong, Daniel P. Heruth, Xun Jiang, Shamima Islam, Li Qin Zhang, Ding‐You Li and Shui Q. Ye

Related Book

First chapter

Photo- and Free Radical-Mediated Oxidation of Lipid Components During the Senescence of Phototrophic Organisms

By Jean-François Rontani

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us