The Glioblastoma (WHO grade IV astrocytoma) is the most common and malignant primary adult brain cancer. The present standard treatment appears to be maximal safe resection of the tumor followed by irradiation and temozolomide adjuvant chemotherapy. Despite advances in treatment modalities, the median survival is very poor. In addition, recent molecular and genetic profiling studies using various genomic technologies have identified several markers and unique signatures as prognostic and predictive factors of GBM. However, none of them are translated into clinics. Thus with a requirement of additional more potent markers, proteomics appears to be more promising in particular for reasons like transcript levels does not correlate with protein levels due to variety of reasons like differences in translation efficiency, protein stability and miRNA regulation. The conventional methods like 2-dimensional gel electrophoresis have been in use for nearly last few decades. However, our ability to identify the proteins through mass spectrometry created renewed interest in proteomics. It is anticipated that the technological advancements happening in proteomics like gel free high throughput quantitative methods is going to revolutionize the biomarker discovery for classification, prognosis, monitoring treatment response and novel targets for better treatment besides understanding of glioma biology.
2. Why proteomics is essential?
The foundation stone for our understanding that cancer is a genetic disease and it arises due to changes in genes or gene activity was laid down by the field of tumor virology (Javier and Butel, 2008). Throughout the last century, the field of tumor virology has discovered groundbreaking findings that helped us to understand the causes of cancer. In the new era of genomics, the techniques like DNA microarray technology, where one can detect changes in the expression of several thousand genes simultaneously, further accelerated the process of our understanding of cancer. This helped us in developing molecular markers and gene signatures leading to improved and more accurate diagnosis, tumor grading and alternate therapeutic methods including targeted therapies. However, the detection of genes or their expression by itself does not reflect the dynamics of various processes in the cell.
Proteomics is the study of all the proteins expressed by a given cell, tissue, or organism at a given time and under specific conditions. Since the proteins are the functional units of the cell, the development of cancer is directly influenced by the protein synthesis, level and their interaction with other molecules. It is now understood that cancer development and progression is largely due to aberrant signaling pathways in which the entire network of proteins play a major role, the proteomics becomes vital to the further advancement of our understanding of various cellular processes. Genomics based studies are not sufficient in our desire towards complete understanding of cancer. For example, the analysis of mRNA level does not necessarily reflect the protein content in the cell always. While the formation of mRNA by transcription is the first and an important step in gene expression, protein level is controlled by other steps like translation and stability of mRNA and protein. Multiple proteins can be formed from a gene due to alternate splicing. It is predicted that a human gene on an average can encode three or more proteins (Wilkins et al., 1996). Proteins are also subjected to post-translational modifications which play a major role in regulating their functions by altering their localization, interactions and turnover. It is estimated that the proteins can undergo as many as 200 different kinds of modifications (Krishna and Wold, 1993). Protein function is also regulated by their localization and interaction with other proteins. By using proteomic approach, it is possible to identify the changes in protein modifications, subcellular localization and protein-protein interactions for many proteins expressed in a cell. Proteomics also help in annotation of genome as predicting genes from genomic data by using bioinformatics algorithm is not always accurate. An integration of genomic data with proteomic studies is needed to achieve complete annotation of the genome.
3. Proteomic methods
3.1. Detection methods
Conventionally, the protein identification was done by carrying out Edman degradation, a chemical method used for the sequencing of amino acids from the N-terminal end of a peptide (Edman, 1949). The limitations of this method are that it is very exhaustive, time consuming and requires more protein sample. Mass spectrometers traditionally used for measuring the mass of small molecules became an important and indispensible tool in protein identification since the introduction of Matrix-Assisted Laser Desorption/Ionization (MALDI) methodology (Karas and Hillenkamp, 1988).
3.1.1. Mass spectrometer
Mass spectrometer comprises three principle components: ionization source, analyzer, and detector. Ionization source can be either MALDI or Electrospray Ionization (ESI). In MALDI, the analyte is co-crystallized with matrix, which imparts laser energy on analyte and thus helps in its ionization. The ions generated in MALDI are singly charged. In ESI, the analytes in aqueous phase are introduced through a fine capillary needle into a high voltage chamber, wherein solvent evaporates leaving behind analytes as multiply charged ions. The generated ions are then separated based on their mass/charge (m/z) ratio by analyzer.
Different types of analyzers are Time Of Flight (TOF), quadrupole and Fourier Transform Ion Cyclotron Resonance (FT-ICR). The TOF analyzer is commonly used in MALDI based mass spectrometry. In TOF analyzer, an electric field accelerates the ions in a vacuum tube and allows to drift towards the detector. Since all the ions posses similar charge, their kinetic energy will be identical. Thus, the ions are separated merely based on their masses. The ion with least mass will reach the detector first. In quadrupole mass analyzer, a radio frequency (RF) field is created between four parallel metal rods through which ions passes. The paths of ions are either stabilized or destabilized selectively by superimposing oscillating electrical fields such that ions in a certain range of m/z ratio are allowed to reach the detector at any given time. Thus, quadrupole mass analyzer acts as a mass-selective filter. In FT-ICR, the charged ions experience movement in a fixed magnetic field which is known as ion cyclotron resonance wherein the frequency of an ion's cycling is determined by its mass to charge ratio. These ions are excited by radio frequency (RF) signal which produces image current measurable by the detector.
The final component of MS is a detector, which is either an electron multiplier. When the ions hit the surface of the detector, the current signal thus produced is recorded and is represented as mass spectrum.
3.1.2. Protein identification using peptide mass finger printing (PMF) and tandem MS
Most often in proteomics, the protein identification is achieved by MALDI-TOF MS analysis. Typically, the protein is digested with trypsin and the resulting peptide mixture is analyzed on MALDI, which generates a spectrum with peaks representing singly charged masses of peptides. The list of peptide masses are searched against tryptic peptides generated in silico from a database of protein (Pappin et al., 1993) or translated nucleic acid sequences (James et al., 1994) to confirm the protein identity. This process is called as peptide mass finger printing. However, if protein sample contains more than one protein, it may result in complex spectra that are less successfully handled by existing algorithms and software. Spectral peaks corresponding to peptides with known possible modifications can be successfully matched to database peptide sequences if the possibility of the modification is specified by the user. However unanticipated modifications will lead to incorrect identification of the protein. In such cases, tandem MS (MS/MS) provides the most powerful tool to unambiguously detect protein identity and modification.
3.2. Separation methods
3.2.1. Gel based separation methods
2-dimensional gel electrophoresis (2DE) is probably the first introduced proteomic technique and being widely used for a large scale separation and identification of protein(s). It was first described and demonstrated by O’ Farrell (O’ Farrell, 1975). In 1977, Anderson and coworkers applied 2DE to separate the plasma proteins (Anderson and Anderson, 1977). Since then 2DE technique has been enormously used to profile the proteome of many organisms, organelles, cell lines and biological fluids. This technique separates proteins based on two independent properties of the proteins. In the first step, isoelectric focusing (IEF), the proteins are separated according to their isoelectric point. In the second step, proteins are separated based on their molecular weight in sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE). The protein spots are visualized by coomasie blue staining or silver staining and the digital images of the gels can be analyzed through software to obtain the quantitative information of the protein spots. The protein spot of interest is excised, destained and subjected to tryptic digestion. The tryptic peptides are analyzed in MS to acquire the Peptide Mass Fingerprint (PMF). The peaks in a PMF correspond to peptides derived from protein spot excised on a 2D gel. The peptide mass values are searched against the appropriate database to confirm the identity of the protein spot. But the conventional 2DE based separation and visualization techniques often lack reproducibility in proteome separation and sensitivity to detect low abundant proteins.
In order to improve the conventional 2DE in this respect, Differential gel electrophoresis (DIGE) was introduced (Unlu et al., 1997). In DIGE, the two different samples are first labeled with fluorescent dyes (Cy3 and Cy5), then mixed together and subjected to DE on a single gel. The gel is subsequently scanned at different wavelengths to generate the images corresponding to different samples used in the experiment. This provides two advantages in protein identification: first, the fluorescent labeling is very sensitive and hence it can detect even very less abundant proteins; second, the control (Cy3) and test (Cy5) samples are separated on a single gel thereby eliminating the gel-to-gel variation and artifacts. Therefore, any variation seen between two differentially labeled samples is mostly due to actual biological differences. After the fluorescent images are generated for software based analysis, the gel is counter stained with coomassie or silver stain, the spot of interest is subjected to MS to confirm its identity. But, most often, the spots seen by fluorescent stain is not visualized by the conventional staining methods. Therefore, the protein spot(s) of interest for identification is preferably taken from a preparatory gel run independently. When more number of control and test samples are to be analyzed by DIGE, generally a common pool consisting equal amounts of all samples is made and labeled with Cy2 dye. This Cy2 labeled pool is run along with control (Cy3) and test (Cy5) in every gel as internal standard which helps in normalizing variation between gels.
3.2.2. Non-gel based separation methods
126.96.36.199. Lable free methods
Despite the advancements in 2DE, it is no longer effective in situations like identification of proteins with low abundance, high hydrophobicity, extreme pI and molecular weight. To overcome these problems, in-solution separation or Liquid Chromatography (LC) method was coupled with ESI-MS based identification of proteins. In LC based proteomics, complex mixture of proteins are first digested to peptides by proteases, separated by one or more dimensions of LC and subjected to tandem MS analysis. The protein identification is achieved based on one or more identified peptide sequences. The separation and analysis of tryptic peptides rather than the protein is termed as bottom-up approach. To achieve the efficient separation of peptides, the combination of multi-dimensional chromatography has been in use and this technique is popularly called as Multidimensional Protein Identification Technology (MudPit) (Washburn et al., 2001). MudPit is a most common two-dimensional LC separation which combines Strong Cation Exchange (SCX) chromatography with reverse phase (RP). The fractions separated by RP are directly injected into online ESI-MS or on MALDI target plate for analysis by MS/MS.
188.8.131.52. Labeling methods
The combination of protein labeling approaches with LC-based separation coupled to ESI-MS has greatly facilitated the large scale identification and quantification of the proteins. Such a combined approach is called as quantitative proteomics. The different labeling techniques include Stable Isotope Labeling of Amino acids in Cell culture (SILAC), Isotope Coded Affinity Tag (ICAT), Isobaric Tag for Relative and Absolute Quantification of peptides (iTRAQ) and Proteolytic 18O labeling. These techniques differentially label the proteins of two (or more) conditions; this is achieved by metabolic incorporation of isotopically distinct amino acids in live cells (SILAC), through chemical labeling of proteins with isotopically distinct tags bound to –SH groups (ICAT) or isobaric amine specific tags bound to amino groups (iTRAQ) (Figure.1).
SILAC involves growing cells under two different conditions (Control Vs Test): one in medium containing ‘light’ amino acid and other in ‘heavy’ amino acid containing medium (Ong et al., 2002). Heavy amino acid containing 2H instead of H, or 13C instead of 12C or 15N instead of 14N is used. Incorporation of a heavy amino acid into a peptide results in a known mass shift compared with peptide containing light amino acid. For example, a six carbon containing amino acid like arginine when labeled with 13C isotope can differentiate a given peptide derived from two different conditions by 6 Da mass difference. The protein samples of control and test conditions are mixed in equal proportion, subjected to trypsin digestion, which selectively cleaves at the carboxyl side of arginine or lysine. The resulting peptide mixture is analyzed by LC-MS to generate a MS spectrum wherein each peptide appears as a pair with expected mass difference. The relative abundance of a peptide is obtained by comparing the intensities of peptides within a pair (Figure.1A).
In 1999 Aebersold and co-workers (Gygi et al., 1999) introduced the ICAT method for relative quantitation of protein abundance. ICAT reagent contains three moieties: biotin tag to isolate the ICAT labeled peptides, a linker with either eight hydrogen or eight deuterium giving an isotope mass difference of 8 Da and cysteine reactive malemide group. In this technique, the complex protein mixtures from two different conditions (Control Vs Test) are labeled using heavy and light ICAT reagent such that one condition contains hydrogen whereas other contains deuterium giving rise to 8 Da mass difference. The labeled protein samples from two different conditions are mixed equally and subjected to proteolytic digestion, thus obtained mixture of peptides are affinity purified using avidin column to enrich the labeled peptides. The peptides are analysed by LC-MS to obtain the relative abundance of a given protein (Figure.1B).
In iTRAQ, the peptides derived from different protein samples are labelled with iTRAQ tags. The tag is covalently linked to N-terminal amine group/side chain amine groups of peptides. The tag is of 145 Da and consists of three moieties, N-hydroxy succinimide (NHS) ester for reaction with primary amine groups on peptides, balance moiety (carbonyl group) and reporter moiety. The reporter moiety may have varying mass like 114, 115, 116 and 117 Da and hence four different conditions (4-plex) can be analysed and relative quantification obtained simultaneously (Ross et al., 2004). The peptides from different conditions are labeled with iTRAQ tags which differ in their reporter ions, pooled, fractionated in nano-LC and analysed by tandem mass spectrometry. Due to isobaric mass of iTRAQ tags, the peptides appear as single peaks in MS scan. When the iTRAQ-peptides are analysed on MS/MS, the tagged peptides dissociates into peptide sequence, a neutral balance moiety and reporter ions. The protein identification is achieved by searching the peptide sequence against the database(s). The relative abundance of protein is estimated by the relative peak intensity of the liberated reporter ions (Figure.1C).
3.3. Other proteomic methods
We describe here briefly about other techniques available for proteomic analysis. More detailed information can be obtained from elsewhere (Angenendt et al., 2006; Diamandis, 2003; Haab, 2005; Kim et al., 2009; Petricoin et al., 2002). A relatively new proteomic technique that is suitable for high-throughput profiling is Surface Enhanced Laser Desorption/Ionization-Time-Of-Flight (SELDI-TOF) which combines affinity chromatography on a modified surface (surface-enhanced) with MALDI-TOF. Protein chips with different chromatographic surfaces either chemically modified like hydrophobic, hydrophilic, cationic, anionic and metal ion or biochemically modified like antibody-antigen, receptor-ligand and DNA-protein are used to fractionate proteins from complex protein mixtures. In the next step, an energy adsorbing matrix is layered over the protein chip and a spectrum will be obtained following laser desorption/ionization. While SELDI is capable of producing protein spectra that can discriminate tumor from normal controls, it does not provide the identity of the protein markers involved. SELDI based proteomic approach involves a pattern discovery phase wherein a discriminatory peak pattern is identified and pattern matching phase in which the identified pattern is validated.
A powerful chip-based proteomic technique which does not utilize mass spectrometry is antibody and protein arrays. Antibody array essentially utilizes antibodies, which are covalently immobilized onto a solid surface like glass slide to capture labeled antigens. In contrast, the protein array contains proteins of interest arrayed on a flat surface. The main use of protein array is to detect autoantibodies seen in conditions like autoimmune disorders and cancer. Since the main hurdle in development of protein array is expression and purification of proteins, alternate techniques like chemical synthesis, cell-free DNA expression or cell-free in situ expression of PCR products are being developed (Angenendt et al., 2006). A modification of antibody arrays is bead arrays wherein antibodies are immobilized to spherical particles containing integrated reporter fluorescent dyes. Since the reporter dye encodes the identity of capture agent attached to the bead, a multiplex assay format is produced by mixing beads with different reporter dyes and capture agents. Because of higher sample throughput, bead arrays are particularly suitable for screening large number of samples.
4. Application of proteomics in glioma
Histological methods relying on microscopic description of glioma helped us not only classification and also making therapeutic decisions (Brat et al., 2008). Subjectivity and inter-observer variations in histopathology may lead to compromise in making choices in therapeutic modalities (Brat et al., 2008). In the last two decades, genomics in the form of various high throughput techniques helped us enormously not only in increasing our understanding of glioma biology but also resulted in the identification of several gene or gene signatures for diagnosis, grading and glioma therapy. These findings have not been translated into clinic and hence the patient outcome remains low. With requirement of better validated robust markers, proteomics appears to be promising in making the lead. We have summarized important discoveries in glioma research using various proteomic methods.
4.1. Glioma tumor tissue and cell line based studies
Using 2DE, Iwadate et al analyzed 85 tissue samples (52 GBM, 13 AA, 10 DA and 10 normal brain samples), and identified 57 protein spots which could be used to differentiate tumors from normal brain tissue (Iwadate et al., 2004). Out of the differentially regulated protein spots, the identity was found for 5 spots by MALDI-TOF and were categorized into various biological functions like signal transduction related proteins, molecular chaperones, transcription and translation regulators, cell cycle-mediating proteins, extracellular matrix-related proteins, and cell adhesion molecules. High levels expression of four proteins-VREB1, GRP78, RhoA and Rac1 and lower expression of enolase in grade IV tumors compared to lower grade tumors was also confirmed by immunohistochemistry. In another attempt to identify differentially expressed proteins between astrocytoma grades, 10 grade II and 10 grade IV samples were analyzed by 2DE followed by mass spectrometry, which resulted in the identification of 15 differentially expressed proteins (Odreman et al., 2005). These findings were subsequently validated by western blotting and immunohistochemcial staining. The proteins more highly expressed in glioblastoma multiforme were peroxiredoxin 1 and 6, the transcription factor BTF3, and R-B-crystallin, whereas protein disulfide isomerase A3, the catalytic subunit of the cAMP-dependent protein kinase, and the glial fibrillary acidic protein were increased in low-grade astrocytomas. Hiratsuka et al used six non tumoral samples and five glioma samples (one grade II, two grade III and two grade IV) for 2DE and MALDI-TOF spectrometry investigation and identified 11 upregulated proteins and 4 down regulated proteins in glioma compared to normal samples (Hiratsuka et al., 2003). Northern blotting confirmed the glioma downregulation of SIRT2. Further, overexpression of SIRT2 resulted in perturbation of microtubule network and reduced colony formation suggesting SIRT2 may act as tumor suppressor in glioma. Ngo et al., used a 2D-DIGE based proteomic approach to identify differences in protein expression between two glioma cells which differ in their chromosome 1p status (Ngo et al., 2007). The rational for this study is that 1p+/- anaplastic oligodendroglioma patients respond better to chemotherapeutic agents like procarbazine, 1-(2-chloroethyl)-3-cyclohexyl-l-nitrosourea, and vincristine. Comparison of A172 (1p+/-) and U251 (1p+/+) glioma cell lines identified 9 spots as differentially regulated proteins, out of which the identity was found by MALDI-TOF/TOF MS for 18 proteins which included three proteins α-enolase, stathmin and DJ-1 encoded by 1p region. Further analysis revealed that decreased stathmin, a microtubule associated protein, was found to be associated with loss of heterozygosity, increased recurrence-free survival in anaplastic ologidendroglioma patients. Expression of stathmin was also found to correlate inversely with overall survival of nitrsourea treated mice carrying xenograft tumors. On mechanistic front, this group identified an increased mitotic arrest in cells with less stathmin expression upon nitrosoureas treatment as potential reason for improved patient outcome with 1p+/- anaplastic oligodendroglioma tumors.
In an effort to identify proteins that could affect sensitivity of glioma cells to chemotherapy, Puchades et al., analysed the protein profiles by 2DE of control U87 glioma cells or infected with p53 adenovirus or treated with SN38, a topoisomearse 1 inhibitor either alone or in combination. Of many proteins modulated by this treatment, they found galectin1 to be down regulated by p53 which is further enhanced by SN38 treatment. Further investigation revealed a high level expression of galectin 1 in glioma cell lines and down regulation of galectin 1 sensitized glioma cells to chemotherapy suggesting that galectin 1 could be a potential therapeutic target (Puchades et al., 2007). Another interesting review work compared systematically many independent glioma proteomic studies and identified 10 proteins- PHB, Hsp20, serum albumin, epidermal growth factor receptor (EGFR), EA-15, RhoGDI, APOA1, GFAP, HSP70 and PDIA3 to be differentially expressed in gliomas (Deighton et al., 2010)
4.2. Serum and CSF based studies
Using combination of 2DE and MALDI-TOF MS, Kumar et al., analyzed 14 GBM and 6 normal control sera and identified haptoglobin (Hp) α2 chain as an up regulated serum protein in GBM patients (Kumar et al. 2010). GBM-specific upregulation was confirmed by ELISA based quantitation of haptoglobin (Hp) in the serum of 99 GBM patients as against lower grades (49 grade III/AA; 26 grade II/DA) and 26 normal individuals. Further validation done using RT-qPCR on an independent set of tumor and normal brain samples and immunohistochemcial staining on a subset of above samples showed an increasing levels of Hp transcript and protein respectively with tumor grade and were highest in GBM. Further investigation revealed that overexpression of Hp either by stable integration of Hp cDNA or exogenous addition of purified Hp to immortalized astrocytes resulted in increased cell migration. Conversely, RNAi-mediated silencing of Hp in glioma cells decreased cell migration. Mouse melanoma and human glioma cells over expressing Hp showed increased tumor growth with decreased mice survival in mouse xenograft model. SELDI-TOF analysis of serum samples coupled with artificial neural network (ANN) algorithm followed by discriminant analysis identified a fifteen peak pattern which can classify low grade (I/II) from high grade (III/IV) tumors with an accuracy of 85.7% (Liu et al., 2005). Petrik et al., compared the SELDI-TOF mass spectra derived from 200 serum samples that comprised of 58 control subjects and 36 patients with grade II astrocytoma, 15 with anaplastic astrocytoma, and 91 with glioblastoma and identified a peak with size of 2.740 kDa showing decreasing intensity with increasing astrocytoma grade (Petrik et al., 2008). The peak was subsequently identified by tandem MS as B-chain of α2-Heremans-Schmid glycoprotein (AHSG). This finding was validated by measuring the serum AHSG levels using turbidimetry in an independent set of serum samples. More interestingly, multivariate Cox proportional hazards model identified serum AHSG levels were to be an independent predictor of patient survival with normal levels being associated with prolonged survival. Analysis of cerebrospinal fluid samples from 60 patients comprising 47 brain tumors and 13 nontumor controls by two independent proteomic techniques, 2DE and cleavable ICAT, found an exclusive presence of attractin in grade III and IV astrocytoma (Khwaja et al., 2006). High-grade specific upregulation of attractin in CSF was validated by western blotting and immunohistochemistry in an independent set of samples. Analysis of tumor samples demonstrated the source of high levels seen in CSF as tumors. Finally, this study showed the attractin from CSF could induce glioma cell migration suggesting the importance of attractin in high invasive nature of high-grade gliomas. Others studies where proteomic approaches were used in analysis of serum and CSF of glioma patients are reviewed recently (Niclou et al., 2010; Somasundaram et al., 2009).
Proteomics is certain to become part of important discoveries in glioma as it has already become an integral part of glioma research. Although glioma proteomics research have not yielded any breakthrough findings close to becoming an application in clinics, the great advances in the proteomic technology in the form of more powerful and sensitive mass spectrometers, quantitative proteomic methods like SILAC, ICAT, iTRAQ, and improvements in data analysis in the form of sophisticated databases and bioinformatic software likely to hasten the biomarker discovery in the years to come. While gel based separation techniques like 2DE or 2D-DIGE will continue to be in use, the label free or labeled quantitative MS methods will take the center stage. Further, serum biomarker discovery field is where proteomic platform is likely to play a major role. However, one major drawback of current proteomic studies is that most of them are of low throughput studies having utilized small number of samples. This obviously requires extensive validation perhaps using other techniques like ELISA or tissue microarray in large number of samples. Prospective validation studies are also needed to confirm the clinical benefit to the patients.
This study was supported by a grant from DBT, Government of India. Infrastructural support by funding from ICMR (Center for Advanced studies in Molecular Medicine), DBT, DST (FIST) and UGC to department of MCB, Indian Institute of Science is acknowledged. MBN and DMK gratefully acknowledge senior research fellowship from IISc and CSIR respectively.