Gene Markers Representing Stem Cells and Cancer Cells for Quality Control

Populations of cells have unique characteristics and gene markers representative of each cell type, and these features are useful for identifying cell characteristics. For example, the gene expression profile of cells differs at each stage of development and differentiation. This review focuses on gene expression in stem and cancer cells to investigate the possibility of identifying cancer stem cells by such markers. Cancer stem cells show similarities to normal stem cells in terms of self-renewal and differentiation into multiple lineages. However, cancer stem cells have an indefinite potential for self-renewal that leads to malignant tumorigenesis. The origins of cancer stem cells are not completely clear but accumulation of gene mutations and cell niches are involved in their development. This article describes the gene expression patterns of stem and cancer cells with the aim of determining gene markers for diverse cell types and culture stages for quality control in cellular therapeutics.


Introduction
Populations of cells have unique characteristics and gene markers representative of each cell type, and these features are useful for identifying cell characteristics.For example, the gene expression profile of cells differs at each stage of development and differentiation.This review focuses on gene expression in stem and cancer cells to investigate the possibility of identifying cancer stem cells by such markers.Cancer stem cells show similarities to normal stem cells in terms of self-renewal and differentiation into multiple lineages.However, cancer stem cells have an indefinite potential for self-renewal that leads to malignant tumorigenesis.The origins of cancer stem cells are not completely clear but accumulation of gene mutations and cell niches are involved in their development.This article describes the gene expression patterns of stem and cancer cells with the aim of determining gene markers for diverse cell types and culture stages for quality control in cellular therapeutics.

The microarray quality control (MAQC) projects
Stem cells have varied gene and protein expression profiles and it is important to identify these profiles for quality control in disease treatment, as illnesses such as cancer may cause cell feature changes.The differentiation capacity of stem cells might be altered upon malignancy and there is the possibility that cancer comes from so-called cancer stem cells.Several methods are available to detect cell marker expression, such as surface protein marker detection, intracellular protein marker detection, and gene expression detection.The MAQC project, which is a collaborative effort conducted as part of the US Food and Drug Administration's Clinical Path Initiative for medical product development is useful to detect gene markers in cells (MAQC Consortium, 2006, 2010;Fan et al., 2010;Oberthuer et al., 2010;Huan et al., 2010;Luo et al., 2008;Parry et al., 2010;Shi et al., 2010;Miclaus et al., 2010;Hong et al., 2010;Tillinghast, 2010).It began in February 2005 and aims to describe the reliability and evaluate the performance of microarrays on several platforms.MAQC-I mainly focuses on the technical aspects of gene expression analysis, whereas MAQC-II focuses on developing accurate and reproducible multivariate gene expressionbased prediction models.Possible uses for gene expression data are vast, including diagnosis, early detection (screening), monitoring of disease progression, risk assessment, prognosis, complex medical product characterisation and prediction of responses to treatment (with regard to safety or efficacy) with a drug or device labelling intent.The MAQC-II data model prediction is dependent upon endpoints, including preclinical toxicity, breast cancer, multiple myeloma and neuroblastoma.Some endpoints are highly predictive based on the nature of the data, and other endpoints are difficult to predict regardless of the model development protocol.Clear differences in proficiency exist between data analysis teams, and such differences are correlated with the level of team experience.The internal validation performance from well-implemented, unbiased crossvalidation analyses shows a high degree of concordance with the external validation performance in a strictly blinded process, and many models with similar performance can be developed from a given data set (Table 1).

Aim
To address the concerns about the reliability of microarray techniques To develop and evaluate accurate and reproducible multivariate gene expression-based predictive model

Summary
The technical performance of microarrays as assessed in the project supports their continued use for gene expression profiling in basic and applied research and may lead to their use as a clinical diagnostic tool as well.
1) Model prediction performance was endpoint dependent.
2) There are clear differences in proficiency between data analysis teams (organisations).
3) The internal validation performance from well-implemented, unbiased crossvalidation shows a high degree of concordance with the external validation performance in a strict blinding process.4) Many models with similar performance can be developed from a given data set.5) Application of good modelling practices appeared to be more important than the actual choice of a particular algorithm over the others within the same step in the modelling process.Applying good modelling practice seems to be more important than the actual choice of a particular algorithm over the others within the same step in the modelling process.The order of the analysis process was as follows: design, pilot study or internal validation, and pivotal study or external validation.Observations based on an analysis of the MAQC-II dataset may be applicable to other diseases.(MAQC Consortium, 2010) 3. Gene markers for stem cells

Cell surface marker genes
The stem cell expression profile varies in differentiated cells.The expression pattern may change depending on differentiation or malignancy of the disease.Endothelial cells in glioblastomas have unique gene expression profiles, and the differences between glioblastomas and lower grade gliomas suggest a more complex ontogeny of the glioblastoma endothelium (Wang et al., 2010).Quantitative in situ hybridisation analyses have revealed that fluorescence-activated cell-sorted CD105+ (one of the human endothelial markers) cells with more than 3 copies of the epidermal growth factor receptor (EGFR) amplicon or the centromeric portion of chromosome 7 are similar to the proportion of tumour cells with similar aberrations.CD133 is a cell surface glycoprotein, which has been used as a possible cancer stem cell marker.CD133 is also expressed in haematopoietic stem cells.

Genes representing the mesenchymal stem cell culture stage
MSCs are often used for treating graft-versus-host disease (GVHD) (Weng et al., 2010;Le Blanc et al., 2008), suggesting that an infusion of MSCs may be an effective therapy for patients with steroid-resistant acute GVHD.The necdin homologue (mouse) (NDN), EPH receptor A5 (EPHA5), nephroblastoma overexpressed gene (NOV) and runt-related transcription factor 2 (RUNX2) are possible markers to describe culture status, including growth capacity and differentiation (Tanabe et al., 2008).EPHA5 and NOV are upregulated in late culture stage of human MSCs, whereas NDN and RUNX2 are downregulated (GEO series, Tanabe et al., 2008, accession GSE7637 and GSE7888).
NOV expression in prostate cancer tends to be involved in cancer conditions, based on human prostate cancer gene expression data (Best et al., 2005).This expression is upregulated in androgen-independent primary human prostate cancer compared to untreated human prostate cancer (GEO series, Best, 2005, accession GSE2443).NOV might be a candidate marker for identifying the cancer state.
Human MSCs have been reported to promote growth of osteosarcomas, a common primary malignant bone tumour (Bian et al., 2010).In addition, interleukin-6 plays an important role growth factor.The combination of gene expression and factors from outside the cells may play important roles in reprogramming cells.

Genes for generation of induced pluripotent stem (iPS) cells
Recently, it had been reported that OCT4 is sufficient to induce alterations in the human keratinocyte differentiation pathway (Racila et al., 2011).Transfection of OCT4, using a plasmid, into human skin keratinocytes resulted in exhibited expression of endogenous embryonic genes and reduced genomic methylation.These OCT4-transfected cells could become neuronal and mesenchymal cell types.The cells have been shown to have characteristics of cultured smooth muscle or myofibroblast cells from a mesenchymal stem cell lineage.It is probable that partial reprogramming using several genes can induce transitions in cell phenotypes and features; hence, complete reprogramming of somatic cells into iPS cells would not always be required for the application of these cells in clinical therapy.
The characterization of human iPS cells, with respect to pluripotency and the ability for terminal differentiation, has been performed with 16 iPS cell lines (Boulting et al., 2011).This study revealed that all iPS cell lines examined, reprogrammed with OCT4, SOX2 and KLF4, or OCT4, SOX2, KLF4 and c-MYC showed the capacity to function as functional motor neurons after differentiation, although there was some variation in the expression of early pluripotency markers and the transgenes.iPS cell lines have been shown to express pluripotency markers, such as NANOG, OCT4, SSEA3, SSEA4, TRA-1-60 and TRA-1-81.

Involvement of genome structure in reprogramming to iPS cells
Copy number variation has been reported to be involved in the reprogramming to pluripotency (Hussein et al., 2011).The comparison of copy number variations of different passages of human iPS cells with their fibroblast cell origins and with human embryonic stem (ES) cells revealed high copy number variation levels in early-passage human iPS cells.
The number of copy number variations in human iPS cell lines decreases with an increase in the number of passage.This decrease during culture passages could be due to DNA repair mechanisms or mosaicism followed by selection.The authors proposed that de novo generated copy number variations create mosaicism that is followed by selection of less damaged cells during culturing, because DNA repair is not considered as a sufficient explanation of the rapid decrease in copy number variation.

Involvement of epigenetic modification and methylation in iPS cells
EMT has been shown to be associated with a stem cell phenotype (Mani et al., 2008;Battula et al., 2010;Polyak & Weinberg, 2009).The tumour suppressor p53 has been suggested to regulate EMT and EMT-associated stem cell properties through transcriptional activation of miRNA (Chang et al., 2011).EMT and the reverse process, the mesenchymal-epithelial transition, are believed to be key elements in the regulation of embryogenesis.It has also been suggested that EMT activation is related to cancer progression and metastasis.
Recently, EMT has been shown to play a role in the acquisition of stem cell properties in normal and neoplastic cell populations.miRNAs are small non-coding RNA molecules and suppress gene expression by interacting with the 3'-untranslated regions (3' UTRs) of target mRNAs.miRNAs are known to be related to EMT and cancer.The study revealed that p53 activates miR-200c, which is down-regulated in normal stem cell and neoplastic stem cell populations, and suppresses the EMT phenotype and stem cell properties represented in CD24−CD44+ cell populations.The expression of mesenchymal stem cell markers, such as N-cadherin and ZEB1, has been shown to be suppressed by p53.The mRNA levels of KLF4 and BMI1, which are known as stemness-associated genes and RNA targets of miR-200c and miR-183, have been shown to be regulated by p53.It has also been reported that the p53R175H mutant up-regulates Twist1 expression and promotes EMT in immortalized prostate cells (Kogan-Sakin et al., 2011).Inactivated or mutated p53 may result in the up-regulation of cell cycle progression genes, such as Twist1, which is a regulator of metastasis and EMT.

Epithelial-mesenchymal transition (EMT) and microRNAs (miRNAs)
iPS cells have been known to show reprogramming variability such as aberrant reprogramming of DNA methylation (Lister et al., 2010).From whole-genome, single-baseresolution DNA methylomic analyses of iPS cells and ES cells, the authors obtained new evidence showing that iPS cells are methylated during reprogramming, and the methylome of iPS cells generally resembles that of ES cells.In the study, a detailed interpretation of the data indicated that there were many differences in DNA methylation between ES cells and iPS cells.For example, many differentially methylated regions that were differentially methylated in either the iPS cell line or the ES cell line existed in several iPS cell lines.

Regulated genes in renal cancer
NCBI's Gene Expression Omnibus (GEO) database is a useful tool to profile gene expression and search for markers representing cell features (Edgar et al., 2002;Barrett, 2011).Renal tumour samples have been analysed using microarray (Yusenko et al., 2009).It was observed that loss of chromosomes 2, 10, 13, 17 and 21 discriminate chromophobe renal cell carcinomas from renal oncocytomas.These authors suggested that detecting chromosomal changes can be used for an accurate diagnosis in routine histology.
The collaborative genome-wide study for renal cell carcinoma using SNP detection techniques has revealed that genome loci on 2p21 and 11q13.3 are genomic regions associated with renal cell carcinoma (Purdue et al., 2011).From this study, EPAS1, encoding hypoxia-inducible-factor-2 alpha at 2p21 and SCARB1, the scavenger receptor class B, member 1 at 12q24.31, were identified as feature genes that have single nucleotide polymorphism mutations in renal cell carcinoma.

Genes expressed in leukaemia
A model in which human cancers are believed to be generated hierarchically from selfrenewing cancer stem cells has been reported.Human acute myeloid leukaemia (AML) is a disease that relates to the model, and AML stem cell-targeting therapy has been developed (Majeti, 2011;Jin et al., 2006).CD25, CD32, CD44, CD47, CD96, CD123 and CLL-1 are expressed on the surface of AML stem cells.Of these genes, CD44 is suggested to be a cancer stem cell marker.The concept of cancer stem cell is important in explaining cancer development from the viewpoint of stem cells (Clevers, 2011;Wang & Shen, 2011).The cancer stem cells for leukaemia were identified from a study showing that CD34+CD38− fractions of cells derived from acute myeloid leukaemia had the capacity to initiate engraftment in immunodeficient mice (Lapidot et al., 1994;Bonnet & Dick, 1997).It is known that deletion or mutation of IKZF1 (IKAROS), PAX5, EBF1 and CDKN2A/B are involved in BCR-ABL1 lymphoblastic leukaemia (Mullighan et al., 2008;Mullighan et al., 2009).

Gene
The function of human BCR-ABL1 lymphoblastic leukaemia-initiating cells in human lymphoblastic leukaemia has been studied from the point of view of genome diversity (Notta et al., 2011).Functional and genetic analysis of Philadelphia chromosome acute lymphoblastic leukaemia (Philadelphia-positive [Ph+] ALL) revealed that the frequencies of genetic alterations in IKZF1 (84%), CDKN2A/B (50%) and PAX5 (50%) were consistent with those reported in previous studies.Complete deletion of IKZF1 was observed in both aggressive and non-aggressive groups; whereas, there were differences in the frequencies of deletion of the CDKN2A/B and PAX5 genes, which may provide markers for malignancy.
On the other hand, CD44 has been identified as a key regulator of leukaemic stem cells in AML (Jin et al., 2006).It was suggested that elimination of leukaemic stem cells, cells capable of initiating and maintaining the leukaemic clonal hierarchy, was required for a permanent cure of AML.Hence, stimulation with a CD44-specific antibody has been reported to result in the elimination of leukaemic stem cells.
A new mechanism was suggested in which tumour vascularisation occurs through endothelial differentiation of glioblastoma stem-like cells (Ricci-Vitiani et al., 2010).The differentiation of cancer stem-like cells may be involved in cancer malignancy, and it is possible to predict or diagnose the malignant stage of cancer using stem cell markers for quality control.
In a genome-wide association study (GWAS) of four case series on 2,251 test patients and 6,097 control patients of European ancestry, LIM domain only 1 (LMO1) at 11p15.4 was found to be associated with neuroblastoma and malignancy (Wang et al., 2011).An integrative genomics study to demonstrate that common genetic polymorphisms associated with cancer tendencies are also related to genomic regions that have possibility of somatic alterations which in turn influence tumour progression, revealed that mutation in LMO1 may also be a candidate indicator of a malignant phenotype.

Surface markers for cancer stem cells
Several markers have been reported for identification of cancer stem cells (Clevers, 2011).CD19 as a surface marker for B cell malignancies, CD20 and ATP-binding cassette transporter B5 (ABCB5) for melanoma, and the following molecules for cancer stem cells in the respective cancer type have been reported: CD24 for pancreas/lung cancer, CD34 for hematopoietic malignancies, CD44 for breast/liver/head and neck/pancreas cancer, CD90 for liver cancer, CD133 for brain/colorectal/lung/liver cancer and epithelial cell adhesion molecule (EpCAM)/epithelial-specific antigen (ESA) for colorectal/pancreatic cancer (Ebben et al., 2010).

Cancer stem cell hypothesis
Cancer stem cells have capacity for self-renewal, which is also the feature to normal stem cells.Cancer stem cells are also capable of generating malignant tumours, and this property may differentiate them from normal stem cells.The origin of the cancer stem cells has not been fully revealed, however, there is a model in which cancer stem cells occur by normal stem cells or normal cells by the accumulation of gene mutations.The process of cancer stem cell derivation is considered to be involved with niche which is microenvironment around normal stem cells.
There are two models to explain tumourigenesis.The first model is stochastic model in which all cells have capacity of tumourigenesis, but the probability to enter into tumourigenesis cell cycle is relatively low.The second model is hierarchy theory in which only small population of cells in cancer has capacity of tumourigenesis and generate tumour with high probability, which lead to cancer stem cell hypothesis.
It is also notable that cancer stem cells are not necessarily related to the cell of origin in a cancer (Visvader, 2011).Although the cell of origin for a particular tumour may have the capacity to differentiate into a mature cell, cancer stem cells have the ability to maintain tumourigenesis according to the cell-of-origin model.

Conclusion
The recent development in molecular biology and bioinformatics technology has revealed stem cell features and their candidate marker genes.Gene expression profiles change widely and dramatically with cell development, various culture conditions and disease status.The each cell type has different gene expression profile after being differentiated, and it is known that the expression pattern alters in each disease status.Even though it seems that the stemness has distinct feature in gene expression, the cell population show various gene expression patterns in each cell lineage or even in each subset of the cell.Until recently, targeting cancer stem cells in cancer therapy was rare because the proportion of these cells in cancer was considered very low and retaining the feature of cancer stem cells in vitro was difficult.The stem cell-targeted therapy including cancer treatment will be expected to progress further in the near future, and the role of markers would become much greater.It is important to know the precise feature and gene expression pattern for quality control in the cell-targeted therapy.