Open access peer-reviewed chapter

Analysis of 10086 Microarray Gene Expression Data Uncovers Genes that Subclassify Breast Cancer Intrinsic Subtypes

By I-Hsuan Lin and Ming-Ta Hsu

Submitted: April 20th 2016Reviewed: October 5th 2016Published: April 5th 2017

DOI: 10.5772/66161

Downloaded: 781

Abstract

Breast cancer is a complex disease comprising molecularly distinct subtypes. The prognosis and treatment differ between subtypes; thus, it is important to distinguish one subtype from another. In this chapter, we make use of high-throughput microarray dataset to perform breast cancer subtyping of 10086 samples. Aside from the four major subtypes, that is, Basal-like, HER2-enriched, luminal A, and luminal B, we defined a normal-like subtype that has a gene expression profile similar to that found in normal and adjacent normal breast samples. Also, a group of luminal B-like samples with better prognosis was distinguished from the high-risk luminal B breast cancer. We additionally identified 33 surface-protein encoding genes whose gene expression profiles were associated with survival outcomes. We believe these genes are potential therapeutic targets and diagnostic biomarkers for breast cancer.

Keywords

  • breast cancer
  • intrinsic subtypes
  • gene expression
  • microarray
  • survival analysis

1. Introduction

In many countries, breast cancer remains the most common cancer among women and one of the top leading causes of cancer death in women. Multiple efforts and studies have been directed toward the understanding of the cause and mechanisms leading to breast cancer and to improve the diagnosis and treatment of this disease. To aid its identification and treatment, breast cancer is divided into four major molecular subtypes [luminal A (LumA), luminal B (LumB), HER2-enriched (HER2E), and basal-like (BasalL)] according to hormone receptor status assessed by immunohistochemistry (IHC) [1, 2].

The luminal types are estrogen receptor positive cancers, and their gene expression patterns are similar to the luminal epithelial cells that line the breast ducts and glands. They can be treated with endocrine therapy and chemotherapy. Luminal A is a low-grade cancer that has the best prognosis, high survival rates and low recurrence rates compared to other subtypes [3]. Patients with luminal B cancer tend to have poorer prognosis and lower survival rates than those with luminal A cancer. In HER2-enriched cancer, the HER2 gene is often overexpressed due to gene duplication. This type of breast cancer is high-grade and fast-growing. Before the discovery of anti-HER2 drugs such as trastuzumab and lapatinib [4, 5], the treatment for patent of this subtype is limited to chemotherapeutic approaches. The other major subtype is the basal-like breast cancer. The gene expression pattern of basal-like breast cancer is similar to cells in the basal layers of the breast ductal epithelium. Many cases of basal-like breast cancer are also triple-negative breast cancer, which lack estrogen or progesterone receptors and without elevated expression of HER2. The basal-like breast cancer is also high-grade and fast-growing. Patients diagnosed with this subtype have poorer prognosis and are treated with combination of surgery, radiotherapy and anthracycline/taxane-based chemotherapy [6].

After the launch of microarray in the early 2000s as an affordable solution to high-throughput quantification of genome-wide gene expression, many research projects begin to use this technology to study breast cancer [79]. Findings derived from microarray studies provide useful biological, prognostic, and predictive information in basic science and clinical practice. One of the applications resulting from microarray analysis is the reclassification of breast cancer samples according to the gene expression patterns of multiple genes [10].

In this chapter, we present our method of analyzing large public breast cancer microarray datasets and discuss our findings concerning breast cancer subtyping using gene expression signatures. By thoroughly gathering of microarray datasets, we collected gene expression results of 10086 normal breast and breast cancer samples from public depositories. We took advantage of the large sample size to explore the similarities and differences among and within breast cancer subtypes. Through the clustering of this large breast cancer dataset, our aim is to update the subtype labels of these samples and re-define the intrinsic subtypes of breast cancer, as well as to identify genes whose expression profiles are not subtype-specific but can subclassify samples within a given subtype and with prognostic values. By analyzing the functional subgroups of human genes through consensus clustering, we identified specific genes that can subdivide breast cancer subtype and provided useful prognostic information as well as possible genetic clues for breast carcinogenesis.

2. Processing of gene expression microarray datasets

We explored the two largest public repositories, NCBI GEO (https://www.ncbi.nlm.nih.gov/geo) and EBI ArrayExpress (http://www.ebi.ac.uk/arrayexpress) for gene expression microarray datasets relating to normal breast tissues and breast cancers. Different microarray platforms produce variations in the final interpretation of gene expression levels due to differences in probe design and detection methods. We chose to obtain experiment conducted using the Human Genome U133A (HG-U133A) and Human Genome U133 Plus 2.0 (HG-U133 Plus 2.0) arrays, as these are the most widely used platforms we found in the databases. Overall, we identified 41 HG-U133A and 62 HG-U133 Plus 2.0 datasets relating to our topic of interest. Redundant and irrelevant arrays were identified and removed. 4952 HG-U133A and 5134 HG-U133 Plus 2.0 arrays, representing 165 normal breast, 193 adjacent disease-free, 5 proliferative breast lesions, and 9723 breast cancer samples, were selected for downstream analysis. The clinicopathological data associated with the samples were also retrieved at the same time if available. In Supplementary Table 1, we list the accession numbers associated with the dataset we collect and used in this study.

Accession No.HG-U133AHG-U133 Plus 2.0
E-MEXP-882024
E-MEXP-368808
E-MTAB-3650536
E-MTAB-566036
E-MTAB-748046
E-MTAB-1006096
E-MTAB-15470208
E-MTAB-2501032
E-TABM-43350
E-TABM-6606
E-TABM-276060
E-TABM-854073
GSE14561590
GSE1561460
GSE20342860
GSE21090346
GSE2603990
GSE34942510
GSE3744047
GSE46112160
GSE49222870
GSE5327580
GSE5462540
GSE5764018
GSE5847920
GSE653232787
GSE6596260
GSE688370
GSE7307010
GSE73901960
GSE7904062
GSE8977022
GSE9195077
GSE957430
GSE107800177
GSE111211980
GSE120931340
GSE122760204
GSE12763030
GSE16391055
GSE164460112
GSE16873110
GSE177052930
GSE17907053
GSE1886402
GSE196150115
GSE2008605
GSE201942650
GSE202711740
GSE20437250
GSE206850326
GSE20711088
GSE21422019
GSE216530254
GSE21947100
GSE22035043
GSE220931020
GSE22513016
GSE22544018
GSE231770116
GSE237200191
GSE23988590
GSE241851000
GSE25011110
GSE250665060
GSE26910011
GSE269712770
GSE28796014
GSE28821010
GSE29431038
GSE31448029
GSE31519670
GSE32072280
GSE367710107
GSE36772960
GSE36773480
GSE37946490
GSE38506013
GSE425680112
GSE43358057
GSE433650111
GSE43502010
GSE452551340
GSE46184740
GSE46222046
GSE46928500
GSE47389047
GSE48390080
GSE50567040
GSE5094805
GSE540020418
GSE55594010
GSE588120107
GSE61304061
GSE6362606
GSE651940162
GSE68892990
GSE70233022

Supplementary Table 1.

Gene expression microarray datasets used.

Due to the different array design and number of probes of HG-U133A and HG-U133 Plus 2.0, the raw data files (.CEL) of the two platforms were imported into the R environment separately. The raw data were normalized using the justRMA function from the affy Bioconductor package with the Robust Multiarray Averaging (RMA) normalization method [11]. The default hgu133a and hgu133plus2 annotation were used to obtain probe-level expression intensities. The intensity of a probe is used to represent the corresponding gene-level expression value. For any given gene detected by more than one probe sets, the probe set with the highest Jetset score is selected to represent its gene-level expression [12]. Then, inSilicoMerging package was used to combine expression intensities from the two microarray platforms and remove batch effect to obtain log2-normalized intensities [13].

3. Identification of differentially expressed genes among subsets of samples

Some of the samples were provided with relevant clinicopathological data. We used this information to perform differential expression analysis using the limma Bioconductor package in R [14]. Specifically, we used disease status (normal vs. cancer), receptor status assessed by IHC, and the subtype classification to subset samples and performed differential expression analysis. The aim was to identify a list of candidate genes from these comparisons to be used in breast cancer subtyping. Seven categories of differentially expressed genes sets were defined. They are:

  1. Normal versus cancer: ABCA8, ADH1B, ASPM, AURKA, BUB1B, CCNB1, CCNB2, CDC20, CDK1, CENPA, CEP55, CKS2, COL10A1, CXCL10, CXCL11, CXCL2, CXCL9, DLGAP5, DTL, FABP4, FOSB, GABRP, ID4, KRT14, KRT15, KRT5, MELK, MMP1, NEK2, NUSAP1, OXTR, PBK, PRC1, PTN, RRM2, S100P, SFRP1, SPP1, SYNM, TGFBR3, TOP2A, TPX2, UBE2C, and WIF1.

  2. Basal-like: AGR2, CA12, DHRS2, ELF5, EN1, ESR1, FABP7, FOXA1, GABRP, GATA3, KRT6B, MLPH, NAT1, PIP, PROM1, ROPN1B, SCGB1D2, SCGB2A2, SCNN1A, TFF1, TFF3, TOX3, and VGLL1.

  3. HER2-enriched: CALML5, CEACAM6, CLCA2, CRISP3, ERBB2, ESR1, FGG, GRB7, KMO, KYNU, NPY1R, PGAP3, PNMT, S100A8, S100A9, S100P, SCUBE2, STARD3, and TFAP2B.

  4. Luminal A: ABAT, AGR2, AGTR1, BMPR1B, CA12, CPB1, DACH1, ERBB4, ESR1, FABP7, GATA3, GFRA1, GREB1, IGF1R, MMP1, NAT1, NPY1R, PGR, PROM1, RARRES1, S100A8, SCUBE2, SERPINA3, STC2, TBC1D9, TFF1, and TFF3.

  5. Luminal B: AGR2, ARMT1, CA12, DHRS2, ESR1, FABP7, GABRP, GATA3, KRT6B, NAT1, PROM1, SFRP1, SLPI, TFF1, and TFF3.

  6. Luminal C: COL10A1, CXCL9, ESR1, FABP7, GABRP, GATA3, IFI44L, SCGB2A2, and TFF1.

  7. Apocrine: CALML5, CLCA2, CPB1, CRISP3, ERBB4, ESR1, IGF1R, KYNU, MMP1, NPY1R, S100A8, S100A9, SERPINA3, and TFF1.

Some of the genes were identified in more than one category, for example the estrogen receptor 1 (ESR1) was found in six of the seven categories. The redundant genes were removed, and the remaining 100 unique genes were used to perform sample subtyping with consensus clustering.

4. Consensus hierarchical clustering using subtype-specific genes

The ConsensusClusterPlus Bioconductor package was used to perform consensus hierarchical clustering on the 10086 samples using the expression intensities of the 100 genes discovered in the previous step [15]. The distance metric used in the clustering was calculated as one minus the Pearson correlation coefficient. The parameters used were: maxK = 6, reps = 1000, pItem = 0.8, pFeature = 1, whereby the clustering was performed 1000 times using the expression of all the genes of randomly selected samples consisting of 80% of the total sample size and with a maximum of six clusters. Figure 1 shows the cumulative distribution functions (CDFs) of the consensus matrix for each number of clusters (i.e. k = 2 to k = 6) on the left and relative change in area under the CDF curves on the right. Both plots were used to help determine the appropriate number of clusters to be selected.

Figure 1.

Analysis of breast cancer gene expression cluster stability. The optimum partitioning of breast cancers is determined with (left) consensus CDF and (right) Delta area plots for cluster between k = 2 and k = 6. The optimal choice of cluster number is 6 whereby the CDF curve is reaching a plateau and has minimal relative change in area under CDF curves.

Figure 2.

Consensus clustering of 10086 samples using the expression profile of 100 genes. The color of each cell of the matrix represents the gene expression intensity a sample (column) of a given genes (row). The red and blue colors reflect high and low expression levels, respectively, as indicated in the color bar. Samples with similar gene expression profiles are grouped together and distributed into six clusters (colored bars).

We assigned the six clusters with names correspond to convention breast cancer subtypes. To visualize the classification result, we used the ComplexHeatmap Bioconductor package to produce heatmap representation of the clustering result [16]. The six clusters were represented with different colors in the heatmap shown in Figure 2, and they are HER2-enriched (HER2E; leftmost), basal-like (BasalL), normal-like (NormL), luminal A (LumA), luminal B (LumB), and mixed luminal (LumMix; rightmost). The clinical features of the six clusters were presented in Table 1. The mixed luminal cancer has the most number of samples, and the normal-like cancer has the fewest samples. The patients of the basal-like cancer were significantly younger (median age at diagnosis 49; t test P-value < 2.2e−16), and the mixed luminal patients were significantly older (median age at diagnosis 56; t test P-value = 4.3e−15). These are consistent with previous reports [1719].

BasalLHER2ELumALumBLumMixNormL
No. of samples17271330125115333735510
Age range24–8426–9027–8824–9324–9121–86
Median age495554535651
ER status by IHC
No. of ER+1061577108312271101
No. of ER−1085614831147773
ER+:ER−1:10.241:3.911:0.121:0.141:0.031:0.72
Missing ER data536 (31.0%)559 (42.0%)458 (36.6%)588 (38.4%)1387 (37.1%)336 (65.9%)
PR status by IHC
No. of PR+4460383315106156
No. of PR−65743610420021948
PR+:PR−1:14.931:7.271:0.271:0.631:0.211:0.86
Missing PR data1026 (59.4%)834 (62.7%)764 (61.1%)1018 (66.4%)2455 (65.7%)406 (79.6%)
HER2 status by IHC
No. of HER2+493023517410014
No. of HER2−861222285391105093
HER2+:HER2−1:17.571:0.741:8.141:2.251:10.501:6.64
Missing HER2 data817 (47.3%)806 (60.6%)931 (74.4%)968 (63.1%)2585 (69.2%)403 (79.0%)

Table 1.

Clinical features of the six clusters.

We compared the subtype assignment by ConsensusClusterPlus with the molecular subtyping by PAM50, SSP2006 and AIMS models using the genefu Bioconductor package (see Tables 24) [20]. The comparisons showed the four major breast cancer subtypes were present in our analysis. The concordances between different methods on the HER2-enriched and basal-like subtype were higher than other subtypes. The classification of luminal subtypes and normal-like samples were more inconsistent. Based on the heatmap and structure of the dendrogram shown in Figure 2, the transcriptome profiles of HER2-enriched and basal-like breast cancers were more distinctive compared to other subtypes. Hence, the clustering results of these two subtypes were more consistent than other subtypes using different methods. The ConsensusClusterPlus assignment is most similar to that produced by the PAM50 model, whereas SSP2006 and AIMS models have classified many samples as HER2-enriched but were determined as luminal B subtype using our method. The major difference between the ConsensusClusterPlus and PAM50 assignment is that our method identified a large subgroup within the luminal subtypes, which we defined it as mixed luminal, that were classified as either luminal A or luminal B by the PAM50 model. We think the increase in the number of samples, as well as selection of different gene candidates, used in our study helped to distinguish and define three luminal subtypes rather than two. The implication of this distinction is rather profound. Although the mixed luminal breast cancers have similar gene expression profile to the luminal B subtype as seen in Figure 2, we showed in the next section that the two subgroups vary in their survival outcomes.

Subtype comparisonPAM50
BasalLHER2ELumALumBNormL
ConsensusClusterPlusHER2E1419095812751
BasalL168670325
LumMix3221686200415
LumB61758112692
LumA3111871241
NormL70730134

Table 2.

Comparison of molecular subtyping by ConsensusClusterPlus and PAM50.

Subtype comparisonSSP2006
BasalLHER2ELumALumBNormL
ConsensusClusterPlusHER2E263833536131
BasalL169520024
LumMix101092882625104
LumB2354144150523
LumA5110360202
NormL00400174

Table 3.

Comparison of molecular subtyping by ConsensusClusterPlus and SSP2006.

Subtype comparisonAIMS
BasalLHER2ELumALumBNormL
ConsensusClusterPlusHER2E38478941108
BasalL169930019
LumMix940015111489321
LumB279363052614
LumA5102755949
NormL1000213

Table 4.

Comparison of molecular subtyping by ConsensusClusterPlus and AIMS.

5. Survival analysis of breast cancer subtypes

We used the Kaplan-Meier method to estimate the survival curves of overall survival (OS), relapse-free survival (RFS) and distant metastasis-free survival (DMFS). The gene expression values were converted to expression status using a modified R script taken from the Kaplan Meier-plotter website (http://kmplot.com/). The survival probabilities were calculated using the survival package [21]. The log-rank test was used to assess the statistical significance of the survival differences. The prognostic significance of our classification relating to breast cancer survival was analyzed using the Cox proportional regression model. The Kaplan-Meier curves were produced using a modified R script taken from http://biostat.mc.vanderbilt.edu/wiki/Main/TatsukiRcode#kmplot.

We showed in Figure 3 the Kaplan-Meier plots of the OS, RFS, and DMFS of the six subtypes that we determined using consensus clustering. In all three survival endpoints, the luminal A patients had highest survival rates (5-year OS = 86.8%, 5-year RFS = 83.8%, 5-year DMFS = 87.4%), whereas the HER2-enriched had worse outcomes (5-year OS = 67.3%, 5-year RFS = 56.8%, 5-year DMFS = 62.2%). The luminal B breast cancers are widely recognized as high risk [2224], and our analysis showed equivalent results. Similar to basal-like and HER2-enriched breast cancers that had poorer prognosis, the luminal B subtype had greater relative risk of locoregional and distant breast cancer recurrence.

Figure 3.

Kaplan-Meier plots showing the relation between subtypes determined with ConsensusClusterPlus and clinical outcome in breast cancer patients. Overall survival (OS; left), relapse-free survival (RFS; middle), and distant metastasis-free survival (DMFS; right) for samples in the six subtypes based on the consensus clustering with 100 genes.

6. Consensus hierarchical clustering using function-specific genes and survival analysis

Besides classifying samples according to the expression of genes relating to breast cancer subtypes, we also aimed to identify subsets of patients that might harbor specific expression profiles that could affect their survival outcome. To do this, we used the current knowledge about protein functions and the participation of genes in biological pathways to select specific functions and pathways that might have an effect or are affected by the development and progression of breast cancer. We used databases such as Ingenuity Pathway Analysis (http://www.ingenuity.com/products/ipa), KEGG (http://www.genome.jp/kegg/), and HGNC (http://www.genenames.org/) to gather genes participates and/or of the following functions: cadherins, zinc fingers, C2 domain-containing, ion channels, solute carriers, integrins, chemokine receptors, chemokine ligands, receptor kinases, immunoglobulins, CD molecules, homeoboxes, interferons, interferon receptors, interleukins, interleukin receptors, intermediate filaments, histones, chromatin-modifying enzymes, ATPases, glycosyltransferases, phosphatases, metallopeptidases, apoptosis, autophagy, unfolded protein response, oxidative stress response, and epithelial-mesenchymal transition pathway. Consensus clustering was performed as before using ConsensusClusterPlus with same parameters to determine at most six clusters from each or collections of gene sets. Then, these clusters were analyzed for their associations with survival.

Using a P-value cutoff of 0.01, we identified two collections of genes that were statistically significantly associated with survivals: the CD molecules and the cytokines and cytokine receptors. Figure 4 shows the Kaplan-Meier plots of OS, RFS, and DMFS for each of the six CD molecules clusters. In both RFS and DMFS, Cluster 2 (lime green colored) had the best survival outcome, and is made up of mixed luminal, luminal A, HER2-enriched, and normal-like breast cancers as shown in Table 5. Cluster 3 (dark green colored), which are mainly HER2-enriched and luminal B cancers, and Cluster 4 (magenta colored) consists of basal-like cancers had worse outcomes. We looked into the CD molecules that showed greater expression differences between Cluster 2 (best survival) and Clusters 3 and 4 (worse survival) by computing the Cohen's d effect size statistics [25]. Of the 317 CD molecules analyzed, the 20 genes that had large effect size (d > 1) are: ACKR1, BCAM, CD248, CD34, CD36, EPCAM, FUT3, HMMR, IGF1R, IL6ST, JAM2, LAMP3, LEPR, LRP1, PDGFRA, PDGFRB, SLC7A5, TEK, TFRC, and TSPAN7. Figure 5 showed their respective expression distributions in Clusters 2, 3, and 4.

Figure 4.

Kaplan-Meier estimates of breast cancer survival of clusters determined using CD molecules. Overall survival (OS; left), relapse-free survival (RFS; middle), and distant metastasis-free survival (DMFS; right) for samples in the six subtypes based on the consensus clustering with 317 genes encoding for CD molecules.

ComparisonSubtypes
BasalLHer2ELumALumBLumMixNormL
Clustering using expression profiles of CD molecules1 496924816660
21617391348469482
33470426221641
411324931465
55393255752444716
627018347810836

Table 5.

Comparison of sample assignment between subtype-specific genes and CD molecules.

Figure 5.

Box plots of the distribution of gene expression values of 20 CD molecules with large effect size between samples with best and worse outcomes. Cluster 2 (best outcome), 3 and 4 (worse outcomes) are chosen to demonstrate the difference in gene expression levels between samples from these three clusters. The box plots of Clusters 2, 3 and 4 are colored in light green, dark green, and magenta, respectively.

The second collection of genes consists of 113 cytokines and cytokine receptors. In Figure 6, the Kaplan-Meier plots showed that Cluster 6 (orange colored) had the worst survival outcome. It consists of Basal-like, HER2-enriched, and some luminal cancers (see Table 6). We again used Cohen's d as a measure to assess whether the expression profiles of Cluster 6 and the two clusters with better survival (Clusters 2 and 4) are significantly different in gene expression for each gene in this collection. We identified 15 genes that had large effect size (d > 1). They are: ACKR1, CCL19, CCL20, CCL7, CX3CR1, CXCL1, CXCL12, CXCL14, CXCL8, IL12RB2, IL13RA1, IL1R1, IL1R2, IL6ST, and PITPNM3, and their respective expression distributions in Clusters 2, 4 and 6 are shown in Figure 7.

Figure 6.

Kaplan-Meier estimates of breast cancer survival of clusters determined using chemokine ligands, chemokine receptors, interferons, interferon receptors, interleukins, and interleukin receptors. Overall survival (OS; left), relapse-free survival (RFS; middle), and distant metastasis-free survival (DMFS; right) for samples in the six subtypes based on the consensus clustering with 113 genes encoding for cytokines and cytokine receptors.

ComparisonSubtypes
BasalLHER2ELumALumBLumMixNormL
Clustering using expression profiles of
cytokines and cytokine receptors
1 681027734115852
23315383062764444
335047716675880921
44301744740910
57502680219470
6522300410612133

Table 6.

Comparison of sample assignment between subtype-specific genes and cytokines and cytokine receptors.

Figure 7.

Box plots of the distribution of gene expression values of 15 cytokines and cytokine receptors with large effect size between samples of better and worst outcomes. Cluster 6 (worse outcome), 2 and 4 (best outcomes) are chosen to demonstrate the difference in gene expression levels between samples from these three clusters. The box plots of Clusters 2, 4, and 6 are colored in light green, magenta, and orange, respectively.

7. Conclusion and perspectives

Breast cancer is a complex disease comprising different subtypes that may be characterized by the change in expression patterns and/or mutations of few candidate genes. The ability to distinguish breast cancer subtypes using these underlying differences has significant clinical implications as it is one of the variables that affect prognosis and treatment of the disease. There were many studies with goals to classify breast cancer based on the amount of literatures and gene expression datasets available in public domain. However, there is a lack of recent meta-analysis to utilize this collection of data generated by various research groups and institutes over the past 15 years. In this chapter, we presented our effort to employ these high-throughput microarray dataset to perform breast cancer subtyping of 10086 samples.

The breast cancer subtypes that we characterized using consensus clustering of 100 genes and 10086 samples not only confirmed the existence of the four major intrinsic subtypes, that is, Basal-like, HER2-enriched, luminal A, and luminal B, but we also defined a normal-like subtype that consists of cancer samples with similar gene expression profile as that found in normal and adjacent normal breast samples. In addition, we distinguished a group of luminal B–like samples with better prognosis (that we term mixed luminal) from the high-risk luminal B breast cancer.

In addition, consensus clustering of the expression signatures of CD molecules and cytokines and cytokine receptors were associated with survival outcomes. Thirty-three genes showed significant differential gene expression between the classes with best and worse survival rates were identified. The ACKR1 (Atypical Chemokine Receptor 1, CD234 Antigen) and IL6ST (Interleukin 6 Signal Transducer, CD130 Antigen) were found in both gene sets. Kaplan-Meier analysis showed patients with higher expression of either one gene had longer survival time. Others includes CX3CR1 (C-X3-C motif chemokine receptor 1), CXCL12 (C-X-C motif chemokine ligand 12), CXCL14 (C-X-C motif chemokine ligand 14), IGF1R (insulin-like growth factor 1 receptor), IL13RA1 (interleukin 13 receptor subunit alpha 1), IL6ST (interleukin 6 signal transducer), JAM2 (junctional adhesion molecule 2), and LEPR (leptin receptor) are also genes that had higher expression associating with better outcomes. On the other end of the spectrum are CCL7 (C-C motif chemokine ligand 7), CXCL1 (C-X-C motif chemokine ligand 1), CXCL8 (C-X-C motif chemokine ligand 8), FUT3 (fucosyltransferase 3 (Lewis blood group)), HMMR (hyaluronan mediated motility receptor), and SLC7A5 (solute carrier family 7 member 5) that were overexpressed in patients with lower survival rates. We believe these genes are potential therapeutic targets and diagnostic biomarkers for breast cancer.

Acknowledgments

The authors acknowledge financial support from the Ministry of Science and Technology, Taiwan (MOST 103-2811-B-010-020 and MOST 104-2811-B-010-007). We also thank the National Center for High-performance Computing of National Applied Research Laboratories of Taiwan and National Research Program for Biopharmaceuticals (MOST 104-2325-B-492-001) for providing computational biology platform. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

I-Hsuan Lin and Ming-Ta Hsu (April 5th 2017). Analysis of 10086 Microarray Gene Expression Data Uncovers Genes that Subclassify Breast Cancer Intrinsic Subtypes, Breast Cancer - From Biology to Medicine, Phuc Van Pham, IntechOpen, DOI: 10.5772/66161. Available from:

chapter statistics

781total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Jab1/Csn5 Signaling in Breast Cancer

By Yunbao Pan and Francois X. Claret

Related Book

First chapter

Physical versus Immunological Purification of Mesenchymal Stem Cells

By Radwa Ali Mehanna

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More about us