Major electronic nucleotide and protein databases
Developments both in computer hardware and software allowed for storing, distributing, and analysing data obtained from biological experimentation, the very definition of bioinformatics. From this standpoint, bioinformatics can be narrowly defined as a field at the crossroads of biology and computer engineering, responsible for the storage, distribution, and analysis of biological information.  The term ‘bioinformatics’ relatively refers to the formation and advancement of algorithms, computational and statistical techniques, and theory to solve formal and practical problems posed by or inspired from the management and analysis of biological data. [2, 3]
Since its emergence as an independent discipline in the 1980s, bioinformatics has been rapidly developing, keeping up with the expansion of genome sequence data. Whereas it is safe to say that, 20 years ago, publishing computationally-derived results was a challenge and experimental observations were considered the only way of making progress , after the famous Clinton-Blair handshake for the completion of the human genome in 2000,  headlines such as ‘the laboratory rat is giving way to the computer mouse’ arose.  The importance of bioinformatics methods has further increased following the technological improvement of large-scale gene expression analysis using DNA microarrays and proteomics experiments. Wet experiments and the use of bioinformatics analyses go hand-in-hand in today’s biological and clinical research.  Undeniably, it is almost inconceivable that a high-impact research publication in biology does not contain some elements of computing. 
To date, the genome, transcriptome and proteome are investigated with large-scale and high-throughput techniques to suggest treatment and predict outcomes. With the availability of high-throughput sequencing in hypothesis-driven science, various sequence-based techniques are originated, namely expressed sequence tags (ESTs),  serial analysis of gene expression (SAGE),  massively parallel signature sequencing (MPSS),  the ‘HapMap’ project proceeding by means of individual SNPs (single nucleotide polymorphisms) to link specific genotypes to diseases. ,  Aside from sequencing techniques, microarray technology is one of the high-throughput techniques, and possibly the most promising one. As for protein analysis techniques, tissue arrays  and proteomics can be named.
On the one hand, microarrays are microscope slides or chips with immobilized probes, usually cDNA (complementary DNA), BAC (bacterial artificial chromosome), or oligo probes.  There are very large numbers of spots on an array, each containing a huge number of identical DNA molecules. Two important applications of microarray technology are gene expression monitoring and Single Nucleotide Polymorphism (SNP) detection.  This technique is widely applicable because less RNA is used to analyse thousands of genes. Despite its increasing use around the world, microarray analysis has some limitations if used as a single method for exploring tumour biology. An obvious weakness is that a microarray represents a single snapshot of the patient.  But there are a large number of elements leading to disturbed gene function,  such as large and small deletions or single base substitutions, mutations that affect promoter regions or splice-sites, as well as epigenetic silencing. Those factors may influence the result but may go undetected as well, depending on the exact type of lesion as well as its location with respect to the area hybridizing with the probe.  Furthermore, differentially expressed genes do not necessarily translate into varying protein levels with functional implications, so a correlation between the expression of a gene and the amount of translated protein is not always shown.  Furthermore, compared to RT-PCR (reverse transcription polymerase chain reaction), microarray signals are less sensitive, accurate and not able to resolve smaller differences in gene expression.  In addition to its comparative simplicity, microarray technology requires a better understanding of the limitations and careful attention to experimental design and data analysis for meaningful results.
Bioinformatics applications are used in the analysis of entire gene expression profiles to approach the disease at a genome level and pose new hypotheses regarding certain mechanisms including, but not limited to, signalling pathways governing the process of formation, maintenance and expansion of tumours.  Bioinformatics analyses can also be applied to miRNA, DNA copy-number, SNPs, sequence, and methylation data  along with the field of medical sciences to know the pathways for diagnosing which genomic changes could give rise to each known inherited disease, i. e. , identification of the gene causing disease, and genetic therapies that can reverse disease phenotype. Different Browser and Databases has been developed to analyse and process this huge quantity of data (Table 1 and Table 2).
Considering that the discovery of complete protein classes is still in progress, e. g. , the kinases of the human genome,  the classification of proteins with related structures and functions  will preserve its significance in the molecular dissection of human health and disease. In the future, bioinformatics is expected to continue its fascinating interplay with the field of genomics in cancer research, which is cancer bioinformatics and oncogenomics. 
2. Bioinformatics in various cancers
Cancer is one of the prevalent diseases that brings about death worldwide. Given that scientists have sequenced the human genome,  it is now time to use these genomic data, and the high-throughput technology developed to generate them, to tackle major health problems such as cancer.  Cancer’s molecular mechanisms are more successfully examined considering the genes’ and proteins’ interaction and network. Bioinformatics tools are vital for acquiring a more holistic view of cancer and analysing the intricate data, speeding up the research process including biomarker discovery. Moreover, cancer clinical bioinformatics is critical for reaching systems clinical medicine by combining clinical measurements and signs with human cancer tissue-generated bioinformatics, understanding clinical symptoms and signs, disease development and progress, and therapeutic strategy. [26, 27, 28]
The leading cause of cancer death is lung cancer but this still awaits reliable molecular markers. Kim et al.  used multiple clinical samples and combined the bioinformatics analysis of the public gene expression data with clinical validation to identify biomarker genes for non–small-cell lung cancer, which shows poor prognosis and recurrence. They meta-analysed the SAGE and EST data and chose 20 genes for experimental validation through semiquantitative RT-PCR. Then, applied quantitative RT-PCR to seven genes (CBLC, CYP24A1, ALDH3A1, AKR1B10, S100P, PLUNC, and LOC147166) identified as potential diagnostic markers, leading to two highly probable novel biomarkers (CBLC and CYP24A1).
Liver cancer is the most common type, subsequent to lung cancer, responsible for cancer-related deaths. Sawey et al.  performed a forward genetic screen, using a mouse hepatoblast model and RNAi, guided by human hepatocellular carcinoma amplification data. They found that the amplification led to the selective sensitivity to FGF19 inhibition. Hence, FGF19 is an equally important driver gene of 11q13. 3 amplicon as CCND1 in liver cancer, which means 11q13. 3 amplification could be an effective biomarker for patients predicted to respond to anti-FGF19 therapy.
In a recent study,  an individualized bioinformatics analysis strategy was applied to previously-established transcriptome data for clear cell renal cell carcinoma (ccRCC) to identify and reposition eight FDA-approved drugs with negative correlation and P-value <0. 05 for anticancer therapy. The authors demonstrated that pentamidine is effective against RCC cells in culture, and slows tumour growth in a RCC xenograft mouse model, so it might be a new therapeutic agent to be combined with current standard-of-care regimens for patients with metastatic RCC.
With regard to leukaemia, diagnosis and subclassification is mostly based on the application of various techniques like cytomorphology, cytogenetics, fluorescence in situ hybridization, multiparameter flow cytometry, and PCR-based methods which are time-consuming and cost-intensive, and require expertise in central reference laboratories. Therefore, microarray analysis represents a novel promising method to be used as a diagnostic tool.  A key determinant in the prognosis of chronic lymphocytic leukaemia (CLL) is the mutational status of the immunoglobulin heavy chain variable region (IGHV) genes.  For the correct delineation of the mutational status, the patient’s leukaemic cells and closest germline counterpart should be compared. Unfortunately, public web-based databases are commonly used instead of the patient’s germline DNA sequence from non-leukaemic cells. Several of these reference databases involve VBASE, GenBank/IgBLAST and the international ImMunoGeneTics information systems that employ different software types, amounts of natural IGHV polymorphism and criteria used to map the complementarity determining regions and framework regions. As a result, the correct interpretation of the IGHV mutational status in CLL may be affected. 
Because of the heterogeneity of many tumours, it is very challenging work to identify good molecular targets. For instance, resistant subclones of overexpressed and mutated genes may prevent them from being good molecular targets. Therefore, the best target is a ‘red dot’ gene whose mutation occurs early in oncogenesis and dysregulates a key pathway that drives tumour growth in all of the subclones. Examples include mutations in the genes ABL, HER-2, KIT, EGFR and probably BRAF, in chronic myelogenous leukaemia, breast cancer, gastrointestinal stromal tumours, non-small-cell lung cancer and melanoma, respectively. For efficacious therapeutics; identification of red-dot targets, development of drugs that inhibit the red-dot targets, and diagnostic classification of the related pathways are a must. 
3. Bioinformatics and breast cancer
Breast cancer occurs in both men and women, yet male breast cancer is less common. Although a cure for each stage of breast cancer has not yet been found, identifying the genetic mutations that cause the disease can play an important role. This is described by scientists to be like looking for needles in a haystack, and after finding the needles or coding regions they must find disease-related sequences within them. [3, 6] Bioinformatics sets the stage for searching three billion base pairs to detect genetic defects.
Allinen et al. described the comprehensive gene expression profiles of each cell type composing normal breast tissue and in situ and invasive breast carcinomas performing SAGE (serial analysis of gene expression) and utilizing cell-type specific cell surface markers and magnetic beads for the rapid sequential isolation. Their results suggest that considerable transcriptional alterations happen in all cell populations while genetic changes were detected only in epithelial cells among myoepithelial, endothelial and stromal cells, myofibroblasts and lymphocytes.  To continue with another study, based upon a systematic Sanger sequencing analysis of 13, 023 genes in 11 human breast cancers, individual tumours accumulate an average of approximately 90 point mutations in gene coding regions, but only a tiny number of these were recurrent and were in significant genes of breast cancer, including p53 and PIK3CA. A much larger number of the genes do not necessarily contribute to the carcinogenesis.  Considering the genomic landscape of breast cancer, these more common mutations resemble ‘mountains’ while the vast majority of genes reflect ‘hills’ that are infrequently mutated. We need to elucidate mechanisms involved in the disease to understand the heterogeneity of human cancers and utilize personal genomics for tumour diagnosis and new therapeutic strategies. 
As widely accepted, early detection of breast cancer has an enormous impact on patient’s survival. Seeing that genome-wide expression patterns of tumours mirror the biology of the tumours, relating gene expression patterns to clinical outcomes sheds light on the biological diversity of the tumours.  In the discovery of genes and pathways that are specifically activated or inactivated during tumour progression, high-throughput genome-wide array based techniques like array comparative genomic hybridization (aCGH) and transcriptional profiling can be used.  A molecular classification of breast cancer, with more than five reproducible subtypes (basal-like, ERBB2, normal-like, luminal A, luminal B) was defined through gene expression profiling and microarray analysis. [38, 39, 17] In addition, performing the gene set enrichment analysis (GSEA), a gene set linked to the growth factor (GF) signalling was observed to be significantly enriched in the luminal B tumours.  Another study states that multiple pathways were identified by mapping gene sets defined in Gene Ontology Biological Process (GOBP) for oestrogen receptor positive (ER+) or oestrogen receptor negative (ER-); and among them, in a separate set, pathways related to apoptosis and cell division or G-protein coupled receptor signal transduction are associated with the metastatic capability of ER+ or ER- tumours, respectively.  Additionally, a study has supported that breast cancer is initiated with mutated stem cells/progenitors, also called ‘breast cancer stem cells’ because they are sufficient to sustain oncogenesis and tumour growth.  To identify genetic changes in the progression of breast carcinoma, Yao et al.  used aCGH and SAGE combined for ductal carcinoma in situ (DCIS), invasive breast carcinomas, and lymph node metastases. They identified 49 minimal commonly amplified regions and reported that the overall frequency of copy number alterations was more in invasive tumours than in DCIS, with several of them present only in invasive cancer. In breast cancer, gene amplification happens recurrently on some chromosomal locations (e. g. , 1q, 8p12, 8q24, 11q13, 12p13, 12q13, 17q21-q23, 20q13), [43, 44] which points to the activation of some oncogenes at high frequency during the growth of tumour. Amplification is a mechanism causing the gene expression constitutively enhanced above the level of physiologically normal variation, so the significance of oncogene amplification in tumourigenesis had originated from expression profiling of tumour cells by oncogene arrays. 
Bioinformatics is also crucial in the realm of pharmacogenomics. There became a need to develop accurate tools for the effective treatment relying on the biological characterization of each patient’s tumour. Gene expression profiling of tumours with DNA microarrays is a powerful tool for pharmacogenomics targeting of treatments. Oncotype DX™ assay (Genomic Health) is a good example, which was described for identifying the subset of node-negative oestrogen-receptor-positive breast cancer patients who do not require adjuvant chemotherapy. ,  Recent research has demonstrated that microarray analysis with qRT-PCR validation reveals distinct pathways of resistance to bevacizumab (BEV) in xenograft models of human ER+ breast cancer, showing Follistatin (FST) and NOTCH as the top signalling pathways associated with resistance in VEGF-driven tumours (P <0. 05). According to the gene expression analysis, the level of VEGF expression affects the response to BEV therapy and gene pathways.  Using appropriate bioinformatics tools, such findings may elucidate the matter of resistance to drugs for individual patients and provide a deeper understanding of treatments and risk factors, opening the door from novel targets and disease-related biomarkers to the right drugs.
Last but not least, the effect of epigenetic changes on breast cancer aetiology is beyond doubt. In spite of quite a number of DNA methylation research studies manifesting diverse patterns including tumour suppressor genes and oncogenes, only a small fraction of them connect the epigenome data with the transcriptome. In a recent study by Minning and coworkers,  DNA methylation and gene expression profiling of primary breast tumour tissues and adjacent non-cancerous breast tissues was carried out. They preferred MS-MLPA or MS-qPCR for validation of results. The overlapping genes between DNA methylation and gene expression datasets were further mapped to the KEGG database to identify the molecular pathways linking the used genes together, and supervised hierarchical clustering was used for data analysis. The authors found that most of the overlapping genes belong to the focal adhesion and extracellular matrix-receptor interaction that play important roles in breast carcinogenesis. The more gene signature data that are acquired by different studies, the better understanding of the epigenetic regulation of gene expression and remedial intervention that will be possible.
|Nucleotide Sequence||GenBank||US National Center for Biotechnology Information (NCBI)||www.ncbi.nlm.nih.gov/genbank|
|EMBL||European Bioinformatics Institute||www.ebi.ac.uk/|
|DDBJ||National Institute of Genetic, Japan||www.ddbj.nig.ac.jp/|
|Protein Sequence||SWISS-PROT||Swiss Institute of Bioinformatics, Geneva||web.expasy.org/docs/swiss-prot_guideline.html|
|European Bioinformatics Institute||www.ebi.ac.uk/swissprot/|
|TREMBLE||EBI (translation of coding sequences from the EMBL database that have not yet been deposited in SWISS-PROT)||www.ebi.ac.uk/tremble|
|UniProt||Bioinformatics Institute (EMBL-EBI), Swiss Institute of Bioinformatics (SIB) and the Protein Information Resource (PIR).||www.uniprot.org|
|PIR||US National Biomedical Research Foundation (NBRF)||pir.georgetown.edu|
|Japan International Protein Information Database (JIPID)||www.ddbj.nig.ac.jp|
|Munich Information Center for Protein Sequences (MIPS)||mips.gsf.de|
Advances in bioinformatics and its application are possible by multidisciplinary teams pursuing focused research. The sensitivity, specificity and combination of tools, methodologies, and databases should be evaluated in a complete matter. On top of that, findings must be confirmed with several molecular techniques before translation into clinical practice.
|Wellcome Trust Sanger Institute/ European Bioinformatics Institute (EBI)||www.ensembl.org/|
|US National Center for Biotechnology Information (NCBI)||www.ncbi.nlm.nih.gov/mapview/|
|Genome Bioinformatics Group of UC Santa Cruz||http://genome.ucsc.edu/|
|European Bioinformatics Institute (EBI)||www.ebi.ac.uk/genomes|
|Genomes Online Database||www.genomesonline.org/|
Ouzounis, C. The Emergence of Bioinformatics: Historical Perspective, Quick Overview and Future Trends. in Bioinformatics in Cancer and Cancer Therapy(ed. Gordon, G. J. ) 1-11 (Humana Press, 2009).
Sims, A. H. Bioinformatics and breast cancer: what can high-throughput genomic approaches actually tell us? J Clin Pathol62, 879-85 (2009).
Maryam Gholizadeh, S. A. P. , Reza Pasandideh. Proteomics and Bioinformatics Approaches for Breast Cancer Researches. International Journal of Agriculture and Crop Sciences5, 1863-1868 (2013).
Ouzounis, C. A. Rise and demise of bioinformatics? Promise and progress. PLoS Comput Biol8, e1002487 (2012).
Economist, T. The race to computerise biology. The Economist(2002).
Daisuke Kihara, Y. D. Y. , Troy Hawkins. Bioinformatics resources for cancer research with an emphasis on gene function and structure prediction tools. Cancer Informatics2, 25-35 (2006).
Adams M. D. , K. J. M. , Gocayne J. D. , Dubnick M. , Polymeropoulos M. H. , Xiao H. , Merril C. R. , Wu A. , Olde B. , Moreno R. F. , et al. Complementary DNA sequencing: expressed sequence tags and human genome project. Science252, 1651-1656 (1991).
Velculescu V. E. , Z. L. , Vogelstein B. , Kinzler K. W. Serial analysis of gene expression. Science270, 484-487 (1995).
Sydney Brenner, M. J. , John Bridgham, George Golda, David H. Lloyd, Davida Johnson, Shujun Luo, Sarah McCurdy, Michael Foy, Mark Ewan, Rithy Roth, Dave George, Sam Eletr, Glenn Albrecht, Eric Vermaas, Steven R. Williams, Keith Moon, Timothy Burcham, Michael Pallas, Robert B. DuBridge, James Kirchner, Karen Fearon, Jen-i Mao, Kevin Corcoran. Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nature Biotechnology18, 630-634 (2000).
Lon R. Cardon, G. R. A. Using haplotype blocks to map human complex trait loci. Trends in Genetics19, 135-140 (2003).
Andrew G. Clark, R. N. , James Signorovitch, Tara C. Matise, Stephen Glanowski, Jeremy Heil, Emily S. Winn-Deen, Arthur L. Holden, Eric Lai. Linkage Disequilibrium and Inference of Ancestral Recombination in 538 Single-Nucleotide Polymorphism Clusters across the Human Genome. American Journal of Human Genetics73, 285-300 (2003).
Hector, B. The multitumor (sausage) tissue block: novel method for immunohistochemical antibody testing. Laboratory Investigation55, 244-248 (1986).
Rennstam, K. & Hedenfalk, I. High-throughput genomic technology in research and clinical management of breast cancer. Molecular signatures of progression from benign epithelium to metastatic breast cancer. Breast Cancer Res8, 213 (2006).
Vidya Vaidya, S. D. A Review of Bioinformatics Application in Breast Cancer Research. Journal of Advanced Bioinformatics Applications and Research.1, 59-68 (2010).
Yang, X. , Ai, X. & Cunningham, J. M. Computational prognostic indicators for breast cancer. Cancer Manag Res6, 301-12 (2014).
Lonning, P. E. , Knappskog, S. , Staalesen, V. , Chrisanthar, R. & Lillehaug, J. R. Breast cancer prognostication and prediction in the postgenomic era. Ann Oncol18, 1293-306 (2007).
Per Eystein Lønning, R. C. , Vidar Staalesen, Stian Knappskog, Johan Lillehaug3. Adjuvant treatment: the contribution of expression microarrays. Breast Cancer Research9, S14 (2007).
Jaluria, P. , Konstantopoulos, K. , Betenbaugh, M. & Shiloach, J. A perspective on microarrays: current applications, pitfalls, and potential uses. Microb Cell Fact6, 4 (2007).
Dominick Sinicropi, M. C. , Mei-Lan Liu. Gene Expression Profiling Utilizing Microarray Technology and RT-PCR. in BioMEMS and Biomedical Nanotechnology(eds. Ferrari, M. , Ozkan, M. & Heller, M. ) 23-46 (Springer US, 2007).
Eroles, P. , Bosch, A. , Perez-Fidalgo, J. A. & Lluch, A. Molecular biology in breast cancer: intrinsic subtypes and signaling pathways. Cancer Treat Rev38, 698-707 (2012).
Schiavon, G. et al.Heterogeneity of Breast Cancer: Gene Signatures and Beyond. 13-25 (2012).
Manning G, W. D. , Martinez R, Hunter T, Sudarsanam S. The Protein Kinase Complement of the Human Genome. Science298, 1912-1934 (2002).
Christos A. Ouzounis, R. M. R. C. , Anton J. Enright, Victor Kunin & José B. Pereira-Leal. Classification schemes for protein structure and function. Nature Reviews Genetics4, 508-519 (2003).
Strausberg RL, S. A. , Old LJ, Riggins GJ. Oncogenomics and the development of new cancer therapies. Nature429, 469-74 (2004).
Human Genome Research Institute, The Human Genome Project Completion. http://www. genome. gov/11006943National.
Duojiao Wu, C. M. R. , Xiangdong Wang. Cancer bioinformatics: A new approach to systems clinical medicine. BMC Bioinformatics13(2012).
Wang, X. & Liotta, L. Clinical bioinformatics: a new emerging science. J Clin Bioinforma1, 1 (2011).
Wang, X. Role of clinical bioinformatics in the development of network-based Biomarkers. J Clin Bioinforma1, 28 (2011).
Kim, B. et al.Clinical validity of the lung cancer biomarkers identified by bioinformatics analysis of public expression data. Cancer Res67, 7431-8 (2007).
Sawey, E. T. et al.Identification of a therapeutic strategy targeting amplified FGF19 in liver cancer by Oncogenomic screening. Cancer Cell19, 347-58 (2011).
Zerbini LF, B. M. , de Vasconcellos JF, Paccez JD, Gu X, Kung AL, Libermann TA. Computational repositioning and preclinical validation of pentamidine for renal cell cancer. Molecular cancer therapeutics13, 1929-1941.
Ghia, P. et al.ERIC recommendations on IGHV gene mutational status analysis in chronic lymphocytic leukemia. Leukemia21, 1-3 (2007).
Davi, F. , Rosenquist, R. , Ghia, P. , Belessi, C. & Stamatopoulos, K. Determination of IGHV gene mutational status in chronic lymphocytic leukemia: bioinformatics advances meet clinical needs. Leukemia22, 212-4 (2008).
Simon. , R. Bioinformatics in cancer therapeutics--hype or hope? Nature Clinical Practice Oncology2, 223 (2005).
Allinen, M. et al.Molecular characterization of the tumor microenvironment in breast cancer. Cancer Cell6, 17-32 (2004).
Sjöblom T, J. S. , Wood LD, Parsons DW, Lin J, Barber TD, Mandelker D, Leary RJ, Ptak J, Silliman N, Szabo S, Buckhaults P, Farrell C, Meeh P, Markowitz SD, Willis J, Dawson D, Willson JK, Gazdar AF, Hartigan J, Wu L, Liu C, Parmigiani G, Park BH, Bachman KE, Papadopoulos N, Vogelstein B, Kinzler KW, Velculescu VE. The consensus coding sequences of human breast and colorectal cancers. Science314, 268-274 (2006).
Wood L. D. , P. D. W. , Jones S. , Lin J. , Sjöblom T. , Leary R. J. , Shen D. , Boca S. M. , Barber T. , Ptak J. , Silliman N. , Szabo S. , Dezso Z. , Ustyanksky V. , Nikolskaya T. , Nikolsky Y. , Karchin R. , Wilson P. A. , Kaminker J. S. , Zhang Z. , Croshaw R. , Willis J. , Dawson D. , Shipitsin M. , Willson J. K. , Sukumar S. , Polyak K. , Park B. H. , Pethiyagoda C. L. , Pant P. V. , Ballinger D. G. , Sparks A. B. , Hartigan J. , Smith D. R. , Suh E. , Papadopoulos N. , Buckhaults P. , Markowitz S. D. , Parmigiani G. , Kinzler K. W. , Velculescu V. E. , Vogelstein B. The genomic landscapes of human breast and colorectal cancers. Science318, 1108-1113 (2007).
Sorlie, T. et al.Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A98, 10869-74 (2001).
Christos Sotiriou, S. -Y. N. , Lisa M. McShane, Edward L. Korn, Philip M. Long, Amir Jazaeri, Philippe Martiat, Steve B. Fox, Adrian L. Harris, Edison T. Liu. Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proc Natl Acad Sci U S A100, 10393-8 (2003).
Loi, S. et al.Gene expression profiling identifies activated growth factor signaling in poor prognosis (Luminal-B) estrogen receptor positive breast cancer. BMC Med Genomics2, 37 (2009).
Jack X Yu, A. M. S. , Yi Zhang, John WM Martens, Marcel Smid, Jan GM Klijn, Yixin Wang, John A Foekens. Pathway analysis of gene signatures predicting metastasis of node-negative primary breast cancer. BMC Cancer7, 182 (2007).
Behbod, F. & Rosen, J. M. Will cancer stem cells provide new therapeutic targets? Carcinogenesis26, 703-11 (2005).
Yao, J. et al.Combined cDNA array comparative genomic hybridization and serial analysis of gene expression analysis of breast tumor progression. Cancer Res66, 4065-78 (2006).
Frank Courjal, M. C. , Joelle Simony-Lafontaine, Genevieve Louason, Paul Speiser, Robert Zeillinger, Carmen Rodriguez, and Charles Theilet. Mapping of DNA Amplifications at 15 Chromosomal Localizations in 1875 Breast Tumors: Definition of Phenotypic Groups. cancer research57, 4360-4367 (1997).
Larissa Savelyeva, M. S. Amplification of oncogenes revisited: from expression profiling to clinical application. Cancer Letters167, 115-123 (2001).
Soonmyung Paik, M. D. , Steven Shak, M. D. , Gong Tang, Ph. D. , Chungyeul Kim, M. D. , Joffre Baker, Ph. D. , Maureen Cronin, Ph. D. , Frederick L. Baehner, M. D. , Michael G. Walker, Ph. D. , Drew Watson, Ph. D. , Taesung Park, Ph. D. , William Hiller, H. T. , Edwin R. Fisher, M. D. , D. Lawrence Wickerham, M. D. , John Bryant, Ph. D. , Norman Wolmark, M. D. A Multigene Assay to Predict Recurrence of Tamoxifen-Treated, Node-Negative Breast Cancer. The new england journal of medicine351, 2817-2826 (2004).
Gokmen-Polar, Y. et al.Gene Expression Analysis Reveals Distinct Pathways of Resistance to Bevacizumab in Xenograft Models of Human ER-Positive Breast Cancer. J Cancer5, 633-45 (2014).
Chin Minning, N. M. M. , Norlia Abdullah, Rohaizak Muhammad, Nor Aina Emran, Siti Aishah Md Ali, Roslan Harun, Rahman Jamal. Exploring breast carcinogenesis through integrative genomics and epigenomics analyses. international Journal of Oncology, 1959-1968 (2014).