According to the central dogma of molecular biology, the entire process of producing proteins in cells is defined as gene expression, which includes replication of the DNA, DNA transcription into mRNA, and mRNA translation into proteins . Although DNA is the same in all cell types of an organism, each cell expresses only a part of its genes each time, which equates to the ability of the cell to modify the expression of its genome and thus changes its functions .
Gene expression profiling is a process in which the genes expressed in a cell can be measured at a specific time . This method simultaneously calculates the levels of thousands of genes leading to the presentation of the expression pattern of the cell’s genes . Therefore, through gene expression profiling, we can discover the functions of a cell at a particular time, which constitutes an important application of this method in cancer cells.
A cancer cell is defined as each cell of a tissue in which there is a loss of the standard controlling mechanisms of cell division, resulting in its uncontrolled multiplication, leading to the accumulation of transformed somatic cells, which contain many genetic alterations and epigenetic modifications. These cells have the ability to filter into adjacent tissues, creating metastasis. Metastatic cells impede the physiologic functioning of the vital organs and destroy the physiological tissues resulting in death .
2. Cancer cell biology
The process of carcinogenesis begins with the transformation of a physiological cell into a cancer cell as the genes that control the growth and differentiation of the cell are modified . The genes that are involved in this process are (1) oncogenes, which promote cell growth and differentiation, and (2) tumor suppressor genes, which repress cell division. For the generation of a tumor, the accumulation of several mutations is necessary, leading to the oncogene generation and overexpression, likewise repressing the tumor suppressor genes .
The genetic modifications causing the tumor development can occur at any stage of the cell cycle. Therefore, an absence of a whole chromosome can occur due to an error in mitosis, or it is possible to arise various mutations in the nucleotide sequence [7, 8].
The causes of cancer are mostly mutations in the genome of the cell, originating from environmental factors, while about 10% of cancers are due to heredity . The main environmental factors that lead to cancer are tobacco, diet and obesity, infections, radiation, stress, and pollution . Cancer cells can develop and filter through all tissues and vital organs of the body. The most common types of cancers worldwide that affect both sexes are lung cancer, breast cancer, colorectal cancer, and prostate cancer, followed by stomach, liver, and esophagus cancer .
3. Gene expression profiling techniques
As mentioned above, gene expression profiling is a useful tool in modern biosciences. The human genome contains genes that can produce mRNA that will later be translated into protein. The human genome also contains nonprotein encoding RNA genes and large areas of noncoding and regulatory sequences . Therefore, measuring mRNAs is crucial in indicating gene expression. Gene expression is essential in determining the cell type, developmental stage, and both pathological and healthy functions. Apart from the ability to present data on the subset of genes that are expressed in different cell types under different conditions, this specific technique can also function as an essential diagnostic test, since it can help record cellular responses to drug treatment . Cancer development, as already stated, is dependent on gain-of-function mutations in proto-oncogene genes that result in dominant oncogenes or overexpression of said oncogenes, along with the loss or under-expression of tumor suppressor genes that lead to uncontrolled cell division. Thus, using gene expression profiling, one can study the difference between normal and cancerous cells to determine the genetic origin of faulty pathways that are a characteristic of cancer and provide potential targets for its treatment . Apart from treatment, this technique can also help with the identification of new biomarkers and gene signatures. Gene expression profiling can be achieved through various assay technologies. Among those, some of the most widespread uses are DNA Microarrays, RNA-seq, and qPCR .
The creation of a cDNA library is a vital step in gene expression profiling. An experiment begins with the extraction of total RNA from the biological material of choice, such as a population of cancer cells. This experiment is followed by the use of a specific protocol with the intent of isolating a specific RNA type (e.g., ribo-depletion to remove ribosomal RNAs). The RNA is then converted to cDNA by reverse transcription .
Printing cDNA microarrays on glass slides is a commonly used technique. cDNAs are received by amplifying individual clones in a library, and each fragment represents an individual gene of interest. Each fragment is then immobilized on a slide coated with DNA-binding chemicals. These slides can be used in a microarray experiment. In a typical experiment, mRNAs from the two samples to be compared are reverse transcribed and then labeled with two different fluorescent markers . The labeled samples are then competitively hybridized to the microarray. The excessive labeled probes are removed by washing, and the samples are examined using a laser scanner. The relative levels of expression of each sample are reflected by the hybridization intensity which is represented by the amount of fluorescent emission .
The development of high-throughput next-generation sequencing (NGS) has revolutionized gene expression profiling. NGS provides the ability of massively parallel short-read DNA sequencing . It is now possible to analyze RA through the sequencing of cDNA—a method termed RNA sequencing (RNA-seq) . The cDNA is sequenced using high-throughput sequencing. The data obtained will be used to generate FASTQ format files which contain reads sequenced by the NGS platform. These reads will be aligned to a reference genome. Finally, the expression level of each gene is estimated by counting the number of reads that align to each full-length transcript .
qPCR is a technique used to quantify gene expression and can monitor the process of polymerase-driven DNA amplification (PCR) in real time . PCR uses a thermostable DNA polymerase enzyme to synthesize new strands of DNA. Along with the DNA polymerase and template DNA (in this instance specifically, cDNA), PCR requires primers and nucleotides. The nucleotides will act as building blocks, while the primers will specify the exact DNA product to be amplified . The reaction proceeds to repeated DNA amplification cycles. Each cycle consists of three necessary steps: denaturing, annealing, and extending. The result is the amplification of the DNA sample. Two standard methods are used in qPCR to detect and quantify the product. Those include fluorescent dyes that non-specifically intercalate with double-stranded DNA or sequence-specific DNA probes consisting of fluorescently labeled reports that are complementary to the DNA product and will permit detection only after hybridization .
4. Gene expression profiling of most common cancer types
Gene expression profiling has helped in the better understanding of breast cancer biology . Among the applications of gene profiling in breast cancer are the subclassification of breast cancer, disease prognosis, prediction of response to therapy, and specialization of therapy based on the hos. . Breast cancer is composed of multiple subtypes based on intrinsic molecular characteristics. With the use of microarrays, the distinctive molecular portraits of breast cancer have been reported . According to those studies, tumors are classified into five subtypes with distinct clinical outcomes. Those subtypes are luminal A, luminal B, HER2 overexpression, basal, and normal-like tumors . Apart from the subclassification of breast cancer, gene expression analyses have been used to characterize novel prognostic indicators. Some of the gene expression tests for breast cancer prognosis that have been developed are Oncotype DX, MammaPrint, PAM50-based risk of recurrence score, Breast Cancer Index, and EndoPredict . An oncologist should be able to design an individualized therapy identified by maximum benefit and minimum harm through the use of predictive biomarkers. Predictive biomarkers are fewer than prognostic ones. Oncotype DX is a genomic model that can be used to predict therapy response too. Also known as 21-gene recurrence score, Oncotype DX records the expression of 21 genes (16 cancer-related genes and 5 reference genes) and reports them as a single Recurrence Score. The Recurrence Score can later help the oncologist select the best available treatment for the patient . Finally, the next step in breast cancer treatment is using host biology in prediction too, since gene variations in the patient can affect the efficacy and toxicity of the treatment .
Lung cancer is molecularly heterogeneous. Just as breast cancer, gene expression profiling has been used on the identification of lung cancer type, while disease prognosis and prediction in response to therapy seem specific to lung cancer subtypes . There are two major histologically distinct types of lung cancer. Those are non-small cell lung cancer (NSCLC) and small-cell lung cancer (SCLC). NSCLCs also have three subcategories: adenocarcinoma, squamous cell carcinoma (SqCC), and large-cell carcinoma . NSCLC and SCLC have different pathophysiology and clinical features, suggesting different molecular mechanisms in carcinogenesis. Genome-wide cDNA microarray has helped researchers to document distinct phenotypic and biological differences in cancer cells. SCLCs are characterized by prevalent bi-allelic inactivation of TP53 and RB1 with SOX2 being a frequently amplified gene and recurrent mutations that encode histone modifiers . Eleven genes have been associated with SqCCs, with the frequency of TP53 mutations being 90%, while 18 genes have been associated with adenocarcinomas, with the frequency of mutations harboring genetic alterations that promote the RTK/RAS/RAF pathway being 75% . Gene expression profile has also been used for disease prognosis and prediction of response to therapy in NSCLCs .
Colorectal cancer afflicts about 10% of people worldwide. According to a meta-analysis study, microarray results indicated that the expression of the genes of six chemokines, CCL18, CXCL9–11, IL8, and CCL2, as well as two apoptosis-related genes, UBD and BIRC3, and LAMC2 and MMP7 had an increase in colorectal cancer . Precisely, the expression of CCL18 constitutes an indication of colorectal cancer , while the expression of CXCL9–11 increases the ability of cancer cells for migratory . Moreover, the results of a bioinformatical analysis indicate that the influenced genes were associated with chemokines, cell cycle, and G protein-coupled receptor signaling pathways . According to this research, the main genes, which are involved in cell cycle process and transformed in cancer, were the cyclins CCNB1 and CCNA2, the cyclin-dependent kinase 1 (CDK1), CENPE, KIF20A, and MAD2L1 . Respectively, the genes of chemokines, which are influenced by cancer, were CXCL1, CXCL2, CXCL6, CXCL8, and CXCL12 .
Another type of cancer with a high frequency is prostate cancer. This cancer is an adenocarcinoma. Its main symptoms are pain, difficulty in passing urine, hematuria, and erectile dysfunction. Its main causes are obesity and diet rich in meat, family history, and HPC1 genes and the androgen receptor (AR) and the vitamin D receptor . According to the literature, there are indications that prostate cancer may be due to regions of SNPs of c-MYC oncogene, which affect the form of chromatin and the expression of the gene. Furthermore, it has been shown that the BRCA2 gene, except for its association with breast cancer, is also related to the increased risk of prostate cancer. Respectively, similar indications are presented with the modification of the expression of BRCA1 . Moreover, prostate cancer has been associated with mutations of genes that are part of the DNA repair mechanism, such as CHEK2, PALB2, BRIP1, and NBS1 that are likewise related to the risk for breast cancer (CHEK2, PALB2, BRIP1) [33, 34, 35] and Nijmegen breakage syndrome (NBS1) .
All in all, many genes are affected and can be employed as biomarkers for the prognosis and the prediction of therapy’s outcomes for all types of cancer. Some examples of these genes have been mentioned above for the four most common types of cancer, but they constitute only a part of all the genes that are affected in a cancer cell. By identifying the genes that are biomarkers of a cancer type and the genes that promote tumor’s proliferation, the purpose is a more targeted and personalized treatment for any patient that will not only beget the tumor eradication, but it will also occasion the silencing of the genes that lead to tumor creation.