Main genes whit potential alternative splicing in breast cancer.
In cancer, several alterations driving cell transformation including: imbalances DNA, changes in gene expression as well as protein diversity. The transcriptional regulation is finely driving and controlled by a large number of molecules, including: SR proteins, hnRNPS, RNA, DNA, histones methylation, among others. However, in cancer the regulation is altered. It is little kwon regulations causing alternative splicing in healthy and human diseases. The alternative splicing plays an important role in the generation of diversity of transcripts its proteins resulting. The aberrant transcripts variants expressed in cancer have shown a great potential as biomarkers or therapeutics targets. In this manuscript, we showed the basic in alternative splicing and a simple method using available data for detection alternative transcripts expressed in the tree most common human cancer.
- alternative splicing
- breast cancer
- prostate cancer
- gene expression
- molecular markers
The molecular biology of cancer is not completely understood. The human transcriptome is an important molecule that to be used as molecular marker, because the RNA is fractionated in coding and non-coding and the functions, locations and structure are very variables. However, in cancer is little known complexity of the transcriptome. In this chapter, we focused in the showed the landscape of the post-transcriptional modifications (RNA splicing), data mining and identification of alternative splicing of available microarray data. We think that splicing and alternative splicing is machinery that is high modified in cancer and the changes in the disease could play a very important role in the diagnosis and prognosis.
2. RNA splicing
The gene expression is orchestrated by means of great interaction of molecules including: SR proteins, hnRNPS, RNA, DNA, histones methylation, among others. The RNA is a fundamental molecule for the life. Recent studies have shown that the human transcriptome is fractionated in coding and non-coding RNA. Interestingly, the coding RNA is representing for only 2% of the human transcriptome and the remains is non-coding RNA, suggesting large versatility to generate protein diversity. The pre-RNA is matured by several events include, addition of a poly (A) tail in the 3′, 5′m7G cap in endings and RNA splicing; those modifications conferring RNA stability, transport efficiency to the cytoplasm, among others.
The RNA splicing involves several steps and includes specific signals that delimited intronic and exonic sequences (splice site, SS), and sequences that help to exon skipping such as: intronic splicing enhancer (ISE), intronic splicing silencer (ISS), exonic splicing enhancer (ESE) and exonic splicing silencer (ESS) . In addition, five small nuclear ribonocleoproteins (snRNP; U1, U2, U4, U5 y U6) and more than 150 additional co-factors that contributing to splicing [2, 3].
Basically, four signals that include: branch point, polypyrimidine tract, splice site 5′ and splice site 3′ [4, 5, 6]. Moreover, sequential steps that confer topological changes between RNA and snRNPs forming E complex, A complex (ATP dependent), B complex and finally C complex or spliceosome, which is the catalytic complex . Additionally, the RNA could be subjected to alternative exon skipping by means of alternative splicing AS. The AS is processed using the basic machinery of splicing, and SR and hnRNPs plays an essential role for alternative exon skipping.
The coding RNA is represented by ~25,000 genes, however, more than 300,000 transcripts have been reported [8, 9]. The difference between genes and transcripts is probability by alternative splicing (AS) regulation. Actually, we know that more than 80% of RNA coding are subjected to AS, promoting a great diversity of mRNA and consequently proteins. For example, in Drosophila melanogaster, the gene Down Syndrome Cell Adhesion Molecule (Dscam) could generate more than 38,000 different mRNAs by means of alternative splicing . These findings showed the importance of AS for the biology of the cell.
On the other hand, the long non-coding RNAs (lncRNAs) also could be subject to alternative splicing. However, the diversity of lncRNAs transcripts has been poorly studied. It is thought that AS in lncRNAs could be implicated in several regulatory processes, mainly mediated RNA-RNA, RNA-DNA and RNA-Proteins interaction. All possibility interaction, probability could increase the complex regulatory process.
We will focus in coding RNA. The coding RNA only is representing for ~2% of total RNA. It is known that in eukaryotic cell there are more events of AS than another organism [11, 12]. Among AS patters we found alternative promoters, exon skipping, intron retention, mutually exclusive exons, exon scrambling, Alternative 5′ splice site, alternative 3’splice site, alternative polyadenylation . Interestingly, the proteins product of AS could change their native functions. In several human diseases, the AS contributes to diverse cellular process including: cell proliferation, migration, adhesion, metastasis, among others [9, 13, 14, 15, 16]. The transcripts subject to AS and their product could be used as molecular markers and therapeutics targets, because only it is expressed in the disease or its expression is increase [11, 17, 18, 19].
3. Alternative splicing and diseases
The AS is regulated by large number of proteins/non-coding RNAs/DNA and large complex network interactions among them provide the perfect capacity of cell regulation. In addition, the posttranscriptional regulation is orchestrated so finely that the cells have capacity to response rapidly before a stimulus and the cell adjust their proteome. Additionally, the cell is exposed daily to several toxics agents, UV radiation, promoting vulnerability to mutations and misregulation. Particularly, the mutations plays an important role in aberrant AS that cause diseases especially neuromuscular, neurodegenerative and multifactorial diseases as cancer .
Three sequences are extremely important for RNA processing and mature, the 5′, 3′ splice site; 5′, 3′ introns end and the branch point sequence, which is usually located at ~40 upstream of 3’splice site, because contain the specific sequences of recognition by spliceosome for precise exon joining . However, mutations in those sites disrupt the correct spliceosome assembly. Approximately 10% of genetic diseases are cause by point mutations that disrupt the interaction between RNA and spliceosome [22, 23].
The class mutation and locations in the genome, contributing to different variants of AS such as: exon skipping, exon retention, alternative 5′ and 3′, among others. The severity of the disease could be represented by intensity of expression of the mutate gene, for example: In spinal muscular atrophy (SMA) the SMN2 gene has C → T change in the exon 7, this change promotes an exon skipping (SMN∆7) and their expression is proportionally mayor ~80% than ~ 20% in the healthy. Other case is in Duchenne muscular dystrophy (DND), in the dystrophin gene there is a substitution of T → A in the exon 31, promoting this exon skipping. In cystic fibrosis, the exclusion of the exon 9 in CFTR modifying the severity of the disease. In the Peutz-Jeghers syndrome the alternative transcript of LBK1 is expressed as consequence of change IVS2 + 1A > G .
4. Alternative splicing and cancer
In cancer, several alterations are involved to cell transformation, recently studies have showed that the AS plays an important role in cancer development, because change the transcriptomics and consequently the proteome; contributing to cell transformation [19, 25]. However, there are few studies focused on the identification of transcripts variants in cancer. Computational studies in cancer derived of expression sequence tags has showed that the AS in cancer was slightly lower in tumors than normal tissues . The question is what is the difference between AS in cancer tissues and normal tissues? The aberrant transcripts expressed in cancer have shown a great potential as biomarkers or therapeutics targets. In breast cancer, CD44 gene can to transcribe seven alternate transcripts; the transcript variants five and seven have been involved in diverse pathologies, but the transcript six only is expressed in metastatic cancer and tumorigenic cell lines. These finding suggesting that an alternative transcript six of CD44 could play role in metastasis process [27, 28]. The BRCA1 has been involved in diverse types of cancer, in breast malignancies the mutation c.591C > T is implicated in skipping of exon 18 in BRCA1 transcript, the mutation constitute an important prognostic factor in familiar breast and ovarian cancer . In gastric cancer, the KIT gene has a deletion of ~40 nucleotides, this cleavage promotes aberrant AS and loss of functional protein resulting .
In the healthy cell a several proteins are key for DNA repair, transcript regulators, among others. The BCL protein is very important in programed cell death. However, the cancer cell up regulates the expression and AS of BCL-xL, promoting the expression of long protein involved in anti-apoptotic process. In contrast, with short protein BCL-xS is involved in the apoptosis . In ovarian cancer was found a new alternative transcript of p53 (TP53INP2) its expression is strongly associated to migration and cell invasion  its expression is associated to adverse prognosis . The leukemia is the most frequency malignance in childhood, this neoplasia is not the exception also has been identifying AS in several transcripts including: CCAR1 promote the complex Par-4/THAP1 y Notch3  and confer unfavorable prognosis as well as hMLH1Delta6 . The Ikaros is a suppressor tumor gene, the variant IK11 is associated to proliferation and anti-apoptotic process .
5. Alternative splice transcript/proteins as molecular markers and therapeutic targets
The great challenge in cancer is the identification of the molecular markers and therapeutic targets. The proteins and transcripts products of AS are a magnify molecules because open some new opportunities in cancer. The aberrant AS is a consequence of malignant transformation, the mutations and gene expression modulation promote the expression of new molecules that confers advantage to cancer cell, such as: cell proliferation, migration, invasion, evading programed death, among others. In this context, the identification of molecules expressed in cancer could be a best molecular marker as well as treatment targets, because only are expressed in pathological tissue. There is a little information about of AS profiles in cancer, nevertheless, some molecules have been used such as molecular markers. The CD44 isoforms be predictive to anti CD44 treatment in many types of cancer . The androgen receptor AR-V7 has been used as a predictive marker , patients who expressed V7 isoforms are resistant to therapy using enzalutamide and abiraterone . The isoforms of SLC39A14 are used to detection of non- invasive colorectal cancer and the isoform is specific of the colon and rectum [40, 41]. The new transcript variant of VNN1 also be specific of cancer colon cancer and is used detection by their specificity .
The prospect for treatment of cancer is based on antibodies specifics for isoform expressed exclusively in the disease. However, there are other strategies that also could be used with RNA target, such as: using stable antisense RNAs, this approach could be used in different types of RNAs (coding and non-coding RNA) inclusively pre-RNA. The interference RNA is other strategy used in the elimination of aberrant expressed transcripts or even splicing variant .
6. Alternative splicing methods for detection
The mRNA splice is easily visualized using several tools for molecular biology; the most used is the RT-PCR. The implications for AS detection using the PCR, is based on primers design. Usually the primers are flanking the exon skipping; however, this method not could detect novel splice transcripts. The in situ hybridization is other method that is used for AS detection as well as PCR this method no could detect novel splice sites. In the last 20 years has been developed massive method for detection gene expression. These tools have provided quickly gene expression profiles diseases-associated. Nowadays, gene expression microarrays and next generation sequencing are being used for detection novel molecules expressed in diverse diseases, include alternative mRNA spliced. Actually, the microarray gene expression (MGE) can measure exon expression, in this context, the low expression or suppression in particular probe set could be indicating AS Figure 1. Up to day, there are 25,252 assays performed with Affymetrix Human Exon 1.0 ST; 39,836 assays using Affymetrix GeneChip Human Gene 1.0 ST; and the most recently version 2422 assays with Affymetrix GeneChip Human Gene 2.0 ST, the experiments were performed between 8/7/07 and 8/8/16, 8/12/08 and 12/1/17 and 8/1/13 to 12/19/17, respectively each version array. Moreover, the microarrays data are available for data mining provides extraordinary information about profiles in human diseases including cancer. Additionally, the microarray analysis can be driving to explorer the AS.
7. Alternative splicing in the most common cancer types
The most common type of cancer is the breast cancer with more than 255,000 new cases expected in the United States in 2017, followed lung and prostate cancer according to National Cancer Institute. The question is Which are the transcripts alternatively spliced between normal and cancerous tissues? The major difficulty has been to determine whether the splicing changes detected in cancer are pathogenic . Then we showed different analysis using high density microarrays to identify AS in three models of cancer. We performed data mining of Affymetrix microarrays. The data were download of ArrayExpress (https://www.ebi.ac.uk/arrayexpress/)  or Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/geo/) . The microarray analysis was performed using Partek Genomics Suite v7.17 according to previous reports .
8. Breast cancer
We performed analysis using the data set E-GEOD-81838 available in ArrayExpress web page or GEO-GSE81838 available in GEO, the data that we used was published by Lehmann et al. . The data set was established by 10 breast tumors and 10 stromal cells. Our analysis showed 605 differential and alternatively spliced transcripts, the top 10 overexpress and suppressed are showed in the Table 1. We showed the most significant over and down expressed. The DTL transcript have 15 exons, is over expressed in tumor cell and DTL showed a potential alternative cap site Figure 2 The FGF7 transcript have four exons, the heat map showed two apparent alternative site; cap and polyadenylation Figure 3.
9. Lung cancer
The analysis in lung cancer was performed using data set E-GEOD-30979 available in ArrayExpress web page or GEO-GSE30979, the data was published by Leithner et al. . The model was hypoxic-based in lung cancer. Our analysis revealed 101 transcripts expressed differentially also could have potentially alternative splicing, in the Table 2 we showed the top 10 over and suppressed transcript identifies in this analysis. One of the most significant AS transcripts was LOX, this transcript has six exons. Our results showed a potential alternative site in cap Figure 4. The CEACAM6 was the supressed in hypoxic condition, also apparently showed an alternative cap site Figure 5.
10. Prostate cancer
The prostate cancer is one of the most common malignance in the worldwide. For this chapter we performed data mining using the data set E-GEOD-66852 available in ArrayExpress web page or GEO-GSE66852, the data was published by Nouri et al. . Our results showed 777 transcripts that have significant differential exon expression, the most significant over and down expressed are shown in the Table 3. Our results showed the over expression in the CCDC80 transcript also showed an alternative spliced site in the exon six Figure 6. The down regulate transcript was DLGAP5, this transcript showed two potential sites of splicing; in the exon four and eight Figure 7.
The alternative splicing is an important transcriptional mechanism that promote protein diversity. In cancer, several alterations in AS has been reported. In this chapter, we showed the generalities of alternative splicing process, the implications of AS in human diseases. The potential use of alternative transcript expressed in cancer as molecular markers and therapeutic targets. Finally, a simple method for identification of alterative transcripts expressed in three models of cancer using available dataset of Affymetrix.
This work was supported by SEP-CONACyT Basic Science grant 243233, Sectorial Fund for Research in Health and Social Security grant 272633 and Fiscal Resources, INP 2017.
The authors declare that they have no competing interests.