Open access peer-reviewed chapter

Alternative RNA Splicing: New Approaches for Molecular Marker Discovery in Cancer

By Vanessa Villegas-Ruíz and Sergio Juárez-Méndez

Submitted: October 13th 2017Reviewed: January 25th 2018Published: June 20th 2018

DOI: 10.5772/intechopen.74415

Downloaded: 266

Abstract

In cancer, several alterations driving cell transformation including: imbalances DNA, changes in gene expression as well as protein diversity. The transcriptional regulation is finely driving and controlled by a large number of molecules, including: SR proteins, hnRNPS, RNA, DNA, histones methylation, among others. However, in cancer the regulation is altered. It is little kwon regulations causing alternative splicing in healthy and human diseases. The alternative splicing plays an important role in the generation of diversity of transcripts its proteins resulting. The aberrant transcripts variants expressed in cancer have shown a great potential as biomarkers or therapeutics targets. In this manuscript, we showed the basic in alternative splicing and a simple method using available data for detection alternative transcripts expressed in the tree most common human cancer.

Keywords

  • alternative splicing
  • breast cancer
  • prostate cancer
  • gene expression
  • molecular markers

1. Introduction

The molecular biology of cancer is not completely understood. The human transcriptome is an important molecule that to be used as molecular marker, because the RNA is fractionated in coding and non-coding and the functions, locations and structure are very variables. However, in cancer is little known complexity of the transcriptome. In this chapter, we focused in the showed the landscape of the post-transcriptional modifications (RNA splicing), data mining and identification of alternative splicing of available microarray data. We think that splicing and alternative splicing is machinery that is high modified in cancer and the changes in the disease could play a very important role in the diagnosis and prognosis.

2. RNA splicing

The gene expression is orchestrated by means of great interaction of molecules including: SR proteins, hnRNPS, RNA, DNA, histones methylation, among others. The RNA is a fundamental molecule for the life. Recent studies have shown that the human transcriptome is fractionated in coding and non-coding RNA. Interestingly, the coding RNA is representing for only 2% of the human transcriptome and the remains is non-coding RNA, suggesting large versatility to generate protein diversity. The pre-RNA is matured by several events include, addition of a poly (A) tail in the 3′, 5′m7G cap in endings and RNA splicing; those modifications conferring RNA stability, transport efficiency to the cytoplasm, among others.

The RNA splicing involves several steps and includes specific signals that delimited intronic and exonic sequences (splice site, SS), and sequences that help to exon skipping such as: intronic splicing enhancer (ISE), intronic splicing silencer (ISS), exonic splicing enhancer (ESE) and exonic splicing silencer (ESS) [1]. In addition, five small nuclear ribonocleoproteins (snRNP; U1, U2, U4, U5 y U6) and more than 150 additional co-factors that contributing to splicing [2, 3].

Basically, four signals that include: branch point, polypyrimidine tract, splice site 5′ and splice site 3′ [4, 5, 6]. Moreover, sequential steps that confer topological changes between RNA and snRNPs forming E complex, A complex (ATP dependent), B complex and finally C complex or spliceosome, which is the catalytic complex [7]. Additionally, the RNA could be subjected to alternative exon skipping by means of alternative splicing AS. The AS is processed using the basic machinery of splicing, and SR and hnRNPs plays an essential role for alternative exon skipping.

The coding RNA is represented by ~25,000 genes, however, more than 300,000 transcripts have been reported [8, 9]. The difference between genes and transcripts is probability by alternative splicing (AS) regulation. Actually, we know that more than 80% of RNA coding are subjected to AS, promoting a great diversity of mRNA and consequently proteins. For example, in Drosophila melanogaster, the gene Down Syndrome Cell Adhesion Molecule (Dscam) could generate more than 38,000 different mRNAs by means of alternative splicing [10]. These findings showed the importance of AS for the biology of the cell.

On the other hand, the long non-coding RNAs (lncRNAs) also could be subject to alternative splicing. However, the diversity of lncRNAs transcripts has been poorly studied. It is thought that AS in lncRNAs could be implicated in several regulatory processes, mainly mediated RNA-RNA, RNA-DNA and RNA-Proteins interaction. All possibility interaction, probability could increase the complex regulatory process.

We will focus in coding RNA. The coding RNA only is representing for ~2% of total RNA. It is known that in eukaryotic cell there are more events of AS than another organism [11, 12]. Among AS patters we found alternative promoters, exon skipping, intron retention, mutually exclusive exons, exon scrambling, Alternative 5′ splice site, alternative 3’splice site, alternative polyadenylation [4]. Interestingly, the proteins product of AS could change their native functions. In several human diseases, the AS contributes to diverse cellular process including: cell proliferation, migration, adhesion, metastasis, among others [9, 13, 14, 15, 16]. The transcripts subject to AS and their product could be used as molecular markers and therapeutics targets, because only it is expressed in the disease or its expression is increase [11, 17, 18, 19].

3. Alternative splicing and diseases

The AS is regulated by large number of proteins/non-coding RNAs/DNA and large complex network interactions among them provide the perfect capacity of cell regulation. In addition, the posttranscriptional regulation is orchestrated so finely that the cells have capacity to response rapidly before a stimulus and the cell adjust their proteome. Additionally, the cell is exposed daily to several toxics agents, UV radiation, promoting vulnerability to mutations and misregulation. Particularly, the mutations plays an important role in aberrant AS that cause diseases especially neuromuscular, neurodegenerative and multifactorial diseases as cancer [20].

Three sequences are extremely important for RNA processing and mature, the 5′, 3′ splice site; 5′, 3′ introns end and the branch point sequence, which is usually located at ~40 upstream of 3’splice site, because contain the specific sequences of recognition by spliceosome for precise exon joining [21]. However, mutations in those sites disrupt the correct spliceosome assembly. Approximately 10% of genetic diseases are cause by point mutations that disrupt the interaction between RNA and spliceosome [22, 23].

The class mutation and locations in the genome, contributing to different variants of AS such as: exon skipping, exon retention, alternative 5′ and 3′, among others. The severity of the disease could be represented by intensity of expression of the mutate gene, for example: In spinal muscular atrophy (SMA) the SMN2 gene has C → T change in the exon 7, this change promotes an exon skipping (SMN∆7) and their expression is proportionally mayor ~80% than ~ 20% in the healthy. Other case is in Duchenne muscular dystrophy (DND), in the dystrophin gene there is a substitution of T → A in the exon 31, promoting this exon skipping. In cystic fibrosis, the exclusion of the exon 9 in CFTR modifying the severity of the disease. In the Peutz-Jeghers syndrome the alternative transcript of LBK1 is expressed as consequence of change IVS2 + 1A > G [24].

4. Alternative splicing and cancer

In cancer, several alterations are involved to cell transformation, recently studies have showed that the AS plays an important role in cancer development, because change the transcriptomics and consequently the proteome; contributing to cell transformation [19, 25]. However, there are few studies focused on the identification of transcripts variants in cancer. Computational studies in cancer derived of expression sequence tags has showed that the AS in cancer was slightly lower in tumors than normal tissues [26]. The question is what is the difference between AS in cancer tissues and normal tissues? The aberrant transcripts expressed in cancer have shown a great potential as biomarkers or therapeutics targets. In breast cancer, CD44 gene can to transcribe seven alternate transcripts; the transcript variants five and seven have been involved in diverse pathologies, but the transcript six only is expressed in metastatic cancer and tumorigenic cell lines. These finding suggesting that an alternative transcript six of CD44 could play role in metastasis process [27, 28]. The BRCA1 has been involved in diverse types of cancer, in breast malignancies the mutation c.591C > T is implicated in skipping of exon 18 in BRCA1 transcript, the mutation constitute an important prognostic factor in familiar breast and ovarian cancer [29]. In gastric cancer, the KIT gene has a deletion of ~40 nucleotides, this cleavage promotes aberrant AS and loss of functional protein resulting [30].

In the healthy cell a several proteins are key for DNA repair, transcript regulators, among others. The BCL protein is very important in programed cell death. However, the cancer cell up regulates the expression and AS of BCL-xL, promoting the expression of long protein involved in anti-apoptotic process. In contrast, with short protein BCL-xS is involved in the apoptosis [31]. In ovarian cancer was found a new alternative transcript of p53 (TP53INP2) its expression is strongly associated to migration and cell invasion [32] its expression is associated to adverse prognosis [33]. The leukemia is the most frequency malignance in childhood, this neoplasia is not the exception also has been identifying AS in several transcripts including: CCAR1 promote the complex Par-4/THAP1 y Notch3 [34] and confer unfavorable prognosis as well as hMLH1Delta6 [35]. The Ikaros is a suppressor tumor gene, the variant IK11 is associated to proliferation and anti-apoptotic process [36].

5. Alternative splice transcript/proteins as molecular markers and therapeutic targets

The great challenge in cancer is the identification of the molecular markers and therapeutic targets. The proteins and transcripts products of AS are a magnify molecules because open some new opportunities in cancer. The aberrant AS is a consequence of malignant transformation, the mutations and gene expression modulation promote the expression of new molecules that confers advantage to cancer cell, such as: cell proliferation, migration, invasion, evading programed death, among others. In this context, the identification of molecules expressed in cancer could be a best molecular marker as well as treatment targets, because only are expressed in pathological tissue. There is a little information about of AS profiles in cancer, nevertheless, some molecules have been used such as molecular markers. The CD44 isoforms be predictive to anti CD44 treatment in many types of cancer [37]. The androgen receptor AR-V7 has been used as a predictive marker [38], patients who expressed V7 isoforms are resistant to therapy using enzalutamide and abiraterone [39]. The isoforms of SLC39A14 are used to detection of non- invasive colorectal cancer and the isoform is specific of the colon and rectum [40, 41]. The new transcript variant of VNN1 also be specific of cancer colon cancer and is used detection by their specificity [42].

The prospect for treatment of cancer is based on antibodies specifics for isoform expressed exclusively in the disease. However, there are other strategies that also could be used with RNA target, such as: using stable antisense RNAs, this approach could be used in different types of RNAs (coding and non-coding RNA) inclusively pre-RNA. The interference RNA is other strategy used in the elimination of aberrant expressed transcripts or even splicing variant [43].

6. Alternative splicing methods for detection

The mRNA splice is easily visualized using several tools for molecular biology; the most used is the RT-PCR. The implications for AS detection using the PCR, is based on primers design. Usually the primers are flanking the exon skipping; however, this method not could detect novel splice transcripts. The in situ hybridization is other method that is used for AS detection as well as PCR this method no could detect novel splice sites. In the last 20 years has been developed massive method for detection gene expression. These tools have provided quickly gene expression profiles diseases-associated. Nowadays, gene expression microarrays and next generation sequencing are being used for detection novel molecules expressed in diverse diseases, include alternative mRNA spliced. Actually, the microarray gene expression (MGE) can measure exon expression, in this context, the low expression or suppression in particular probe set could be indicating AS Figure 1. Up to day, there are 25,252 assays performed with Affymetrix Human Exon 1.0 ST; 39,836 assays using Affymetrix GeneChip Human Gene 1.0 ST; and the most recently version 2422 assays with Affymetrix GeneChip Human Gene 2.0 ST, the experiments were performed between 8/7/07 and 8/8/16, 8/12/08 and 12/1/17 and 8/1/13 to 12/19/17, respectively each version array. Moreover, the microarrays data are available for data mining provides extraordinary information about profiles in human diseases including cancer. Additionally, the microarray analysis can be driving to explorer the AS.

Figure 1.

Representative probe set signal in microarray and alternative splicing detection. The figure showed in the top the probe set, the markers that inspect exon level expression. In the middle part depicted signal intensity in microarray hybridization. On the bottom the alternative splicing by low signal intensity is shown.

7. Alternative splicing in the most common cancer types

The most common type of cancer is the breast cancer with more than 255,000 new cases expected in the United States in 2017, followed lung and prostate cancer according to National Cancer Institute. The question is Which are the transcripts alternatively spliced between normal and cancerous tissues? The major difficulty has been to determine whether the splicing changes detected in cancer are pathogenic [26]. Then we showed different analysis using high density microarrays to identify AS in three models of cancer. We performed data mining of Affymetrix microarrays. The data were download of ArrayExpress (https://www.ebi.ac.uk/arrayexpress/) [44] or Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/geo/) [45]. The microarray analysis was performed using Partek Genomics Suite v7.17 according to previous reports [46].

8. Breast cancer

We performed analysis using the data set E-GEOD-81838 available in ArrayExpress web page or GEO-GSE81838 available in GEO, the data that we used was published by Lehmann et al. [47]. The data set was established by 10 breast tumors and 10 stromal cells. Our analysis showed 605 differential and alternatively spliced transcripts, the top 10 overexpress and suppressed are showed in the Table 1. We showed the most significant over and down expressed. The DTL transcript have 15 exons, is over expressed in tumor cell and DTL showed a potential alternative cap site Figure 2 The FGF7 transcript have four exons, the heat map showed two apparent alternative site; cap and polyadenylation Figure 3.

Gene SymbolRefSeqp-valueFold-Change
DTLNM_0012862291.45E-464.97616
ESRP1NM_0010349153.48E-773.86032
HOOK1NM_0158881.36E-713.69894
TTKNM_0011666913.31E-423.6733
GRHL1NM_1981824.13E-443.5722
ASPMNM_0012068467.69E-673.53763
DLGAP5NM_0011460159.97E-413.51451
OCLNNM_0012052541.02E-193.4842
ELF5NM_0012430801.35E-123.41892
TDRD5NM_0011990851.07E-423.33299
INHBANM_0021926.07E-12−2.99379
TSHZ2NM_0011934211.53E-05−3.01074
FGF10NM_0011425561.27E-13−3.01537
PDZRN4NM_0011645954.88E-21−3.0344
NEXNNM_0011723096.90E-53−3.03644
CXCL12NM_0006097.58E-23−3.06879
IGF1NM_0006187.66E-16−3.46073
COL8A1NM_0018506.71E-20−3.50649
FGF7NM_0020093.49E-19−4.07332
FGF7P2OTTHUMT000001576596.90E-12−4.13171

Table 1.

Main genes whit potential alternative splicing in breast cancer.

Figure 2.

Differential exon expression of DTL gene. The figure showed in the top tree transcripts variants reported. The middle part sowed the level expression, the line red indicates tumor samples and blue indicate stroma samples. The heat map showed exon level expression on the far left, the exon is supressed suggesting an alternative splicing.

Figure 3.

Differential exon expression of FGF7 gene. The figure showed in the top one transcript variant reported, in the middle indicates level expression; the blue line indicates stroma samples and the line red indicates tumor samples. The heat map showed exon level expression on the far left and right, the exons are supressed suggesting that there are two new potential alternative transcripts.

9. Lung cancer

The analysis in lung cancer was performed using data set E-GEOD-30979 available in ArrayExpress web page or GEO-GSE30979, the data was published by Leithner et al. [48]. The model was hypoxic-based in lung cancer. Our analysis revealed 101 transcripts expressed differentially also could have potentially alternative splicing, in the Table 2 we showed the top 10 over and suppressed transcript identifies in this analysis. One of the most significant AS transcripts was LOX, this transcript has six exons. Our results showed a potential alternative site in cap Figure 4. The CEACAM6 was the supressed in hypoxic condition, also apparently showed an alternative cap site Figure 5.

Gene symbolRefSeqp-valueFold-Change
MROH9NM_0011636298.67E-262.95561
LOXNM_0011781021.53E-212.66476
CLGNNM_0011306754.13E-182.49403
MMENM_0009021.73E-342.47655
DDIT3NM_0011950539.74E-102.46406
NUCB2NM_0050135.49E-342.41954
FICDNM_0070768.59E-072.36754
DNAJB9NM_0123286.35E-152.35193
GBE1NM_0001582.04E-602.24392
ADMNM_0011242.06E-082.20975
BPIFA1NM_0012431932.45E-05−2.91878
IGKCAF1138879.63E-15−2.93589
TOP2ANM_0010673.08E-75−3.05223
SFTPBNM_0005422.40E-09−3.0864
HPNM_0011261020.00045976−3.12945
PI15NM_0158868.33E-14−3.13972
IGKV3OR2–268OTTHUMT000003304180.000711524−3.52251
IGKV2D-30OTTHUMT000003232850.00240377−3.55542
CEACAM6NM_0024836.87E-11−3.68686
CEACAM5NM_0012914842.69E-11−3.8858

Table 2.

Main genes whit potential alternative splicing in lung cancer.

Figure 4.

Differential exon expression of LOX gene. The figure showed in the top two alternative transcripts reported, the middle part the blue line indicates hypoxic model and the red line indicates normoxic model. The heat map showed exon level expressions on the far right two probe set are supressed, both markers inspection one exon. Our results could indicate the expression is the LOX NM_001178102 transcript variant.

Figure 5.

Differential exon expression of CEACAM6 gene. The figure showed in the top one transcripts, in the middle parte the blue line indicates hypoxic model and the red line indicates normoxic model. The heat map showed exon level expression, on the far left one marker is supressed indicating a potential fractioned exon, consequently alternative cap site.

10. Prostate cancer

The prostate cancer is one of the most common malignance in the worldwide. For this chapter we performed data mining using the data set E-GEOD-66852 available in ArrayExpress web page or GEO-GSE66852, the data was published by Nouri et al. [49]. Our results showed 777 transcripts that have significant differential exon expression, the most significant over and down expressed are shown in the Table 3. Our results showed the over expression in the CCDC80 transcript also showed an alternative spliced site in the exon six Figure 6. The down regulate transcript was DLGAP5, this transcript showed two potential sites of splicing; in the exon four and eight Figure 7.

Gene symbolRefSeqp-valueFold-Change
CCDC80NM_1995112.98E-5212.3906
PLA2G2ANM_0003001.48E-259.88943
PCDH11XNM_0011683606.22E-458.56495
RIMS1NM_0011684072.78E-1047.77511
SINM_0010411.17E-1187.63493
IGFBP3NM_0005985.86E-357.56274
NLGN1NM_0149322.60E-227.29711
PCDH11XNM_0011683603.23E-366.91453
LRRN1NM_0208734.82E-166.45031
EPB41L4ANM_0221408.22E-556.22688
SHCBP1NM_0247451.26E-42−14.8909
KIF20ANM_0057334.31E-69−15.7441
HMMRNM_0011425565.84E-65−16.2617
FAM111BNM_0011427033.52E-18−17.0768
MELKNM_0012566853.51E-64−17.2272
HIST1H3INM_0035331.58E-10−19.3335
TOP2ANM_0010675.06E-126−23.1938
PBKNM_0012789454.11E-38−24.005
NCAPGNM_0223462.24E-64−24.1474
DLGAP5NM_0011460154.89E-73−32.3098

Table 3.

Main genes whit potential alternative splicing in prostate cancer.

Figure 6.

Differential exon expression of CCDC80 gene. The figure showed in the top two alternative transcripts, the middle part the blue line indicates parental cells model and the red line indicates transdifferentiated cells model. The heat map showed exon level expression, on the middle transcript one marker is supressed indicated by blue color in transdifferentiated model.

Figure 7.

Differential exon expression of DLGAP5 gene. The figure showed in the top two alternative transcripts. The middle the blue line indicates parental cells model and the red line indicates transdifferentiated cells. The heat map showed exon level expression on the right side two markers were supressed in the parental model. Our results suggest two additional transcript variants non-reported are expressed in this model.

11. Conclusions

The alternative splicing is an important transcriptional mechanism that promote protein diversity. In cancer, several alterations in AS has been reported. In this chapter, we showed the generalities of alternative splicing process, the implications of AS in human diseases. The potential use of alternative transcript expressed in cancer as molecular markers and therapeutic targets. Finally, a simple method for identification of alterative transcripts expressed in three models of cancer using available dataset of Affymetrix.

Acknowledgments

This work was supported by SEP-CONACyT Basic Science grant 243233, Sectorial Fund for Research in Health and Social Security grant 272633 and Fiscal Resources, INP 2017.

Competing interests

The authors declare that they have no competing interests.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Vanessa Villegas-Ruíz and Sergio Juárez-Méndez (June 20th 2018). Alternative RNA Splicing: New Approaches for Molecular Marker Discovery in Cancer, Bioinformatics in the Era of Post Genomics and Big Data, Ibrokhim Y. Abdurakhmonov, IntechOpen, DOI: 10.5772/intechopen.74415. Available from:

chapter statistics

266total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

A Novel Approach to Mine for Genetic Markers via Comparing Class Frequency Distributions of Maximal Repeats Extracted from Tagged Whole Genomic Sequences

By Jing-Doo Wang

Related Book

First chapter

Virtual Plant Breeding

By Sven B. Andersen

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More about us