lncRNAs altered in cancer and associated epigenetic marks.
Recently, the non-coding RNAs (ncRNAs) have been classified in different categories, and its importance in regulating different cellular processes has been unravelled. The long non-coding RNAs (lncRNAs) can interact with DNA, other RNAs and proteins, including epigenetic modifiers. Some lncRNAs are related to genomic imprinting and are associated with chromatin-modifying complexes that can regulate gene transcription. It is well established that cancer cells have different epigenetic alterations and some of these modifications are associated with lncRNAs. Studies of cancer-associated lncRNAs have defined its function in the process of tumorigenesis, its impact on cell proliferation, cellular signalling, angiogenesis and metastasis. Therefore, having a better knowledge of their role might contribute to a better understanding of the diseases. In this chapter, we will discuss about lncRNA classification and functions, epigenetic marks and how they can guide transcription. Nevertheless, we will discuss how these mechanisms can interact and guide gene expression, as well as recently findings of dysregulation of lncRNAs in cancer.
- DNA methylation
- histone modifications
1. An overview
The patterns of gene expression of a cell are altered throughout its lifetime, and these changes occur as a response to different stimuli. For example, during the differentiation stage of an embryonic cell, a group of active genes dictates the cell fate, while after differentiation those genes are silenced since they are no longer needed for that task. In this manner, shifts in gene expression may occur within different mechanisms. However, the most important alterations occur in the epigenome level. The epigenome is dynamic, being constantly altered by different chemical modifications such as DNA methylation, histone modifications, nucleosome positioning and chromatin remodelling. Those changes make the DNA sequences more or less accessible to the transcriptional machinery, altering gene expression in a cell. The regulation of these mechanisms is complex and involves enzymes, proteins and RNA molecules. In the last years, it has been shown that long non-coding RNAs are also responsible for regulating transcription and they can do it in three different levels: pre-transcriptional, transcriptional and post-transcriptional. Besides these regulatory functions, they can also alter gene expression by altering the epigenome. Epigenomic alterations may alter gene expression and are related to the onset of many diseases and have been reported to be crucial to cancer development. In cancer cells, tumour suppressor genes are silenced, and oncogenes are overexpressed, and these alterations can be driven by epigenetic modifications regulated by lncRNAs. In this chapter, we will discuss about lncRNAs and epigenetic marks. Nevertheless, we will approach how they can interact with each other to regulate gene expression and their role in cancer.
2. Long non-coding RNA
The discovery of ribonucleic acid (RNA) molecules that do not code for proteins has drastically altered our understanding of molecular biology. Until recent years, the central dogma of biology described the DNA as the source of information from which an encoded gene was transcribed into a RNA strand and after it would be translated into a protein. However, in the human genome, approximately 93% of the DNA can be transcribed into RNA, but only around 2% of that would be protein-coding messenger RNA (mRNA). The remaining transcripts were therefore classified as transcriptional noise. With the rapid advance in molecular biology techniques, including large-scale sequencing, it is now known that many thousands of non-coding transcripts are encoded by the genome. These transcripts represent more than 70% of the genome, and they are transcribed into non-coding RNA (ncRNA) molecules. This knowledge opens up a completely new universe, and currently more than 40 types of non-coding RNAs have already been described. Among the most well-known ncRNAs are the transfer RNAs (tRNAs), ribosomal RNAs (rRNAs), microRNAs (miRNAs) and, more recently, the long non-coding RNAs (lncRNAs), which are the focus of this chapter.
2.1 Characteristics of lncRNAs
The lncRNAs, as the name suggests, are long RNA transcripts, with more than 200 nucleotides which are not translated into protein. The first long non-coding RNA was described in 1971, in a viroid plant pathogen; however, the first time a long non-coding RNA had its regulatory role described was only in the early 1990s, when the scientific community discovered transcripts involved in epigenetic mechanisms. One of the first identified lncRNAs was H19 (imprinted maternally expressed transcript), firstly described in mouse . Shortly after, X-inactive-specific transcript (XIST) was suggested to be a functional lncRNA, with a structural role in the cell nucleus. lncRNAs present relatively low levels of evolutionary conservation and originated from genes that are usually shorter than protein coding genes, with fewer exons . However, they present similar features with protein-coding transcripts, as they are typically transcribed by RNA polymerase II and can be capped, polyadenylated and spliced .
lncRNAs can be transcribed from both mitochondrial and nuclear genomes, in sense and antisense directions. Also, strong evidence suggest that the post-transcriptional cleavage of the lncRNAs might be the substrate to smaller RNAs, as they can act as precursors to smaller molecules such as miRNAs, piRNAs, siRNAs and others.
One of the main characteristics of lncRNAs is their ability to fold themselves into secondary or higher thermodynamically stable structures, which are highly conserved . The longer the lncRNA, the higher is the probability of it to form those structures. Because lncRNAs have the capacity to bind through bonds, they are able to fold themselves into structures such as double-helix, hairpins, loops, pseudonodes and more. Due to these complex structures, they are able to bind to more than one molecule at a time, regulating gene expression at different levels through RNA-protein, RNA-DNA and RNA-RNA complexes.
lncRNAs can be expressed in different cell compartments, and their function is directly related to their location. A substantial proportion of lncRNAs are exclusively expressed in the nucleus. Nuclear lncRNAs often play a role in modulating gene expression by recruiting transcription factors, by remodelling or by modifying the chromatin or by RNA-DNA triplex formation . Other lncRNAs must be transported to the cytoplasm, where they may interfere in post-translational modification, participating in protein localization processes, mRNA translation and stability . Not only that, lncRNAs may also be transported to distant regions through extracellular vesicles, such as exosomes and microvesicles; however, the mechanisms which regulate the expression of these circulating lncRNAs are still not well understood [7, 8].
2.2 The lncRNA classification
Because lncRNAs are a very diverse class of molecules, there is still a debate on which would be the best way to classify them into categories, as the classification can infer information regarding their localization, regulatory function, biological function and so on. The simplest method of lncRNA classification is related to their size [9, 10]: small lncRNA (200–950 nt), medium lncRNA (950–4800 nt) and large lncRNA (>4800 nt). According to this classification, most human lncRNAs fall into the small-lncRNA group (58%).
Another classification by the catalogue of human lncRNAs, made in 2012, defines five biotypes of lncRNAs according to the GENCODE (Figure 1):
Antisense: located on the opposite strand from protein-coding genes, containing an intersection with some exons or introns or published evidence of antisense gene regulation
Long intergenic non-coding RNA (lincRNA): transcripts originated from intergenic loci; that is, located between two protein-coding genes
Sense overlapping: transcripts containing ‘protein-coding gene sequences in their introns’, located in the same strand as them and that do not overlap with any exon
Sense intronic: located within introns of a protein-coding gene and with no intersection with exons
Processed transcripts: locus where all transcripts have no open reading frame (ORF) and do not fit in any of the above biotypes, due to their complex structure
It is important to note that even though this classification is widely used, additional biotypes of lncRNAs are also described in GENCODE, such as macro lncRNAs, pseudogenes, 3 prime overlapping ncRNA and bidirectional promoter lncRNA, among others. Alternatively, lncRNAs can be categorized according to the molecular mechanisms that may be involved in their functions into five archetypes:
Signal archetype: acts as a molecular signal or indicator of transcriptional activity
Decoy archetype: binds and captures other molecules, such as proteins and other regulatory RNAs, inhibiting its function
Guide archetype: binds and recruits ribonucleoprotein complexes to specific targets
Scaffold archetype: plays a structural role as a platform upon which other molecules can bind simultaneously, assembling a complex
Enhancer archetype: controls higher-order chromosomal looping
Nevertheless, they can also be classified based on the region of the DNA sequence impacted by the lncRNA. lncRNAs can influence a neighbouring gene on the same allele from which it is transcribed (cis) or in further genomic region and other chromosomes (trans):
Cis-lncRNAs: lncRNAs regulating the expression of genes in close genomic proximity. They may be transcribed from promoter regions and may interfere in the transcription activity of neighbouring genes. They may act by recruiting transcription factors, inducing chromatin remodelling or forming DNA-RNA triplex structure.
One of the most well-known examples of cis-acting lncRNA is XIST. In mammals, the females have two copies of the X chromosome (XX), while the males have only one (XY). This unbalance could result in a variety of problems associated with the expression of genes from chromosome X. However, the lncRNA X-inactive-specific transcript (XIST) is expressed from the X-inactivation centre (XIC) locus and acts in cis along the whole chromosome from which it is transcribed, resulting in this chromosome silencing (Figure 2A).
Trans-lncRNAs: lncRNAs may also function in trans-mode by influencing distant gene loci. In such case, they may also act as chromatin modification complexes, as well as affect transcription by binding to transcription elongation factors or to RNA polymerases.
Another well-studied lncRNA, HOX transcript antisense RNA (HOTAIR), also recruits the polycomb repressive complex 2 (PRC2) to inactivate gene expression. However, in this case, HOTAIR is transcribed from the HoxC locus on chromosome 12 and represses the HoxD locus on chromosome 2, therefore acting in trans (Figure 2B).
2.3 Gene expression regulation mediated by lncRNAs
Long non-coding RNAs are functionally very diverse and are involved in numerous biological roles, such as imprinting, epigenetic regulation, apoptosis and cell cycle control, transcriptional and translational regulation, splicing, cell development and differentiation and ageing. They have been described in almost every stage of gene regulation: pre-transcriptionally, guiding proteins to specific areas of the genome; as decoys, keeping proteins away from chromatin; by epigenetic alterations, by histone modifications or DNA methylation ; transcriptionally, modulating the transcriptional process; and post-transcriptionally, by RNA-RNA interactions.
2.3.1 Pre-transcriptional regulation
It is well understood that, in eukaryotic cells, the DNA is packaged in the chromatin and the availability of those structures to the transcriptional machinery has a strong influence in the gene expression, as the transcriptional factors must have access to the chromatin in order to transcribe the encoded gene. The lncRNAs can regulate this expression in the nucleus by associating and recruiting chromatin-remodelling factors. The examples of XIST and HOTAIR mentioned above illustrate this pre-transcription regulation, as in both examples, XIST and HOTAIR recruit the PRC2 to interact and repress the expression of genes through the K27 trimethylation in H3 histones .
2.3.2 Transcriptional regulation
The lncRNAs located in the cell nucleus can participate in the transcription regulation and are divided into two different categories, according to its function: promoter-associated lncRNAs (plncRNAs) and enhancer-like lncRNAs (elncRNAs).
The plncRNAs may act as inhibitor or promoter of gene expression. For example, the dihydrofolate reductase (DHFR) gene contains two promoters, with the downstream major promoter being responsible for 99% of the RNA transcription. However, the transcription from the upstream minor promoter generates a lncRNA transcript that interacts both with the major promoter and the transcription factor IIB (TFIIB), forming a triplex structure between DNA and RNA, which inhibits the binding of TFIIB with the major promoter causing the inhibition of DHFR gene expression .
Another example is the lncRNA Evf-2, which acts on cis as a distal-less homeobox 2 (DLX2) protein coactivator, creating a stable complex that activates the transcription of the adjacent locus distal-less homeobox 5/6 (DLx5/6) .
2.3.3 Post-transcription regulation
lncRNAs may act upon the post-transcription regulation in different ways. In the nucleus or in the cytoplasm, they can alter the mRNA stability, splicing or even cellular compartmental distribution. The zinc finger E-box binding homeobox 2 (ZEB2) gene is transcribed in the DNA sense strand of chromosome 2. On its opposite strand, the lncRNA ZEB2 natural antisense transcript (ZEB2NAT) can mask the splicing site of an intron in the 5’UTR region of the ZEB2 mRNA by complementary binding. This interaction avoids the spliceosome attachment, allowing the expression of the ZEB2 protein .
In another case, in the cytoplasm, lncRNAs can also act as miRNA ‘sponges’, when the mRNA and the lncRNA have similar miRNA binding sites. Therefore, when the lncRNA binds to the miRNA, the miRNA is no longer available for mRNA attachment, increasing the concentration of the mRNA in the cytoplasm. The phosphatase and tensin homolog (PTEN) gene and the lncRNA phosphatase and tensin homolog pseudogene-1 (PTEN1) illustrate this action mode. They both share similar nucleotide sequences. The miRNA families miR-17, miR-21, miR-214, miR-19 and miR-26 contain in their 3’UTR region a perfect match binding sequence to PTEN1. Therefore, the lncRNA acts as bait and sequesters the miRNAs that would otherwise bind to PTEN mRNA .
3. Epigenetic regulatory functions of lncRNAs
As mentioned above, lncRNAs can interact with epigenetic mechanisms altering gene expression. But what are those mechanisms? Epigenetics refers to chemical modifications of the chromatin, without alterations in the nucleotide sequence, which are transmitted throughout mitosis and play a key role in gene expression regulation and genomic stability. Among epigenetic mechanisms are DNA methylation, post-translational histone modifications, nucleosome positioning and chromatin accessibility. All these epigenetic marks interact with each other in a dynamic way altering patterns of gene expression along embryogenesis, throughout lifetime by environmental stimulus and in the transition health-disease stage.
3.1 Chromatin structure and epigenetic marks
Different states of chromatin organization allow different transcriptional factors to bind to DNA and regulate gene expression. This interaction between transcriptional factors and DNA is only possible due to a chromatin open state, known as euchromatin. The inactive form of chromatin is called heterochromatin and is characterized by epigenetic marks that make this structure highly condensed. The negatively charged phosphate backbone of DNA is wrapped around an octamer of histone proteins, forming the nucleosome. The histone octamer is made out of a pair of each histone protein H2A, H2B, H3 and H4, and these structures are linked by the histone H1 protein. The epigenetic marks are written by enzymes that can add methyl groups to cytosine in the genomic DNA (DNA methyltransferases; DNMTs) and acetyl and methyl groups to amino acid residues of histone proteins (histones acetylases and methylases; HATs and HMTs, respectively). These marks can be interpreted by proteins that bind to methylated DNA, such as methyl CpG-binding domain (MBDs) proteins, and to modified histones, as proteins containing chromo- and bromodomains. Epigenetic marks can also be further erased by enzymes, such as histone deacetylases and demethylases (HDACs and HDMs), respectively, as well as the family of ten-eleven translocation (TETs), which oxidases the 5-methylcytosine.
3.1.1 DNA methylation
DNA methylation is the process in which a methyl group is added to the fifth carbon of a cytosine resulting in 5-methylcytosine (5mC). The methyl group is donated from S-adenosyl methionine (SAM), and this reaction is catalyzed by DNMT enzymes. In mammals, the methylation process usually occurs in the CpG dinucleotide context, but is not limited to this condition. There are five types of DNMTs in mammals: DNMT1, DNMT3A, DNMT3B, DNMT2 and DNMT3L. During the replication process of a cell, DNMT1 recognizes 5mC in the hemimethylated DNA and is responsible for the reestablishment of the methylation patterns in the daughter strand, which makes the epigenetic marks heritable during cell division. For this reason, DNMT1 can be called as a maintenance DNA methyltransferase. DNMT3A and DNMT3B do not prefer hemimethylated DNA, being able to establish de novo (new) methylation patterns especially during cell differentiation in embryogenesis. DNMT3L does not have catalytic activity but can act as a cofactor of DNMT3A to improve its affinity to DNA and further improve the methylation process. Despite the fact that DNMT2 has no strong catalytic activity to DNA, it was recently showed that this enzyme is capable of adding methyl groups to tRNA.
The methylation pattern of the human genome is said to be bimodal; in other words, that means that some regions have a low methylation level, as transcription start sites (TSS) with high content of CpGs and imprinting control regions (ICR), although the other CpG sites in the genome are kept methylated. The CpG islands (CGI) are known as CpG-rich regions within the DNA sequence, and they can be related with gene expression. For example, it is well established that TSS with high methylation of CGI are related with a long-term silencing. Methylation of promoter regions is associated with transcriptional repression, while the non-methylated promoters are associated with active transcription. Methylated DNA acts as a physical barrier for transcription factors and, additionally, recruits MBDs that also act as repression complexes. However, methylation at the gene body is related with active transcribed genes; for example, when the methyl-CpG binding protein 2 (MeCP2) identifies methylated exons, it may regulate alternative splicing. Furthermore, transposable elements can also be silenced by DNA methylation, contributing to genome stability.
Demethylation of the genome is an important process during the pre-implantation phase and germ cell development. The enzymes responsible for this multistep process are the TET proteins capable of oxidizing 5mC, leading to the loss of DNA methylation. DNMTs and TETs can dynamically regulate transcription repression and activation across the genome. DNA methylation is a stable mark for gene silencing, but is not the only one.
Interactions between DNA methylation marks and lncRNAs may happen in order to control gene expression. As an example, one of TET proteins, TET2, has been shown to negatively regulate the expression of the lncRNA antisense ncRNA in the INK4 locus (ANRIL) through its binding affinity with the lncRNA promoter, regulating not only its expression but the expression of its downstream genes as well. In addition, another negative correlation has been observed between TET2 and the methylation pattern of the lncRNA maternally expressed gene 3 (MEG3) promoter. This is particularly interesting considering that MEG3 promoter methylation has been associated with poor survival in myeloid malignancies and the same trend has been observed in breast, cervical, colon, liver, lung and prostate cancer cell lines [16, 17]. As mentioned previously, lncRNA XIST is responsible for X-chromosome inactivation. However, the active X chromosome expresses the lncRNA TSIX, antisense to XIST through its 5′ end, that interacts with PRC2 and enhances DNA hypermethylation through DNMT3A, resulting in XIST silencing. Conversely, neither PRC2 nor DNMT3A is essential for XIST expression, which implies that more than one pathway acts in the TSIX/XIST regulation complex. Another lncRNA that interferes in DNA methylation is H19 that was discovered to interact with methyl-CpG-binding domain protein 1 (MBD1), which recruits different enzymes that are related to gene silencing. Besides that, it was also identified to bind with S-adenosyl homocysteine hydrolase (SAHH) and hydrolysis S-adenosyl homocysteine (SAH) and block DNA methylation by DNMT3B [18, 19]. It is also known that H19 is part of an imprinted gene network that is only expressed from maternal allele. Interestingly, the silencing of the paternal allele is due to CpG methylation of the H19 promoter. Genomic imprinting is an epigenetic mechanism that restricts the expression of a gene to only one allele, either maternal or paternal. The lncRNA Nespas belongs to the Gnas imprinted cluster. Nespas transcription through the paternal allele is inversely correlated with Nesp. Nesp promoter in this context is encountered methylated due to Nespas ability to recruit KDM1B (histone demethylase 1B) and therefore demethylate lysine 4 of histone 3 .
3.1.2 Histone modifications and chromatin remodellers
Other epigenetic marks are the post-translational modifications of histone proteins. Histone modifications are more plastic and can lead to repression and activation of transcription, depending on the modification and the residue to be modified. The main targets of modifications are the amino acids located in the N-terminal portion of histone tails that shape the nucleosome. Lysine is the amino acid that can accommodate more combinations of modifications; however, other histone residues can also be modified, such as arginine, serine, threonine and tyrosine. Some of the most studied modifications are methylation and acetylation, but modifications as phosphorylation, ubiquitination and ADP-ribosylation also occur. Histone acetylation is a mark related to active gene expression. The acetyl group neutralizes the positive charge of histones, and due to this new charge, since the DNA has a negative backbone, the DNA-histone interactions are lowered, and this repulsion makes this loosen structure more accessible to the transcription machinery. An example of acetylation controlling gene expression is the acetylation of histone H3: high-acetylated genome regions are related to highly expressed genes, while low acetylation is present in silenced genomic regions. While histone acetylation profiles can indicate active or repressive regions of the genome, the effect of histone methylation depends on the residue where it occurs and their degree of methylation, whether once, twice or three times methylated (mono, di or tri, respectively). Just like DNA methylation, the methyl group added in histone residues is donated from SAM; however, this reaction is catalyzed by HMTs. The most commonly methylated histones are H3 and H4, and some marks are already related with different states of chromatin accessibility. For example, trimethylation of lysine 9 in histone H3 (H3K9me3) and trimethylation of lysine 20 in histone H4 (H4K20me3) are associated with inactive regions of the genome, while mono-methylation of lysine 9 in histone H3 (H3K9me1) and mono-methylation of lysine 20 in histone H4 (H4K20me1) are related with active genome regions. The mechanism that erases those marks are guided by different enzymes, including HDACs and HDMs, responsible for removing, respectively, acetyl and methyl groups from the histones. The process of histone modification and its cooperation with DNA methylation makes the epigenome very dynamic and plastic during different stages of cell development.
The nucleosome occupancy in the chromatin and the changes that may occur in this structure are important to guide gene expression. Chromatin remodellers use the energy of adenosine triphosphate (ATP) hydrolysis to alter the nucleosome position. These remodellers are multi-protein complexes that can modulate nucleosome occupancy with the help of transcriptional cofactors, pioneer factors and non-coding RNAs. The coordinated mechanism of nucleosome modifications and DNA accessibility can occur in at least four ways: (1) sliding of the nucleosome: supported by imitation switch (ISWI), chromodomain-helicase-DNA (CHD) binding and chromatin remodeller families; it is characterized by the transfer of the nucleosome to a new position but without changing its chromatin region; (2) nucleosome ejection: mediated by imitation switch/sucrose non-fermentable (ISWI/SNF) remodeller family; the nucleosome is taken out of its position, and the DNA in that area becomes accessible; (3) nucleosome-selective dimer removal: mediated by SWI/SNF family, destabilizes the nucleosome since it leaves only two tetrameters within the nucleosome; (4) nucleosome histone replacement: histone variants such as H2A is replaced by H2A.Z, mediated by the inositol (INO)-requiring family. Therefore, all those modifications are important to maintain the dynamic regulation of gene expression. The histone-modifying complexes (PRC1/PRC2) and the trithorax group/mixed-lineage leukaemia (MLL) protein complexes (TrxG/MLL) are important players on the control of chromatin structure and, therefore, are important regulators of gene activity. PRC2 promotes the methylation of histone H3 at lysine 27 (H3K27me3), inhibiting gene activity, while on the other hand, TrxG/MLL stimulates the methylation of histone H3 at lysine 4 (H3K4me3), triggering gene expression.
In the literature, many lncRNAs have been reported to alter gene expression through histone modifications and chromatin remodelling. lncRNAs can interact with chromatin remodellers in order to promote or repress gene expression according to its genomic regions. For instance, they can interact with multiple regulatory complexes at the same time and bind with different enzymes that can change chromatin marks, such as DNA methylation, histone modifications and nucleosome modifications. lncRNA metastasis-associated lung adenocarcinoma transcript 1 (MALAT1) can act as a molecular scaffold, and it is related to both gene silencing and gene activation. The lncRNA can bind to PRC2 complex, resulting in the methylation of histone H3 in lysine 27 (H3K27), which is a repression mark of gene expression. In other context, it can interact with nuclear speckles, structures that are thought to be associated with splicing and processing of pre-mRNA, coordinating gene transcription and regulating splicing of mRNA . The elncRNA HOTTIP (HOXA distal transcript antisense RNA) is transcribed from the 5′ tip of the homeobox A (HOXA) locus. HOTTIP binds with the WD repeat-containing protein 5 (WDR5) and recruits the methyltransferase MLL complex driving the histone H3 lysine 4 trimethylation (H3K4me3) and coordinating the transcription of several genes from HOX cluster . In addition, the lncRNA Foxf1 adjacent non-coding developmental regulatory RNA (FENDRR) also recruits silencing complexes such as PRC2, to guide then to regions that will be silenced. However, FENDRR also interacts with the TrxG/MLL complex at a specific set of promoters, suggesting that there is a fine balance between FENDRR/PRC2 and FENDRR/MLL gene regulation [23, 24]. lncRNAs urothelial cancer associated 1 (UCA1), highly upregulated in liver cancer (HULC) and PVT1 lncRNAs can interact with the histone methyltransferase complex PRC2 promoting trimethylation of lysine 27 on histone H3, silencing gene expression in gallbladder cancer, colorectal carcinoma and gastric cancer, respectively. Some lncRNAs can also interact with the chromatin remodellers of the ISWI/SNF family in order to recruit this complex to genome regions to activate transcription; an example is the lncRNA transcription factor 7 (TCF7) in hepatocarcinoma cells. Other lncRNAs, such as lncRNA nuclear enriched transcript 1 (NEAT1) may also interact with the SWI/SNF complex; however, the specific mechanisms and function are still unclear. Together, these recent findings of how lncRNAs regulate gene expression through epigenetic marks still need to be more elucidated. Accumulating evidence over the last years indicates that lncRNAs play a major role in regulating gene expression through epigenetic marks. However, the comprehension of these mechanisms still needs to be more elucidated. Nevertheless, based on the topics discussed, the main interactions between lncRNAs and the epigenetic machinery are illustrated with one example of each in Figure 3.
3.2 Epigenetic regulation, lncRNAs and cancer
Human cancers are complex diseases involving multiple genetic and epigenetic alterations, while only part of the DNA mutations corresponds to the malignant phenotype; recent research has demonstrated the importance of the epigenetic alterations in the development of tumours. The switch from silenced genes to actively transcribed genes and vice versa is regulated by complex mechanisms and alterations within the cell machinery. Cancer cells present variations within DNA methylation patterns, such as global hypomethylation profile and CpG island hypermethylation in promoter regions. Due to these modifications, the cancer cell genome presents chromosomal instability, loss of genomic imprinting and changes in gene expression, both for protein coding and regulatory non-coding RNAs.
With the development of high-throughput sequencing, a number of studies have provided an ever-expanding survey on genetic aberrations in cancer. However, these abnormalities also affect lncRNAs, disrupting their functions and consequently leading to deregulation of their targets. Some of the recurring molecular mechanisms that govern how lncRNAs regulate cellular processes were highlighted earlier in this chapter. Most well-characterized lncRNAs to date show a functional role in gene expression regulation, typically transcriptional rather than post-transcriptional regulation. With advances in cancer transcriptome profiling and accumulated evidence supporting lncRNA functions, a number of differentially expressed lncRNAs have been associated with several types of cancers, which simultaneously acquire one or more dynamic modifications within their structures. Here, a few lncRNAs that have already been reported to participate in cancer progression will be mentioned, as detailed below.
XIST is one of the best-studied lncRNAs, and as such, it has been searched for and found in many different human neoplasias. Its expression can be either upregulated or downregulated, acting as an oncogene or as a tumour suppressor in multiple types of cancer. Overexpression of XIST is associated with advanced tumour stage, lymph node or distant metastasis and overall poor prognosis in human cancers. In breast cancer, XIST acts as a tumour suppressor by positively regulating the expression of non-X-chromosome gene PH domain and leucine-rich repeat protein phosphatase 1 (PHLPP1), which in turn catalyses dephosphorylation of protein kinase B (AKT) . In non-small-cell lung cancer (NSCLC), nasopharyngeal and hepatocellular carcinoma, osteosarcoma and gastric, colorectal, pancreatic and bladder cancer, its expression is upregulated, acting as an oncogene and promoting cell proliferation and migration. XIST was also described acting as a sponge for miR-186-5p, and its knockdown suppresses multiplication and invasion in NSCLC.
MALAT1 is another well-studied lncRNA. Its high expression has been associated with gastric cancer, melanoma, breast cancer tumour and metastatic progression. MALAT1 also functions in key spots of cancer development process, since it regulates transcription of oncogenic targets and regulates itself interacting with transcription factors . This lncRNA can mediate transcription factors binding to target gene promoters or can act as a sponge to sequester miRNAs, controlling miRNA suppressor effects on oncogenic targets. On the other hand, epigenetic modifications occurring at histone level, for instance, demethylation of histone H3 in lysine 9 position (H3K9) by a demethylase that binds to the MALAT1 promoter, may result in MALAT1 lncRNA overexpression [27, 28].
HOTAIR is a lncRNA involved in gene silencing by interaction with two chromatin-modifying complexes and plays numerous roles in cancer development. Altered expression of HOTAIR is found in many types of cancers , promoting metastasis and tumour invasiveness through epigenetic gene silencing. Cancer stem cells from breast, oral and colon carcinomas express high levels of HOTAIR associated with increased stemness and metastatic potential . High levels of HOTAIR correlating with metastasis and poor prognosis have been found in lung cancer , hepatocellular carcinoma , breast cancer , gastric cancer , colorectal cancer , cervical cancer , ovarian cancer, head and neck carcinoma and oesophageal squamous cell carcinoma. Just recently, elevated HOTAIR expression was also identified in adrenocortical carcinoma, and it was demonstrated to induce cell proliferation. In addition, another recent study showed the potential of HOTAIR to promote osteosarcoma development. Evidence supporting HOTAIR’s role in mediating drug resistance has emerged from an investigation with different types of cancer. HOTAIR overexpression was found in samples from drug-resistant patients with NSCLC. Similar results demonstrated HOTAIR’s potential to promote resistance to cisplatin or other types of chemotherapy drugs. Those studies have been conducted with hepatocellular carcinoma, breast cancer, gastric cancer, colorectal cancer, cervical cancer and ovarian cancer [36, 37, 38, 39, 40].
The lncRNA H19 has been well-studied in cancer. Aberrant expression of H19 is observed in numerous solid tumours, including hepatocellular and bladder cancer. Functional data on H19 points in several directions, and it has been linked to both oncogenic and tumour suppressive qualities. For example, there is evidence for its direct activation by cMYC as well as its downregulation by p53 during prolonged cell proliferation. The siRNA knockdown of H19 impairs cell growth and clonogenicity in lung cancer cell lines in vitro and decreased xenograft tumour growth of Hep3B hepatocellular carcinoma cells in vivo [41, 42].
DNMT1-associated lncRNA (DACOR1) was described to be repressed in colon cancer. The lncRNA was directly associated with demethylation of CpG sites by guiding DNMT1 methylation patterns across the genome at thousands of different CpG sites .
The long non-coding RNA PTENP1 is a well-known tumour suppressor gene. Studies have shown that PTENP1 increased PTEN protein levels by competing for a set of PTEN-targeting miRNAs, which downregulate PTEN independent of its protein-coding function. In colon cancer, the loss of focal copy number at the PTENP1 locus was associated with the downregulation of PTEN expression in colon cancer patients. A similar relationship was shown between the oncogene KRAS and its pseudogene KRAS1P in colon cancer [15, 44]. PTENP1 has been downregulated or suppressed in several cancers, such as GC, hepatocellular carcinoma (HCC), renal cell carcinoma, head and neck squamous cell carcinoma (HNSCC), melanoma, endometrial cancer and oral squamous cell carcinoma (OSCC) [45, 46, 47, 48, 49, 50].
Besides the alterations mentioned above, some lncRNAs were also related to specific epigenetic marks in cancer. However, not all of these marks were well elucidated in all types of cancer. Table 1 shows some lncRNAs that were described in different cancer types and what epigenetic mark they can regulate.
|lncRNA||Associated epigenetic marks||Described in||Ref.|
|XIST||Histone modification, chromatin remodelling||Gastric, oesophageal||[21, 51, 52]|
|MALAT1/a/NEA||Histone modification||Breast, lung||[53, 54, 55, 56]|
|HOTAIR||Chromatin remodelling/histone modification||Breast, pancreas||[36, 37, 38, 3940]|
|H19||DNA hypermethylation||Gastric||[41, 42, 57]|
|HULC||Chromatin remodelling||Hepato-, colorectal carcinoma||[58, 59]|
|GCLnc1||Histone modification||Gastric cancer|||
|DACOR1||DNA methylation||Colon cancer|||
|FENDRR||Histone modification||Gastric, lung cancers|||
|UCA1||Chromatin remodelling||Colorectal cancer||[62, 63]|
|TCF7||Chromatin remodelling||Liver cancer|||
|TP53TG1||CGI hypermethylation||Colorectal, gastric cancers|||
|ANRIL||Chromatin remodelling||Prostate cancer, leukaemia|||
|MEG3||Promoter/imprinting control hypermethylation||Brain tumour|||
|GAS5||Histone modification||Breast cancer|||
|NEAT1||Chromatin remodelling||Cervical renal, lung||[69, 70, 71, 72, 73]|
|PVT1||Histone modification||Renal, gastric||[70, 71, 72, 73, 74, 75]|
Gene expression can be regulated by different mechanisms; however, epigenetic alterations have a major effect in this process. We discussed the different ways by which lncRNAs may interact with epigenetic marks to guide gene expression. Furthermore, the complete panorama of how these interactions work remains unclear, as we are only beginning to understand the connections between lncRNAs and the epigenome. Since accumulating evidence shows that lncRNAs can regulate gene expression in cancer cells, understanding the mechanisms by which these molecules work on it is essential for comprehending cancer development and progression, in order to develop better diagnostic tools and treatments. There is still a long way to go on this road, until we can finally elucidate the rules that guide these interactions as well as the functional implications of these associations.
This work was supported by Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq).
Conflict of interest
No potential conflicts of interest were disclosed.