Non coding RNA in human genome.
The question of which regions of the human genome constitute its functional elements—those expressed as genes or serving as regulatory elements—has long been a central topic in biology. In the 1970s and 1980s, early cloning-based methods revealed the presence of more than 7000 genes in human genome , and large-scale analyses of expressed sequence tags (ESTs) in the 1990s suggested that the estimated number of human genes range from 35,000 to 100,000 . The completion of the human genome project narrowed the focus considerably by highlighting the surprisingly small number of protein-coding genes, which is now conventionally cited as less than 25,000 . While the number of protein-coding genes (20,000–25,000) has maintained broad consensus, recent studies of the human transcriptome have revealed an astounding number of non-coding RNAs (ncRNAs) [4-6]. In fact, the increased sensitivity of genome tiling arrays provides an even more detailed view, revealing that the extent of non-coding sequence transcription is at least four times greater than coding sequence, and that the abundance of non-coding transcripts had been previously overlooked. The RNA world hypothesis proposes that early life was based on RNAs, which subsequently devolved the storage of information to more stable DNA, and catalytic functions to more versatile proteins. Consequently, despite crucial roles in the ancient processes of translation and splicing, RNA is assumed to have been largely relegated to an intermediate between gene and protein, encapsulated in the central dogma ‘DNA makes RNA makes protein’ . However, the finding that most of the genome in complex organisms is transcribed and the discovery of new classes of regulatory non-coding RNAs (ncRNAs) challenges this assumption and suggests that RNAs have continued to evolve and expand alongside proteins and DNA.
ncRNAs are considered as RNA transcripts that do not encode for a protein. In the past decade, a great diversity of ncRNAs has been observed. Depending on the type of ncRNA, transcription can occur by any of the three RNA polymerases (RNA Pol I, RNA Pol II, or RNA Pol III). General conventions divide ncRNAs into two main categories: small ncRNAs less than 200 bp and long ncRNAs greater than 200 bps . Within these two categories, there are also many individual classes of ncRNAs (Table1), although the degree of biological and experimental support for each class ranges substantially and should be evaluated individually. The relevance of ncRNAs in gene regulation has been rapidly unveiling during the last decade. However, the functional elements in the primary sequence of noncoding genes that determine their role as RNA molecules remain unknown. Protein-coding genes have a defined language with a set of grammatical rules: three nucleotides forms a codon that translates into a specific amino acid . Aberrations in codons of a protein-coding gene can be interpreted in terms of the amino acids they encode. We can recognize a mutation in a codon and determine its contribution to a given disease. In contrast to the genetic code for protein synthesis, ‘the ncRNA alphabet’ – a specific set of RNA sequences or structural motifs important for ncRNA function – remains to be largely elucidated. However, it has become increasingly apparent that the ncRNAs are of crucial functional importance for normal development, physiology and disease . The functional relevance of the ncRNAs is particularly evident for a class of small non-coding RNAs called microRNAs (miRNAs) [11-12]. In human diseases, particularly cancer, it has been shown that epigenetic and genetic defects in miRNAs and their processing machinery are a common hallmark of disease [13-16]. However, miRNAs are just the tip of the iceberg, and other ncRNAs such as small nucleolar RNAs (snoRNAs), PIWI-interacting RNAs (piRNAs), large intergenic non-coding RNAs (lincRNAs) and, overall, the heterogeneous group of long non-coding RNAs (lncRNAs), might also contribute to the development of many different human disorders. Here we discuss the most recent genetic studies on ncRNAs and their related proteins in the context of cancer and we will analyze the new regulatory elements of the noncoding language to interpret their contribution to the pathogenesis of cancer.
In 1993, Victor Ambros and colleagues discovered a gene,
independent transcription units. Animal miRNAs are processed from longer primary transcripts (pri-miRNAs) that can contain multiple miRNAs [34,35]. Few pri-miRNA transcripts have been studied in detail, but in general miRNAs are regulated and transcribed similar to protein encoding genes by (Pol) II with the exception of the rapidly evolving RNA polymerase (Pol) III transcribed miRNA cluster . MiRNA processing occurs in three essential steps (Figure 1). First, the nuclear endoribonuclease protein Drosha recognizes the miRNA hairpins in the primary transcript and cleaves each hairpin ~11 nt from its base [37-38]. It has been proposed that Drosha may recognize the pri-miRNA through the stem-loop structure and then cleave the stem at a fixed distance from the loop to liberate the pre-miRNA. How is the Drosha enzyme able to discriminate the pri-miRNA stem-loop structure from the other stem-loop cellular RNAs? Both cell culture experiments and in vitro Drosha cleavage assays have shown that proteins associated with Drosha confer specificity to this process. In fact, Drosha has been found to be part of a large, ~650-kDa protein complex known as the Microprocessor , where Drosha interacts with its cofactor DGCR8 (the DiGeorge syndrome critical region gene 8 protein) in the human and interacts with Pasha in
protecting them from the moment they are generated in the nucleus until they are ready for the next cleavage step in the cytoplasm, where GTP is hydrolyzed to guanosine diphosphate (GDP); at that point, the Exp5/Ran-GDP complex releases its cargo. Third, the endoribonuclease protein Dicer cleaves the pre-miRNA into ~22 nt duplexes and, with the help of cofactors such as TAR RNA binding protein (TRBP) and protein activator of the interferon-induced protein kinase (PACT), preferentially incorporates one of the duplex strands Into the RNA induced-silencing complex (RISC) [44-50]. The final product is a miRNA-miRNA duplex that needs to be unwound to act as a single-stranded guide in the RISC to recognize its target mRNAs. It was originally proposed that an ATP-dependent helicase (known as unwindase) separates the two small RNA strands, after which the resulting single-stranded guide is loaded into Ago proteins. However, it was later shown that
2. MicroRNAs and cancer
Cancer is a multistep process in which normal cells experience genetic changes that progress them through a series of pre-malignant states (initiation) into invasive cancer (progression) that can spread throughout the body (metastasis). The dysregulation of genes involved in cell proliferation, differentiation and/or apoptosis is associated with cancer initiation and progression. Genes linked with cancer development are characterized as oncogenes and tumor suppressors. Recently, the definition of oncogenes and tumor suppressors has been expanded from the classical protein coding genes to include miRNAs [61-62]. MiRNAs have been found to regulate more than 60% of mRNAs and have roles in fundamental processes, such as development , differentiation , cell proliferation , apoptosis , and stress responses . Over the past few years, many miRNAs have been implicated in various human cancers. The first evidence that miRNAs are involved in cancer comes from the finding that miR-15 and miR-16 are downregulated or deleted in most patients with chronic lymphocytic leukemia . This discovery has projected miRNAs to the center stage of molecular oncology and, in the past few years, a myriad of genome-wide miRNA expression profiling analyses have shown a general dysregulation of miRNA expression in all tumors (Table 2) . Surprisingly, the use of miRNA profiles is newly becoming highly preferred to the traditional mRNA signature for a variety of reasons. First, the remarkable stability of miRNAs, due to their short length, has allowed scientists to perform analyses also in samples considered to be technically challenging, such as formalin fixed specimens. High sensitive and refined miRNA detection technique provide high reliability in the use of miRNAs as a diagnostic tools. Finally, miRNA fingerprints have demonstrated the ability to identify the tissue of origin for cancer that have already spread in multiple metastatic sites, thereby reducing patient’s psychological burden and overall procedure costs. To date, over 1000 miRNAs have been reported in humans (miRbase: 1527 at November 2011), and both loss and gain of miRNA functions contribute to cancer development through a range of different mechanisms that we will discuss in the following sections.
3. Oncogenic microRNAs
Although studies linking miRNA dysfunctions to human diseases are in their infancy, a great deal of data already exists, establishing an important role for miRNAs in the pathogenesis of cancer. Many miRNAs have been shown to function as oncogenes in the majority of cancers profiled to date (Table 3).
Another important oncogenic miRNA is represented by
Another example of “oncomiR” is represented by
The miR-106b-25 polycistron is composed of the highly conserved miR-106b, miR-93, and miR-25 that accumulate in different types of cancer, including gastric, prostate, and pancreatic neuroendocrine tumors, as well as neuroblastoma and multiple myeloma. Petrocca and collaborators  demonstrated that E2F1 regulates miR-106b, miR-93, and miR-25, inducing their accumulation in gastric tumors. Conversely, miR-106b and miR-93 control E2F1 expression, establishing a negative feedback loop that may be important in preventing E2F1 self-activation and apoptosis. On the other hand, miR-106b, miR-93, and
miR-25 overexpression causes a decreased response of gastric cancer cells to TGFβ by downregulating p21 and Bim, the two most downstream effectors of TGFβ-dependent cell cycle arrest and apoptosis, respectively.
Another example of a miRNA locus with oncogenic properties is represented by the
3. Tumor suppressor microRNAs
The first evidence that miRNAs are involved in cancer comes from the finding that miR-15 and miR-16 are downregulated or deleted in most patients with chronic lymphocytic leukemia (CLL) (Table 4) . They are transcribed as a cluster (
In mammalians, the miR-34 family comprises three processed miRNAs that are encoded by two different genes: miR-34a is encoded by its own transcript, whereas miR-34b and miR-34c share a common primary transcript. The miR-34 family has been shown to form part of the p53 tumor-suppressor network: their expression is directly induced by p53 in response to DNA damage or oncogenic stress [101-102]. He et al. identified different miR-34 targets such as cyclin E2 (CCNE2), CDK4, and MET. Silencing these selected miR-34 targets through the use of small interfering RNAs (siRNAs) led to a substantial cell cycle arrest in G1. Moreover, ectopic miR-34 delivery caused a decrease in levels of phosphorylated retinoblastoma gene product (Rb), consistent with lowered activity of both CDK4 and CCNE2 complexes . BCL2 and MYCN were also identified as miR-34a targets and likely mediators of the tumor suppressor phenotypic effect in neuroblastoma . It has been also reported that p53 activation suppressed the EMT-inducing transcription factor SNAIL via induction of the miR-34a/b/c genes. In fact, suppression of miR-34a/b/c by anti-miRs caused up-regulation of SNAIL and cells displayed EMT markers, enhanced migration and invasion .
MicroRNA-122 (miR-122) is a liver-specific microRNA and is frequently downregulated in liver cancer . Xu et al. reported that restoration of miR-122 in hepatocellular carcinoma cells could render cells sensitive to chemotherapeutic agents adriamycin or vincristine through downregulating antiapoptotic gene Bcl-w and cell cycle related gene cyclin B1 . Another group found that over-expression of miR-122 inhibits hepatocellular carcinoma cell growth and promotes the cell apoptosis by affecting Wnt/β-catenin signalling pathway . Coulouarn et al. showed that miR-122 is specifically repressed in a subset of primary hepatocellular tumors that are characterized by poor prognosis . They further reported that loss of miR-122 resulted in an increase of cell migration and invasion and that restoration of miR-122 reverses this phenotype . The final understanding of the tumor suppressor role for mir-122 role in liver cancer came from a recent study where miR-122 knockout mice were studied. When miR-122 KO mice aged, hepatic inflammation ensued, preceding the progressive onset of fibrosis and, eventually, tumors resembling human liver cancer. These pathologic manifestations were associated with hyperactivity of oncogenic pathways and hepatic infiltration of inflammatory cells that produce pro-tumorigenic cytokines, including IL-6 and TNF .
Metastasis is the result of cancer cells detaching from a primary tumor, consequently adapting to distant tissues and organs, and forming a secondary tumor  and this ability of cancer cells to metastasize is a hallmark of malignant tumors [111-112]. To successfully metastasize, a tumor cell must complete a complex set of processes, including invasion, survival and arrest in the circulatory system, and colonization of foreign organs. Despite great advancements in knowledge of metastasis biology, the molecular mechanisms are still not completely understood. Several miRNAs have been shown to initiate invasion and metastasis by targeting multiple proteins that are major players in these cellular events, thus they have been denominated as metastamiRs (Table 5). It seems that these metastasis-associated miRNAs do not influence primary tumor either in development or initiation steps of tumorigenesis, but they regulate key steps in the metastatic program and processes, such as epithelial-mesenchymal transition (EMT), apoptosis, and angiogenesis. Ma et. al reported that miR-10b is highly expressed in metastatic breast cancer cells and positively regulates cell migration and invasion. Overexpression of miR-10b in otherwise non-metastatic breast tumors initiates robust invasion and metastasis . The team led by Joan Massague found that miR-335, miR-126, and miR-206 are metastasis-suppressors in breast cancer . MiR-126 and miR-206 restoration reduced overall tumor growth and proliferation, whereas miR-335 inhibits metastatic cell invasion through targeting of the progenitor cell transcription factor
Another important aspect of the metastatic dissemination is represented by the epithelial-to-mesenchymal transition (EMT) that allow neoplastic cells to abandon their primary site and survive in the new tissue. During EMT, an epithelial neoplastic cell looses cell adhesion by repressing E-cadherin expression and thereby the cell increases its motility. Numerous studies have shown that different microRNAs are modulated during EMT and one of the best-studied example is represented by the miR-200 family. These miRs are commonly lost in aggressive tumors such as lung, prostate, and pancreatic cancer. It has been shown that miR-200 family members directly target ZEB1 and ZEB2, transcription repressors of E-cadherin . In fact, in the highly aggressive mouse lung cancer model where KRAS is constitutively activated and p53 function is perturbed, miR-200 ectopic expression prevented metastasis by repressing ZEB1 and ZEB2 and preventing E-cadherin down-regulation . However, overexpression of the miR-200 family is associated with an increased risk of metastasis in breast cancer and this overexpression promotes metastatic colonization in mouse models, phenotypes that cannot be explained by E-cadherin expression alone . By using proteomic profiling of the targets of mesenchymal-to-ephitelial (MET)-inducing miR-200, the authors discovered that miR-200 globally targets secreted proteins in breast cancer cells. Between the 38 modulated target genes, Sec23a, which is involved in transporting protein cargo from the endoplasmic reticulum to the Golgi, shows a superior association with human metastatic breast cancer as compared to the currently recognized miR-200 targets ZEB1 and the EMT marker E-cadherin. EMT is first acquired in the onset of transmigration and then reversed in the new metastatic site. Korpal et al. have shown that the miR-200 status predicts predisposition of the cancer to successful metastasis .
5. Other non-coding RNAs: Biology and implications in cancer
5.1. snoRNAs: From post-transcriptional modification to cancer
Small nucleolar RNAs (snoRNAs) have, for many years, been considered one of the best-characterized classes of non-coding RNAs (ncRNAs) [120-123] but despite the common assumption that snoRNAs only have cellular housekeeping functions, in the past few years, independent reports have converged in implicating snoRNAs in the control of cell fate and oncogenesis [124-130]. SnoRNAs are small RNAs of 60-300nt in lenght that specifically accumulate in the nucleolar compartment of the cell where are in charge of the 2′-O-ribose methylation and pseudouridylation of specific ribosomal RNA nucleotides, essential modification for the efficient and accurate production of the ribosome [120-122]. The snoRNAs carry out their function in the form of small nucleolar ribonucleoproteins (snoRNPs), each of which consists of a box C/D or box H/ACA guide RNA, and four associated C/D or H/ACA snoRNP proteins (Figure 2). In both cases, snoRNAs hybridize specifically to the complementary sequence in the rRNAs, and the associated protein complexes then carry out the appropriate modification on the nucleotide that is identified by the snoRNAs. Biogenesis of vertebrate snoRNPs is remarkable and highly variable: in fact snoRNA gene organization ranges from independently transcribed genes, endowed with their own promoter elements, to intronic coding units lacking an independent promoter. In both yeast and animals, processing of intron-encoded snoRNAs is largely splicing-dependent; in contrast, the production of plant snoRNAs from introns seems to rely on a splicing-independent process . Moreover, in both contexts (intergenic or intronic), genes can be either single or part of clusters. In the latter case, the generation of individual snoRNAs involves the enzymatic processing of polycistronic precursor RNAs. Such a processing, at least in yeast, appears to involve the same combination of endo- and exoribonucleases required for the maturation of monocistronic pre-snoRNAs [132-134]. The first indication that snoRNAs might have important roles in human disease was provided by the genetic studies on Prader–Willi syndrome (PWS), an inherited human disorder characterized by a complex phenotype, including mental retardation, decreased muscle tone and failure to thrive at birth, short stature, hypogonadism, sleep apnea, behavioral problems and hyperphagia (an insatiable appetite) that can lead to severe obesity . The disease is caused by the genomic loss of the imprinted chromosomic 15q11-q13 locus which is normally only active on the paternal allele. The only characterized and conserved genes within this 121-kb-long genomic interval are the numerous HBII-85 snoRNA gene copies, thus suggesting that loss of expression of these repeated small C/D RNA genes might play a role in conferring some (or even all) phenotypes of the human disease and PWS-like phenotypes in mice (neonatal lethality, growth retardation and hypotonia). In fact, it has been shown that a site-specific deletion of the entire murine MBII-85 gene cluster led to post-natal growth retardation with low postnatal lethality (<15%) only seen in some genetic backgrounds, but no obesity . Although all the imprinted C/D RNAs that have been tested accumulate within the nucleolus, none of them appear to act as RNA guides to modify rRNAs or spliceosomal U-snRNAs; they are called ‘orphan C/D RNAs’. So far, the MBII-52 gene clusters have attracted much attention, given that the neuronal-specific MBII-52 small RNA is predicted to interfere (A-to-I RNA editing and/or alternative RNA splicing) with the post-transcriptional regulation of the pre-mRNA that encodes the 5-HT2C (5-hydroxytryptamine 2C) receptor, playing a key role in regulating serotonergic signal transduction [137-138]. These observations raised the possibility that snoRNAs could have functions completely independent from their traditional activities and carry out other regulatory roles. The first insights into the potential roles of snoRNAs in cancer began with a study that identified C/D box snoRNA U50 and its host gene U50HG at the breakpoint in the t(3;6) (q21;q15) translocation in a diffuse large B cell lymphoma . Moreover, snoRNAU50 gene has been found to undergo to a frequent copy number loss and a transcriptional downregulation in breast and prostate cancer samples [139,140]. In addition, a 2-bp deletion in U50 sequence also occurred both somatically and in germline, leading to increased incidence of homozygosity for the deletion in cancer cells .
SnoRNA42 (SNORA42) is located on chromosome 1q22 which is a commonly frequent amplified genomic region in lung cancer and overexpression of SNORA42 is frequently and remarkably found in NSCLC cells . In addition, SNORA42 exhibited close correlations between its increases of copy number and expression level, suggesting that SNORA42 overexpression could be activated through its amplification. Importantly, engineered repression of SNORA42 caused marked repression of lung cancer growth in vitro and in vivo and it is associated with increased apoptosis by a p53-dependent pathway. Although not exhibiting apoptosis, p53 null and mutant p53 cancer cells with reduced levels of SNORA42 also show inhibited proliferation and growth, suggesting that SNORA42 knockdown can inhibit cell proliferation in p53-dependent or -independent manner. These independent studies on U50 and SNORA42 provide evidence for the functional importance of snoRNAs in cancer, and they show that snoRNAs can promote, as well as suppress, tumour development. In 2002, Wu and coworkers demonstrated that the expression of snoRNAs 5S was differentially displayed in different tissues and noticeably was highly expressed in normal brain, but its expression drastically decreased in meningioma . Recently, genome-wide approaches identified six snoRNAs (SNORD33, SNORD66, SNORD73B, SNORD76, SNORD78, and SNORA42) that were statistically differently expressed between the non small cell lung cancer tumor and paired noncancerous samples . Specifically, all these snoRNAs displayed a strong up-regulation in lung tumor specimens and the majority of them is located in commonly frequent genomic amplified regions in lung cancer: SNORD33 is located in chromosome 19q13.3 that contain potential oncogenes in lung cancer, while SNORD66 and SNORD76 are situated in chromosomal regions 3q27.1 and 1q25.1, respectively 3q27.1 and 1q25.1 are two of the most frequently amplified chromosomal segments in solid tumors, particularly NSCLC .
As well as the initial evidence that snoRNAs are involved in cancer development, there are some preliminary data showing that the genes that host snoRNAs might also contribute to the aetiology of this disease. A research screening for potential tumor-suppressor genes identified that Growth arrest-specific transcript 5 (gas5) gene as almost undetectable in actively growing cells but highly expressed in cells undergoing serum starvation or density arrest [144-145]. Gas5 is a multi-snoRNA host gene which encodes 9 (in mouse) or 10 (in human) snoRNAs and like all known snoRNA host genes exhibit characteristics which belong to the class of genes encoding 5′ terminal oligopyrimidine (5′TOP) mRNAs . The first and stronger evidence that GAS5 is related to cancer is the identification that GAS5 transcript levels are significantly reduced in breast cancer samples relative to adjacent unaffected normal breast epithelial tissues and some, but not all, GAS5 transcripts sensitize mammalian cells to apoptosis inducers . Other studies have also showed that GAS5 reduced expression is associated with poor prognosis in both breast cancer and head and neck squamous cell carcinoma . Of note, GAS5 has been also identified as a novel partner of the BCL6 in a patient with diffuse large B-cell lymphoma, harboring the t(1;3)(q25;q27) . Another example of a mature spliced transcript that harbors C/D-box snoRNAs and can function independently of the snoRNAs is represented by the transcript Zfas1 . This gene intronically hosts three C/D box snoRNAs (Snord12, Snord12b, and Snord12c) and has been identified as one of the most differentially expressed gene during mouse mammary development. siRNA-mediated downregulation of Zfas1 mRNA in a mouse mammary cell line increased proliferation and differentiation without substantially affecting the levels of the snoRNA hosted within its intron. The human homologue, ZFAS1 (also known as ZNFX1‐AS1), which is predicted to share secondary structural features with mouse Zfas1, is expressed at high levels in the mammary gland and is downregulated in breast cancer. Taken together, these findings indicates that snoRNA host genes might have important functions in regulating cellular homeostasis and, potentially, cancer biology but more studies are needed to understand their involvement in molecular basis of disease and classify them as sources of potential biomarkers and therapeutic targets.
Another important aspect of the association between snoRNAs and tumorigenesis is represented by the involvement of their associated proteins in cancer. A point mutations in the DKC1 gene is the cause of a rare X-linked recessive disease, the dyskeratosis congenita (DC) [151-152]. Individuals with DC display features of premature aging, as well as nail dystrophy, mucosal leukoplakia, interstitial fibrosis of the lung, and increased susceptibility to cancer. DKC1 codes for dyskerin, a putative pseudouridine synthase, which carries out two separate functions, both fundamental for proliferating cells. One function is the pseudo-uridylation of ribosomal RNA (rRNA) molecules as a part of the H/ACA ribonucleoprotein complex, and the other is the stabilization of the telomerase RNA component necessary for telomerase activity. Dkc1 mutant mice recapitulate the major features of DC, including an increased susceptibility to tumor formation. Early generation (G1 and G2) of Dkc1 mutant mice showed a full spectrum of DC and presented alterations in rRNA modification, whereas defects in telomere length were not evident until G4 mice, suggesting that deregulated ribosome function is important for the initiation of DC and that impairment in telomerase activity in Dkc1 mutant mice may modify and/or exacerbate the disease in later generations. To this regard, DKC1 was identified as one of only seventy genes that, collectively, constitute a gene expression profile that strongly correlates with the development of aneuploidy and is associated with poor clinical prognosis in a variety of human cancers. Therefore, one hypothesis is that an alteration of physiologic dyskerin function, irrespective of the mechanism, may perturb mitosis and contribute to tumorigenesis but this idea will require more detailed investigation. Another possibility is related to the strong effect of dyskerin loss on H/ACA accumulation. Recent finding in fact have shown that some H/ACA box and C/D box can be processed to produce small RNAs, at least some of which can function like miRNAs . Such processing may be of crucial importance, as miRNAs have important roles in the development of many cancers as previously discussed. To date, Xiao and colleagues have recently reported that an H/ACA box snoRNA- derived miRNA, miR-605, has a key role in stress-induced stabilization of the p53 tumour suppressor protein . p53 transcriptionally activates its negative regulator, MDM2, in addition to miR-605. miR-605 counteracts MDM2 through post-transcriptional repression; under conditions of stress, this snoRNA-derived miRNA offsets the MDM2 negative-feedback loop, generating a positive-feedback loop to enable the rapid accumulation of p53. However, whether this regulation of p53 by miR-605 is relevant to cancer biology has not yet been addressed. Like dyskerin, NHP2 and NOP10 proteins, both components of the H/ACA snoRNPs, are also significantly up-regulated in sporadic cancers and high levels may be associated with poor clinical prognosis. Moreover, germline NHP2 and NOP10 mutations give rise to autosomal recessive forms of dyskeratosis congenita, and cancer susceptibility is also a feature of these genetic forms of the disease. Since the functions of several snoRNAs have not yet been identified (orphan snoRNAs), it is possible that disruption of snoRNP biogenesis by any mechanism may affect an array of important cellular processes, and could potentiate cancer development and/or progression.
5.2. piRNAs: Guardians of the genome
Piwi-interacting RNAs (piRNAs) are germline-specific small silencing RNAs of 24–30 nt in length, that suppress transposable elements (TE) activity and maintain genome integrity during germline development, a role highly conserved across animal species [155-156]. TEs are genomic parasites that threaten the genomic integrity of the host genome: they are able to move to new sites by insertion or transposition and thereby disrupt genes and alter the genome . In animals, endogenous siRNAs also silence TEs, but the piRNA pathway is at the forefront of defense against transposons in germ cells . piRNAs specifically associate with PIWI proteins, which are germline-specific members of the AGO protein family, AGO3, Aubergine (Aub) and Piwi, and form a piRNA-induced silencing complex (piRISC) which will guide the TE silencing [159-162]. Any mutations in each of the three members of the PIWI family lead to transposon derepression in the germline, indicating that they act non-redundantly during TE silencing. Initial screening of piRNA sequences revealed that there are hundreds of thousands, if not millions, of individual piRNA sequences [163-165]. Furthermore, they are characterized by the absence of specific sequence motifs or secondary structures such as miRNA precursors. Despite their large diversity, most piRNAs can be mapped to a relatively small number of genomic regions called piRNA clusters. Each cluster extends from several to more than 200 kilobases, it contains multiple sequences that generate piRNAs and some piRNAs map to both genomic strands, suggesting bidirectional transcription [163-165] Indeed, analysis of piRNA clusters in different
6. The emergence of long non-coding RNAs
Over the last decade, advances in genome-wide analyses of the eukaryotic transcriptome have revealed that most of the human genome is transcribed, generating a large repertoire of (>200 nt) long non-coding RNAs (lncRNA or lincRNA, for long intergenic ncRNA) that map to intronic and intergenic regions [181,181]. Given their unexpected abundance, lncRNAs were initially thought to be spurious transcriptional noise resulting from low RNA polymerase fidelity . However, the restricted expression of many long ncRNAs to particular developmental contexts, the often exhibiting precise subcellular localization and the binding of transcription factors to non-coding loci, suggested that a significant portion of ncRNAs fulfills functional roles beyond transcriptional remodelling [183-187]. lncRNA typically refers to a polyadenylated long ncRNA that is transcribed by RNA polymerase II and is associated with epigenetic signatures common to protein-coding genes, such as trimethylation of histone 3 lysine 4 (H3K4me3) at the transcriptional start site (TSS) and trimethylation of histone 3 lysine 36 (H3K36me3) throughout the gene body [188-189]. lncRNAs also commonly exhibit splicing of multiple exons into a mature transcript, and their transcription occurs from an independent gene promoter and is not coupled to the transcription of a nearby or associated parental gene. RNA-Seq studies now suggest that several thousand uncharacterized lncRNAs are present in any given cell type [188-189], and that the human genome may harbor nearly as many lncRNAs as protein-coding genes (perhaps ~15,000 lncRNAs), although only a fraction is expressed in a given cell type. One main characteristic of the lncRNAs is their very low sequence conservation that had fueled the idea that they are not functional. This assertion needs to be carefully considered and takes in consideration several points. First, a recent study identified the presence of 1,600 lncRNAs that show a strong evolutionary conservation and function ranging from from embryonic stem cell pluripotency to cell proliferation . In contrast to the protein coding genes, long ncRNAs can exhibit shorter stretches of sequence that are conserved to maintain functional domains and structures. Indeed, many long ncRNAs with a known function, such as
long ncRNAs can mediate epigenetic changes by recruiting chromatin remodelling complexes to specific genomic loci resolving the paradox of how a small repertoire of chromatin remodelling complexes are able To specify the large array of chromatin modifications without any apparent specificity for the genomic loci [191,192]. A recent study found that 20% of 3300 human long non coding RNAs are bound by Polycomb Repressive Complex 2 (PRC2) . Although the specific molecular mechanisms are not defined, there are several examples that can illustrate the silencing potential of lncRNAs (Figure 4). The first most known example is represented by the X-chromosome inactivation which is carried out by a number of lncRNAs including Xist and RepA, which bind PRC2 complex, and the antogonist of Xist, Tsix . In pre-X-inactivation cells, Tsix competes with RepA for the binding of PRC2 complex; when the X-inactivation starts Tsix is downregulated and PRC2 becomes available to RepA which can actively induced the transcription of Xist. The up-regulated Xist in turn preferentially binds to PRC2 and spreads across the chromosome X inducing PCR2-mediated trimethylated histone H3 lysine27. Another important example is represented by the hundreds of long ncRNAs which are sequentially expressed along the temporal and spatial developmental axes of the human homeobox (Hox) loci, where they define chromatin domains of differential histone methylation and RNA polymerase accessibility . One of these ncRNAs, Hox transcript antisense RNA (HOTAIR), originates from the HOXC locus and silences transcription across 40 kb of the HOXD locus in trans by inducing a repressive chromatin state, which is proposed to occur by recruitment of the Polycomb chromatin remodelling complex PRC2 by HOTAIR (Figure 4). Recently, it has been proposed that HOTAIR has the ability to bind other histone-modifying enzymes such as the demethylase LSD1 . In fact, knockdown of HOTAIR induces a rapid loss of LSD1 or PRC2 at hundreds of gene loci with the corresponding increase in expression. This model fits other chromatin modifying complexes, such as Mll, PcG, and G9a methyltransferase, which can be similarly directed by their associated ncRNAs . As modulator of epigenetic landmark, it has been shown that HOTAIR has a profound effect on tumorigenesis. In fact, HOTAIR is upregulated in breast carcinoma and colon cancer and its correlates with metastasis and poor prognosis  Enforced expression of HOTAIR consistently changed the pattern of occupancy of Polycomb proteins from the typical epithelial mammary cells pattern to that of embryonic fibroblasts . Another important effect of lncRNAs on chromatin modification that can highlight their impact on cancer is the relationship between the lncRNA ANRIL and the INK4b/ARF/INK4a locus, encoding for three tumor-suppressor genes highly deleted or silenced in a large cohort of tumors . ANRIL, which is transcribed antisense to the protein coding genes of the locus, controls the epigenetic status of the locus by interacting with subunits of PRC1 and PRC2. High expression of ANRIL is found in some cancer tissues and is associated to a high levels of PCR-mediated trimethylated histone H3 lysine27. Inhibition of ANRIL releases PRC1 and PRC2 complexes from the locus, decreases the histone methylation status with the following increase of the protein coding gene transcription. Many other tumor suppressor genes that are frequently silenced by epigenetic mechanisms in cancer also have antisense partners, which can affect gene expression with different other mechanism. First, antisense ncRNAs can mask key cis-elements in mRNA by the formation of RNA duplexes, as in the case of the Zeb2 antisense RNA, which complements the 5′ splice site of an intron of Zeb2 mRNA . Expression of the ncRNA prevents the splicing of the intron that contains an internal ribosome entry site required for efficient translation and expression of the ZEB2 protein with a further efficient translation (Figure 4). In this context, it has been evaluated that the prevalence of lncRNAs are antisense to introns, hypothesizing their role in the regulation of splicing or capable of generating mRNA duplexes that fuel the RISC machinery to silence gene expression. One major emergent theme is the involvement of the lncRNAs in the assembly or activity of transcription factors functioning as a scaffold for the docking of many proteins, mimicking functional DNA elements or modulation of PolII itself. The first example is represented by the suppression of CCND1 mediated by the lncRNAs through the recruitment and integration of the RNA binding protein TLS into a transcriptional programme. DNA damage signals induce the expression of long ncRNAs associated with the cyclin D1 gene promoter, where they act cooperatively to recruit the RNA binding
protein TLS. The modified and promoter-docked TLS inhibits the histone acetyltransferase activities of CReB binding protein and p300 inducing the silencing of cyclin D1 expression (Figure 4) . A different co-activator activity mediated by lncRNAs is also evident in the regulation of Dlx genes, important modulators of neuronal development and patterning . Dlx5-6 expression is regulated by two ultraconserved enhancers one of which is transcribed in a lncRNA, named Evf-2. Evf2 forms a stable complex with the homodomein protein DLX-2 which in turn acts as a transcriptional enhancer of Dxl5-6 gene (Figure 4). In some cases, lncRNAs can also affect RNA polymerase activity by influencing the initiation complex in the choice of the promoter. For example, in humans, a ncRNA transcribed from an upstream region of the dihydrofolate reductase (DHFR) locus forms a triplex in the major promoter of DHFR to prevent the binding of the transcriptional co-factor TFIID (Figure 4). This could be a widespread mechanism for controlling promoter usage as thousands of triplex structures exist in eukaryotic chromosomes. Recently, lncRNAs have also shown their tumorigenic potential by modulating the transcriptional program of p53 . An 3kb lncRNAs, linc-RNA-p21, transcriptionally activated by p53, has been shown to collaborate with p53 in order to control the gene expression in response to DNA damage. Specifically, silencing of lincRNA-p21 derepresses the expression of hundred of genes which are also derepressed following p53 knockdown. It has also been discovered that lincRNA-p21 interacts with hnRNPK and this binding is essential for the modulation of p53 activity.
The final category of lncRNAs is represented by those molecules capable to generate the formation of compartmentalized nuclear organelles, subnuclear membraneless nuclear bodies whose funtion is relative unknown. One of them is represented by cell-cycle regulated nuclear foci, named paraspeckles. In addition to protein components, two lncRNAs, NEAT1 and Men epsilon, have been detected as essential part of the paraspeckles. While depletion of NEAT or Men epsilon disrupts the paraspeckles, their overexpression strongly increases their number. There is a number of different lncRNAs that localize to different nuclear regions . Metastasis-associated lung adenocarcinoma transcript 1 (MALAT1) localizes to the splicing speckles, Xist and Kcnq1ot1 both, localize to the perinucleolar region during the S phase of the cell cycle, a class of repeat-associated lncRNAs (es SatIII) are associated to nuclear stress bodies which are produced on specifc pericentromeric heterochromatic domains containing SatIII gene itself.
Alterations in microRNAs and other short or long non-coding RNA (ncRNA) are involved in the initiation, progression, and metastasis of human cancer. Over the last decade, a growing number of non-coding transcripts have been found to have roles in gene regulation and RNA processing. The most well known small non-coding RNAs are the microRNAs, but the network of long and short non-coding transcripts is complex and is likely to contain as yet unidentified classes of molecules that form transcriptional regulatory networks. The field of small and long non coding RNAs is rapidly advancing toward in vivo delivery for therapeutic purposes. Advanced molecular therapies aimed at downmodulating or upmodulating the level of a given miRNA in model organisms have been successfully established. RNA-based gene therapy can be used to treat cancer by using RNA or DNA molecules as therapy against the mRNA of genes involved in cancer pathogenesis or by directly targeting the ncRNAs that participate in pathogenesis. The use of miRNAs is still being evaluated preclinically; no clinical or toxicologic studies have been published but the future is promising. Kota and collegues reported that systemic administration of this miRNA in a mouse model of HCC using adeno-associated virus (AAV) results in inhibition of cancer cell proliferation, induction of tumor-specific apoptosis, and dramatic protection from disease progression without toxicity (116). Recently, Pineau et al. (117) identified DNA damage-inducible transcript 4 (DDIT4), a modulator of the mTor pathway, as a bona fide target of miR-221. They introduced into liver cancer cells, by lipofection, LNA-modified oligonucleotides specifically designed for miR-221 (antimiR-221) and miR-222 (antimiR-222) knockdown. Treatment by antagomiRs, but not scrambled oligonucleotide, reduced cell growth in liver cancer cell lines that overexpressed miR-221 and miR-222 by 35% and 22%, respectively. Thus the use of synthetic inhibitors of miR-221 may prove to be a promising approach to liver cancer treatment (117). Despite recent progress in silencing of miRNAs in rodents, the development of effective and safe approaches for sequence-specific antagonism of miRNAs in vivo remains a significant scientific and therapeutic challenge. Recently, Elmen and collaborators (118) showed for the first time, that the simple systemic delivery of an unconjugated, PBS-formulated LNA-antimiR effectively antagonizes the liver-expressed miR-122 in nonhuman primates. Administration by intravenous injections of LNA-antimiR into African green monkeys resulted in the formation of stable heteroduplexes between the LNA-antimiR and miR-122, accompanied by depletion of mature miR-122 and dose-dependent lowering of plasma cholesterol. These findings demonstrate the utility of systemically administered LNA-antimiRs in exploring miRNA functions in primates and show the impressive potential of this strategy to overcome a major hurdle for clinical miRNA therapy. In conclusion, the discovery of small RNAs and their functions has revitalized the prospect of controlling expression of specific genes in vivo, with the ultimate hope of building a new class of gene-specific medical therapies. Just how significant are the ncRNAs? They appear to be doing something important and highly sophisticated; there are so many of them, their sequences are so highly conserved, their expression is tissue specific, and they have recognition sites on more than 30% of the entire transcriptome. It seems that ncRNAs were overlooked in the past simply because researchers were specifically looking for RNAs that code proteins. The above discussed data highlight that the complexity of genomic control operated by the ncRNAs is somewhat greater than previously imagined, and that they could represent a total new order of genomic control. In this scenario, understanding the precise roles of ncRNAs is a key challenge. The targeting of other ncRNAs, in addition to miRNAs, is still in its infancy, but new important developments are expected in this area. Therefore, small RNAs could become powerful therapeutic tools in the near future.