Examples of miRNAs associated with AML that change expression level
Large-scale analysis of total genome transcripts (transcriptome) in organisms including human and mouse has revealed that many RNAs are transcribed from genomic regions that encode no proteins (referred to as ncRNA) (1-5). Among such ncRNAs, microRNAs (miRNAs), small molecule RNAs 18-28 bases long, have been extensively studied over the past decade, and a gene regulatory system called “RNA silencing” has been revealed. In humans, more than 400 miRNAs are known to regulate at least one-third of protein-encoding genes (6-10). Most miRNAs are generated by processing of long miRNA precursors (pri-miRNAs) (6, 9). Pri-miRNAs are transcribed by RNA polymerase II and 5’ cap structures and poly A tails are added, similarly to protein-encoding mRNAs. Pri-miRNAs are further processed in the nucleus into pre-miRNAs with an approximately 70 base hairpin structure and are then exported to the cytoplasm. pre-miRNAs are finally processed into mature miRNAs by the enzyme, Dicer. It is noteworthy that miRNAs are sometimes encoded in the introns of other genes. A mature miRNA is incorporated into the RNA-induced silencing complex to act on its target mRNA. Broadly speaking, miRNAs can act on mRNAs in two ways. If there is limited homology between an miRNA and a target mRNA, the miRNA suppresses translation of the mRNA. However, if the miRNA has complete or nearly complete homology with a target mRNA, the mRNA is rapidly degraded. In animal cells, the former scenario usually occurs (7, 10-12). Many miRNAs have been reported to be associated with tumors, including AML and glioma; however, it is still unclear how predominant miRNAs are in tumorigenesis.
Relatively large ncRNAs of over several hundred bases, which are longer than pri-miRNAs whose length is usually 200-300 bases, are called long-chain non-coding RNAs (lncRNAs). Despite their somewhat unclear definition and their largely undetermined functions (13), the public databases for lncRNAs, for example, lncRNAdb (http://www.lncrnadb.org/) (14) or NONCODE (http://www.noncode.org) (15), contain several hundred mammalian lncRNAs, including more than 100 from human (16). The RNAs included are heterologous; some localize in the nucleus to form certain structures, others interact with chromatin modifying enzymes such as p300, while others function in the cytoplasm (Fig. 1).
Both miRNAs and lncRNAs are physiologically important in many biological processes, including development and cell differentiation. Their association with disease, especially cancers, is of great interest (5). Association of miRNAs with various tumors, including different types of leukemia (Table 1) and glioma (Table 2), has been demonstrated. They sometimes act as tumor-promoting factors and sometimes as tumor suppressors. Expression of many lncRNAs, including NDM29 (neuroblastoma) (17, 18) and MALAT-1 (lung cancer) (19) are correlated with tumor progression, while MEG3 (pituitary tumor) (20, 21), HOTAIR (breast carcinoma) (22), H19 (Wilms’ tumor) (23), AK023948 (papillary thyroid tumor) (24) and LOC285194 (osteosarcoma) (25) are putative tumor suppressors (Table 3). These lncRNAs seem to control cancer cell growth by regulating other genes (NDM29, HOTAIR, H19) or by adjusting the mRNA splicing mechanism (MALAT-1) (Fig. 1) (14).
|Oncogenic or Increased Expression in AML||Tumor Suppressive or Decreased Expression in AML|
|Name||Genetic Locus||Name||Genetic Locus|
|Oncogenic or Increased Expression in Glioma||Tumor Suppressive or Decreased Expression in Glioma|
|Name||Alias||Mouse Homolog||Genetic Locus||Product Length (bp)||Tumor||Function||Refs|
|Tumor promoting or Increased Expression|
|BC200||BCYRN1||Bc1||2p21||200||Breast cancer||Regulation of proteinbiosynthesis||(70)|
|HIF1A-AS2||aHIF||NA||14q23.2||2051||Multiplecancers||Decoy of mRNA||(71)|
|HOTAIR||Gm16258||Hotair||12q13.3||2364||Multiple cancers||Epigenetic silencing of HOXD gene through histoneH3K27 methylation||(72)|
|HULC||NA||6p24.3||500||Hepatocellular carcinoma||Post-transcriptional regulation||(73)|
|KRASP1||NA||6p12-p11||5178||Prostate cancer||Decoy of miRNA||(75)|
|L1PA16||VL30-1a)||3q26.3||833||Many tumor cell lines||Activation of proto-oncogene||(76)|
|MALAT1||Neat2||Malat1||11q13.1||8708||Multiple cancer||Control of RNA procession||(19, 77)|
|MER11C||HERVK11||VL30-1a)||11p11.1||1060||Many tumor cell lines||Activation of proto-oncogene||(76)|
|SRA1||Sra1||5q31.3||1955||Breast cancer||Activation of nuclear receptors||(81)|
|TERC||Terc||3q26||451||Multiple cancer||Telomere template||(82)|
|UCA1||CUDR||NA||19p13.12||1591||Bladder cancer||Regulation of cell cycle||(83)|
|WT1-AS||WIT1||NA||11p13||1333||Wilms' tumorAML||Downregulation of WT1, tumor suppressor||(84)|
|XIST||Xist||Xq13.2||19271||Multiple cancers||Xinactivation||(56, 85)|
|Tumor Suppressing or Decreased Expression in Tumor|
|AK023948||NA||8q24||2807||Papillary thyroid carcinoma||NA||(24)|
|ANRIL||CDK2BAS, p15AS||NA||9q21||944||Prostate cancer, breast cancer, melanoma, andother tumors||Regulation of epigenetic transcriptional repression||(58)|
|DLEU2||Dleu2||13q14.3||2768||Chroniclymphocytic leukemia||pri-miRNA for miR15a and miR16||(86)|
|GAS5||Gas5||1q25.1||651||Breast cancer||Decoy of glucocorticoid receptor||(87)|
|H19||H19||11p15.5||2322||Wilms' tumor||Epigenetic regulation through DNA methylation||(88)|
|KCNQ1OT1||LIT1, KvLQT1-AS, KvLQT1OT1||Kcnq1ot1||11p15||91671||Embryonal cancer associated with Beckwith-Wiedemann syndrome||Epigenetic imprinting through H3K27 methylation||(57)|
|MEG3||Gtl2||Meg3||14q32||1595||Glioma, pituitary adenoma andother tumor||Regulation of p53 target proteins||(89)|
|NDM29||29A||NA||11p15.3||131||Neuroblastoma||Induction the appearance of neuronal-like properties||(18)|
|p53 mRNA||Tp53||17p13.1||19144||Multiple cancer||RNA protein binding, MDM3||(90)|
|PTENP1||NA||9p21||3932||Prostate cancer||Decoy for PTEN-targeting miRNAs||(75)|
|RMRP||Rmrp||9p21-p12||267||Leukemia and lymphoma||Mitochondrial RNA processing endoribonuclease, hTERT-dependent||(91)|
|TERRA||TelRNAs||telomere repeats||NA||Many cancer cell lines||Interaction with the TRF1||(92)|
|vtRNA2-1||NA||5q31.1||100||AML, papillary thyroid cancer||Regulation of RNA dependent protein kinase (pPKR)||(93)|
2. Genetic abnormality observed in acute myeloid leukemia (AML)
AML, which comprises approximately 25% of hematopoietic malignancies, has heterogeneous clinical features and variable responses to contemporary therapy (26). Genetic alterations are often observed in AML cells and the clinical heterogeneity of the disease is considered to reflect the genetic diversity of these cells (27, 28). It is very important to study the genetic mutations in AML cells to fully understand the cause of the disease. However, genetic lesion(s) responsible for AML, such as the loss or gain of a certain gene, have not yet been fully elucidated. Indeed, the complex features of AML suggest that the genetic cause of this disease is multifactorial (29). Several protein-encoding genes have been identified that are useful for indicating the prognosis of the disease (30-32). These include RUNX1 (AML1)-RUNX1T1 (ETO) and CBFB-MYH11, which are associated with specific chromosomal mutations, t(8;21)(q22;q22) and inv(16)(p13;q12)/t(16;16)(p13;q22), respectively. AML with these cytogenetic features (singly or together) represents about 15% of de novo AML. The patients with these diagnostic criteria are classified in the favorable clinical outcome group (standard-risk group). Several other chromosomal abnormalities have been recurrently observed, as described in the WHO classification. AML with balanced or unbalanced translocations involving the MLL gene located on chromosome 11 are also well documented and are mostly classified in the intermediate-risk group. Meanwhile, AML patients with a normal karyotype and no cytological abnormality include cases classified in the unfavorable (adverse-risk) or intermediate-risk group. Moreover, a genetic abnormality of the FL3 gene (internal tandem repeat) is found in many AML subtypes and, in combination with a wild-type NPM gene, contributes to poor prognosis (31). Recently, Paschka and colleagues have revealed that the genes encoding the metabolic enzymes, isocitrate dehydrogenase 1 and 2 (IDH1/2) are important for diagnosis and prognosis prediction of AML patients (33). These mutations of IDH1/2 change the activity of the enzymes to reduce α-ketoglutarate levels and to elevate 2-hydroxyglutarate levels. This results in changes to chromatin structure and destabilization of certain gene-regulatory proteins, including HIF-1 (34). While cytogenetically normal AML patients with an NPM mutation and a normal FL3 gene tend to show favorable outcomes, AML patients with the same genetic profile but also with IDH1/2 mutation showed adverse prognosis with poorer remission. IDH1/2 mutation was also found in several other tumors, including glioma (35). Therefore, a combination of genetic alterations resulting in mutation of specific genes as well as cytogenetically apparent chromosomal changes are important for AML malignancy.
3. AML and CCDC26
In HL-60 cells derived from AML, a small part of chromosome 8 is excised and amplified as an extrachromosomal element, or double minute chromosome (dmin). Dmin is a cytogenetic abnormality infrequently observed in AML. The dmin of HL-60 cells consists of several repeats of an amplification unit (referred as amplicon) of about 2 million base pairs. The amplicon, which is derived from several areas of an approximately 4.6 million base pair region of chromosome 8q24, contains an intact MYC oncogene. Besides MYC, several other genes, including CCDC26 and tribbles homolog 1 (TRIB1), are also encoded on the amplicon (Fig. 2). All are actively transcribed in HL-60 cells. The drug-induced differentiation of HL-60 cells suppressed the expression of all these genes, indicating that they might be related to the cancerous nature of the cells. Some types of cancer cell respond to the anticancer drug hydroxyurea by excluding unstable extrachromosomal elements, which then lose their proliferative nature. In HL-60 cells, the original MYC genetic locus remained intact after dmin was excluded, but was no longer transcribed (36). These observations suggest that the expression of genes from dmin, with its altered DNA structure, and from the intact chromosome are different, and can be interpreted as being due to aberrant gene expression from dmin (including the MYC oncogene). Interestingly, in HL-60 cells, the CCDC26 gene on dmin is rearranged as a result of chromosomal rejoining and is amplified in an incomplete form to produce abnormal transcripts (37).
A common change occurs at the CCDC26 locus in cytologically dmin-positive AML patients. This chromosomal change occurs at a position consistent with the amplified region observed in HL-60 cells (38, 39). Furthermore, destruction of the internal structure of the CCDC26 gene seems to underlie the common mechanism behind the generation of dmin-positive AML cells.
A comprehensive genome-wide study of a group of childhood AML patients revealed that CCDC26 was one of the genes with the highest increase in copy number in AML cells. Radtke and colleagues investigated chromosome number alteration (CNA) in pediatric AML using a comprehensive single nucleotide polymorphism (SNP) array analysis. They found the most common CNA, in 14% (15 in 111) of pediatric AML patients, to be in chromosome band 8q24 with a low-burden copy number increase (2.83-3.77 copies) (40). These included cases of trisomy 8, which frequently occurs in AML (41). The minimum altered region common in all 15 of these patients was located in a 20-megabase region of 8q24, which contains CCDC26.
Originally, CCDC26 was reported as a gene associated with differentiation and apoptosis of PLB985 cells (an HL-60 subclone) following induction by treatment with retinoic acid (CCDC26 is also known as RAM, retinoic acid modifying). In cells that have become resistant to differentiation and apoptosis after infection of retrovirus, the viral genome was seen to be inserted in the intron of CCDC26. Retinoic acid promotes differentiation and apoptosis of not only many leukemia cells but also of neuroblastoma and glioblastoma cells through transcriptional regulation of many other genes. CCDC26 may have a role with retinoic acid in differentiation and growth arrest of these cells (42).
4. Glioma and CCDC26
Primary brain tumor (PBT) is a disease with an incidence of 12 in 100,000 per year. Glioma accounts for a major part of PBT, and contains cases with different grades of malignancy, namely (I) benign glioma, (II) diffuse astrocytoma, (III) anaplastic astrocytoma and (IV) glioblastoma (43). Although many genetic abnormalities have been reported in gliomas, a single critical lesion responsible for tumorigenesis has not been found. Among these abnormalities, mutations occur in genes for DNA repair enzymes, including PRKDC, XRCC, PARP1, MGMT, ERCC1, ERCC2, epidermal growth factor and the inflammatory cytokine, IL-13. Furthermore, over-expression or amplification of the epidermal growth factor receptor gene and deletion of p16INK are correlated with poor survival (43). A genome wide association study using SNPs revealed the association of several genes with glioma, including telomerase regulating gene TERT, RTEL1, tumor suppressor gene CDKN2A/2B, pleckstrin homology-like domain family B member 1 (a protein with unknown function) and CCDC26 (44). The CCDC26 gene locus was strongly linked with this glioma by several SNPs, including rs4295627, rs16904140, rs6470745, rs891835, and rs10464870 (see Fig. 3a). A different SNP in the intergenic region bordering CCDC26, rs987525, was linked to cleft palate (45). Notably, cleft palate is also a risk factor of PBT. CCDC26 is, therefore, a potential common factor of both conditions. CCDC26 is just one of the risk factors for glioma and other genetic risk factors increase glioma incidence cumulatively. Therefore, there might be a synergistic effect with other genetic risk factors (46). CCDC26 is not necessarily a risk factor of high grade (III-IV) glioma (47). Interestingly, in concordance with the situation for AML, the CCDC26 genotype is associated with IDH1/2 mutation in low grade glioma. Considering the synergy of CCDC26 with IDH1/2, CCDC26 may have linkage to a subpopulation of gliomas with relatively lower grade (46, 48).
The Gene Expression Omnibus database (GEO; http://www.ncbi.nlm.nih.gov/geo/) (49) contains data showing altered CCDC26 expression between normal and tumorigenic cells. Expression of CCDC26 is higher in myeloid leukemia cell lines, namely KG-1, THP-1 and U937, compared with normal monocytes (GEO dataset accession ID; GDS2251), and is higher in sporadic basal-like cancer compared with normal cells (GD2250). On the other hand, CCDC26 expression is decreased in hyperplastic enlarged lobular units considered as the earliest precursors of breast cancer compared with normal units (GDS2739). Increased CCDC26 expression is associated with malignancy progression in some cancerous cells. CCDC26 expression was increased in CD133 positive neurosphere-like glioma cell lines compared with CD133 negative adherent glioma cell lines (GDS2728), and was increased in alveolar macrophages of cigarette smokers comparison with macrophages of non-smokers (GDS3496). Increased expression of CCDC26 might mean this gene is tumorigenic or oncogenic. However, the relationship of altered CCDC26 expression to malignancy is still ambiguous.
5. Overview of the CCDC26 genetic locus
As described in the previous section, all SNPs associated with glioma, and a retrovirus insertion site where virus insertion makes AML cells resistant to retinoic acid (42) are located in the intron of CCDC26 (Fig. 3a). Exon 4, which encodes the majority of a hypothetical open reading frame (ORF), is not amplified in pediatric AML or in AML-derived HL-60 cells. The exonic sequence of CCDC26 is not well conserved in other species, including mouse, and an ORF has no homology with known proteins. These data strongly suggest that CCDC26 does not function as a protein-encoding RNA; rather it functions as a ncRNA. Highly conserved regions in the intron sequence of CCDC26 suggest the existence of another intronic ncRNA. As mentioned above, the CCDC26 locus is rearranged in the genome of HL-60 cells. It is plausible that the ncRNA encoded by this locus is important for the growth of these cells.
A short putative ORF encoding a protein or with a length of 109 amino acids is present in the CCDC26 exons; there is no other ORF of more than 50 amino acids. This actual protein, however, has not been observed. Moreover, orthologous proteins are not found in any other organism. For example, a loosely homologous sequence of human exon 4, found in the mouse chromosome 15 region of conserved synteny, with an ORF of 94 amino acids is actively transcribed in mouse leukemia cells (T. Hirano unpublished observation). However, this ORF is completely different from the human sequence and even contains frame shift alterations (Fig. 3b). This indicates that the putative protein encoded by CCDC26 has no conserved function among species. Although this ORF may be coincidental due to the absence of stop codons, an interesting possibility is that this unique protein has newly emerged during human evolution. mRNA stability is influenced by whether an ORF is encoded because nonsense mediated RNA decay, a mechanism associated with quality control of mRNA, rapidly degrades mRNAs that are not useful as templates for protein synthesis. Absence of an ORF in an mRNA promotes degradation by this mechanism, however, the existence of a CCDC26 protein will prolong the lifetime of CCDC26 mRNA and may maintain the function (if any) of the RNA itself.
Because of the considerable length of the CCDC26 intron (330 kbp versus 1200 bp exons), it is very difficult to ignore the possibility that there is another transcript(s) within this intron with important function. Possible encoded ncRNAs within the CCDC26 exon-intron region are summarized in Fig. 4 and include, mRNA (a), intronic encoded ncRNA (b), intronic lariat RNA (c-d) and miRNA independently transcribed or processed from the precursor of the CCDC26 mRNA (e). Actually there are several regions in the CCDC26 intron where nucleosomal histones undergo high levels of methylation and acetylation, meaning that these locations may be actively transcribed (Fig.3a). Furthermore, most of these regions are
highly conserved among mammals, suggesting that function is encoded. Also, expressed sequence tags other than known spliced CCDC26 mRNAs have been reported in the intron. There are three miRNAs (miR-3669, 3673 and 3686) in the intron of CCDC26 that are registered in the miRNA database (miRBase; http://www.mirbase.org/)(50). Although their functions are unknown, they may act as oncogenic or tumor suppressive ncRNAs.
6. Hypothetical function of CCDC26 as a non-coding RNA
Although many ncRNAs are registered in databases, only a few have clearly demonstrated functions and detailed mechanisms of action. CCDC26 might be a new ncRNA that is associated with cancer, including AML. Interestingly, expression of an miRNA, miR-21, is observed in many malignant cells, including AML cells (51). Also phorbol ester-induced differentiation of HL-60 cells into macrophage-like cells is accompanied by up-regulation of miR-21 (52). There are several reports suggesting that miRNAs act as oncogenic or tumor suppressive miRNAs in AML, as reviewed in (53, 54). Recently, Marcucci and colleagues used 305 different probes to search for miRNA expression in favorable and adverse-risk groups of normal karyotype AML (monocytic leukemia). They then used these data to link expression profiles with the cohort analysis of the patients. They identified a certain pattern of miRNA expression in the adverse-risk group and linked the expression level of eight types of miRNA to AML prognosis (55). It is possible that an unknown miRNA in the CCDC26 locus affects cancer malignancy through the regulation of other genes. But all miRNAs described so far in the CCDC26 locus (mir-3669, mir-3673 and mir-3686) show no expression in leukemia cells and no conservation among mammals in contrast to other oncogenic miRNAs; for example miR21 and let7, are actively transcribed and strongly conserved.
Within the CCDC26 intronic region, there are some long regions (>10 kb) that are actively transcribed in leukemia cells (Fig. 3a). They seem to be too long for pri-miRNAs but could encode lncRNAs. Indeed, active transcription occurs in the CCDC26 region in cells derived from AML (T. Hirano unpublished observation), meaning that these transcripts might function as a tumor promoting or oncogenic lncRNAs. In contrast, if the original function of CCDC26, or of lncRNAs associated with CCDC26, was lost by chromosomal abnormality (for example in dmin of HL-60 cells), then they might function naturally as tumor suppressors. Some lncRNAs including XIST (56), KCNQ1OT1 (57), ANRIL (58) and AIRN (59) are known to suppress (in cis) the expression of neighboring gene. It is well known that genes located in extrachromosomal elements such as dmin are actively transcribed, but the mechanism behind this phenomenon is not well understood (60, 61). Differences between dmin and an intact chromosome are caused by differences in chromatin structure, which is indicated by differences in DNase I hypersensitivity (36). Similarly to other gene silencing lncRNAs, an ncRNA encoded by the CCDC26 locus might suppress the expression of other nearby genes. The hypothesis that neighboring genes, including the MYC oncogene, are activated when the normal CCDC26 locus structure is destroyed by a chromosomal abnormality could explain the high transcriptional activity of genes in extrachromosomal elements (Fig.5). Further evidence is needed to determine whether CCDC26 mRNA and/or its transcripts encoded in its intron are oncogenic or tumor suppressive.
7. Future perspectives
The size of the CCDC26 locus, spanning over 330,000 base pairs, makes it difficult to study. If the ORF of the gene is not functional then it is unclear which part(s) of the locus are functional. Therefore, to study this gene, it is first necessary to determine all transcripts produced by the CCDC26 locus and then to analyze their function. Comprehensive analysis of transcriptome of the relevant region using tiling microarray analysis is needed. Although lncRNA orthologs are frequently not found between species, homology analysis of this region between human and mouse could be helpful to identify functional sequences. Once transcripts are identified, we will be able to perform in situ hybridization to determine subcellular localization. Knock-down of transcripts will be useful to investigate their functions. Proteins interacting with the RNA transcripts will be identifiable by pull-down assays and mass spectrometry analysis. Finally, gene targeting should be used to investigate the effects of disruption of the region encoding the transcript. It will be of special interest if transcription of neighboring genes is activated or inactivated (in particular MYC), suggesting a regulatory function of the ncRNA encoded in the CCDC26 locus. If an ortholog of the gene is found in mice, making a knock-out mouse of the ncRNA or a transgenic mouse with forced expression of the ncRNA will help to demonstrate its relationship to disease.
As a conclusion, the CCDC26 locus is considered to encode an lncRNA involved in tumorigenesis. CCDC26 itself might be an lncRNA or its intron might contain a functional miRNA or lncRNA. The study of this gene will bring new knowledge to gene regulation and to cancer treatment strategies targeting lncRNAs. Further in vitro and in vivo study is needed to prove the relationship between transcripts from the locus and disease, such as leukemia and glioma.