Vast emerging evidences are linking the base modifications and gene expression involved in essential metabolic pathways. Among the base modification markers extensively studied, 5-methylcytosine (5mC) and its oxidative derivatives (5-hydroxymethylcytosine (5-hmC), 5-formylcytosine (5-fC), and 5-carboxylcytosine (5-caC)) dynamically occur in DNA and RNA and have been acknowledged as the important epigenetic markers involved in regulation of cellular biological processes. The modification of C has been characterized biochemically, molecularly, and phenotypically, including elucidation of its methyltransferase complexes (writer), demethylases (eraser), 10-11 translocation proteins (TETs), and direct interaction proteins (readers). The levels and the landscapes of these epigenetic markers in the epitranscriptomes and epigenomes are precisely and dynamically regulated by the fine-tuned coordination of the writers and erasers in accordance with stages of the growth, development, and reproduction as naturally programmed during the life span. In mammalian genome, the TET family is consisted of three members, including TET1, TET2, and TET3. The link between aberrant modifications and diseases, such as cancers, neurodegenerative disorders, and heart diseases, has been appreciated. This review article will highlight the research advances in the writers and erasers for the modifications of cytosine in genome, as well as the dual function of TET1 in tumorigenesis as a tumor suppressor and a promoter. Additionally, the future research directions are addressed.
- 5-methylcytosine (5-mC)
- 5-hydroxymethylcytosine (5-hmC)
- DNA methyltransferases (DNMTs)
- DNA demethylase
- 10-11 translocation protein (TET)
- 5-mC binding protein
- 5-hmC binding protein
Epigenetics is defined as the investigation on gene expression alterations heritable to next generations caused by nongenetic but heritable cellular memory other than DNA sequence variations . The epigenetic memories including dynamic base modifications (DNA methylation/demethylation), histone modifications, chromatin architecture, and noncoding RNAs maintain all the biological processes in the programmed tracks. Any aberrant alterations could lead to development of abnormality and initiation of diseases such as neurological disorders and cancers as reviewed in [2, 3, 4, 5, 6, 7, 8]. A micro-event in base modification could lead to strong “earthquake” in the signaling pathways and the consequent alteration of organism phenotypes, even diseases. The most extensively studied modifications are methylation and demethylation of 5-cytosine (5-C).
DNA base modifications such as methylation of 5-mC [9, 10, 11, 12, 13, 14] and 5-hydroxymethylcytosine (5-hmC) [15, 16, 17, 18, 19, 20, 21] have been acknowledged as the best characterized epigenetic markers in mammalian brains [20, 22, 23, 24] and ES cells [25, 26, 27], essentially regulating chromatin structure and consequently gene expression with the potential mechanisms. This review article mainly focuses on the recent advances in methylation/demethylation modifications of 5-C in mammalian genomes, including methylation/demethylation machineries, methyltransferase complexes (writers) and demethylase complexes (erasers), as well as the distinct functions of TET1 in the regulation of tumorigenesis.
2. Cytosine modifications
To maintain the normal life process, any base modification must be dynamically and tightly regulated in accordance with stages of the growth, development, and reproduction, including modification generation by methyltransferase complexes (writers), removal by demethyltransferases (erasers), as well as the preferential binding protein components (readers), to get the related epigenetic markers into the biochemical effects.
2.1. Methyltransferases of cytosine methylation
DNA methylation, particularly the most abundant CpG methylation marker 5-mC, is an essential modification of DNA in the mammalian genome, typically linked with gene silencing and involved in gene regulation, development, genome defense, and disease. A family of DNA methyltransferases named (DNMTs) is responsible for the addition of methyl groups to the 5-position of the carbon, including DNMT1, DNMT2, DNMT3a, DNMT3b, and DNMT3L. The five members are structurally and functionally distinct. The three methyltransferase enzymes DNMT1, DNMT3a, and DNMT3b serve as writers for the de novo CpG methylation pattern during embryogenesis [28, 29], while DNMT1 could confer the maintenance of parent DNA methylation patterns to the new daughter strand DNA during DNA replication .
Traditionally, DNMT1 was regarded as the maintenance methyltransferase copying methylation marks of hemimethylated DNA to the newly synthesized daughter strand during DNA replication, making the enzyme indispensable for dividing progenitor cells [29, 39, 40]. This is supported by the finding that DNMT1 has higher affinity to hemimethylated DNA [41, 42] and that gene knockout of Dnmt1 in the central nervous system leads to lethal in mice . While Dnmt1 deletion in all dividing somatic cells is also lethal [43, 44, 45, 46, 47], mouse embryonic stem cells are viable, despite the resulting global loss of DNA methylation . Notably, human embryonic stem cells (ESCs) also displayed a global demethylation upon Dnmt1 deletion .
However, in accordance with the special requirement, the DNMT1 and DNMT3A are functionally correlated. For example, in the adult brain, both methyltransferases could carry out cytosine methylation in the promoter and gene body regions, leading to transcription repression .
While DNMT1 is believed to function mainly for the maintenance of established patterns of DNA methylation in normal living cells, in the diseased cells such as cancer cells, DNMT1 alone is not sufficient to maintain the programmed normal gene hypermethylation. As such, the collaboration of DNMT1 and DNMT3b is indispensable for the maintenance function.
Sirt1 regulates DNA methylation and differentiation potential of embryonic stem cells by antagonizing Dnmt3l. DNMT2, a tRNA methyltransferase and the most conserved member of the DNMTs methylates tRNAs to protect them from ribonuclease digestion. More importantly, DNMT2 is functionally related to the sperm small RNA (sncRNAs) mediated essentially in writing the “paternal epigenetic signature” to sperm RNA . The mechanism is that the DNMT2-conferred m5C in sncRNAs regulates the secondary structure and biological properties of sncRNAs, suggesting that sperm RNA modifications could serve as one of the carriers for paternally imprinted epigenetic memories .
2.2. Demethylation and demethylases
The dynamic DNA methylation/demethylation is tightly regulated during the whole life span. DNA demethylation, the removal of a methyl group, is not just a reverse process of methylation, but rather very complicated metabolic pathways indispensable for reactivation of genes and directly involved in pathogenesis of diseases such as cancers and neurological disorders. Either passive, active, or combination of both, leads to demethylation of DNA. The passive mechanism renders the automatic demethylation in a way that dilution and gradual loss of methylation in the newly synthesized DNA strands during successive replication rounds. In contrast, the active demethylation is believed to be the most important mechanism for active DNA demethylation via 5-mC oxidation catalyzed by the 10-11 translocation proteins (TETs) in alpha-ketoglutarate (a-KG) and Fe(II) dependent manner . In addition to TETs, several other enzymes are acknowledged to be involved in the active mechanisms for demethylation, such as activation-induced cytidine deaminase (AID) , TET [51, 52], and thymine DNA glycosylase (TDG) [53, 54, 55].
5-hmC is generated by oxidation of 5-mC by TET, and the 5-hmC faces several fates once it is generated. First, the 5-hmC could be directly converted to regular cytosine through mechanisms involving the base excision repair pathway. Second, stepwise, a small percentage (~10%) of the 5-hmC is converted to 5-formylcytosine (5-fC) and 5-carboxylcytosine (5-caC), respectively [56, 57]. The 5-fC and 5-caC are finally converted into regular cytosine  with the help of Thymine-DNA glycosylase (TDG). Finally, in some tissues such as stem cells and adult neuron cells, high 5-hmC levels could be detected particularly in transcribed regions adjacent to the promoter and enhancers, positively correlating with gene expression. The low turnover rates of 5-hmC in some tissues suggest that besides serving as an intermediate of active demethylation, the stable accumulation of the 5-hmC forms a dynamic 5-hmC landscape to serve as special epigenetic markers, potentially altering the local chromatin structures via recruiting or repelling some special protein components with high affinity to or low even repellent to 5-hmC-harboring DNA [59, 60]. For example, 5-hmC loss has become a hall marker for cancer cells [61, 62, 63, 64, 65, 66]. In addition, the TET members are acknowledged as the tumor suppressors as Tet gene mutations or deletions have been identified in some tumor tissues .
In mammalian genome, the TET family is consisted of three members, including TET1, TET2, and TET3. While all three TET members could function as hydroxylases for conversion of 5-mC to 5-hmC and further stepwise from 5-hmC to 5-fC and 5-fC to 5-caC, their functions involved in diverse biological pathways are in the development stage and specifically in tissue-dependent manners [25, 68].
2.2.1. TET1 and regulation of its target gene expression
Highly expressed in ESCs, PGCs, and inner cell mass of blastocyst, TET1 protein has been proven to be mainly responsible for the initial oxidation of 5-mC to 5-hmC, and to establish the paradoxically dual distinct epigenetic patterns in transcriptional activation and repression in accordance with life processes of growth and development. Alternative splicing mechanism leads to several TET1 isoforms, including the full-length canonical and the short transcripts [69, 70, 71, 72, 73]. TET1 expression is regulated by very complicated factors including the reprogramming factors such as Oct3/4, Nanog, and Myc [68, 70] in early embryos, ESCs and PGCs , the transcription factors in the differentiated cells, and STAT3/STAT5 in acute myeloid leukemia (AML) .
The full length of TET1 protein is believed to have multiple functions in regulation of gene expression. In general, TET1 catalyzes the oxidation of 5-mC to 5-hmC, which serves as an epigenetic marker and intermediate for active demethylation, leading to transcription activation. The more emerging evidence has supported the TET1 conferred transcription activation and repression of its direct target genes [75, 76, 77] at the transcriptional level. At the molecular level, the interaction between TET1 and SIN3a facilitates transcription activation of their target genes at the transcription level. More importantly, the interaction has been detected between TET1/TET2 and E26 transformation-specific or E-twenty-six (ETS) family, one of the largest transcription factor families. For example, ETS variant 2 (ETV2), an ETS family transcription factor, interacts with TET1/TET2 to recruit the demethylases to the Robo4 promoter for demethylation-mediated transcription activation during endothelial differentiation. More recently, the Methyl-CpG-binding domain (MBD) protein, such as MBD1, through its CXXC domain recruits TET1 other than TET2 and TET3 to the heterochromatin for oxidation of 5-mC to 5-hmC, whereas the resulting 5-hmC releases the MBD1 from the binding sites by affinity-based displacement .
On the other hand, TET1 also confers transcription repression of its target genes. It is accepted that the TET1-mediated transcription repression does not require the catalytic activity of the TET1 in conversion of 5-mC to 5-hmC, but rather the interaction between TET1 and some other protein components that contain repressor complexes . Several mechanisms for TET1-mediated transcription repression have been proposed. First of all, TET1 binds a large number of polycomb target genes and interacts with SIN3A, the core component of the SIN3A co-repressor complex, leading to the transcription repression of their co-target genes via the SIN3A conferred histone deacetylation [76, 80].
The second mechanism of the TET1 conferred transcription repression is involved in TET1 interaction with recruitment of MBD repression complexes such as MBD3 [78, 81] at least in ES cells. The evidence of the mechanism includes the co-localization of TET1 and MBD3 in ESCs, higher affinity to 5-hmC than 5-mC, and association of the MBD3 knockdown with reduced level of 5-hmC as well as the enhanced expression of the 5-hmC-modified genes.
Several other mechanisms that TET1 represses the transcription have been also uncovered. It is convinced that TET1 is involved in the repression of polycomb-targeted regulator genes in accordance with the development stage by recruiting polycomb repressive complex 2 (PRC2) to the CpG-rich promoters of these genes . Further study indicated requirement of the catalytic activity in oxidation of 5-mC to 5-hmC for the PRC repressive complex-mediated repression, evidenced by the fact that the PRC2 was co-localized with 5-hmC , while TET1 recruits the EZH2 DNMT-containing PRC complex targeting H3K27 methylation.
During the early stages of epiblast differentiation, repression of TET1 target genes was conferred by the interaction between TET1 and the JMJD8 and enhancement of the JMJD8 demethylase transcriptional repressor expression , but does not require the TET1 oxidation activity. Although TET1, TET2, and TET3 are all expressed in gonadotrope-precursor cells, the TET1 expression was dramatically decreased in the differentiated cells. Differentiation with according increase in the expression of the luteinizing hormone gene (Lhb). The short isoform of TET1 with deletion of the N-terminal CXXC-domain binds the H3K27me2/3 enriched region located at the upstream promoter of the Lhb gene, downregulating its expression and leading to differentiation deficiency .
3. Distinct functions of TET1 on tumorigenesis
Tet1functions as an oncogene in some cancers
Initially, given the mutations and the deletions as predominant variation of TET proteins, particularly TET1, in human cancer genomes, it was accepted that TET1 functions as a tumor suppressor [61, 65, 66]. Indeed, TET1 and TET3 bear the predominant mutations in some tumors including colorectal cancer, melanoma, and cutaneous squamous cell carcinoma [88, 89, 90]. However, emerging evidences are connecting the TET1 overexpression and tumorigenesis as well, most likely attributed to activation of cancer-specific oncogenic pathways mediated by TET1 conferred hypomethylation [72, 84] (Figures 1 and 2).
3.1.1. TET1 demethylation associated activation of the members in the oncogenic pathways
TET1 overexpression accounts for about 40% of patients with triple-negative breast cancer (TNBC) that belongs to the most hypomethylated cancers observed, leading to about 10% hypomethylation of the queried CGI and activation of oncogenic pathways including PI3K, EGFR, and PDGF. Thus, TET1 seems functioning as a potential oncogene and could serve as a target for intervention therapy . This phenomenon was observed not only in NTBC, but also in MLL-rearranged leukemia where TET1 is believed to activate the downstream oncogenic pathways by its demethylase activity, serving as an oncogene . Additionally, via DNA hypomethylation, TET1 was demonstrated to regulate the expression of MUC4, one member of the mucin (MUC) family and an essential factor for carcinogenesis and tumor invasion in lung neoplasms, functioning as the potential oncogene [86, 87].
TET1 functions as an important oncoprotein in acute myeloid leukemia (AML) as evidenced by the high level expression of TET1 in AML, indicating that efficient inhibition of TET1 expression could serve as a powerful strategy for AML therapy. Drug screening led to identification of two compounds NSC-370284 and its structure analogue UC-514321, which repress TET1 transcription by targeting directly to target STAT3/5, TET1 transcriptional activators, suggesting the potential of the compounds targeting the STAT/TET1 for efficient therapy of AML .
Full length TET1 (TET1FL) has a CXXC domain that binds to unmethylated CpG islands (CGIs), allowing TET1 to protect CGIs from aberrant methylation and limiting its ability to regulate genes outside of CGIs. An isoform of TET1 (TET1ALT) without CXXC domain but still with catalytic domain is repressed in ES cells while it is activated in embryonic and adult tissues in contrast to TET1FL’s expression in ESCs and repression in adult tissues. TET1ALT aberrant activation is detected in breast cancer, uterine and ovarian cancer, and glioblastoma, leading to worse overall survival in these types of cancers. As for the pathogenesis mediated by the TET1ALT isoform, a predominantly activated isoform of TET1 in cancer cells does not protect from CGI methylation but likely mediates dynamic site-specific demethylation outside of CGIs.
3.1.2. Hypoxia induced promotion of TET1 expression
Enhanced expression of TET1 by hypoxia induction has been reported to upregulate cancer cell migration, invasion, and proliferation via the HIF1α signaling pathway in JEG3 cells , suggesting the oncogenic function of TET1 under hypoxia condition.
3.1.3. Overexpression of Tet1 mRNA 3’UTRs leads to sequestration of miRNAs, which target the oncogenic transcripts as well, leading to miRNA deficiency to target the oncogenic transcript
Transcription levels of TETs were significantly elevated while the protein levels were not in gastric cancer (GC) tissues compared to the adjacent normal tissues, suggesting the essential role(s) of the endogenous TET transcripts in gastric carcinogenesis and prognosis. Further study showed that overexpression of 5’UTRs, CDs, and 3’UTRs contributed to varied effects in a way that overexpression of TET 3’UTRS promoted GC growth and proliferation. Given that miR-26 targets 3’UTRs of both TET1 and EZH2 mRNAs, overexpression of TET members mRNA sequestrates miR-26 competitively and leads to release of the miR-26 mediated repression of EZH2. Thus, activation of EZH2 expression facilitates gastric carcinogenesis and progression  (Figure 2).
3.2. TET1 serves as a tumor suppressor
The pathogenic contributions the TET members made in various human cancers by functioning as tumor suppressors or promoters have been proven to be versatile. The hypermethylation-based transcriptional silencing of TET1 is frequently detected in non-Hodgkin B cell lymphoma (B-NHL), suggesting TET1 as a tumor suppressor of hematopoietic malignancy . Similarly, TET1 is downregulated upon NF-κB activation in multiple cancers including basal-like breast cancer (BLBC), melanoma, lung, and thyroid cancers, demonstrating that TET1 is the tumor suppressor that relies on involvement of the immune system .
3.2.1. TET1 methylation-mediated activation of tumor suppressor genes
It is acknowledged that 5hmC depletion initiates carcinogenesis caused by either TET1 expression repression or aberrant localization. Significantly lower 5-hmC and TET1 expression level and subcellular mislocalization in gastric cancer tissues demonstrate the crucial role of TET1 as a cancer repressor  (Figure 3).
In the tested epithelial ovarian cancer (EOC), undetected TET1 expression suggests that the consequence of TET1 repression induces the tumorigenesis, in accordance with the inhibition of colony formation, cell migration, and invasion by ectopic expression of TET1 in SKOV3 and OVCAR3 cells. The potential mechanism is the TET1 conferred demethylation and the consequent activation of the expression of two key proteins SFRP2 and DKK1 in the canonical Wnt/β-catenin signaling pathway, associated with inhibition of EMT and metastasis .
TET1 is identified as a key tumor suppressor player in ovarian cancer cell lines as well by demethylating a CpG site within the Ras association domain family member 5 (RASSF5) promoter to enhance expression of the RASSF5, leading to the growth inhibition of ovarian cancer cells .
More evidences show that EGFR-mediated TET1 repression induces silencing of tumor suppressors in cancer cells such as lung adenocarcinomas and glioblastomas. If only the oncogenic EGFR expression is inhibited, TET1 could bind to promoters of the tumor suppressors to activate their expression via DNA demethylation. TET1 overexpression inhibits lung and glioblastoma tumor growth, and vice versa, in agreement with the significant decrease in TET1 expression or TET1 cytoplasmic localization in the majority of lung cancer samples. Thus, it is plausible to speculate that TET1 may serve as the therapeutic target for oncogenic EGFR-induced lung cancers and glioblastomas . However, Lai et al. could not draw the same conclusion in human NSCLC patient samples. They did not detect the EGFR-mediated TET1 silencing, but rather observed the significant elevation of the TET1 expression levels in patient samples with EGFR mutations, suggesting the inconclusiveness in EGFR-mediated TET1 silencing among the cellular and animal models and human lung cancer patients .
Eicosapentaenoic acid (EPA), one of the major polyunsaturated fatty acids, could enhance the formation of PPARγ-RXRα-TET1 to recruit TET1 to a hypermethylated CpG island on the p21 gene for rapid demethylation and consequent expression of p21Waf1/Cip1, leading to inhibition of cancer cell-cycle progression in hepatocarcinoma cells. This suggests the bridge requirement for TET1 exerting the anti-tumor function and potential of EPA for solid tumor therapy such as live cancer .
3.2.2. TET1 silencing and loss of 5-hmC induces initiation of tumors
Loss of 5-hydroxymethylcytosine (5 hmC) caused by TET1 dysfunction could induce tumor initiation and enhance malignancy by promoting cancer cell growth, migration, and invasion in DLD1 colon cancer cells mediated by EZH2 . With loss of TET1, EZH2 repression is released, but H3K27 demethylase UTX-1 expression is repressed, enhancing histone H3K27 tri-methylation and consequently repressing the target gene E-cadherin (DH1). Accordingly, even at the condition of TET1 deficiency, either the H3K27 demethylase UTX-1 overexpression or EZH2 depletion both could enhance H3K27 demethylation at CDH1 promoter, thereby impeding EMT and tumor invasion. Likewise, either EZH2 overexpression or UTX-1 depletion both could promote EMT and tumor metastasis in DLD1 cells. Thus, these results elucidate regulation interplay among TET1, E-cadherin, and EZH2 and indicate the critical mediator role the EZH2 plays in the E-cadherin repression and tumor progression .
3.2.3. miRNA-mediated repression of TET1 expression
Some miRNAs are identified to be involved in regulation of cancer progression or repression, and one of the mechanisms refers to the oncogenic miRNA-mediated TET1 repression and the consequent loss of 5-hmC. Indeed, miR-21-5p has been confirmed to target Tet1 in colorectal cancer (CRC), serving as a biomarker for diagnostics and prognostics in CRC . Similarly, miR-4284 directly targeting Tet1 mRNA downregulates TET1 levels of both mRNA and protein in human gastric cancer SGC-7901 cells, and thus serves as an oncogenic marker , suggesting that miR-4284 could provide a potential target for gastric cancer therapy.
3.2.4. Dual function of miRNA by interaction with Tet mRNA 3’UTRs
Some miRNAs are reported to function as both a suppressor and a promoter in some cancers such as miR29b in breast cancer (BC) cells by regulation of BC cell proliferation, metastasis, and epithelial-mesenchymal transition (EMT). Significantly decreased expression of miR-29b in BC samples and cell lines suggests the role of TET1 as a BC suppressor. However, miR-29b overexpression promotes cell proliferation, colony formation, migration, and EMT, indicating that miR-29b functions as a BC promoter . In vitro assay TET1 has been identified as one of the miR-29b targets, and it turns out that overexpression of miR29b leads to TET1 downregulation-mediated promotion of proliferation, colony formation, invasion, and EMT in GC cells such as MDA-MB-231 and MCF-7. Further study showed that the TET1-mediated suppression of the BC attributed to TET1 conferred disruption of ZEB2 expression by binding to the promoter of ZEB2. While the miR-29b/TET1/ZEB2 pathway offers understanding for the mechanism of miR-29b and TET1-mediated BC promotion, the suppression mechanism for TET1 remains to be elusive in GC .
4. Concluding remarks
In the past decades, particularly recent years, significant achievements have been made in epigenetic study particularly 5-mC and its derivatives such as 5-hmC, 5-fC, and 5-caC, in understanding the generation, dynamic alteration, machinery, distribution, and biological functions and connection between the modifications and the pathogenesis of diseases such as neurological disorders and cancers. However, a large number of unknown epigenetic events related to pathogenesis of many diseases particularly cancers remain to be elusive. Although the individual members of the methyltransferase complexes (writer) for cytosine modifications have been characterized, their coordination in conducting the methylation in response to tumorigenesis has not yet been comprehensively investigated. Similarly, the functional study on the TET proteins (the erasers for methylation) stays only at the conversion of 5-mC to 5-hmC, identification of the targeting miRNAs, and identification of serving as tumor suppressors or promoters by several known mechanisms. However, it is logical to speculate that as such huge protein molecules, TET proteins may have much more unidentified functions. Further study on the unknown functions will provide essential information for dissecting the cancer pathogenesis. First, only limited information is available for the physical interaction components of the TETs; identification of the TET interaction proteins may help us better understand how and where the TETs are recruited to function as demethylase to maintain the dynamic balance of 5-mC/5-hmC and the chromatin remodeling. Then, identification of other functions of TETs other than demethylase will be of importance. Given that the 5-hmC is not so much serving an intermediate of demethylation as the important dynamic 5-hmC landscape, it is essential to investigate how the epigenetic information stored in the landscape is transformed into the biological effect. To this end, for identification of the readers of the 5-hmC modification, the specific 5-hmC binding proteins might be the prerequisite. A better understanding the functions of methyltransferase complex for cytosine methylation, TETs for demethylation of 5-mC and interaction protein components as well as the other known functions, and the specific readers of the 5-hmC marker could identify some epigenetic components for therapeutic targets for treatments of cancers and other diseases such as neurological disorders.
It has been reported that Tet1 alternative splicing forms have distinct functions . However, the information regarding the Tet1 alternative splicing is still limited. Further alternative splicing study may identify more unknown functions conferred by the different isoforms which may bear the potential for therapeutic targets.
Additionally, the chemical biology approach based on further identification of small molecule compounds that target the 5-mC/5-hmC machineries or the signaling pathways in which 5-mC/5-hmC involved could help explore therapeutic targets for some stubborn diseases such as cancers and neurological diseases.