Recent Insights into the Mechanisms of De Novo and Maintenance of DNA Methylation in Mammals

DNA methylation is one of the key epigenetic mechanisms essential for transcriptional regulation, silencing of transposable elements, and genome sta-bilization. Under physiological conditions, DNA methylation is erased and then established genome-wide during gametogenesis and embryogenesis. De novo DNA methylation by the enzymatic reaction of the de novo DNA methyltransferases (DNMTs), DNMT3A and DNMT3B, occurs during the establishment of DNA methylation patterns specific to each germ cell type or somatic cell type after the erasure. Once cell type-specific DNA methylation patterns are established during embryogenesis, which can extend to early childhood, the maintenance of DNA methyltransferase DNMT1 and its cofactor UHRF1 cooperatively maintain the pattern throughout the individual’s lifetime. Recently, our group found that UHRF1 is also involved in de novo DNA methylation during oogenesis. Moreover, our group has identified two genes, CDCA7 and HELL S, to be the causative genes of ICF syndrome, characterized by hypomethylation of centromeric and pericentromeric repetitive sequences. Because CDCA7/HELLS comprise a chromatin remodeling complex, there are evidently certain regions where chromatin remodeling is required to achieve maintenance of DNA methylation. In this chapter, the current situation with respect to our understanding of de novo and maintenance of DNA methylation mechanisms under physiological conditions in mammals is summarized.


Introduction
Methylation at the C5 positions of cytosine (i.e., 5mC) in the CpG context (hereafter called DNA methylation) plays a major role in the transcriptional regulation of gene expression, the silencing of transposable elements (TEs), and genome integrity. The enzymatic activities catalyzing DNA methylation can be classified into two types. One is de novo DNA methylation, which is an activity by which The erasure of DNA methylation in PGCs is probably the result of a defect in maintenance of DNA methylation, caused by the diminished expression of UHRF1 in the cells [4]. After the demethylation, DNMT3A establishes the methylation pattern in combination with DNMT3L, which itself does not possess enzymatic activity but is indispensable for the activity of DNMT3A [5][6][7] in oocytes arrested at an early stage of the first meiotic division or in prospermatogonia arrested at the G1 phase [8]. Although the major role of UHRF1 is in the maintenance of DNA methylation (Section 2.2), our group has recently found that UHRF1 is involved in 25% of the genome-wide de novo DNA methylation in oocytes [9]. The absence of the UHRF1 protein preferentially decreased DNA methylation levels at transcriptionally inactive regions without histone H3 trimethylation at lysine 36 (H3K36me3) mark. Given that only a small percentage decrease in DNA methylation was observed in DNMT1 KO oocytes [10] and that UHRF1 has the potential to interact with de novo DNMTs [11], UHRF1 may cooperate with DNMT3A for the establishment of methylation patterns. Despite the involvement of UHRF1 in de novo DNA methylation in oocytes, our group found that the localization of UHRF1 in oocytes is mainly in the cytoplasm [9]. Recently, cytoplasmic Stella (also known as DPPA3 and PGC7), which is localized in both the cytoplasm and the nucleus, is reported to contribute to the cytoplasmic localization of UHRF1 in oocytes to prevent aberrantly excessive de novo DNA methylation by the UHRF1 protein complex [12]. Nuclear Stella is also reported to inhibit the association of UHRF1 with chromatin, resulting in a possible double-layer mechanism to prevent aberrant de novo DNA methylation by the complex [13].
During post-implantation embryogenesis and early childhood, not only DNMT3A but also DNMT3B proves to be essential for establishing the characteristic methylation pattern [14]. These enzymes may work together or independently to establish specific DNA methylation patterns in each cell type. However, it still has to be determined when the establishment of the methylation pattern is completed, although it probably depends on the cell type. The "developmental origins of health and disease" (DOHaD) is a concept that has emerged over the past three decades, linking the risk of diseases in later childhood and adult life with the environmental conditions of the early life, including nutrient availability to the mothers. Accumulating evidence suggests that the environment can change the epigenetic state, including DNA methylation of the fetus and infant, with the state being maintained throughout the lifetime of the individual [15]. A well-known experiment showed that early experience in childhood permanently alters behavior and physiology; interactions between rat mothers and their offspring, including the licking and grooming of the pups by their mother in the first week of life, altered the DNA methylation status of the glucocorticoid receptor promoter in the hippocampus of the offspring, resulting in differential stress tolerance among the offspring [16]. This indicates that the establishment of DNA methylation is not complete by the first week after birth, at least in the hippocampal neurons of the rat.

Specification of de novo DNA methylation sites
The mechanisms underlying the specification of the genomic regions targeted by de novo DNMTs have remained largely elusive. In oocytes, a significant positive correlation between transcription and highly methylated regions has been reported [17]. It is known that transcriptionally active regions are marked with H3K36me3 and that the histone methyltransferase SET domain containing 2 (SETD2) is responsible for the histone methylation in oocytes [18]. Since SETD2 is reported to interact with the phosphorylated C-terminal domain of RNA polymerase II (RNA pol II) [19], SETD2 appears to methylate histones at regions actively transcribed by the polymerase. On the other hand, the PWWP domain of DNMT3A recognizes H3K36me3 [20], and mutations in this domain, which disrupt this recognition, cause microcephalic dwarfism with aberrant DNA methylation in humans and in a mouse model [21,22]. Oocyte-specific SETD2 KO also causes aberrant DNA methylation [23]. Taken together, it appears that SETD2 methylates H3K36 accompanied by transcription by RNA pol II and DNMT3A recognizes the histone mark and methylates the DNA, resulting in the establishment of DNA methylation patterns specific to oocytes (Figure 2). However, there are exceptions. For example, as described above, UHRF1 is involved in 25% of the genome-wide de novo DNA methylation, mostly at transcriptionally inactive regions lacking the H3K36me3 mark [9]. It is still unknown which factors trigger transcription in oocytes, although transcription from long terminal repeat (LTR)-retrotransposons, whose methylation is erased in PGCs, could be one such trigger [24].
During embryogenesis, transcription factors probably define certain transcribed regions in each cell type as only four transcriptional factors (OCT3/4, SOX2, KLF4, and MYC), together known as OSKM or Yamanaka factors, can drive drastic transcriptional change and define epigenetically active regions in differentiated cells, resulting in induced pluripotent stem (iPS) cells [25]. DNMTs can access regions, where the transcription factors are absent, to passively specify regions for DNA methylation (Figure 3). Noncoding RNAs, such as PIWI-interacting RNAs (piRNAs) and long noncoding RNAs (lncRNAs), can also contribute to the specification of regions for DNA methylation (Figure 3). piRNAs are the largest class (26-31 nucleotides) of small noncoding RNA expressed in animal cells, which were first discovered in Drosophila as RNAs interacting with the PIWI protein; human and mouse homologs are HIWI and MIWI, respectively. In most cases, precursor piRNAs are derived from piRNA clusters in the genome composed of mutated TEs. The precursor piRNAs are processed by several steps and matured by the addition of a methyl group at their 3′ ends [26]. Then, the maturated piRNAs interact with Argonaute (AGO) family proteins and cleave the TEs, which are undesirably transcribed by the erasure of DNA methylation in PGCs [26]. Although the underlying mechanisms are unknown, piRNAs silence these TEs by epigenetic modifications, including DNA methylation, especially during spermatogenesis [27]. In addition, lncRNAs can specify de novo DNA methylation-acquired regions. X-inactive specific transcript (XIST) is one of the best-studied lncRNAs. XIST RNA is randomly expressed from one of two X-chromosomes in mammalian female cells during embryogenesis and covers the X-chromosome in cis to trigger silencing of most genes on it by several layers of epigenetic modifications, including DNA methylation, to achieve dosage compensation [28,29].
Current consensus has it that the process of maintenance of DNA methylation operates as follows. After DNA replication, UHRF1 directly recognizes hemi-methylated DNA and mono-ubiquitylates histone H3K14, K18, and K23, to recruit DNMT1 to the hemi-methylation sites. Then, DNMT1 recognizes two of the three ubiquitylated histone lysine residues through the replication foci targeting sequence (RFTS) domain and methylates the nascent strand in hemi-methylated DNA, resulting in the maintenance of the methylation patterns (Figure 4). Immediately prior to the methylation of hemi-methylated DNA by DNMT1, it has been reported that the deubiquitylation of histones by ubiquitin specific peptidase 7 (USP7) is required [42]. DNA ligase 1 (LIG1), which is critical for the joining together of Okazaki fragments [43], is also involved in this process [31]. Euchromatic histone lysine methyltransferase 2 (EHMT2, also called G9a) and EHMT1 (also called GLP) methylate K126 of LIG1. UHRF1 recognizes the methylated LIG1, and this interaction facilitates the recruitment of UHRF1 to DNA replication sites. Since LIG1 is indispensable for completing the lagging strand synthesis, the interaction between UHRF1 and LIG1 may be especially important for maintenance of DNA methylation of the strand (Figure 4).

Maintenance of DNA methylation by the CDCA7/HELLS chromatin remodeling complex
The cell division cycle-associated 7 (CDCA7)/helicase lymphoid-specific (HELLS) chromatin remodeling complex is also involved in maintenance of DNA methylation. Recently, an international group including us identified CDCA7  and HELLS (also known as LSH) to be causative genes of the immunodeficiency, centromeric instability, facial anomalies (ICF) syndrome type-3 and type-4 (hereafter ICF3 and ICF4), respectively [44]. The syndrome is a rare autosomal recessive disorder characterized by reduced immunoglobulin levels in the serum and recurrent infection [45]. Centromeric instability manifests as stretched heterochromatin, chromosome breaks, and multiradial configurations involving the centromeric/pericentromeric regions of chromosomes 1, 9, and 16 in activated lymphocytes [46], and the cytological defects are accompanied by DNA hypomethylation in pericentromeric satellite-2 and -3 repeats of these chromosomes.
Patients with the ICF syndrome are classified into two groups [47]. One group includes ICF syndrome type-1 (ICF1), which shows DNA hypomethylation only at the pericentromeric repeats. A causative gene for this group is DNMT3B [1, 48,49]. The second group includes ICF syndrome type-2, type-3, and type-4 (ICF2, ICF3, and ICF4, respectively), which shows DNA hypomethylation at centromeric α-satellite repeats in addition to the pericentromeric repeats. As described above, causative genes for ICF3 and ICF4 are CDCA7 and HELLS, respectively [44]. The causative gene for ICF2 is zinc finger and BTB domain containing 24 (ZBTB24) [50]. As ZBTB24 is a transcriptional activator of CDCA7 [51,52], and CDCA7 and HELLS constitute a chromatin remodeling complex, in which CDCA7 stimulates the nucleosome remodeling activity of HELLS [53], the same pathway seems to be disrupted in ICF2, ICF3, and ICF4.
A recent study revealed that, in addition to centromeric and pericentromeric repeats, DNA methylation levels of other heterochromatic late-replicating regions are affected in ICF2, ICF3, and ICF4 patients, though not in ICF1 patients [54]. As UHRF1 KO and DNMT1 KO cause hypomethylation of the entire genome, including centromeric and pericentromeric repeats [2], the DNMT1/UHRF1 complex is surely essential for maintaining these regions. However, the CDCA7/HELLS complex seems to be required for assisting the DNMT1/UHRF1 complex to methylate hemimethylated DNA, possibly by sliding nucleosomes in a region-specific manner [53]. Supporting this idea, our group detected an interaction between CDCA7 and UHRF1 [55]. Late-replicating regions tend to be heterochromatic regions, where the nucleosome density is high. Therefore, the CDCA7/HELLS chromatin remodeling complex may be required for such regions (Figure 4).
Using human embryonic kidney 293 cells, our group reported that DNMT3B KO caused a slight decrease in DNA methylation of pericentromeric repeats after 4 months of KO by the CRISPR/Cas9 system, while CDCA7 KO and HELLS KO caused drastic decreases in DNA methylation even after only 2 months [55], indicating that the CDCA7/HELLS chromatin remodeling complex is essential for maintaining the DNA methylation of the repeats, whereas the requirement of DNMT3B for the maintenance is limited in differentiated cells. In the CDCA7 KO and HELLS KO cells, DNA methylation levels of centromeric repeats were also decreased, but the level of decrease was much less than that of pericentromeric repeats. This indicates that the CDCA7/HELLS complex is less essential for maintenance of DNA methylation of centromeric repeats. Because the chromatin structure, density of nucleosomes, and histone variants are different between centromeric and pericentromeric regions, these differences may determine the levels of requirement for the chromatin remodeling complex. In addition, it has been reported that nucleosomes and the linker histone H1 are barriers to access of DNMTs to DNA and that HELLS and deficient in DNA methylation 1 (DDM1), a plant homolog of HELLS, are required for the methylation of DNA wrapped around nucleosomes [56,57]. Consistent with these reports, the most abundant proteins co-immunoprecipitated with human CDCA7 were histone H1 and core histones in our group's report [55]. The interaction between the CDCA7/HELLS complex and histone H1 may also be a cue to identify regions where the complex is required for maintenance of DNA methylation (Figure 4).

Maintenance of DNA methylation by the proteins associated with multilocus imprint disorder
It is reported that mutations in genes encoding zinc finger protein 57 (ZFP57) and components of subcortical maternal complex (SCMC), including NLRP2, NLRP5, NLRP7, PADI6, OOEP, and TLE6, cause the multi-locus imprint disorder, which exhibits DNA hypomethylation at multiple imprinting control regions (ICRs) [58][59][60][61]. Since the hypomethylation is observed in both paternally and maternally methylated ICRs, these factors are thought to be involved in maintenance of DNA methylation against genome-wide DNA demethylation in preimplantation embryos (Figure 1). Mutations in ZFP57 cause transient neonatal diabetes mellitus [61]. As ZFP57 is a nuclear protein, which recognizes the methylated TGCCGC hexanucleotide found in almost all ICRs and which acts together with ZNF445, KRAB-associated protein-1 (KAP1), DNMTs, SET domain bifurcated histone lysine methyltransferase 1 (SETDB1), and heterochromatin protein 1 (HP1) [62,63], ZFP57 is considered to maintain DNA methylation by directly binding to ICRs with such proteins. However, the mechanism by which SCMC components, which are localized adjacent to the oocyte membrane, can maintain DNA methylation at ICRs remains elusive [59]. Among the multi-locus imprint disorder cases, just one case, who has a heterozygous mutation (V159 M in isoform 1, V172 M in isoform 2) in the TTD of UHRF1, has been reported [60].

Conclusions
I identified UHRF1 as a novel methyl-CpG binding protein in 2004 by biotinavidin pulldown assay using biotin-labeled methylated DNA mixed with nuclear extracts and subsequent mass spectrometric analysis [64,65]. Since then, an understanding of the mechanism by which maintenance of DNA methylation is achieved has quickly expanded and deepened, progress that I would never have imagined at that time. When the involvement of UHRF1 in maintenance of DNA methylation was reported [2], the recognition of hemi-methylated DNA by UHRF1 was reported [32,34,35], and the ubiquitylation of histone H3 by UHRF1 was reported [36], each time I felt that the mechanism of maintenance of DNA methylation had been resolved. However, the mechanism is more complicated than expected, and more factors could still be involved to assist the DNMT1/UHRF1 complex, depending on context such as replication timing, replication strand, and higher-order chromatin structure. We still cannot take our eyes off advances in this field.