At the DNA replication step during cell division, not only fundamental information (i.e. nucleotide sequence) but also superficial information (i.e. “epigenetic” modifications) is faithfully reproduced on the newly synthesized DNA sequence. The faithful maintenance of the epigenetic pattern, which determines the gene-expression pattern of the cell, safeguards the maintenance of cell identity.
The term “epigenetics” was first used to describe “the causal interactions between genes and their products, which bring the phenotype into being” , and this definition initially referred to the role of the epigenetics in embryonic development, in which cells develop distinct identities despite having the same genetic information. However, today epigenetics refers to “the study of heritable changes in gene expression that occur independent of changes in the primary DNA sequence” . This definition is now associated with in a wide variety of biological processes, such as genomic imprinting [3,4], inactivation of the X chromosome , embryogenesis , tissue differentiation , and carcinogenesis .
Epigenetic chemical modifications, such as DNA methylation and histone modifications, are known to be faithfully duplicated in each cell cycle and subsequently the chromatin structures are propagated through DNA replication ; however, little is known about how the chromatin structure is maintained during or reformed after DNA replication. Furthermore, several lines of recent evidence suggested that the superficial information on the DNA strand is more susceptible to change by environmental stress than the DNA strand itself. Therefore, for a better understanding of the DNA replication process, it is highly important and desirable, for biologists in general and molecular biology in particular, to learn about the epigenetic mechanisms.
In this chapter, we introduce the current understanding of the DNA methylation mechanism, 5-hydroxycytosine (the sixth base), histone modifications, and their significance in congenital and acquired diseases, and also discuss to which direction this field ought to proceed in the future.
2. DNA methylation during DNA replication
Not all genes are necessarily expressed in every cell of the organism. Most of these genes and genetic regions are programmed to remain repressed, which defines the identity of each cell. Epigenetic modifications are molecular mechanisms that can preserve the inactive state by regenerating a repressive chromatin structure on the “unnecessary genes and genomic regions” following each round of DNA replication in the cell. DNA methylation is one of the fundamental mechanisms known to be involved in this maintenance process .
Maintenance of such methylation pattern in DNA during replication is mediated by DNA nucleotide methyltransferase 1 (DNMT1) , which methylates newly synthesized CpG sequences, depending on the methylation status of the template strand (Fig. 1). A bridging protein, known as UHRF1 (ubiquitin-like, containing PHD and RING finger domains 1), that interacts with DNMT1 and hemimethylated CpG is required to maintain the hemimethylated CpG dinucleotides pattern at the DNA fork [12,13].
The chromatin structure, modified by DNA methylation, is not stable, but it undergoes a wave of disruption and reassembly during DNA replication. These changes in the chromatin structures influence the dynamics of DNA replication by regulating the selection of replication origin sites and their initiation timings. Interestingly, active gene promoters are often found at these active replication origin sites. Thus, the coordination of replication and transcription is an important mechanism for the establishment and inheritance of differential gene expression patterns during cellular differentiation .
DNA methylation status is also involved in determining the chromosomal replication timing. Hypomethylation is associated with late-replication and late-replicating genomic regions are gradually demethylated with cell divisions, whereas DNA methylation of early-replicating regions is maintained during DNA replication . Moreover, DNA replication in early S phase gets automatically repackaged with acetylated histones, whereas the regions that replicate late in S phase assemble nucleosomes containing deacetylated histones .
So far several DNA nucleotide methyltransferases (DNMTs), which includes DNMT1, DNMT2, DNMT3A, DNMT3B, DNMT3L, have been found in mammals, all of which contain a methyltransferase catalytic domain. Of these, DNMT1 is the most abundant DNMT in differentiated cells; it has a preference for hemi-methylated DNA, and acts as a ‘maintenance methylase’, which allows it to efficiently methylate the hemi-methylated sites that are generated during DNA replication. Thus, the CpG methylation pattern is maintained in the genome after DNA replication . Until recently, the biochemical and functional properties of DNMT2 remained unknown. However, the DNMT2 is now known to act as an RNA methylatransferase and the DNMT2-mediated methylation protects tRNAs against ribonuclease cleavage in drosophila .
DNMT3A and DNMT3B are expressed at high levels in mouse embryonic stem (ES) cells and at lower levels in differentiated cells. They act as ‘de novo methylases’, which catalyze the transfer of methyl groups to naked DNA, and are responsible for establishing the pattern of methylation during embryonic development . Recent evidence suggests that besides playing the role as ‘de novo methylases’ DNMT3A and DNMT3B may also act as ‘methylation completer’ and ‘methylation error corrector’ - by completing the methylation process and correcting errors, respectively, left by DNMT1 - at least at highly methylated DNA regions, such as imprinted regions and repetitive elements .
Once a certain site is methylated, it could then act as a candidate region where the silent chromatin is established. For this purpose, the methylated site first recruits methyl-CpG binding domain (MBD) proteins; the MBD proteins subsequently recruit histone deacetylases, histone modification proteins. In other words, MBDs, which form bridges between the methylation site and other associated proteins, are the key proteins in epigenetic regulation (Fig. 2).
So far five MBD proteins, each containing a methyl-CpG binding domain, have been reported. Among these MBD proteins, MBD1 is unique because it is capable of repressing transcription from both methylated and unmethylated promoters .
MBD1 associates with chromatin modifiers such as the Suv39h1-HP1 complex, and enhances DNA methylation-mediated transcriptional repression . MBD1 also associates with the H3K9 methyltransferase SETDB1 . During S phase, the chromatin assembly factor CAF1 recruits the MBD1-SETDB1 complex to chromatin to establish new H3K9 methylation. On the other hand, the removal of DNA methylation disrupts the formation of MBD1-SETDB1-CAF1 complex, which results in the loss of H3K9 methylation at the formerly methylated site .
MBD2 protein shares extensive sequence homology with MBD3. MBD2 binds to methylated CpGs and confer DNA methylation-mediated transcriptional silencing through its association with HDAC1 and HDAC2 in the NuRD chromatin remodeling complex . Although Mbd2-null mice develops normally and remains viable and fertile , lack of Mbd2 affects immunological systems by inducing ectopic IL-4 expression in undifferentiated helper T cells . Lack of Mbd2 also influences X-chromosome inactivation by inducing ectopic Xist expression in the active X chromosome .
MBD3, like MBD2, is an essential subunit of the NuRD complex. It has been suggested that MBD2 and MBD3 associate with the NuRD in a mutually exclusive way, thereby forming two distinct complexes . Although there is a great sequence similarity between MBD2 and MBD3, the two proteins do not perform redundant functions during early development. In contrast to Mbd2-mull mice which displayed a mild phenotype, MBD3-null embryos die on day 8.5, by failing to shut down the expression of undifferentiated cell markers such as Oct4 and Nanog .
MBD4 is a thymine glycosylase, which acts as a DNA repair protein and targets the sites of cytosine deamination. Spontaneous hydrolytic deamination of 5mC leads to 5mCpG-TpG transitions, whereas that of non-methylated CpG leads it to UpG, and MBD4 is able to excise and repair both ‘mutated’ nucleotides . Consistent with this observation, Mbd4-null mice exhibit a two to three times higher number of 5mCpG-TpG transitions, indicating that Mbd4 indeed acts to reduce the 5mCpG-TpG mutation rate . More importantly, when crossed with mice carrying a germline mutation in the Apc (adenomatous polyposis coli) gene, Mbd4-null mice show accelerated tumor formation . In fact, mutations in MBD4 have been reported in various human carcinomas .
MeCP2 is the first MBD to be cloned . As of now, MeCP2 is known to be a multifunctional nuclear protein, which is known to be involved transcriptional repression, activation of transcription, nuclear organization, and splicing [32,33]. Besides acting as a transcriptional repressor like other MBDs, MeCP2 also acts as a splice regulator, by interacting with YB-1, a component of messenger ribonucleoprotein particles, in brain nuclear extracts . Indeed, microarray splicing analysis of cerebral cortex mRNA isolated from Mecp2-mutant mice showed a number of aberrantly spliced genes . Furthermore, MeCP2 deficiency activates L1 retrotransposition in neurons, which is possibly associated with the genomic diversity of brains . Therefore, it is interesting that there exist several links between MBD-mediated repression, RNA processing and DNA-sequence diversity. It is also intriguing to find a link between epigenetic modification and its suppressive power on genetic diversity since, in addition to MeCP2, a histone modification enzyme (H3K9 methyltransferase, ESET) also contributes to silence retrovirus-like elements in the mammalian genome .
3. 5-hydroxymethylcytosine - the sixth base in mammalian DNA
The 5-methylcytosine (5mC) has been recognized as “the fifth base”. However, early work suggested the existence of a sixth base, 5-hydroxymethylcytosine (5hmC) (Fig. 1). 5hmC was first reported in T-even bacteriophages  and later in mammalian cells . However, the reported finding, which claimed that this modified base accounted for ~15% of total cytosines in DNA extracted from the brains of adult rats, mice and frogs, could not be reproduced . The topic received only little attention for the next 30 years until 2009, when work from two research teams brought it back to life [40,41]. Actually, it was found that 5hmC accounts for 0.6%, 0.2%, 0.03% of total nucleotides in Purkinje cells, granule cells, and mouse ES cells, respectively [40,41].
The presence of 5hmC in the mammalian genome depends on pre-existing 5mC, because 5hmC is converted from 5mC with the help of TET proteins, which utilize molecular oxygen to incorporate a hydroxyl group to 5mC. TET is named after Ten-Eleven Translocation (translocation between chromosomes 10 and 11) because it is initially found as a fusion protein partner of mixed-lineage leukemia gene (MLL) in acute myeloid leukemia (AML) patients carrying a t(10;11)(q22;q23) translocation [42,43]. The findings that ectopic expression of TET1 in HEK293 cells lacking TET1 led to reduced levels of 5mC and increased levels of 5hmC, and that the levels of 5hmC decreased upon RNAi-mediated depletion of TET1 in ES cells indicate that TET1 is able to catalyze the conversion of 5mC and 5hmC in cultured cells . Also, it has been demonstrated that TET1 is capable of acting not only on fully-methylated DNA strands but also on hemi-methylated DNA strands . Furthermore, not only TET1 but also other TET proteins (TET2 and TET3) is capable of converting 5mC to 5hmC .
In terms of gene regulation, the significance of this 5hmC modification is similar to that of non-methylated cytosine. In other words, the 5hmC modification is associated with transcriptional activity, which is different from the 5mC modification that is associated with transcriptional repression . It has recently demonstrated that TET1-binding to the promoter region (presumably 5hmC modification at this site) induces the expression of Nanog in ES cells and that downregulation of Nanog via TET1 knockdown induces DNA methylation in the promoter region . These findings indicate that the TET1 driven 5hmC modification contribute to maintenance of the nature of un-differentiation and pluripotency of ES cells, and support a working model by which TET1 and DNMTs coordinately regulate Nanog expression.
In ES cells, high levels of TET1 block the access of DNMTs for maintained Nanog expression. On the other hand, when TET1 is downregulated in ES cells by in vitro differentiation, DNMTs methylate the Nanog promoter, which leads to the downregulation of Nanog expression and loss of ES cell identity (Fig. 1) . This hypothesis is supported by a recent finding in which the chromosomes containing 5hmC are gradually reduced during the development of preimplantation embryos . However, another study showed that the 5hmC level in the mouse cerebellum during development increases from 0.1% of total nucleotides at postnatal day 7 to 0.4% of total nucleotides in the adult mouse .
As described above, TET1 was initially identified through a rare translocation case with leukemia [42,43]. Later studies have demonstrated that deletion and mutations in TET1, TET2 and TET3 are associated with myeloid malignancies . In fact, mutations found in TET2 in myeloid cancers have been shown to impair hydroxylation of 5mC .
While our knowledge about 5hmC is rapidly growing, currently there is no reliable methodology available that would provide information on 5hmC at single-base-pair resolution. Although a 5hmC antibody is available for chromatin Immunoprecipitation, this method only provides some coarse information (i.e. detects presence of 5hmC but not that of 5mC in chromatin). A more sensitive method has been developed for 5hmC by capillary eletrophoresis, but this is not the one at sinlge-base-pair resolution . Another method (namely, bisulfite sequencing) has proven to be a powerful tool for providing information on the methylation status at single-base-pair resolution. However, it too fails to discriminate between 5mC and 5hmC. Thus, if the bisulfite-treated DNA is used as a template for PCR analysis, cytosine will be read as thymine, whereas both 5mC and 5hmC will be read as cytosine . Therefore, it is important to develop a methodology that can distinguish between 5mC and 5hmC at single-base-pair resolution in order to achieve complete understanding of the active demethylation mechanism, because TET protein-mediated 5mC oxidation may contribute to dynamic changes in global or locus-specific 5mC levels by promoting active DNA demethylation .
4. Histone modifications and DNA methylation during replication
DNA methylation and histone modifications not only occur separately, but they also work hand-in-hand at multiple levels to determine expression status, chromatin organization and cellular identity, and they are co-ordinately maintained through mitotic cell division, allowing for the transmission of parental DNA and for the histone modifications to be copied to newly replicated chromatin [51,52].
Lande-Diner et al. recently developed a DNMT1- knockout cell line and demonstrated that an unmethylated state, caused by the lack of DNMT1, induced deacetylation of histones H3 and H4, resulting in transcriptional activation in many genes . This observation clearly indicates that DNA methylation is associated with histone deacetylation. However, this group also demonstrated that in several other genes the unmethylated state, caused by lack of DNMT, did not induce histone H3 and H4 deacetylation, resulting in transcriptional repression. In addition, late replication in S phase was observed at these loci, suggesting that the replication timing may be independent of DNA methylation . Rather, histone acetylation is associated in controlling the replication timing .
DNA methylation is not only correlated to histone ‘acetylation’, but also associated with histone ‘lysine methylation’. Genome-wide DNA methylation profiles suggest that DNA methylation is associated with the absence H3K4 methylation and the presence of H3K9 and H3K27 methylation .
In fact, DNA methylation induces histone H3K9 methylation through an MBD, thereby establishing a repressive chromatin state . SETDB1, a H3K9 trimethylation (H3K9me3) methyltransferase, contains a putative MBD domain with two conserved DNA-interacting arginine residues, which are also present in the MBD domains of MBD1 and MeCP2 and are known to make direct contact with the DNA in the structures of MBD1-DNA and MeCP2-DNA complexes [56,57]. This result suggests that SETDB1 acts as an H3K9me3 ‘writer’ in corporation with DNA methylation ‘reader’. Likewise, SUV39H1/2, another H3K9me3 ‘writer’, interacts with HP1, the H3K9me3 ‘reader’ to create a repressed status in their recruited genomic region . These are the mechanisms for propagating and maintaining repressive chromatin marks on both DNA and histones during DNA replication.
A histone methyltransferase, in turn, can direct DNA methylation to specific genomic targets by recruiting DNMTs to stably silence genes ; accordingly, disruption of the histone lysine methyltransferase gene with specificity for H3K4 (MLL) in mice not only induces the loss of H3K4 methylation but also induces de novo DNA methylation at several gene promoters [60,61]. In another study, it was shown that the lack of histone H3K9 methyltransferase induced demethylation at the imprinting center in SNPRN locus on the maternal chromosome, whereas the lack of DNMT1 failed to induce demethylation of histone H3K9, indicating that the modification order at this locus is histone modification followed by DNA methylation . Taken together, histone methylation marks play important roles in predicting the methylation status of the genome .
Whereas DNMT1 is stabilized by a histone demethylase (HDM) to maintain DNA methylation , DNMTs can direct the local status of histone methylation patterns, recruiting MBDs and HDACs to achieve gene silencing and chromatin condensation [65,66]. Recently, DNMT3L has been shown to act as a sensor for H3K4 methylation. Thus, when methylation is absent, DNMT3L induces de novo DNA methylation by docking DNMT3A to the nucleosome, which is one of mechanisms by which methylated regions are newly created during the replication step .
The interplay of these modifications creates an epigenetic landscape that regulates the way the mammalian genome expresses itself in different cell types, developmental stages and disease states. The distinct patterns of these epigenetic modifications present in different cellular states serve as a guardian of cellular identity . Whereas it is well accepted that DNA methylation patterns are replicated in a semi-conservative fashion during cell division via the mechanisms discussed earlier, how histone modification patterns are similarly replicated remains to be elucidated.
5. Abnormalities in epigenetic mechanism and their possible inheritance
Thanks to identification of molecules that contribute to epigenetic gene regulation, we now know how that abnormalities in these molecules cause a number of congenital diseases.
The first group of diseases with abnormal epigenetic mechanism is genomic imprinting diseases . Genomic imprinting is a mechanism in which only one of the two parental alleles is expressed in a gene. For example, in the case of SNRPN gene, the paternal allele of the SNRPN gene is expressed, whereas the maternal allele is suppressed by DNA methylation in normal individuals, and abnormal suppression of the normally expressing paternal allele causes a congenital obesity disease, known as Prader-Willi syndrome . In the case of UBE3A gene, which locates adjacent to the SNRPN gene, the maternal allele is expressed, whereas the paternal allele is suppressed in neurons ; abnormal suppression of the expressing maternal allele causes a congenital epileptic disease, known as Angelman syndrome .
X-chromosome inactivation is another epigenetic mechanism in which only one of the two X chromosomes is activated and the other X chromosome is inactivated in females . Females with aberrant X-inactivation (i.e. both two X chromosomes are activated) are thought to be embryonic lethal, since somatic clones with aberrant X-inactivation are aborted .
Abnormal functioning of the proteins related to epigenetic regulation also causes diseases. For example, mutations in the DNMT3B gene, which lead to hypomethylation at the paracentromeric chromosomal regions, cause the immunodeficiency- centromeric instability- facial anomalies (ICF) syndrome, which is characterized by immunodeficiency, centromere instability, facial abnormalities, and mild mental retardation (Fig. 3A) [71-73]. On the other hand, over-expression of DNMTs is associated with hypermethylation found in colorectal, breast, and hepatocellular carcinomas (Fig. 3C) [74-76]. Another example is Rett syndrome caused by MECP2 mutations, which is characterized by seizures, ataxic gait, language dysfunction and autistic behavior [77,78]. In this disease, MECP2 mutations induce abnormal regulation of a subset of neuronal genes [79,80] (Fig. 3B).
Besides these “DNA methylation diseases” caused by mutations in DNA methylation-related enzymes and proteins, “histone modification diseases” caused by mutations in histone modification-related enzymes have recently been reported. For example, Say-Barber-Bieseker-Young-Simpson syndrome is caused by mutations in the histone acetyltransferase gene, KAT6B, which is a multiple anomaly syndrome characterized by, an immobile mask-like face, abnormal narrowing of palpebral fissures (short eyelid), anomalies of the spine, ribs and pelvis, renal cysts, hydronephrosis, agenesis of the corpus callosum, and severe intellectual disability . Another example is Kleefstra syndrome caused by deletion or mutation in the histone H3K9 methyltransferase gene, EHMT1, which is characterized by childhood hypotonia, distinctive facial features, and intellectual disability with severe expressive speech delay .
Recently, it has been shown that short-term environmental stress could also cause aberrant epigenetic status associated with various diseases. Thus, aberrant epigenetic mechanism can not only cause congenital diseases, but can also cause acquired diseases. For example, short-term mental stress after birth, in which the mother is separated from the offspring, causes DNA hypermethylation in the promoter of the glucocorticoid receptor (GR) gene in the rat brain, resulting in persistent abnormal behavior . Malnutrition in the fatal period is also known to induce DNA hypomethylation in the promoter of the peroxisome proliferator-activated receptor alpha (PPARα) gene, a so-called “thrifty gene”, in the liver, which may be associated with the developmental basis of adult diseases (i.e. obesity and diabetes mellitus) [84,85] (Fig. 3D). This hypomethylation event has later been confirmed in human individuals who suffered prenatal malnutrition during the period of famine [86,87].
Several lines of evidence suggested that acquired DNA methylation changes described above are transmitted to the next generation. Epigenetic marks allow the transmission of gene activity states from one cell to its daughter cells. Initially, it was assumed that epigenetic marks were completely erased and re-established in each generation. However, recent studies using several model organisms indicate that the erasing process is incomplete at some loci and so the epigenetic changes acquired in one generation are inherited by the next generation.
For example, it has been shown in mice that the mental stress caused due to maternal separation in offspring not only changes the DNA methylation status in the first generation but also in the next generation through changes in the sperms of the first generation in mice . Moreover, the observed changes in the DNA methylation status altered the expression level of corticotrophin releasing actor receptor 2 (Crfr2) in the brains of next generation mice, which could be associated with their abnormal behavior .
6. Concluding remarks
One of the major differences between DNA sequence and epigenetic modifications is tissue specificity. Epigenetic modifications vary according to the tissue type, which consequently allows generating tissue-specific expression patterns. However, how determines the epigenetic modification (epigenomic) pattern in each tissue type is not fully understood.
Thus, it is essential to categorize epigenomic patterns in each human tissue at the nucleotide resolution [92,93]. In fact, the NIH Roadmap Epigenomics Program under the US National Center for Biotechnology Information (NCBI) and the International Human Epigenome Consortium (IHEC) have initiated the large-scale epigenomic mapping studies in order to generate epigenome maps for each human cell type for this purpose .
Understanding the human epigenome will be fundamental to the study of congenital and acquired diseases, and will also be invaluable for analyzing the linkage between birth defects and environmental factors. However, biological studies to understand the epigenome are in their initial phase. Further studies are necessary to elucidate the molecular mechanism by which the epigenome pattern in each cell type differs, epigenomic patterns are altered by environmental factors, and process of inheriting the epigenomic pattern from the previous generation could be avoided. The authors expect that these molecular mechanisms would hopefully be discovered by the “next generation” of researchers.
The research described in this article was partially supported by the Ministry of Education, Science, Sports and Culture (MEXT), grants-in-aid (KAKENHI) for Scientific Research (B) (23390272) to TK, grants-in-aid for Exploratory Research (23659519) to TK, grants-in-aid for Young Scientists (B) (23791156) to KM, and grants-in-aid for Scientific Research (C) (23591491) to TH. The authors thank NAI inc. for critical review.