Duplicated GGAA motifs in the 5’-upstream regions of human DNA-repair associated genes
A variety of transcription factor binding sequences instead of the authentic TATA- or TATA-like elements are present in large numbers of 5’-flanking or regulatory regions of the human genes . Our previous research showed that several human gene promoter regions of the DNA repair-associated genes, including PARP, PARG, ATR, and RB1, contain duplicated GGAA-motifs or ETS binding sequences, although they have no obvious TATA-like elements . On the other hand, surveillance of a human genomic DNA database revealed that 5’-flanking regions of the human genes encoding telomerase and telomere maintenance factors, which are called as shelterins, are TATA-less but most of them carry GC-boxes and/or Sp1-binding sequences . These observations suggest that the expression of the DNA repair and telomere maintenance factor-encoding genes is likely to be regulated by a TATA-independent mechanism.
The molecular mechanisms of effect induced by caloric restriction (CR) mimetic drugs, including Resveratrol (Rsv), have been well studied . It was suggested that the CR mimetic compounds activate NAD+ dependent deacetylase sirtuins, or inhibits cAMP phosphodiesterases to improve mitochondrial functions . Thus, it is supposed that Rsv affects cellular senescence to elongate lifespan of various organisms . It should be noted that mitochondrial functions cross-talk with telomeres in which telomere-shortening causes chromosomal instability and leads to cellular senescence . We have reported that caloric restriction (CR) mimetics, 2-deoxy-D-glucose (2DG) and Rsv up-regulate promoter activities of the 5’-flanking regions of genes encoding telomere-maintenance factors including shelterin complex proteins . Moreover, we observed that telomerase activity in HeLa S3 cells was moderately induced by the 2DG and Rsv [7,8]. Additionally, it has been reported that tumor suppressor p53, which is encoded by the TP53 gene, is phosphorylated and then it induces ERK1/2 activation in response to Rsv treatment . Interestingly, the TP53 promoter contains GGAA (TTCC)-duplication adjacent to the transcription start site (Table 1). Taken together, these observations suggest that the anti-aging effect of CR mimetic compounds stems from up-regulation of TP53 expression via duplicated GGAA (TTCC) elements, in accordance with the moderate induction of expression of genes encoding telomere maintenance factors possibly through GC-box or Sp1-binding elements.
In this review article, we will discuss the contribution of cis-elements, namely duplicated GGAA and GC-boxes, in regulation of DNA-repair- and telomere maintenance-associated gene expression that is thought to control cellular senescence and aging of organisms.
2. Transcription of eukaryotic cells
2.1. General transcription factors and TATA-dependent and independent transcription mechanisms
Transcription or synthesis of RNAs is known to be regulated at several steps, including chromosomal modification, transcription initiation, elongation, and termination . Eukaryotic transcription of mRNAs is catalyzed by RNA polymerase II (Pol II) and the molecular mechanisms are well studied . Initiation of transcription is executed by transcription machinery complex consisting of Pol II and general transcription factors (GTFs), such as TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and THIIH. Transcription is thought to start from the formation of pre-initiation complex (PIC), which contains GTFs and Pol II, at the transcription start site (TSS) . The most studied eukaryotic promoter regions contain TATA- or TATA-like sequences that are recognized by TATA binding protein (TBP). Binding of TBP to the TATA-box results in recruitment of TFIID and TAFs , then it provokes the formation of the PIC, precisely determining the TSS. Although TATA-dependent transcription initiating mechanisms have been extensively characterized by a variety of experiments, 76% of the TSSs in human genomes have no obvious TATA or TATA-like elements . This fact clearly indicates that eukaryotic transcription is initiated by either TATA-dependent or independent mechanisms.
2.2. TATA-less promoters-genome wide analyses by ChIP experiment
Recent study of PICs in Saccharomyces by genome wide ChIP analysis revealed that they are positioned at TATA-boxes or TATA-like elements in TATA-less promoters . In contrast, from the analysis of human DNA sequence data base, it was shown that only 2.6% of human promoters contain the TATA-consensus 7-mer TATAAA around their TSSs . Moreover, surveillance of the human genome database revealed that a total of 174 different DNA sequence motifs are found in promoter regions, and that no obvious TATA-like elements are listed in the top 50 most common of these motifs . These observations imply that appropriate cooperation between transcription factor (TF) binding sites would determine TSSs and tissue specific transcription in mammalian cells as TATA-element determines. In other words, TATA-box might be one of the cis-elements that specify where TSSs should be located in the human gene promoter regions. The concept that multiple cis-elements and their combinations determine the location of TSS and tissue specificity is consistent with the transcription model that is driven by enhanceosome in several gene promoters including IFNB promoter .
3. Promoter regions of the human DNA-repair associated genes
We have been studying the regulatory mechanism of the human PARG gene expression, and isolated its promoter region . Deletion and mutagenesis analyses narrowed the core promoter region, and indicated an important role for duplicated GGAA motifs in the TATA-less PARG promoter function. The PARG gene encodes a poly(ADP-ribose) glycohydrolase (PARG) that degrade the poly(ADP-ribose) (PAR) which is synthesized by enzyme reaction catalyzed by poly(ADP-ribose) polymerase, PARP protein . Interestingly, no obvious TATA-box but a duplicated GGAA-motif is found around the TSS of the human PARP1 gene .
Poly(ADP-ribosyl)ation is thought to be involved in the process of DNA-repair, which is dependent on both poly(ADP-ribose) synthesis and degradation . Given that the PARP1 and PARG genes encode proteins that work cooperatively in the PAR-dependent DNA-repair system, their expression would be similar in response to the same DNA-damaging signal. Therefore, it is natural that the 5’-upstream regions of the two genes resemble each other containing duplicated GGAA (TTCC) element but TATA-box. We thus speculate that other promoters of PAR-dependent DNA-repair system associated genes might contain GGAA-duplication instead of the TATA-box.
3.1. Surveillance of 5’-upstream regions of the PARP and PAR-associated protein encoding genes
At first, we understood that the duplicated GGAA is a sequence that should be associated with macrophage-like differentiation of HL-60 cells induced by 12-O-tetradecanoylphorbol-13-acetate (TPA) . The expression of several genes are up-regulated during the TPA-induced differentiation of HL-60 cells, as shown by DNA-microarray experiments . Interestingly, RB1 gene, which encodes a tumor suppressor and cell cycle regulator protein Rb1, is included in the late response genes . The Rb1 protein is also suggested to control cell fate by inducing differentiation and inhibiting apoptosis . Thus, we examined the 5’-flanking region of the RB1 gene, and found that a duplication of the GGAA-motif is essential for the promoter activity . We have also reported that duplicated GGAA-motifs are contained in the promoter regions of the human XPB and ATR genes that are involved in DNA-repair synthesis and DNA-damage response signal, respectively . These genes are known to be involved in the DNA repair synthesis.
PARP modifies itself and various target proteins by addition of a PAR using NAD+ as the substrate . This modification is important for the recruitment of base excision repair (BER) associating factors, including XRCC1 . Therefore, expression of the genes encoding PARP target proteins or PAR-associating proteins might be similarly regulated as in PARP1 and PARG genes. In this context, it should be emphasized that PAR binds to p53 altering its associatiation with DNA .
PARP1 has been reported to regulate G1 arrest in response to DNA damage via poly(ADP-ribosyl)ation of the p53 . Furthermore, XRCC1 and ATM (Ataxia telangiectasia mutated) proteins, which play roles in the DNA-damage response signaling system, are also known to interact with PAR . Moreover, cooperation of PARP and DNA-dependent protein kinase (DNA-PK) during DNA strand break repair has been also demonstrated . Not surprisingly, duplicated GGAA-motifs are found in the 5’-upstream regions of the ATM, PRKDC (DNA-PKCS), TP53 and XRCC1 genes encoding the PARP/PAR associating proteins (Table 1). Although degradation of PAR in nuclei is thought to be mainly catalyzed by the PARG, it should be noted that ARH3 catalyzes the degradation of PAR on the mitochondrial matrix . As GGAA duplication is contained in the 5’-flanking region of the ADPRHL2 (ARH3) gene (Table 1), we predict that it functions in response to DNA-damage signals.
3.2. Surveillance of the DNA repair associated gene promoter regions
XRCC1, which is a 70-kDa X-ray cross-complementing group 1 protein, is thought to act as a scaffold protein for BER and DNA single strand break repair (SSBR) . Various proteins are involved in the XRCC1-associated DNA-repair processes, including APEX1 (APE1), TDP1, PCNA, RFC, POLB (DNA-pol β), WRN, ERCC6 (CSB), and E2F family proteins . We previously reported that the WRN promoter region contains GGAA duplications , and after analyses of several other DNA-repair related genes found that APEX1, TDP1, POLB and E2F4 gene promoters also harbor duplicated GGAA-motifs (Table. 1).
Additionally, GGAA-duplications around the TSSs of the human ATM and ATR genes were discovered (Table 1). Both ATM and ATR are check point kinases with critical roles in DNA repair via homologous recombination repair (HRR) at the sites of double-strand breaks (DSBs) . Cancer and genetic studies highlighted the roles for the FANC proteins, Rad51, BRCA1, BRCA2, CHEK1 (CHK1), CHEK2 (CHK2), NBN (NBS1), RecQL4, WRN, XRCC5 (Ku80), and XRCC6 (Ku70) in HRR . Therefore, we examined the sequences of each 5’-upstream region of these HRR/DSB associated genes and revealed that the duplicated GGAA-motifs are contained in the 5’-flanking region of the BRCA1, BRCA2, CHEK1, DCLRE1C (Artemis), FANCD2, NBN and XRCC5 (Ku80) genes (Table 1). Although the CHEK2, LIG4 and XRCC4 genes are not listed in Table 1, duplicated GGAA (TTCC) motifs, which are distant within thirteen nucleotides, are located near their TSSs.
3.3. Possible roles of the duplicated GGAA motif in the 5’-upstream regions of DNA-repair genes as a bidirectional initiation element
It has been shown that the human PARG gene is head-head linked with the TIM23 gene, which encodes a mitochondrial inner membrane translocase 23 [17,30]. Moreover, we reported that a duplicated GGAA motif is located in the region of a head-head junction of the human IGHMBP2 and MRPL21 promoters . Furthermore, many cancer or DNA repair associated genes are regulated by bidirectional promoters, for example tandem repeat binding sites for ETS family proteins were identified in the bidirectional promoter regions of PERLD1/ERBB2 and CIDEC/FANCD2 genes in breast and ovarian cancers . We also identified several head-head oriented genes whose promoter regions contain duplicated GGAA-motifs . Several examples of bidirectional partners of the DNA repair-associated genes those are oriented in a head-head manner are summarized in Table 2. Given that specific TFs are linked to the regulation of bidirectional promoters , the TF-binding elements in these promoter regions may determine whether they function as bidirectional or unidirectional promoters dependent on the prevailing TF-expression of the cell. Although it has not been shown yet, functions of transcribed RNAs or translated proteins from these bidirectional partners might be associated with cellular responses that are required against DNA damaging agent.
|DNA repair genes (GENE ID)||Partner genes (GENE ID)|
|ADPRHL2 (ADH3) (54936)||TEKT2 (27285)|
|APEX1 (APE1) (328)||OSGEP (55644)|
|ATM (472)||NPAT (4863)|
|BRCA1 (672)||NBR2 (10230)|
|CHEK2 (11200)||HSCB (150274)|
|FANCD2 (2177)||CIDECP (152302)|
|LIG4 (3981)||ABHD13 (84945)|
|PARG (8505)||TIM23B (653252)|
|PCNA (5111)||CDS2 (8760)|
|PRKDC (DNA-PKCS) (5591)||MCM4 (4173)|
|TP53 (7157)||WRAP53 (55135)|
|XRCC6 (Ku70) (2547)||DESI1 (27351)|
3.4. Multiplicity of GGAA motifs may play a role in the formation of specific chromosomal structures
It is well known that various repetitive sequences are providing special features at specific regions of eukaryotic chromosomes. Telomeres are composed of TTAGGG repeats and they are maintained by specific structures that are known as T- and D-loops . Other example is that the centromeres, in which the (CENP) B box is located, have specific structures that function to segregate chromosomes accurately . Interestingly, the 17-bp sequence of (CENP) B box, which is recognized by CENP-B protein, contains GGAA motif, and this (CENP) B box appear every other α-satellite repeat (171-bp sequence) in human chromosomes [37,38]. Thus, repetitive sequences play roles in the formation of specific chromosomal structures and they are generally referred as microsatellites.
It is noteworthy that repetitive GGAA motifs or GGAA-microsatellites are targets of the oncogenic fusion protein EWS/FLI, whose mRNA is transcribed from the result of aberrant chromosomal translocation, t(11;22)(q24;q12) [39,40]. The GGAA-microsatellites are located in the promoter regions of several genes, including DAX1/NR0B1, FCGRT, CAV1, CACNB2, FEZF1, KIAA1797, and GSTM4 [41-43]. The EWS/FLI binds to these promoter regions activating their transcription . Although, the function of GGAA-microsatellites in the formation of specific structures of human chromosomes has not been clearly shown, DNA damage is reported to be introduced non-randomly or heterogeniously , suggesting that sensitivities to oxidative damages are partly dependent on DNA sequences or the structures. Oxidative damages to DNA, which might cause microsatellite instability, inhibition of methylation, and telomere shortening, do not only generate 8-OH-Gua, but also modulate transcription by altering redox status in cells . Furthermore, given that telomere repeat sequence TTAGGG changes DNA conformation to form G-quadruplex structure , the repetitive GGAA motifs might also play a part in maintaining specific structures of chromosomes. Thus it could be hypothesized that the duplicated GGAA motifs in the 5’-upstream regions of the DNA-repair genes affect chromosomal structures, which might be altered by DNA-damage causing agents. Alternatively, affinities of GGAA-binding TFs with the duplicated GGAA motifs may be altered by oxidative damages. Yet these possibilities are to be elucidated by further experimental analyses.
4. Promoter regions of the human telomere maintenance factor-encoding genes
Human telomeres are unique structures of chromosomal ends where telomere binding proteins and telomere maintenance factors are associated to control chromosomal integrity, and their shortening is thought to cause instability of chromosomes leading to cellular senescence [35,47]. It has been shown that telomeres form a T-loop configuration [47,48], which are protected by shelterin proteins, including TRF1, TRF2, Rap1, TIN2, TPP1 and POT1 [49,50]. Recently, conditional knock down experiments demonstrated that shelterin proteins function as repressors or inhibitors of ATM/ATR signaling, non-homologous end joining (NHEJ), alt-NHEJ, HRR and resection . Given that shelterin proteins have similar functions in protecting telomeres from DNA-damage, shelterin genes might be regulated in a similar manner to each other. In addition, their gene expression needs to be regulated by a unique system that is different from those of ATM/ATR signaling, NHEJ, alt-NHEJ, HRR and resection.
4.1. GC-box or Sp1 binding element is a common TF binding motif within the 5’-upstreams of the telomere maintenance factor-encoding genes
Previously, we have isolated 300 to 500-bp 5’-upstream regions of the human TERT, TERC, DKC1, POT1, RAP1, TANK1, TANK2, TIN2, TPP1, TRF1, and TRF2 genes [3,7]. Sequence analyses of the PCR-amplified DNA fragments showed that they have no apparent TATA-box or TATA-like element except for the TERC gene promoter . Similar to the 5’-upstream region of the human WRN gene, GC-boxes or Sp1-binding elements are found adjacent to the TSSs of the TERT, TERC, DKC1, RAP1, TANK1, TIN2, TPP1, TRF1 and TRF2 genes but not in the POT1 and TANK1 promoter regions . Instead, OCT-binding elements are located in the 5’-flanking regions of both these genes. We have also isolated the 5’-upstream region of the human RTEL1 gene , which encodes a DNA helicase motif containing protein with telomere D-loop dissociation and telomere G-quadruplex contracting activity [52,53]. Therefore, the mechanism for maintenance of telomere integrity by RTEL1 would be different from that of the shelterin proteins. It is noteworthy that duplicated GGAA motifs are located near the TSS of the RTEL1 gene (Table 1) and one of them functions as an essential cis-element for transcription , suggesting that GC-box binding TFs are not the main regulators of RTEL1 gene expression, rather the contribution by GGAA motif-binding TFs are of greater importance, in a similar manner as the DNA-repair associated genes, ATM/ATR and Rb1.
4.2. TATA-independent regulatory mechanisms of DNA-repair associated genes and telomere maintenance factor-encoding genes
Clustering analysis of TF-binding sites in human promoters revealed that a TATA-box is totally absent in promoters containing an ETS binding motif . The most frequently found sequence co-localized with ETS binding motifs in human promoters is the Sp1 element with 28.4% occurrence , next is the ETS binding motif itself (18.7%). In addition, occurrences of Sp1 motif with the other Sp1 motifs in human promoters was estimated at 61.2%. These lines of evidences suggest that Sp1 family and ETS family proteins synergistically control promoters containing both elements.
However, comparison of common TF-binding motifs in the 5’-flanking regions of the DNA-repair and telomere associated genes suggest that they are individually regulated by GGAA-binding factors and GC-box-binding factors, respectively. In addition, most of these promoters do not have an authentic TATA or TATA-like element. We can speculate that through the evolution of organisms, GGAA-duplicated motifs have become selectively utilized for regulation of gene expression of the DNA-repair factor encoding genes, while GC-box might have developed to be a regulator for telomere maintenance factor-encoding genes (Fig. 1). TATA-dependent transcription may have been disadvantageous in control of DNA damage inducible genes with a distinct ability to sustain or maintain integrity of genomes, including chromosomes and telomeres.
5. Caloric restriction induced signals that affect transcription of the telomere associated genes
It is well established that loss of function mutations on the WRN gene that encodes telomere regulating RecQ helicase can lead to cancer or premature aging syndrome [54,55]. On the other hand, caloric restriction (CR) can extend life spans of various organisms , and thus CR mimetic drugs are expected to have an anti-aging effect. We therefore hypothesized that CR or CR mimetic drugs might induce signals acting on transcription of telomere-associated genes. We previously reported that the relative promoter activities of the human shelterin encoding genes compared with that of the PIF1 gene are up-regulated by 2-deoxy-D-glucose (2DG) or Resveratrol (Rsv) in HeLa S3 cells .
5.1. Effect of CR mimetic drugs on telomere associated protein-encoding gene promoters
2DG and Rsv, which are known as a potent inhibitors of glucose metabolism , and an activator of sirtuin-mediated deacetylation , respectively, are referred as CR mimetic drugs. It has been shown that telomerase activity in HeLa S3 cells was moderately activated by 2DG and by Rsv [7,8]. These observations suggest that CR mimetic drugs have protective effects on telomeres by inducing telomerase activity along with up-regulating expression of the telomere maintenance factor-encoding genes. Up to present, human TERT (hTERT) promoter region has been well characterized with c-Ets, GC-box, E-box and other TF-binding elements that are located in its 5’-flanking region [57,58]. GC-boxes and Sp1-binding sites are not the only commonly found elements in the human TERT and WRN promoter regions , but also duplicated GGAA elements which are found adjacent to both TSSs (Table 1).
Interestingly, both duplicated GGAA-motif and GC-boxes are contained within 500-bp upstream of the TSS of the human SIRT1 gene . It is suggested that human SIRT1 gene expression is regulated by PPARβ/γ through Sp1 binding elements . SIRT1, which belongs to sirtuin protein family, is proposed to regulate aging and the healthspan of organisms . The biologically important function of the SIRT1 is its NAD+ dependent deacetylating activity targeting various proteins including histones, PGC-1α, FOXO1, p53 and HIF1α . These findings imply that the signals provoked by CR or CR mimetic drugs might induce Sp1 or GC-box binding TFs, thus simultaneously up-regulating expression of TERT, WRN, SIRT1, and the shelterin-encoding genes. Given that the CR causes stress response for cells due to the lack of nutrients or energy to survive, cells need to stop growing but need to keep the integrity of chromosomes and telomeres without replication of their genome. Therefore, agents with ability to induce telomere maintenance factor encoding genes might be lead compounds to design anti-aging drugs.
5.2. Mechanisms that regulate aging or lifespan via mitochondria and metabolic stress
Genetic studies of C. elegans implied that the insulin/IGF-1 signaling pathway regulates the lifespan of animals . Insulin/IGF-1 signaling and glucose metabolism are thought to be associated with several diabetes/obesity controlling factors, including AKT, FOXO, mTOR and AMPK . The mTOR is a component of mTORC1 and mTORC2 that play key roles in signal transduction in response to changes in energy balance . Recently, it was reported that mTORC1 in the Paneth cell niche plays a role in calorie intake by modulating cADPR release from cells . AMPK is known to be a sensor for energy stress and DNA damage, which acts by phosphorylating various TFs, such as FOXO, PGC-1α, CREB and HDAC5 [64,66]. Moreover, AMPK regulates SIRT1 activity by modulating NAD+ metabolism .
It has been shown that mitochondrial functions can control lifespan . Furthermore, it was suggested that a cross talk system between telomeres and mitochondria functions in the regulation of aging . This concept was implied from a Tert knock down experiment that indicate telomere dysfunction causes suppression of PGC-1α in a p53-mediated manner . The tumor suppressor p53 has been suggested to affect aging of organisms as a pro-aging factor . Moreover, it is noteworthy that p53 regulates mitochondrial functions including respiration and glycolysis [70,71]. Taken together, these lines of evidences strongly suggest that p53-mediated signaling is transferred to telomeres and mitochondria in order to affect cellular senescence. Although canonical GC-box motif is not found near the TSS, duplicated GGAA-motifs are located in the human TP53 promoter (Table 1). Therefore, transcription of genes that need to respond to the energy stress might be classified into two types, namely duplicated GGAA-motif- and GC-box-controlled system, which activates p53/DNA repair/mitochondria and telomere maintenance, respectively.
Here we discussed the TF-binding elements in the 5’-upstream regions of DNA-repair factor- and telomere maintenance factor-encoding genes, and proposed that duplicated GGAA in conjugation with the GC-box/Sp1-regulatory motifs are common sequences required for their gene regulation (Table 1). Moreover, duplicated GGAA-motifs are frequently found in the bidirectional promoter regions of head-dead oriented DNA-repair genes (Table 2). GGAA containing sequences are known as a target for ETS family proteins, and the GC-box can be recognized by multiple proteins, including Sp1 family. Therefore, multiple TFs may access and bind to the duplicated GGAA or GC-box when cells were exposed to DNA damage or energy stress (Fig. 1). Therefore, we hypothesize that these genes are required to respond promptly and accurately when cells encounter stress signals, such as DNA damage or lack of energy source. This might in part explain why they have common cis-elements in the gene regulatory regions. However, detailed molecular mechanism(s) how expression of these DNA-repair genes and telomere maintenance genes is regulated are yet to be elucidated. Thus, revealing the regulatory mechanisms behind expression of these genes should contribute to the development of novel drugs for cancer, obesity, diabetes and an anti-aging treatment in the future.
The authors are grateful to Takahiro Oyama and Midori Konno for discussion and outstanding technical assistance. This work was supported in part by a Research Fellowship from the Research Center for RNA Science, RIST, Tokyo University of Science.