Alylating agent damage from SN1 or SN2 methylating and ethylating agents .
Direct reversal repair eliminates some DNA and RNA modifications without using excision, resynthesis, and ligation. Therefore, because direct reversal repair does not require breaking of the phosphodiester backbone, it is error-free and preserves genetic information. Direct reversal is primarily utilized in correcting damage caused by DNA alkylating agents including N-methyl-N’-nitro-N-nitrosoguanidine (MNNG), N-methyl-N-nitrosourea (MNU), and methyl methanesulfonate (MMS) that react with DNA to produce various O-alkylated and N-alkylated products. Two major types of proteins conduct direct reversal repair, O6-methylguanine-DNA methyltransferases and ALKBH α-ketoglutarate Fe(II) dioxygenases (FeKGDs) . Although there are numerous methyltransferases in mammalian cells, those enzymes generally catalyze transfer of methyl groups to DNA or transfer of methyl groups to or from proteins [2-4]. In mammalian cells, there is only a single DNA methyltransferase protein, O6-methylguanine-DNA methyltransferase (MGMT or AGT), that removes methyl groups at exocyclic ring oxygens of DNA . The other type of direct reversal repair is performed by ALKBH proteins that are members of a superfamily of FeKGDs [4, 6]. Though the ALKBH family of FeKGDs encompasses nine proteins with conserved active site domains, removal of alkyl damage in DNA has only been established for four family members, ALKBH1 – 3 and FTO . Unlike repair by MGMT, which is inactivated following a single repair reaction, each ALKBH protein can catalyze numerous repair reactions to eliminate N-modifications of cytosine, adenine, thymine and guanine residues . In this review, prior to description of the direct reversal DNA repair enzymes and their functions, we will briefly describe sources of alkylation damage and adducts that are introduced upon exposure of cells to alkylating agents.
1.1. Sources of alkylation damage
Alkylating agents are present in the exogenous environment as well as intracellularly via oxidative metabolism. Alkylation damage from exposure to exogenous sources such as N-methyl-N’-nitro-N-nitrosoguanidine (MNNG), N-methyl-N-nitrosourea (MNU), and methyl methanesulfonate (MMS) (Figure 1B and 1D)  can arise from environmental agents present in various media including air, water, plants, and food . Furthermore, numerous alkylating agents are used in chemotherapy to attack rapidly dividing tumor cells . In addition to exogenous sources of DNA damage, a number of putative endogenous alkylating agents are proposed (Figure 1E). Endogenous agents can also introduce alkylation damage as a consequence of cellular metabolism . Among the possible alkylating agents implicated is the enzyme cofactor S-adenosylmethionine (SAM), which is involved in numerous biochemical processes . Methylating agents can also be formed via enzymatically-catalyzed chemical nitrosation reactions . These nitrosation reactions can activate choline and betaine, as well as other lipoperoxidation products (Figure 1F) [10, 12, 13]. Regardless of whether sources are exogenous or endogenous, reaction of alkylating agents with DNA and RNA generates adducts that can disrupt major cellular processes such as replication and transcription, which can trigger cell cycle checkpoints and initiate apoptosis . Importantly, if DNA alkyl adducts are left unrepaired, replication of damaged DNA can result in formation of mutations that, depending on the site within the genome, can lead to long term effects on cellular function.
1.2. Types of alkylating agents
The two major types of alkylating agents are SN1 (Figure 1A) and SN2 (Figure 1C). In the SN1 reaction mechanism, a charged ionic species forms that is generally the rate limiting step (monomolecular) (Figure 1A); whereas SN2 reactions follow bimolecular kinetics (Figure 1C). Alkylating agents are generally electrophilic compounds that have an affinity for the nucleophilic centers in organic macromolecules and react in either a mono- or bifunctional manner [13, 15, 16]. Monofunctional agents consist of a single reactive group that interacts covalently with a nucleophilic center in DNA and primarily modify ring nitrogens [14, 17]. Common monofunctional nucleophilic reaction centers in DNA include: adenine N1, N3, N6; guanine N7, N1, N2, N3, N7, and O6; cytosine N3, N4, and O2; and thymine N3, O2, and O4 (Figure 2A), as well as phosphate modifications that form phosphotriesters (Figure 2B) . Bifunctional alkylating agents, on the other hand, have two reactive groups that can interact with the DNA and can form cyclized or cross-linked DNA bases in addition to alkylating ring nitrogens (e.g., Melphelan and Nitrogen Mustard (Figure 1B and 1E)) [14, 17].
1.3. Distribution of DNA Damage manifested by simple alkyating agents
Alkylating agents can cause damage at all exocyclic nitrogens and oxygens in DNA and RNA, as well as at ring nitrogens (Figure 2A) . However, the percentage of each base site modified depends on the alkylating agent, the position in DNA or RNA, and whether nucleic acids are single- or double-stranded (Table 1) . Interestingly, O-alkylations are more mutagenic and harmful than N-alkylations, which may be more cytotoxic, but not as mutagenic .
Some frequent methylation sites in DNA include 1-methylguanine (1-meG), O6-methylguanine (O6-meG), 7-methylguanine (7-meG), 3-methylguanine (3-meG), 3-methylcytosine (3-meC), and 3-methyladenine (3-meA) (Figure 2A) [7, 10]. Importantly, nitrogens at base pairing positions in double-stranded DNA are less susceptible to alkylating damage than those found in single-stranded regions of DNA; though, methylating agents can react at Watson-Crick hydrogen bonding sites when DNA is singe-stranded. As a result, 1-meA and 3-meC adducts are much more frequent in single-stranded than in double-stranded DNA (Table 1).
Base modifications caused by larger ethylating agents (Table 1) begin to show differences from the corresponding methylating agents. Ethyl methanesulfonate (EMS) reacts similarly with guanine ring nitrogens compared to the methyl methanesulfonate (MMS), but there is a small, yet significant, decrease in the percentage of 1-meA and 3-meC formed by that agent. On the contrary, 1-ethyl-1-nitrosourea (ENU) produces significantly less 7-ethylguanine compared to the percentage of 7-meG formed by exposure to 1-methyl-1-nitrosourea (MNU) (Table 1). The decrease in modification at the N7 position of guanine (G) is accompanied by a concomitant increase in the formation of phosphotriesters by ENU that represents over 50% of the damage assayed.
|Alkylating Agent||MMS (SN2)||MNU (SN1)||EMS (SN2)||ENU (SN1)|
Given the number and importance of alkylating agents, much effort has been expended to study the biological effects of alkylated DNA in cells. Interestingly, the biological consequences of unrepaired alkylation damage vary depending on the site in DNA. For instance, exposure of double-stranded DNA to SN1 or SN2 alkylating agents results in more frequent generation of 7-meG and 3-meA; however, the consequences of unrepaired 7-meG and 3-meA are different. Specifically, 7-meG does not block DNA replication and therefore is not as cytotoxic as 3-meA, whereas 3-meA adducts arrest DNA synthesis and do not show altered coding specificity . In contrast, 7-meG can spontaneously depurinate and indirectly can lead to mutations at the apurinic sites that form. Furthermore, although both 7-meG and 3-meA modifications destabilize the glycosylic linkages, 3-meA repair occurs much more rapidly than 7-meG. Though not formed as frequently as 3-meA and 7-meG adducts, 3-meT and 3-meC lesions block DNA synthesis . Additionally, generation of 1-meA and 3-meC modifications, primarily in single-stranded DNA regions will also halt DNA polymerization . Unique to O6-meG, unrepaired damage is both cytotoxic and mutagenic .
2. Repair of DNA alkylation damage
The diversity of the types of DNA alkylation damage necessitates the involvement of a number of DNA repair systems to eliminate the ensemble of alkylation damage. As this chapter is focused on direct reversal repair mechanisms, we will only mention other major systems implicated in repair of alkylation damage in this section (Figure 3). Major repair pathways include, excision repair mechanisms, including base excision repair (BER) and nucleotide excision repair (NER), which require removal of the damage followed by resynthesis and ligation . Additionally, alkylation damage that persists during replication can lead to double strand breaks which are repaired by either non-homologous end-joining (NHEJ) or homologous recombination (HR) mechanisms [20-22].
3. O6-methylguanine-DNA methyltransferases direct reversal repair
Of the aforementioned direct reversal proteins, the first O6-methylguanine methyltransferase was isolated from E. coli and named Ada in that it was identified to regulate the adaptive response to alkylation damage . Currently, O6-methylguanine-DNA methyltransferases have been identified in prokaryotes, archea, and many eukaryotes . In initial mechanistic studies of the enzymatic activity of this group, protein extracts from E. coli incubated with DNA containing tritiated methyl groups at the O6 position of guanine (G) resulted in an association of the radiolabel with the methyltransferase proteins [24, 25]. Specifically, the active site cysteine of Ada covalently and irreversibly accepted the methyl group from O6-meG converting the cysteine to S-methylcysteine using a mechanism conserved by the homologous human protein MGMT. Although the O6-meG (Figure 4) is the preferred substrate for MGMT to act upon, the protein can also remove longer alkyl chains from DNA, including ethyl- (Figure 4), propyl-, butyl-, benzyl- and 2-chloroethyl groups, as well as O4-meT (Figure 4) .
3.1. MGMT protein structure/ active site recognition
In human cells, O6-methylguanine-DNA methyltransferase (MGMT), coded for by MGMT, is a monomer with a molecular weight of ~18 kDa [24, 26] that contains numerous conserved structural features found in O6-methylguanine-DNA methyltransferases from different species. For example, a zinc finger structure containing a sole Zn(II) bound within a coordination sphere of four amino acids (Cys5, Cys24, His29, and His85) is preserved near the N-terminus (Figure 5) [27, 28]. The C-terminal domain (residues 86-207) of MGMT consists of an α/β fold that bears a helix-turn-helix (HTH) motif. The second or “recognition” helix of the HTH contains a highly conserved RAV[A/G] motif with an ‘arginine finger’ that promotes flipping of the target nucleotide from the base stacking arrangement . Additionally, the C-terminal portion is comprised of a short, two-stranded, parallel β sheet, as well as four α helices, and a 310 helix where the active site PCHR cysteine motif is found . Immediately before the active-site, methyl group acceptor cysteine (Cys145) are two tight, overlapping turns stabilized by a conserved asparagine-hinge, Asp137, as well as the helix-turn-helix motif . Finally, the N-terminal portion of MGMT (residues 1-85) encompasses a conserved α/β roll structure with three-stranded, anti-parallel β sheets followed by two helices .
3.2. MGMT substrate recognition/repair mechanism
The “suicide” mechanism that MGMT employs to directly remove O-alkyl adducts from DNA (Figure 6) is unique in that a single protein molecule is responsible for eliminating each lesion, which is a high energetic cost for cells . The single use of the enzyme means that cellular MGMT activity begins to be depleted as soon as the enzymatic reaction occurs . This suggests that as the enzyme repairs DNA lesions rapidly, saturation of the repair process occurs after which the initial repair rate slows considerably . Unlike most other DNA repair systems, MGMT acts as a single protein and no other enzymes or cofactors are involved in the process . At low MGMT concentrations and in the absence of any cofactors, the transfer of a methyl group occurs in <2 s at 37°C .
The reaction catalyzed by MGMT is similar to the first half of a ping-pong enzyme kinetics mechanism, in which a group, the DNA adduct, is transferred from a substrate to a site in the enzyme, MGMT. In the absence of the second substrate, the group remains covalently attached to the protein, inactivating it . The alkyl group removed from DNA is covalently attached to the Cys145 residue in the active site of the human MGMT protein through an SN2 mechanism . The MGMT active site is buried inside its structure, therefore it must flip the damaged DNA base pairs out of the DNA helix in order to access them . In fact, a mispaired base in the helical structure is more likely to be detected by MGMT than the same Watson–Crick base pair, which suggests that DNA structural distortion caused by alkyl base damage is an efficient way for the MGMT protein to locate damage sites on DNA . The proposed reaction mechanism between the alkyl group and the active site of MGMT (Figure 6) requires that His146 acts as a water-mediated general base to deprotonate Cys145, which mediates the attack at the O6-alkyl carbon of guanine and, results in generation of a cysteine thiolate anion and an imidazolium ion stabilized by Glu172. Residue Tyr114 donates a proton to N3 of O6-meG and abstracts a proton from Lys165, facilitating simultaneous transfer of the methyl group on O6-meG to the thiolate anion of the Cys145 residue [33, 35].
3.3. MGMT expression/regulation
In human cells, MGMT is encoded on chromosome 10q26  and is a housekeeping gene that is expressed in all tissues, though expression levels vary between cell types . The highest MGMT expression levels in normal tissue are found in the liver, lung, kidney, and colon [32, 37-41]. Interestingly, the liver has higher levels of endogenous nitrosating agents relative to other organs, which could indicate a need for MGMT in that tissue. In contrast, the lowest levels of MGMT are found in the pancreas, hematopoietic cells, and lymphoid tissues [37, 41, 42]. MGMT levels in different cell types are dependent on various factors such as promoter regulatory elements, microRNAs (miRNAs) and possibly post-translational modifications. However, this correlation is not well-understood as evidenced by the fact that MGMT expression is up-regulated in some cancers, but silenced in others [37, 41, 43-45].
MGMT gene transcription is mediated by the 5’ promoter regulatory region of the gene that initiates at a single site within a GC-rich, non-TATA box, non-CAAT box-containing promoter . Expression is additionally mediated by two glucocorticoid response elements (GRE) within MGMT that bind activator protein-1 (AP-1) sequences (Figure 7) [46, 47]. Protection against alkylating agent treatment by MGMT can be induced in response to the glucocorticoid phorbol-12-myristate-13-acetate (TPA), which regulates MGMT expression by Protein Kinase C (PKC) signaling. Thus, control of MGMT expression has implications for the use of chemotherapeutic drugs [48, 49].
In addition to transcript levels, another means of controlling MGMT protein levels is through microRNAs (miRNAs). miRNAs can lead to RNA degradation via the RNA-induced silencing complex (RISC) or by binding to the mRNA and inhibiting translation. MGMT mRNA has a number of associated miRNAs [50-55], some of which control expression. A comparison of MGMT mRNA and MGMT levels indicated that the production of protein and transcripts are not directly correlated , which suggests a means of post-transcriptional control of protein synthesis. One miRNA, miR-181d, was linked to favorable glioblastoma patient responses to temozolomide (TMZ) . Subsequently, analysis of mRNAs from glioma samples showed two alternative poly(A) signals in the 3’-untranslated region (3’UTR) of MGMT (Figure 8), producing long and short MGMT transcripts with identical full length coding regions [50, 52]. Other in vitro analysis in cell lines identified three principal miRNAs that altered protein levels associated with the long MGMT transcript: miR-181d, miR-767-3p, and miR-648. All 3 miRNAs were linked to reduction in transcript levels. Other miRNAs, including miR-661 and miR-370 had lesser effects on MGMT levels . Moreover, there is a direct interaction of the miR-181d with the MGMT transcript . Specifically, the longer 3’UTR transcript provides a site for interaction with the miR-181d that leads to degradation of the MGMT transcript and thereby regulates cellular MGMT levels. On the contrary, although miR-648 is in the longer transcript, its principal function is to limit translation and not degradation of the mRNA . Of note, cells with lower the MGMT levels caused by association of the miRNAs with the longer transcript are more susceptible to methylating agent treatment than cells that do not have the alternative 3’UTR . Other miRNAs with MGMT target sites that have been associated with response to both chemotherapy and radiotherapy include miR-181b and miR-181c . Additional miRNAs, including miR-661 and miR-370 had lesser effects on MGMT levels; however, this area is only beginning to be explored as a method of regulating direct reversal repair and is associated with patient responses to alkylating agents.
Epigenetic regulation is another means by which MGMT expression is controlled. Epigenetic factors are heritable changes not directly related to primary DNA structure and do not involve mutations to the DNA . Two main mechanisms through which epigenetic regulation occurs are DNA methylation and histone modifications . DNA methylation in mammals is observed in CpG sequences by introduction of a methyl group at the 5 position of cytosine . CpG sequences are underrepresented in mammalian genomes based on the random distribution of dimer sequences. Depending on the position of a CpG site, enzymatic methylation can either enhance or reduce gene expression [58-60]. CpG sequences are often organized in promoter regions or early in genes in a concentrated manner defined as CpG islands. The MGMT promoter region has such a CpG island with multiple CpG dinucleotides in six different SP1 recognition sites (Figure 7) , which can be methylated. Modification of the MGMT promoter CpGs at the 5-meC interferes with transcription factor binding, that leads to MGMT silencing .
The other major type of epigenetic control of gene expression is through post-translational modification of histones. Histone deacetylation has a major role in transcriptional regulation and gene expression by removing acetyl groups from lysine residues in the amino terminal histone tails, which stabilizes DNA-histone interactions and condenses chromatin such that transcription binding sites are blocked and inaccessible [62, 63]. Other factors that can contribute to epigenetic regulation include diet and lifestyle choices [56, 64]. Such histone modifications contribute to the epigenetic regulation of MGMT expression. Acetylated H3 and H4 histones in a ‘hot spot region’ (~-100 to -250 from the transcription start site) and in AP1 binding sites (-605 to -611 and -798 to -804) of the MGMT promoter were associated with enhanced MGMT expression . MGMT expression is also controlled by another group of proteins, methyl-CpG-binding-domain proteins (MBD). When MBD protein levels are high, the MGMT promoter is silenced, suggesting that MBD proteins remove lysine acetylation from histones H3 and H4, which result in more condensed chromatin, inhibiting transcription factor access to the MGMT promoter region, consequent inactivating transcription [61, 66]. Thus, multiple epigenetic factors influence MGMT expression including, but not limited to CpG methylation at the promoter. Therefore, MGMT silencing will additionally be discussed further with respect to biological significance.
3.4. MGMT localization
MGMT is actively transported to the nucleus ; however, establishing stable MGMT levels in the nucleus is a two-step process. In the first step, MGMT is transported to the nucleus, and then, once in the nucleus MGMT is localized to regions of active RNA polymerase II transcription . Nevertheless, the localization of MGMT changes dramatically upon treatment with DNA alkylating agents. Following MNU exposure, those MGMT foci co-localized with RNA polymerase II transcription sites diffuse, suggesting dispersal of MGMT to damage sites . After elimination of DNA alkylation damage, and ensuing inactivation of MGMT, ubiquitination of Lys125 and Lys178 targets the inactive protein for degradation via the proteasome . Importantly, retention of MGMT is mediated by the basic PKAAR sequence (codons 124-128) that prevents the loss of the protein from the nucleus .
3.5. Post-translational modifications of MGMT
Transfer of the methyl group to generate the thioester, S-alkylcysteine restores the original guanine base, but in the process irreversibly inactivates the MGMT enzyme . Following the methyl group transfer and inactivation of MGMT, the enzyme is ubiquitinylated and subjected to degradation by the proteasome [34, 71]. There are also predictions for general sites of methylation, acetylation, sumoylation, and phosphorylation in MGMT, which include: methylation of Arg128 and Arg135; acetylation of Lys8, Lys125, Lys178, and Lys193:, sumoylation of Lys75, Lys205, Lys18, and Lys107; and phosphorylation of Ser36, Ser56, Ser130, Ser182, Ser202, Ser206, Ser208, Thr37 Tyr91, and Tyr115 [69, 71-73]. These sites may stimulate or attenuate MGMT activity or assist in translocation to damage sites, but the role of such putative modifications is still not clear and is a possible target for future investigations.
3.6. Biological significance of direct repair by MGMT
Though exposure to alkylating agents introduces numerous DNA and RNA alkyl adducts (Figure 2), specific repair of O6-meG and O4-meT adducts by MGMT reduces cell cytotoxicity and mutagenicity, as exhibited by the increased cell death and mutation frequency displayed in Mgmt-deficient murine models treated with SN1 or SN2 alkylating agents [39, 74, 75]. Lack of O6-meG or O4-meT adduct repair can result in cytotoxicity due to ensuing interference with replication and transcription machinery, which leads to apoptosis [76, 77]. Alternatively, if a modified base can form at least two hydrogen bonds, transcription, replication, and translation of templates can continue . In the absence of repair, O6-meG can readily form two hydrogen bonds to base pair with T. That transition mutationcan lead to G→A mutations [78, 79], which has been observed in Ada,Ogt-deficient E. coli or Mgmt-deficient murine systems,. Otherwise, in the absence of repair, translesion synthesis (TLS) DNA polymerases can bypass DNA adducts, facilitating progression of replication and transcription past the damaged bases [80, 81]. However, TLS DNA polymerases exhibit reduced fidelity compared to normal replicative polymerases, making TLS DNA polymerases more tolerant to distortions in DNA that may result from alternative hydrogen bonding and non-Watson-Crick base pairing with damaged bases . Consequently, in the absence of MGMT, mispairs are readily incorporated opposite unrepaired bases, reducing cytotoxic effects, but increasing mutagenicity.
Direct reversal repair of O6-meG is a relatively simple repair mechanism, but the biological consequences in the absence of O6-meG repair are of great importance. For instance, MGMT promoter silencing is exhibited in numerous types of cancers including breast, lung, colon, head and neck cancers [83-86]. In gliomas, higher MGMT promoter methylation is linked to increased overall survival in response to alkylating agents [38, 50, 87, 88], but has also been noted in myeloma, colon, pancreatic, breast, and lung cancers, as well as non-Hodgkin lymphoma [38, 86, 89, 90]. Furthermore, the expression of microRNA, miR-181d [50, 52], which targets MGMT expression, is predictive of patient responses to the chemotherapeutic drug TMZ. Presumably, decreases in MGMT, either by promoter silencing or miRNA inhibition, permit cytotoxic O6-meG adducts to remain in DNA and lead to increased cell death that is more specific to dividing tumor cells.
4. ALKBH Fe(II)/α-ketoglutarate-dependent dioxygenases direct reversal repair
The other family of direct reversal DNA repair proteins found in mammalian cells is the ALKBH family. Similar to MGMT, the AlkB protein was initially discovered in E. coli. Though AlkB was originally identified using a screen for methyl methanesulfonate (MMS) sensitive mutants , it took almost 20 years to classify AlkB as part of the FeKGD superfamily  and to demonstrate its ability to reverse 1-meA and 3-meC damage via oxidative demethylation [92, 93]. Two human homologs of AlkB, ALKBH2 and ALKBH3, were subsequently confirmed to be oxidative DNA demethylases [94, 95]. DNA adducts that are typically repaired by ALKBH proteins are 1-meA, 3-meC, 1-meG, 3-meT, 1-etA, 3-etC, as well as etheno adducts, 1,N6-ethenoadenine, and 3,N4-ethenocytosine [6, 91, 93, 96] (Figure 9). Additionally, bacterial AlkB and mammalian ALKBH3 also repair alkyl adducts in RNA .
The ALKBH family consists of nine human ALKBH enzymes, ALKBH1-8 and the Fat Mass and Obesity associated gene (FTO) [1, 97, 98]. Despite primary structure conservation, only ALKBH1-3 and FTO have demonstrated unambiguous DNA repair activity [6, 97, 98]. The prototypical substrates for ALKBH1-3 are 1-meA and 1-meC adducts, but other modifications can also be substrates for those proteins (Figure 9). Cells that are deficient in ALKBH proteins generally show a higher sensitivity to SN2 type alkylating agents and a higher mutant frequency [75, 99, 100]. Adducts repaired by ALKBH proteins are considered cytotoxic because they prevent hydrogen bonding with a complementary nucleotide and thus arrest DNA and RNA synthesis , blocking replication and transcription, which leads to apoptosis [73, 91, 93, 100-102]. Alternatively, increased mutant frequency could result from unrepaired lesions that undergo mutagenic bypass . In murine models, targeted deletion of Alkbh1 is linked to developmental defects with Alkbh1 enzymatic activity primarily directed at demethylation of histone H2A [103-105]. However, because this review is focused on DNA repair, our discussion of ALKBH1 will be limited. Similarly, although FTO removes 1-meA and 3-meC damage in vitro, its role is more closely linked to functions in RNA demethylation or demethylation of 6-meA [69, 106-110], which is not usually linked to DNA repair functions, and will therefore not be discussed further in this review.
4.1. ALKBH protein structure/active site organization
Although sequence homology is limited to active site and conserved domains in human ALKBH proteins, the secondary structures are conserved. For instance, all ALKBH family proteins have similar catalytic domains, but varying DNA recognition motifs . Conserved domains in the FeKGD superfamily include a jelly-roll topology with a His2-X-Asp/Glu-Xn-His2 motif . Specifically, the jelly-roll is made up of two sheets of antiparallel β-strands that contain His131, Asp133, and Trp69 residues which bind the iron and 2-oxoglutarate co-substrates [102, 112, 113] (Figure 10). Additionally, the His187 residue of the jelly-roll assists in Fe ligation, whereas Arg204 and Arg210 act to form salt bridges with the carboxylates of 2-oxoglutarate .
The catalytic core of both ALKBH2 and ALKBH3 is made up of three major components: a double-stranded β-helix, a nucleotide recognition lid, and an N-terminal extension [114, 115]. The Nucleotide Recognition Lid (NRL) is comprised of β-hairpin motifs which create a substrate binding groove that covers the active site until substrate is bound . Despite similar catalytic mechanisms, β-strands and α-helices that create distinct outer walls of the DNA binding groove, which is involved in substrate recognition and specificity, vary between ALKBH proteins . More explicitly, this divergence is present in the two looped structures, or “flips” that lie between the single β-sheet and two α-helices in the N-terminal portion of the catalytic region [112, 113]. In ALKBH2, the first flip is made up of 20 residues that constitute a β-hairpin and a short α-helix that together create a hydrophobic substrate binding groove [113, 116]. ALKBH3, on the other hand, contains a first flip that is a β-hairpin that is made up of only 17 residues, forming a hydrophilic binding groove that has a preference for single stranded DNA or RNA substrates [111, 112]. The second flip in ALKBH2 consists of 24 residues that is composed of three β-sheets, allowing for interaction with both strands of DNA, whereas flip 2 of ALKBH3 is only 12 residues long and is made up of a single β-sheet .
To facilitate repair, ALKBH proteins flip the damaged base into their active site enabling the protein to interact with both strands of the DNA [111, 113]. In the case of ALKBH2, a short loop with a positively charged RKK sequence (Arg241–Lys243) is involved in grasping the complementary strand of the DNA, while a longer, more flexible loop, containing the residues Arg198, Gly204 and Lys205, binds the opposite DNA strand [111, 113, 114]. Additionally, ALKBH2 uses an aromatic finger residue, Phe102, to intercalate within the duplex stack, filling the gap resulting from DNA base flipping . Tyr76 forms hydrogen bonds with the two phosphates 5′ of the methylated base, maintaining the substrate within the active site and residues Asp135 and Glu136 hydrogen bond with the exocyclic amino group via a water molecule [113, 115]. On the contrary, ALKBH3 does not contain the same aromatic finger residue and RKK motif as ALKBH2 and therefore has a greater preference for single-stranded DNA substrates. As a result, “flipping” of the damaged base is accomplished by squeezing the DNA proximal to the damage, causing it to rotate outward [112, 113]. Though the structures of the ALKBH homologs 4 – 8 have not been studied extensively, differences in the organization of the catalytic residues and active sites are predicted to influence the substrate specificities as well as enzymatic activities of these homologs .
4.2. Substrate recognition/repair mechanism
The ALKBH family of proteins removes and repairs DNA methyl adducts via a mechanism known as oxidative demethylation which results in the direct restoration of the original base coupled with the release of the hydroxylated methyl group as formaldehyde [92-94]. Other modifications can also be removed using similar mechanisms. Unlike MGMT, the repair mechanism utilized by the ALKBH family requires molecular oxygen, Fe(II), and α-ketoglutarate as co-factors to execute removal of alkyl adducts from DNA [93, 94].The ALKBH repair reaction consists of four steps with various intermediates (Figure 11) . The first step of this mechanism involves a reaction between the active site Fe(II) and O2 which produces a superoxo anion (O2-) bound to Fe(III) . The superoxide attacks the α-keto carbon of the α-ketoglutarate, resulting in a bridged peroxotype intermediate . The α-ketoglutarate intermediate is decarboxylated releasing succinate and CO2 and undergoes a heterolytic cleavage of the O–O bond to form the high-valence ferryl-oxo intermediate. This intermediate then hydroxylates the alkyl adduct on the DNA producing an unstable intermediate that decomposes in water, with release of formaldehyde for methylated bases (and other aldehydes, depending on the substrate) restoring the original undamaged base (Figure 11) [75, 101, 118, 119].
Though ALKBH proteins primarily repair 1-meA and 3-meC (Figure 9A), those proteins also repair etheno and other exocyclic bases, but to a lesser extent than the methylated bases (Figure 9B and 9C, Figure 11C and 11D) [75, 96, 99, 118, 120, 121].
4.3. Gene expression/protein regulation
ALKBH2 and ALKBH3 are located on chromosomes 12q24 and 11p11, respectively and are considered housekeeping genes. The mRNA and protein levels of the ALKBH2 and 3 mRNAs vary with tissue type and the homolog [122-124]. However, both ALKBH2 and ALKBH3 are highly expressed in the testis and ovary [123, 124]. ALKBH2 and ALKBH3 contain CpG islands in their promoters (Figure 12A and 12C), but the role that those structures play in gene expression remains undefined.
Control of expression for ALKBH2 and ALKBH3 has not been as thoroughly studied as for MGMT. Interestingly, the arrangements of the ALKBH2 and uracil-DNA glycosylase gene (UNG) suggest a possible manner to control gene expression. ALKBH2 is adjacent to UNG on human chromosome 12, but transcribed in the opposite direction . The opposite orientations of these two genes could have an influence on their expression (Figure 12B). For ALKBH3, expression could be controlled by a putative ALKBH3 antisense RNA that is also converted into a long non-coding RNA (lncRNA) sequence (Figure 12D). The role that the lncRNA plays in ALKBH3 expression remains to be established. Unlike MGMT, control of ALKBH2, and ALKBH3 expression via micro RNAs has not been examined. However, the mir-505-5p miRNA is reported to target ALKBH2 (www.Exiqon.com), whereas at least 3 miRNAs (mir-188-3p, mir-4774-3p, and miR5580-5p) that could be involved in regulating ALKBH3 expression have been identified. Thus, the regulation of ALKBH2 and ALKBH3 expression has much that is yet unresolved.
4.4. Protein localization
ALKBH2 and ALKBH3 not only have different substrate preferences, but also exhibit different subcellular localization patterns, suggesting distinct biological functions. ALKBH2 is strictly nuclear and is found mainly at replication foci during S-phase [122-124]. Additionally, ALKBH2 co-localizes with PCNA [123, 124], indicating a possible role in DNA repair close to the replication fork. In contrast, ALKBH3 is found in the nucleus and in the cytoplasm [111, 122]. Association of ALKBH3 with the activating signal cointegrator complex 3 (ASCC3) helicase enzyme is consistent with nuclear localization , whereas the role of ALKBH3 in mRNA repair is consistent with its localization in the cytoplasm .
4.5. Post-translational modifications of ALKBH proteins
Post-translational modifications of residues within ALKBH2 and ALKBH3 have been examined using site-specific mutagenesis methods, as well as mass spectrometry , but the effects of these modifications are unknown. ALKBH2 residues Lys34 and Lys104 can be acetylated, Tyr91 and Thr93 and Thr252 can be phosphorylated, and Lys104 can be ubiquitinated . Though the effects of these modifications on ALKBH2 activity have not been established, it is important to note that residue Lys104 falls within the variable region of the N-terminus which provides for protein specificity. Similarly, post-translational modification of ALKBH3 includes several phosphorylated residues such as Thr126, Thr212, and Thr214, as well as Tyr127 and Tyr143, some of which have been shown to correlate with acute myelogenous, chronic myelogenous, and T-cell leukemia . Moreover, phosphorylation of active site residues Thr212 and Thr214 have been observed in liver cancer tissue samples, while phosphorylated Tyr residues have been reported in lung and non-small cell lung cancer cell lines . Despite some intriguing results as possible biomarkers for tumors, the functions of these post-translation modifications have not been identified.
4.6. Biological significance of direct repair by ALKBH proteins
Similar to MGMT, the presence of ALKBH2 and/or ALKBH3 reduces cell cytotoxicity and mutagenicity, as shown in Alkbh-deficient murine models treated with SN2 alkylating agent methyl methanesulfonate (MMS) [75, 100, 127]. Though the major damage sites repaired by ALKBH proteins are only susceptible to modification when DNA is single-stranded, formation of 1-meA and 3-meC at DNA base-pairing positions prevents proper base insertion which can halt DNA synthesis , causing replication fork collapse . As a result, persistence of 1-meA and 3-meC adducts increases cell cytotoxicity by triggering programmed cell death [75, 100, 127]. In contrast to O6-meG, following alkylation of the N1 of a purine or the N3 of pyrimidine, only a single hydrogen bond can be readily formed. Therefore, increased mutant frequency exhibited in Alkbh-deficient murine models is likely due to adduct bypass by TLS DNA polymerases . In E. coli, evaluation of 1-meA, 3-meC, 1-meG, or 3-meT mutagenicity revealed that all adducts were highly mutagenic, with the exception of 1-meA, which was only slightly mutagenic . Interestingly, of the adducts repaired by ALKBH proteins, 3-meC is formed at the highest frequency in response to MMS treatment (Table 1) and the mutations identified following MMS treatment in Alkbh2- or Alkbh3-deficient primary MEFs were C:G →A:T and C:G → T:A , suggesting that 3-meC is highly mutagenic in absence of repair.
Similar to O6-meG repair by MGMT, direct repair by ALKBH proteins has a biological significance that is not well understood. Varying expression levels of ALKBH2 and ALKBH3 also contribute to the progression or suppression of different types of cancers. Down-regulation of ALKBH2 increases sensitivity of H1299 lung cancer cells to the drug, cisplatin, improving overall survival . However, down-regulation of ALKBH2 has been observed to promote the growth of gastric cancer cells . Mutations in ALKBH2 and ALKBH3 have also been associated with their enhanced expression levels in glioma cells and pediatric brain tumors [131, 132]. ALKBH2 also mediates resistance to the alkylating agent therapeutic temolozomide (TMZ) in glioblastoma cells . ALKBH3 silencing induced senescence and increased sensitivity to alkylating agent therapies in prostate cancer cells [126, 130]. Therefore, further investigation of the roles of ALKBH2 and ALKBH3 in different types of cancer is important to define the specific roles of individual ALKBH family proteins.
5. Models of direct reversal repair and implications as therapeutic targets
Repair of DNA damage is critical for cell survival and maintenance of genome integrity. Not surprisingly, cells depend on direct repair mechanisms to remove damage that could otherwise be cytotoxic or mutagenic . Understanding the roles that direct reversal repair proteins play in genome stability also enables exploration and exploitation of these proteins in regard to therapeutics. Therefore, use of currently established animal models, as well as generation of additional models is integral in development of diagnostic and therapeutic approaches.
5.1. Current mammalian models defective in DNA direct reversal repair genes
The effects and efficiency of repair pathways is best studied by observing the effects on cell cytotoxicity, replication, transcription, and mutation in the absence of the repair proteins. To evaluate the impact of the absence of direct repair proteins animal models with targeted deletions have been developed for MGMT, as well as ALKBH proteins. Interestingly, murine knock-out (KO) models for Mgmt and Alkhb2 or Alkbh3 do not exhibit a detectable phenotype in the absence of alkylating agent treatment [75, 85, 100, 127, 129, 134-137]. As anticipated from in vitro cell culture studies, Mgmt-deficient mice treated with a number of alkylating agents including, N-methyl-N’-nitro-N-nitrosoguanidine (MNNG), N-methyl-N- nitrosourea (MNU), 1,3-bis(2-chloroethyl)-1-nitrosourea (BCNU), 1-(-4-amino-2-methyl-5-pyrimidinyl)methyl-3-(2-chloroethyl)3-nitrosourea (ACNU), streptozocin, TMZ, and dacarbazine, exhibited lethality at lower drug concentrations than for the wild type mice [10, 40, 87, 138], consistent with increased toxicity due to the absence of Mgmt. Moreover, Mgmt KO mice that were treated with MNU lost hematopoietic stem cells [135, 139] and were prone to the development of thymic lymphomas  as well as lung adenomas [103, 120, 139]. However, the sensitivity to methylating agents is lost when the mismatch repair system is not functional [85, 135, 140-143], which has implications for therapeutic treatments. Importantly, mice heterozygous for Mgmt do not exhibit decreased survival following alkylating agent treatment . Furthermore, mice exhibiting elevated levels of Mgmt are resistant to alkylating agent-induced tumor formation .
Even though Alkbh2- and Albh3-deficient murine models do not show any overt phenotypic changes compared to their wild type counterparts, over time the Alkbh2-deficient mice accumulate high levels of 1-meA in the liver . Similarly, double mutants with targeted deletions in both Alkbh2 and Alkbh3 do not demonstrate an obvious phenotype and the mice are fertile and live to normal ages [75, 100, 127]. Of note, a mouse model that targeted both the FeKGDs (Alkbh2 and Alkbh3) and the base excision repair pathway (Mpg or Aag) was generated. All of those proteins have roles in repair of alkylation damage. In response to alkylation damage (chemically-induced colitis), that mouse model deficient in all three proteins manifested a synergistic phenotype that resulted in death with even a single treatment . Due to the lack of phenotypic effect and limitations treating animals, analysis of the effects of Alkbh2 and/or Alkbh3 deficiency have also been conducted using primary and immortalized mouse embryonic fibroblasts (MEFs) [75, 100, 127].
5.2. Generation of new in vitro mammalian models defective in DNA direct reversal repair genes
The capacities to directly target human cells to either abrogate protein function or place specific tags on proteins have been limited. Recently, the development of clustered regularly interspaced short palindromic repeats/CRISPR associated endonuclease 9 (CRISPR/Cas9) technologies have permitted the generation of targeted deletions in human and rodent cells rapidly. Implementation of that technology will facilitate the study of DNA repair and mutagenesis in great depth. The basis for targeting genomic DNA to generate deletions is outlined (Figure 13). A guide RNA is designed based on an exon sequence in the genomic DNA. Following transfection of a plasmid expressing the guide RNA and the Cas9 mRNA, the protein-RNA complex will induce a double strand break in the genomic DNA. Repair by non-homologous end-joining leads to a change in the reading frame that inactivates the protein. This powerful technology can also be used to introduce point mutations and to create other cancer prone models in human cells for study in vitro and also in vivo in animal models. Readers are referred to recent publications that describe the possibilities of using these methods [144-146].
5.3. Direct repair proteins as therapeutic targets
DNA repair deficiency is associated with increased cancer risk and formation of tumors, but has also been used in therapeutic strategies employing synthetic lethality in an effort to overload the cancer cells with damage that results in apoptosis while normal cells with efficient repair can eliminate the damage invoked by the chemotherapeutic regimen [149-153]. Commonly anti-neoplastic therapies utilize alkylating agents as well as ionizing radiation (IR); however, these treatments not only induce cell death in cancer cells, but can also increase the formation of mutations in normal cells, leading to an increased risk of secondary cancers. Synthetic lethality for DNA repair agents exploits defects in DNA repair found in tumor cells that use alternative repair systems for repair. Inhibiting the alternative repair systems results in increased tumor cell death specifically targeted to the tumors. Currently, both chemotherapy and radiation are used in combination to target specific DNA repair proteins in cancer cells in order to improve therapeutic efficacy and limit drug resistance [154, 155]. One of the advantages of direct reversal repair proteins is that a single protein is ultimately responsible for elimination of the damage and that no breaks are made in the DNA by the repair mechanism. Targeting direct reversal repair proteins to increase sensitivity could supplement the efficacy of the alkylating agents already used in clinical protocols.
MGMT-inhibitors currently exist and include O6-benzylguanine (BG) and O6-(4-bromothenyl) guanine (PaTrin-2 or lomeguartib) . Patient studies indicate that treatment with BG eliminates nearly 99% of MGMT activity in human cells for 24 hours after the removal of the chemotherapeutic drug . Additionally, clinical trials evaluating combination treatment of BG with BCNU have been conducted to evaluate enhancement of alkylating agent chemotherapeutics , and treatment with O6-(4-bromothenyl) guanine has been evaluated in glioma patients .
The roles of ALKBH2 and ALKBH3 in response to methylating agent chemotherapies (particularly for TMZ) remain unclear. Although MGMT is associated with resistance, there are reports of ALKBH2 also enhancing resistance . Inhibitors of Alkb have already been identified, but inhibitors of the human homologs have not been reported [157-159]. The identification of ALKBH inhibitors and their use with current chemotherapeutics could provide new tailored therapies for patients.
In addition to the development of therapies targeting the proteins, RNA interference-mediated gene silencing (RNAi) [160, 161] is a possible alternative approach for specifically targeting MGMT or ALKBH family members that allows depletion of proteins for extended periods of time . Lowering protein levels of species that protect against DNA damage would render cells more susceptible to chemotherapeutic agents and should theoretically offer better drug efficacy.
siRNAs can be identified using empirical methods to target direct DNA repair protein mRNAs (i.e., for MGMT, ALKBH2, or ALKBH3). At present, reports indicate that higher MGMT levels are associated with poor response to therapy using TMZ, whereas patients with lower MGMT levels have better responses to TMZ therapy [23, 52, 86-90]. Rather than using siRNA constructs defined empirically, naturally occurring miRNAs could be used to reduce DNA repair protein levels and improve therapeutic responses for tumors (e.g., glioblastomas) that respond to methylating agents (e.g., TMZ). However, a number of miRNAs have already been identified including: miR-181d, miR-195, and miR-196b, that negatively correlate with overall survival in glioma patients [52, 160]. Using miRNAs as targets could attenuate MGMT levels in tumors, rendering the tumors more susceptible to TMZ treatment. That susceptibility of cells to TMZ/miRNA treatment was demonstrated in vitro [50, 52]. Using T98G or A1207 cells, miRNA targeting in combination with TMZ reduced cell viability by up to 2.5-fold for the MGMT transcript compared to cells in which the MGMT levels were not reduced [50, 52]. To date, the miRNAs in combination with TMZ as treatment are indicated as biomarkers to predict probable patient outcomes [50, 52, 55, 160]. In an evaluation of The Cancer Genome Atlas dataset for glioblastomas, MGMT transcript levels and miR-181d correlated with patient survival . In the future, miRNAs against MGMT mRNA could be introduced to augment therapeutic response (Figure 7). Other microRNAs associated with ALKBH2 and 3, such as mir-505-5p miRNA for ALKBH2 (www.Exiqon.com) and mir-188-3p, mir-4774-3p, and miR5580-5p for ALKBH3, could also be used as new therapeutic avenues, because there are also reports that TMZ response is linked to the presence of ALKBH2 or ALKBH3 [100, 126, 133]. The use of miRNAs has great potential for high specificity with limited side effects. Targeting transcripts using miRNAs is an exciting area for developing new therapeutic targets and biomarkers for predicting outcomes. Employing miRNAs could have substantial benefits for patients, but much work remains to bring such promising therapies to fruition.
Direct reversal repair is one of the lesser known mechanisms by which cells repair DNA. Unique to direct reversal repair pathways, repair occurs without breakage of the DNA backbone and the processes are error free. As a result, direct reversal repair proteins have central roles in the preservation of genomic stability. In mammalian cells, direct reversal repair is principally limited to correcting DNA alkylation damage that can arise from exogenous or endogenous sources. Elimination of alkylation damage by direct reversal is achieved by two major types of proteins: O6-methylguanine-DNA methyltransferases and ALKBH α-ketoglutarate Fe(II) dioxygenases. Although much is known biochemically about direct reversal repair enzymes, epigenetic factors, post-translational modifications, as well as genomic and mitochondrial DNA repair mechanisms require further investigation. Recent data establishing the function of direct reversal repair proteins in model system organisms, most prominently in mice has contributed to the comprehension of the biological function of these proteins. Already, the partial understanding of these mechanisms has been translated into clinical use and in the future should lead to an even greater influence on improving therapeutic outcomes.
Alya Ahmad and Stephanie L. Nay contributed equally to this work. The authors would like to acknowledge support from the Beckman Research Institute, the Irell and Manella Graduate School of Biological Sciences, and the National Institutes of Health (R01CA176611 [Termini/O’Connor], P50 CA107399 [Forman/Project 2 Bhatia/O’Connor], the City of Hope Comprehensive Cancer Center Support Grant P30 CA033572 [Rosen], and NIH Training Grant 2T32-AI52080-11 [Nay]) for funding work in this publication. Correspondence should be addressed to email@example.com.