Genomic DNA is constantly associated with the various proteins that are involved in DNA folding and transactions. The association between DNA and proteins is reversible, and when prompted, proteins dissociate from or translocate along the DNA strand, leaving the open nucleotide sequence available for replication, transcription, and repair. This process ensures the faithful expression and propagation of genetic information. However, exposure of cells to DNA-damaging agents can cause proteins to become covalently trapped on DNA, generating DNA–protein cross-links (DPCs) [1, 2]. The formation of DPCs was originally demonstrated in bacterial and mammalian cells that had been heavily irradiated with ultraviolet light [3, 4]. It was subsequently shown that DPCs can be produced by various chemical and physical agents, such as aldehydes , metal ions , and ionizing radiation , and by certain types of anticancer agents [8-10].
DPCs are unique among DNA lesions, since they are extremely bulky and are likely to impose steric hindrances upon proteins involved in DNA transactions, and hamper their function. Despite the potential importance of DPCs as genomic damage, they have received less attention than other DNA lesions. Accordingly, much remains to be learned about how cells alleviate the toxic effects of DPCs and about what happens to cells if DPCs are left unrepaired.
The characteristics of DPCs vary considerably with respect to the size, physicochemical properties, biological function, and cross-linking bonds of the trapped proteins. The currently known DPCs can be subdivided into four groups (types 1–4) according to whether and how they are associated with flanking DNA nicks (Fig. 1) [2, 11]. Type 1 DPCs contain proteins that are covalently attached to an undisrupted DNA strand. They are the most common form of DPC found under physiological conditions and are produced by chemical and physical agents such as aldehydes, chromate, platinum compounds, ionizing radiation, and ultraviolet light . Type 2 DPCs, which were identified very recently in vitro and in vivo, contain poly(ADP-ribose) polymerase-1 (PARP-1) attached to the 3’ end of a DNA single-strand break (SSB) [12, 13]. They are formed as a result of abortive DNA repair. Type 3 DPCs contain topoisomerase (TOPO) I attached to the 3’ end of an SSB via a tyrosinyl–phosphodiester bond. Finally, type 4 DPCs are formed via the attachment of TOPO II to the two 5’ ends of a DNA double-strand break (DSB) via tyrosinyl–phosphodiester bonds. Type 3 and type 4 DPCs are produced by inhibition of the covalent reaction intermediate of TOPO I and TOPO II, respectively, by TOPO inhibitors (TOPO poisons) or by flanking DNA damage .
In this article we review the current knowledge regarding the formation, repair, and biological effects of type 1 and 2 DPCs (Fig. 1). There already exist extensive reviews and research papers on similar topics for the TOPO-inhibitor-induced type 3 and type 4 DPCs [15-17], and so these will not be dealt with herein.
2. Detection and characterization of DPCs
2.1. Overview of DPC detection
Analysis of the induction and removal of DPCs in the genome is indispensible when studying the repair and biological effects of DPCs. DPCs can be detected either directly or indirectly; while DNA purification is not required for the indirect detection method (Section 2.2), it is required for both direct detection (Section 2.3) and immunodetection (Section 2.4) methods. When required, DNA can be purified by conventional cesium chloride density gradient ultracentrifugation [10, 18, 19] or using the DNAzol-based method [20, 21]. Recently, the rapid and small scale purification methods of DNA were reported and used for immunodetection of DPCs [22, 23]. The methods of DPC detection and their principles are summarized below.
2.2. Indirect detection of DPCs
The indirect methods of detecting DPCs include the alkaline elution, nitrocellulose filter-binding, sodium dodecyl sulfate (SDS)/potassium ion (K+) precipitation, and single-cell gel-electrophoresis methods. The alkaline elution method is based on the different elutabilities of DNA without and with cross-linked proteins from a filter under alkaline conditions [24, 25]. In brief, cells are filtered onto a polyvinylchloride filter and lysed with sarkosyl; the DNA that is retained on the filter is eluted at pH 12.1. The adsorption of cross-linked proteins to the filter reduces the elutability of unwound single-stranded DNA, thereby changing its elution kinetics. The nitrocellulose filter-binding method depends upon the different abilities of DNA without and with cross-linked proteins to bind to a nitrocellulose filter [26, 27]. In this method, cells are lysed with sarkosyl and passed through the filter, which retains proteins and DNA with cross-linked proteins, but not free DNA. The amount of DNA that is retained on the filter via cross-linked proteins is then assayed for DPCs. The SDS/ K+ precipitation method is based on SDS binding tightly to proteins to form insoluble precipitates with K+ [6, 28]. Cells are lysed with SDS, and SDS-bound proteins and DNA with cross-linked proteins, but not free DNA, are selectively precipitated by KCl. The amount of DNA precipitated due to cross-linked proteins is then assayed for DPCs. The single-cell gel-electrophoresis method (the comet assay) detects retarded DNA migration due to a certain type of DPC [29, 30]. Pretreatment of lysed cells with proteinase K enables the distinction between DNA with and without DPCs in these methods.
While a major advantage of these aforementioned indirect methods is that they enable the detection of DPCs without purifying DNA, there is no linear relationship between the amounts of DNA and cross-linked proteins. This makes it difficult to quantitatively interpret the data derived from these indirect measurements of DPCs.
2.3. Direct detection of DPCs
Two techniques have been developed that allow direct and quantitative analysis of DPCs: the 125I-postlabeling and fluorescein isothiocyanate (FITC)-labeling methods. The 125I-postlabeling method is based on the specific incorporation of 125I into tyrosine residues that are associated with purified DNA . A recently reported FITC-labeling method has been shown to provide a more straightforward analysis. Cross-linked proteins in purified DNA are specifically labeled with FITC and directly assayed for the resulting fluorescence [18, 19, 32]. A key advantage of the FITC-labeling method is that the amount of DPCs is proportional to the fluorescence intensity of the labeling.
2.4. Immunodetection of DPCs
Most DPC-inducing agents are fairly nonspecific and covalently trap various proteins. However, DPCs can be detected directly by Western blotting if the identity of the cross-linked protein is known. DNA methyltransferase (DNMT), which is associated with type 1 DPC [10, 22], PARP-1, which is associated with type 2 DPC [12, 13], and TOPOs I and II, which are associated with type 3 and type 4 DPC, respectively , can be detected by Western blotting when they are covalently trapped in DNA by inhibitors.
2.5. Proteomic analysis of cross-linked proteins
Considerable insight into the biological effects and repair mechanisms of DPCs can be obtained by identifying the cross-linked proteins in DNA. Comprehensive proteomic analyses of the cross-linked proteins induced by ionizing radiation , formaldehyde , mechlorethamine (one of the antitumor nitrogen mustards) [34, 35], and butadiene diepoxide (a carcinogenic metabolite of 1,3-butadiene) [36, 37] have been performed. The identified cross-linked proteins include those participating in transcriptional regulation, translation, RNA processing, DNA damage response, DNA repair, cell cycle, homeostasis, cell signaling, and cell architecture. These proteomic approaches may have potential applications in the analysis of DNA-damage interactomes .
3. Formation of DPCs
DPCs are produced by various chemical and physical agents or during DNA transactions. Here we summarize the formation of DPCs by selected agents including aldehydes, bifunctional alkylating agents, and ionizing radiation. We also refer to DPC formation by abortive DNA metabolism and repair.
3.1. Formation of DPCs by DNA-damaging agents
Aldehydes are well-known inducers of type 1 DPCs. Humans are exposed to various aldehydes through anthropogenic and food sources. Aldehydes are also generated by lipid peroxidation and metabolism in cells . Aldehydes that have escaped from the detoxification systems of cells react with DNA, proteins, and other biomolecules and hamper cellular functions. The reactions between aldehydes and DNA result in the formation of DPCs [1, 5] and base adducts . Other DNA lesions such as DNA intrastrand cross-links, DNA interstrand cross-links (ICLs), SSBs, and DSBs may be formed concurrently to varying extents .
In light of the reactivity of aldehydes, the side chains of lysine, cysteine, and histidine residues in proteins react with aldehydes to form adducts (Fig. 2). The aromatic amines of DNA bases are weak nucleophiles and are less reactive to aldehydes. The resulting protein adducts react further with the amino group of DNA bases to form various types of cross-linking bond that have different chemical stabilities. The cytotoxicities of formaldehyde, chloroacetaldehyde, acrolein, crotonaldehyde, trans-2-pentenal, and glutaraldehyde, and their in vivo DPC-inducing efficiencies have been analyzed using human MRC5-SV cells and the FITC-labeling method (Table 1) . The results show that chloroacetaldehyde, acrolein, and glutaraldehyde are more potent DPC inducers than are crotonaldehyde, trans-2-pentenal, and formaldehyde, and that the DPC-inducing efficiency of aldehydes is correlated with their cytotoxicity. The in vitro DPC-inducing efficiencies of these aldehydes (except for chloroacetaldehyde) have also been assessed using plasmid DNA and histone (Table 1) . Comparison of these in vivo and in vitro data indicates that glutaraldehyde and acrolein are potent DPC-inducers both in vivo and in vitro, whereas crotonaldehyde and trans-2-pentenal are poor DPC-inducers in vivo and in vitro. Interestingly, formaldehyde is a highly potent DPC-inducer in vitro, but a poor DPC-inducer in vivo, indicating that formaldehyde is effectively detoxified in cells . The DPCs induced by the aforementioned aldehydes are eliminated from the genome with a half-life 4.8–8.4 h in vivo, while they are reversed spontaneously with a half-life 8.0–20.2 h in vitro (Table 1) . The shorter half-life in vivo may be at least partially attributable to acceleration of DPC reversal by nucleophiles present in cells. There is a positive correlation between the in vivo and in vitro half-lives of DPCs.
Finally, it is also worth noting that the importance of DNA damage induced by endogenous aldehydes has been again acknowledged in recent studies. Studies involving mouse models have revealed that DNA damage induced by endogenous aldehydes is associated with the symptoms of Fanconi anemia (FA), which is a complex heterogenic disorder of genomic instability, bone marrow failure, and cancer predisposition [44, 45]. In a study involving chicken DT40 cells, it was found that the catabolism of formaldehyde is essential for cells deficient in the FA DNA-repair pathway . The identity of the aldehyde-induced DNA lesion that is responsible for FA remains elusive.
|Aldehyde||Cytotoxicity (μM)a||DPC-inducing efficiencyb||Half-life of DPC (h)|
|In vivo||In vitroc||In vivo||In vitrod|
3.1.2. Bifunctional alkylating agents
Bifunctional alkylating agents have been generating considerable interest as both health hazards and anticancer drugs. Their biological activity relies on their capacity to cross-link biomolecules, resulting in inactivation of their function. With DNA, bifunctional alkylating agents react with DNA to form monoadducts. The remaining reactive site of these reagents can further react with either DNA to form a DNA–DNA cross-link or a protein to form a DPC .
The simple bis-electrophiles, such as 1,2-dibromoethane, butadiene diepoxide, and epibromohydrin, are used in industry and are considered to be hazardous to health. In general, many of them are cytotoxic and mutagenic, and induce monoadducts, DNA–DNA cross-links, and DPCs. Interestingly, the cytotoxic and mutagenic effects of 1,2-dibromoethane, butadiene diepoxide, and epibromohydrin in Escherichia coli and Chinese hamster cells are significantly increased by the ectopic overexpression of human O6-alkylguanine-DNA alkyltransferase (hAGT), the primary function of which is to maintain genomic integrity by directly reversing alkylation DNA damage [48, 49]. These agents cross-link hAGT and DNA to form type 1 DPCs together with other DNA damage [50, 51]. An initial reaction occurs between the hAGT and one side of the reagents to produce a reactive intermediate at Cys145 at the active site of hAGT. The resulting intermediate subsequently attacks the N7 of guanine in DNA, yielding covalent hAGT–DNA cross-links [50, 51]. It has been proposed that the hAGT–DNA cross-links and/or apurinic/apyrimidinic (AP) sites arising from the depurination of hAGT–DNA cross-links are involved in the cytotoxicity and mutagenicity observed in the presence of hAGT.
Glyceraldehyde 3-phosphate dehydrogenase and histones are cross-linked to DNA by butadiene diepoxide in vitro [52, 53]. However, in contrast to hAGT, the ectopic expression of these proteins in E. coli and concomitant treatment with butadiene diepoxide does not affect cell survival and mutations [52, 53]. As mentioned in Section 2.5, various butadiene diepoxide-induced DPCs were identified in human cells by proteomic analysis [36, 37].
The nitrogen mustards containing N-(2-chloroethyl) groups are typical bifunctional alkylating agents and are frequently used for cancer therapy. The N-(2-chloroethyl) group cyclizes spontaneously and forms an aziridinium ion, which alkylates DNA and proteins and forms a “half mustard”. The resulting half mustard undergoes a similar cycle of reactions to form a DNA–DNA cross-link and a DPC . Accumulated evidence indicates that the chemotherapeutic potential of nitrogen mustards and other cross-linking anticancer drugs such as mitomycin C is mainly attributable to their ability to form ICLs [54, 55]. However, it has not been clarified whether DPCs induced by nitrogen mustards and other cross-linking anticancer drugs potentiate the therapeutic efficacy of the drugs in conjunction with ICLs.
Mechlorethamine belongs to the member of the nitrogen mustards. The formation of DPCs together with ICLs upon the treatment of cells and nuclei with mechlorethamine and other cross-linking agents (nitrosoureas) was initially demonstrated using the alkaline elution method . Recent proteomic analyses have revealed the formation of DPCs in mechlorethamine-treated nuclear extracts and cells, demonstrating the involvement of functionally different proteins in DPCs [34, 35].
Mitomycin C is another class of bifunctional alkylating agent and is also used for cancer therapy. FK973, FK317, and FR900482 are substituted dihydrobenzoxazine derivatives and undergo reductive activation to form the reactive mitosene structures that are similar to that of mitomycin C. Alkaline elution analyses have shown that together with mitomycin C, FK973 forms concentration- and time-dependent ICLs and DPCs in cells, but not SSBs . In addition, chromatin immunoprecipitation analyses have revealed that FR900482 and FK317 cross-link minor-groove binding proteins such as HMGA1, HMGB1, and HMGB2, but not major-groove binding proteins such as NF-κB or Elf-1, to the promoter regions of the IL-2 and IL-2Rα genes to form DPCs in vivo [58, 59].
3.1.3. Ionizing radiation
Ionizing radiation causes damage to DNA via both direct and indirect mechanisms. In the direct mechanism, the radiation energy is deposited directly in DNA and produces DNA cation radicals, which are unstable and undergo decomposition. In the indirect mechanism, the radiation energy is deposited to water (i.e., the bulk medium of cells) and produces reactive oxygen species such as hydroxyl radicals, which in turn attack and damage DNA. Various types of radiation-induced DNA lesion have been identified: base damage, SSBs, DSBs, and DPCs. The most critical damage underlying the cell-killing effects of ionizing radiation is attributed to DSBs. The efficiency of DSB formation by ionizing radiation is decreased under hypoxic conditions relative to normoxic conditions, whereas that of DPC formation is increased under hypoxic conditions [1, 7]. Although the contribution of DPCs to the lethal events in irradiated cells remains to be clarified, the aforementioned opposing effects of oxygen on DSB and DPC formation point to the potential importance of DPCs for hypoxic cells, and in particular those present in tumors.
The induction of DPCs and their removal from the genome following irradiation of normoxic and hypoxic mouse tumors with carbon-ion beams were recently analyzed using the FITC-labeling method . The yield of DPCs was greater by 4-fold in hypoxic tumors than in normoxic tumors. Simultaneously, the yield of DSBs in hypoxic tumors was decreased to 1/2.4 relative to that in normoxic tumors. Interestingly, the carbon-ion beams produced two types of DPC that differed according to their rate of removal from the genome. The half-life of the rapidly removed component of DPCs was less than a few hours in vivo, whereas that of the slowly removed component was estimated to be longer than a few days. The rapidly and slowly removed components accounted for 40% and 60% of the total DPCs, respectively, indicating that DPCs remain in the genome much longer than do DSBs, the half-live of which is around several hours in vivo. It would be interesting to know whether similar results are observed upon irradiation with X-rays, which are characterized by lower linear energy transfer than that for carbon-ion beams.
It is possible that the rapidly removed DPCs are chemically unstable and reversed spontaneously by hydrolysis as is the case for aldehyde-induced DPCs (see Section 3.1.1 and Fig. 2). Alternatively, they may be peptide-containing DPCs, which are relatively small and are efficiently removed from DNA by nucleotide excision repair (NER) as described in Section 4.4.2. Slowly removed DPCs are virtually irreversible DPCs and are resistant to excision repair as evidenced by their long half-live in vivo. These DPCs will contain a stable covalent bond between the DNA and the protein molecules. DPCs containing a stable thymine–tyrosine cross-link bond have been identified in cells irradiated with γ-rays . Furthermore, DPCs containing large proteins are not excised from DNA by NER (see Section 4.4.2). The mechanisms underlying the formation of DPCs through the direct and indirect actions of ionizing radiation and the effect of oxygen on the formation of DPCs have been discussed in a recent review .
3.2. Formation of DPCs by abortive DNA metabolism and repair
3.2.1. DPC formation by inhibition of DNA-metabolizing and repair enzymes
Some classes of enzymes form transient covalent complexes with their substrates during catalysis. Those involved in DNA metabolism and repair are no exception to this, and considerable numbers of enzymes have been found that form a transient covalent complex with DNA as a reaction intermediate.
The methylation of cytosine in 5’-CG-3’ sequences is an important carrier of epigenetic information in higher organisms; this methylation is performed by DNMTs including DNMT1 (maintenance methyltransferase), DNMT3a, and DNMT3b (both de novo methyltransferases). The methylation reaction proceeds via a covalent intermediate between the DNMTs and the target cytosine . The DNMT inhibitor 5-aza-2’-deoxycytidine (azadC; known clinically as decitabine) is metabolized in cells, incorporated into DNA, and partly substituted for cytosine. When DNMTs attempt to methylate DNA, 5-azacytosine (the base moiety of azadC) covalently traps their reaction intermediates and aborts subsequent reactions, leaving type 1 DPCs . AzadC and its analogs are used as anticancer agents; their anticancer activity is at least partly attributable to the toxic effects of the resulting type 1 DPCs , although the hypomethylation of DNA due to the passive (covalent trapping) and active (proteasome-mediated degradation) depletion of free DNMT1 may also affect cell viability via the altered gene expression [63-65].
Bifunctional DNA glycosylases and DNA repair proteins such as PARP-1 and Ku have an associated AP lyase activity and react with AP sites in DNA, forming covalent Schiff base intermediates [66-68]. The structure of these covalent Schiff base intermediates is similar to that of type 2 DPCs (Fig. 1). DNA polymerases that have an associated 5’-terminal 2-deoxyribose-5-phosphate (dRP) lyase activity also form covalent Schiff base intermediates [69, 70]. These intermediates mimic type 2 DPCs (Fig. 1), but the protein is tethered to the 5’ end of a SSB via dRP. With the exception of PARP-1, the covalent Schiff base intermediates of the aforementioned glycosylases, repair proteins, and polymerases cannot be isolated, but they can be stabilized by NaBH4-reduction and isolated as DPCs . Interestingly, the formation of stable DPCs containing PARP-1 (type 2) in vitro and in vivo has recently been demonstrated [12, 13]. Their levels were increased by a PARP-1 inhibitor (4-amino-1,8-naphthalimide) or by the knockout of DNA polymerase β or X-ray repair cross-complementing protein (XRCC)1. The biological and clinical significances of these findings in conjunction with abortive DNA repair remain to be elucidated.
Tyrosyl-DNA phosphodiesterase (Tdp1) is involved in the repair of TOPO I–DNA covalent complexes (see below), and its catalytic cycle involves a covalent reaction intermediate in which a histidine residue is connected to a DNA 3’-phosphate through a phosphoamide linkage . In the strand-breakage reaction catalyzed by TOPO I and TOPO II, a nucleophilic attack of a catalytic tyrosyl residue of the TOPO upon a DNA phosphodiester bond results in transient covalent attachment of the tyrosine to the DNA phosphate either at the 3’-end (TOPO I) or the 5’ end (TOPO II) of the broken DNA (Fig. 1) [15, 16]. TOPO inhibitors (poisons) such as camptothecin (a TOPO I inhibitor) and etoposide (a TOPO II inhibitor) freeze the covalent reaction intermediate and abort the subsequent rejoining of DNA ends, leaving TOPO cleaved complexes that contain strand breaks and protein covalently bound to DNA (type 3 and type 4 DPCs; Fig. 1) [15, 16]. Many chemotherapeutic drugs targeting the covalent TOPO reaction intermediates have been developed since type 3 and type 4 DPCs are complex DNA lesions containing DPC(s) and strand break(s) and would be effective at killing tumor cells.
3.2.2. Suicidal cross-linking DNA damage
Several DNA lesions have been shown to act as suicidal substrates by stably cross-linking base excision repair (BER) enzymes, although the cross-linking reactions have only been demonstrated in vitro. The exceptional case with PARP-1 is described in Section 3.2.1.
2-Deoxyribonolactone (dL) is an oxidized form of an AP site and is produced by many DNA-damaging agents. dL in DNA undergoes β-elimination to form α,β-unsaturated lactone at the 3’-terminus. Alternatively, dL can be incised by an AP endonuclease (e.g., APE1) to form a dL phosphate at the 5’-terminus. The lactone in these dL analogs can react with nucleophiles such as lysine to form a stable amide bond. It has been shown that of eight bifunctional DNA glycosylases tested, Nth/Endo III cross-links to dL in DNA, while Fpg and hNEIL1 cross-link to the β-elimination product of dL (Fig. 3A) [72, 73]. The dL phosphate at the 5’-terminus generated by the incision of APE1 also cross-linked to DNA polymerase (Pol) β . Other forms of oxidized 2-deoxyribose (dioxobutane and the C4-oxidized AP site) at the 5’-terminus of DNA produce transient covalent complexes with Pol β and Pol λ, but the complex containing the oxidized sugar and enzyme is subsequently released from DNA, resulting in inactivated free polymerases .
Oxanine (Oxa) is produced by nitrosative damage of guanine  and has a reactive lactone-like structure. It was shown that of seven DNA glycosylase tested, Fpg, Nei/Endo VIII, and hOGG1 (bifunctional glycosylases) and AlkA (monofunctional glycosylase) cross-link to Oxa in DNA to form type 1 DPCs (Fig. 3B) . The glycosylases trapped by Oxa are notably different from those trapped by dL (Nth), suggesting distinct interactions with DNA damage in the active site. Histones also react with Oxa to produce DPCs, but at much lower rates, indicating the importance of specific interactions with proteins in DPC formation. With dL and Oxa, it seems that the catalytic amino acids in the active site (lysine or proline) of the enzymes attack the carbonyl carbon of dL or Oxa, resulting in stable amide bond formation (Fig. 3AB).
5-Hydroxy-5-methylhydantoin is produced by oxidation of thymine, and the carbanucleoside of 5-hydroxy-5-methylhydantoin (cHyd) has been shown to covalently trap Fpg, Nei/Endo VIII, and hNEIL1, serving as a suicidal substrate (Fig. 3C) . The crystal structure of the cHyd-DNA and Fpg covalent complex directly revealed the cross-linking between the N-terminal proline and the C5 of cHyd.
The cross-linking efficiencies of dL, Oxa, and cHyd for glycosylases or polymerases are relatively low. Thus, the biological significance of these lesions as suicidal substrates for repair enzymes remains to be assessed in vivo. However, these lesions can be used in mechanistic studies of repair enzymes in vitro, and also applied for solving the crystal structure of complexes involving DNA and repair enzymes .
4. Repair of DPCs
4.1. DPC repair in bacterial cells
The genes involved in the repair of DPCs have been elucidated by analyzing the sensitivity of a panel of repair-deficient E. coli mutants to formaldehyde and 5-azacytidine (azarC; the ribonucleoside form of azadC) that induce type 1 DPCs (these are simply referred to as DPCs in Section 4) [18, 79]. These studies have revealed that two mechanisms underlie the repair of DPCs. The first mechanism involves RecBCD-dependent homologous recombination (HR) and subsequent PriA-dependent replication restart (RR), and the second mechanism involves NER. The sensitivity of mutants (Fig. 4) indicates that the first mechanism is the major mechanism and is effective for DPCs induced by both formaldehyde and azarC, whereas the second mechanism is effective only for those induced by formaldehyde . This finding also suggests the differences in the nature of the DPCs induced by the two reagents. Formaldehyde is a nonspecific DPC inducer and covalently traps various proteins of potentially different sizes. For example, the nucleoid-associated proteins of E. coli are the putative candidates of DPCs and contain both small and large proteins (i.e., ranging from 9 to 33 kDa) . Conversely, azarC is incorporated into DNA after metabolic transformation and specifically cross-links a 53 kDa DNMT protein . It was therefore assumed that the DPCs containing large proteins (large DPCs), which are commonly produced by formaldehyde and azarC, are repaired by HR plus RR, while DPCs containing small proteins (small DPCs), which are produced only by formaldehyde, are repaired by NER. Consistent with this, the removal of formaldehyde-induced small DPCs from the E. coli genome was found to be dependent on UvrA protein, which is a component of the UvrABC nuclease involved in bacterial NER .
In vitro studies involving model substrates have provided further insight into DPC repair by NER. The UvrABC nuclease makes damage-specific incisions for DPCs containing short peptides in vitro, but it exhibits poor activity for those containing a T4 endonuclease V protein (16 kDa) [82, 83]. A more systematic study has shown that the dual incision activity of UvrABC is increased for proteins up to 1.6–2.1 kDa, and then decreases for larger proteins, with the activity being negligible for proteins of 25–44 kDa . That study has also revealed the steric inhibition of the loading of UvrB onto the DPC site in the damage-recognition step by large DPCs, abrogating the subsequent recruitment of UvrC that executes the dual DNA incision. Interestingly, the uvrC mutant of E. coli exhibited a uniquely weak sensitivity to formaldehyde, but the cho (a UvrC homolog) mutant exhibited a moderate sensitivity (Fig. 4), suggesting that it has an in vivo role as an alternative nuclease in the NER of DPCs .
Since large DPCs are not removed from DNA by NER, they stall the replication fork. As mentioned above, the genetic data show the essential role played by HR plus RR in the reactivation of a stalled replication fork by DPCs. The HR of E. coli has two subpathways: RecBCD-dependent HR, which is involved in the repair of DSBs, and RecFOR-dependent HR, which is involved in the repair of daughter strand gaps . It has been demonstrated that The RecBCD-dependent HR, but not RecFOR-dependent HR, is pivotal to the reactivation of the replication fork stalled by DPCs . Furthermore, other HR components including RuvABC Holliday junction resolvase and the RecG Holliday junction translocase/helicase were required for the HR of DPCs (Fig. 4) . The survival of E. coli after treatment with cisplatin and mitomycin C that produce ICLs as well as DPCs requires RecBCD, but it is also moderately dependent upon RecFOR, indicating the distinct requirement of RecFOR for the repair of DNA damage induced by formaldehyde/azarC and cisplatin/mitomycin C . It seems that DNA Pol I (polA) are required in HR, but translesion synthesis (TLS) polymerases including Pol II (polB), Pol IV (dinB), and Pol V (umuCD) are dispensable for the damage tolerance pathway associated with DPCs (Fig. 4). It is worth noting that an array of 34 lacO repressor sites bound by lac repressors impedes fork progression and inhibits cell growth of recA, recB, and recG mutants but not of recF and ruvABC mutants . Thus, tolerance to repressors bound to an array of repressor sequences and tolerance to DPCs (Fig. 4) require overlapping recombination genes, with the exception of ruvABC.
In E. coli, PriA, PriB and PriC proteins play a vital role in RR via two PriA-dependent mechanisms: the PriA–PriB and PriA–PriC pathways [87, 88]. The RR proteins recognize forked DNA structures such as arrested replication forks and D-loops, and load the replicative helicase DnaB for RR [89, 90]. According to the sensitivity to formaldehyde and azarC, the PriA–PriB pathway contributes more to RR than does the PriA–PriC pathway (Fig. 4) . Another Rep–PriC restart pathway  appears to be dispensable in RR following the HR of DPCs.
The precise molecular mechanism underlying reactivation of the DPC-induced stalling of the replication fork by HR remains to be established. Inactivation of some replication proteins (DnaB, Rep helicases, DNA Pol III) by mutations results in fork breakage and DSBs in a recBC background . However, treatment of the recB mutant (and wild type) by formaldehyde did not result in fork breakage, as evidenced by no accumulation of DSBs . This suggests that arrest of the replisome by DPCs does not lead to fork breakage and that the DSB ends processed by the RecBCD helicase/exonuclease are generated by other mechanisms. DSB ends may be formed by the re-replication of incomplete nascent strands  or by fork reversal mediated by the RecG helicase . Further study is required to elucidate the underlying molecular mechanism.
4.2. DPC repair in yeast
Yeasts such as Saccharomyces cerevisiae are one of the simplest eukaryotic organisms but many essential cellular processes are conserved between yeast and higher organisms. To identify genes that mitigate the cytotoxic effects of DPCs, the S. cerevisiae haploid non-essential gene deletion library (ca. 5000 genes) was screened for increased sensitivity to formaldehyde . This screening revealed 44 deletion strains that are sensitive to chronic low-dose exposure to formaldehyde (1.0–1.5 mM for 48 h). The identified genes were those involved in the cell cycle and DNA repair (20 genes), metabolism (6 genes), transcription (7 genes), and others (11 genes). The functions of the identified DNA repair genes were HR [RAD50, RAD51, RAD52, RAD54, RAD55, XRS2(NBS1), and MRE11], NER [RAD1(XPF), RAD4(XPC), and RAD14(XPA)], post-replication repair (PRR)/TLS [RAD5(SHPRH) and MMS2], and the maintenance of replisome stability [SGS1(RECQ) and TOP3]. Note that genes within parentheses are mammalian counterparts.
Reexamination of the sensitivity of individual strains to chronic low-dose exposure to formaldehyde has indicated that cell survival is mainly conferred by proteins of the HR pathway and those related to that process [SGS1(RECQ) and TOP3]. The low-to-moderate sensitivity of the NER mutants suggest a less critical contribution of NER to survival following chronic exposure . Although the PRR/TLS-deficient strain RAD5(SHPRH) was sensitive to chronic formaldehyde exposure, other deletion mutants involved in this pathway did not exhibit sensitivity (REV1, REV3, REV7, UBC13, MMS2, RAD6, RAD18, and RAD30) , suggesting that canonical PRR/TLS is dispensable for cell survival. It is also noteworthy that the deletions of genes involved in ICL repair (REV3, EXO1, and PSO2) did not confer the cells with sensitivity. Thus, it seems that the requirement of repair genes in S. cerevisiae  reproduces those in E. coli [18, 79] with respect to mitigation of the cytotoxic effects of formaldehyde-induced DPCs.
Interestingly, the requirement of repair gene to mitigate the cytotoxic effect of formaldehyde following acute high dose exposure (60 mM, 15 min) differed significantly from that following chronic low-dose exposure to formaldehyde (1.0–1.5 mM for 48 h) . The NER-deletion strains (RAD1 and RAD4) exhibited the highest sensitivity, whereas the HR-deletion strains [RAD50, RAD52, MRE11, XRS2(NBS1)] and the related strains [SGS1(RECQ) and TOP3] exhibited only moderate sensitivity, suggesting that the relative contributions of DNA repair pathways to protection against formaldehyde-induced DPCs vary with the exposure conditions. Acute high-dose treatment may have changed the cell state analogous to the G1/G0 phase of the cell cycle, in which no HR takes place. However, the molecular mechanism underlying the change in the major repair pathway upon acute high-dose exposure remains to be elucidated. Although it is not clear whether relevant to the aforementioned observations with yeast, it has been shown that the concentration and the regimen of formaldehyde treatment affect the formation of DPCs and gene expression in human cells, and that the genes with altered expression are involved in detoxification but not DPC repair .
It has been demonstrated very recently that the metalloprotease Wss1 in S. cerevisiae is crucial for cell survival after exposure to formaldehyde and camptothecin, which induce type 1 and type 3 DPCs, respectively . The mutants deficient in Wss1 were found to accumulate DPCs. In vitro analysis has shown that Wss1 protease cleaves TOPO I-DPCs directly in a DNA dependent manner. In addition, the results with formaldehyde-treated cells suggest that the proteolytic degradation of DPCs by Wss1 enables the TLS of DPC-containing DNA and suppresses gross chromosomal rearrangements that can otherwise occur through the HR of intact DPCs. The authors suggest that proteolysis by Wss1 enables repair of DPCs via downstream canonical DNA repair pathways . Proteins homologous to Wss1 are present in bacteria and several eukaryotes such as fungi, plants, Plasmodium, and Trypanosoma brucei, but are absent in animals . In higher eukaryotes Dvc1/Spartan, which has a domain organization similar to that of Wss1, may have Wss1-like function [96, 98]. Consistent with the results with yeast , recent analysis of in vitro DNA replication using Xenopus egg extracts has shown that the proteolytic degradation of DPCs in leading and lagging strands promotes replication through lesion sites .
4.3. DPC repair in chicken DT40 cells
The chicken B lymphocyte cell line DT40 has a high rate of gene targeting and has been used as a model system for reverse genetics studies in higher eukaryotes . The genes involved in the repair of DPCs have been elucidated by assessing the formaldehyde sensitivity of DT40 cells with targeted mutations in various DNA repair genes . In total, 22 mutants involved in different DNA repair pathways were studied: FA (FANCD2), HR (BRCA1, BRCA2, XRCC2, XRCC3, RAD51C, RAD51D, RAD52, and RAD54), PRR/TLS (REV1, REV3, RAD18, and POLQ), NER (XPA), BER (PARP1, POLB, and FEN1), NHEJ (DNA-PKcs, KU70, and LIG4), and DNA damage response (ATM and CHK1). With a few exceptions of mutants, the general order of the repair pathways that are critical for cell survival after formaldehyde treatment was FA >> HR > PRR/TLS > NER/BER > NHEJ = DNA damage response = wild type, suggesting that FA, HR, PRR/TLS, and (to a lesser extent) NER are critical for mitigating the cytotoxic effects of formaldehyde. A parallel experiment showed that human FANCC and FANCG knockout cells were sensitive to formaldehyde . Thus, in addition to HR and NER, which are required in E. coli (Section 4.1) and S. cerevisiae (Section 4.2), FA and PRR/TLS pathways are also required in DT40 cells. Cells deficient in the FA pathway are sensitive to ICL-inducing agents . Higher eukaryotes use multiple pathways for ICL repair, and the existence of the specialized FA pathway represents a significant difference from yeasts. Accordingly, one possible interpretation of the data with DT40 cells is that formaldehyde simultaneously induces DPCs and ICLs, and that the two lesions are repaired via partially overlapping pathways.
Interestingly, the DT40 FANCD2 mutant was also sensitive to a simple aldehyde (acetaldehyde), but not to dicarbonyl compounds (glyoxal and methylglyoxal) and α,β-unsaturated aldehydes (acrolein and crotonaldehyde) .
4.4. DPC repair in mammalian cells
4.4.1. Sensitivity to DPC-inducing agents
Chinese hamster ovary (CHO) cells deficient in HR (XRCC3 and RAD51D) and NER (XPD and XPF) were examined for their sensitivity to formaldehyde and azadC . As mentioned in Section 4.1, formaldehyde induces DPCs of various sizes, whereas azadC specifically induces large DPCs containing DNMTs, which are 33–183 kDa in mammalian cells. The HR-deficient CHO mutants were highly sensitive to formaldehyde and azadC. The CHO XPD mutant showed slight sensitivity to formaldehyde but not to azadC; similar results were obtained with human NER mutants (XPA and XPD) . These results indicate that cell survival after treatment with formaldehyde and azadC reagents is mainly conferred by the HR pathway. The low-to-negligible sensitivity of the NER mutants (XPA and XPD) suggests a less critical contribution of NER to survival after treatment with both reagents. Interestingly, the CHO XPF mutant was highly sensitive to formaldehyde but not to azadC. The hypersensitivity of the CHO XPF mutant but not other CHO NER mutants to formaldehyde was also confirmed by the recent study . The distinct and unique sensitivity of the CHO XPF mutant to formaldehyde suggests a role of XPF protein outside canonical NER. As mentioned in Section 4.3, formaldehyde can induce both ICLs and DPCs, and the FA pathway mitigates the cytotoxic effects of ICLs. In this regard, it is noteworthy that XPF has been recently added as a member of the family of FA genes, and has been designated FANCQ .
The sensitivity of FA-pathway-deficient mammalian cells to formaldehyde has also been examined. One study involving mouse and CHO cells found that cell survival was dependent on FANCD1/BRCA2, FANCD2, and FANCG, but not on either FANCA or FANCC, which are present in the core FA complex , whereas another study involving human cells found that cell survival was dependent on FANCC (and FANCG) , revealing a different requirement of FA genes. It remains to be seen whether the response of mammalian cells to formaldehyde through the FA pathway is species dependent . With respect to DPCs induced by azadC, human FANCD1/BRCA2 and FANCD2 cells are only weakly sensitive to azadC .
The removal of DPCs induced by formaldehyde and other aldehydes has been analyzed using NER-proficient and NER-deficient (XPA) human cells, which revealed that the rate of removal of DPCs from the genome was similar in the two cell types . Another study found that the rate of removal from the genome of hexavalent chromium [Cr(VI)]-induced DPCs was also similar in NER-proficient and NER-deficient (XPA) human cells . This study further suggested that the Cr(VI) sensitivity of NER-deficient cells was due to a defect in the repair of Cr-DNA adducts, which are the precursors of Cr(VI)-induced DPCs. With the exception of XPF cells that have a defect in ICL repair, the repair of DPC precursors (but not DPCs per se) may also account for the slight sensitivity of NER-deficient mammalian cells to formaldehyde. These in vivo observations with mammalian cells contrast with those observed with E. coli, wherein genomic small DPCs were actively removed by NER in vivo .
4.4.2. Repair capacity of mammalian NER for DPCs
The capacity of mammalian NER to repair DPCs has been studied in vitro using defined DPC substrates. The cell-free extracts (CFEs) from mammalian cells made efficient damage-specific incisions for DPCs containing peptides comprising 4 or 12 amino acids but not for those containing T4 endonuclease V (16 kDa), histone H1 (22 kDa), and HhaI DNMT (37 kDa) [107-109]. A systematic analysis with HeLa CFEs and defined DNA substrates containing DPCs of various sizes demonstrated that the 5’-incision efficiency increased for cross-linked proteins up to 1.6 kDa, and then decreased thereafter (Fig. 5A); the incision activity was negligible for the 11 kDa protein. . This was also the case for 3’-incisions. Together these observations indicate that the upper size limit of cross-linked proteins amenable to mammalian NER is around 8 kDa in vitro, which is notably smaller than that for bacterial NER (Fig. 5B). The smaller upper size limit of DPCs for mammalian NER accounts for a less critical contribution of NER to cell survival after treatment with formaldehyde and azadC observed in mammalian cells (see above).
As with conventional bulky lesions, the 5’-incision sites were around the 21st phosphodiester bond 5’ to DPCs, and 3’-incision site was at the 6th phosphodiester bond 3’ to DPCs, which indicates that the incision sites are independent of the size of cross-linked proteins . The CFEs from NER-deficient cells exhibited no incision activity for DPCs. Despite significant differences in protein components, bacterial and mammalian NER shares an activity optimum for a cross-linked protein size of around 1.6 kDa, which is several times larger than the sizes of conventional bulky lesions. It would be interesting to know whether this is simply due to a mechanistic reason or if it has some evolutional significance.
An alternative model of DPC repair by mammalian NER has been proposed. In this model cross-linked proteins are initially degraded to short peptides by the proteasome, and due to the robust activity of mammalian NER for DPCs containing short peptides in vitro, the resulting DNA-peptide cross-links are removed by NER (Fig. 5A) [11, 83, 108, 109]. Polyubiquitination targets proteins for recognition and degradation by the 26S proteasome . However, it was shown that cross-linked proteins were not polyubiquitinated in vivo after treatment with formaldehyde, and hence were not subjected to proteasomal degradation in cells . Very recently the yeast metalloprotease Wss1 and a putative protease in Xenopus egg extracts have been shown to be involved in the repair of DPCs [96, 154]. It would be interesting to elucidate whether the functional homologs are present in mammalian cells, although no clear orthologs of Wss1 seem to exist in mammalian cells (see Section 4.2 for details).
4.4.3. DPC tolerance by HR
Since NER is virtually unable to repair DPCs in mammalian cells, the replication fork will run into the DPC site and become stalled. Given the high sensitivity of HR-deficient CHO mutants to formaldehyde and azadC, HR is pivotal with respect to activation of the DPC-stalled replication fork. Indeed, the formation of RAD51 nuclear foci, which is reminiscent of HR, was observed following treatment with formaldehyde and azadC . Accumulation of DSBs was observed in HR-deficient CHO cells, but not in HR-proficient CHO cells after treatment with formaldehyde or azadC, suggesting that HR for DPCs is initiated by fork breakage to generate one-sided DSBs .
When CHO cells are treated with replication inhibitors such as hydroxyurea and aphidicolin, accumulation of DSBs due to fork breakage is observed even in HR-proficient cells [111, 112], suggesting mechanistic differences in the formation of DSBs by DPCs and replication inhibitors. The MUS81-EME1 and MUS81-EME2 structure specific endonucleases are implicated in fork breakage in mammalian cells [113-115]. However, whether they are also involved in fork breakage at DPCs remains to be elucidated.
The data for DT40 cells suggest that TLS is a crucial component of DPC tolerance , but the sensitivity of mammalian TLS mutants to DPC-inducing agents has not been tested. Mouse embryonic stem cells deficient in two major DNA glycosylases (Nth1 and Ogg1) did not exhibit significant sensitivity to formaldehyde and azadC (Ide et al., unpublished data). Regarding the DNA damage response, it was shown that DPCs activate both ATM (ataxia-telangiectasia mutated) and ATR (ATM and Rad3-related) pathways in the late and early stages of damage response, respectively, in human cells .
Studies of the tolerance/repair of DPCs have been hampered by the facts that DPC-inducing agents such as aldehydes, bifunctional alkylating agents, and platinum anticancer drugs concurrently produce ICLs, which are also potent lethal lesions, and that tolerance/repair mechanisms for DPCs and ICLs are partly overlapping at least with respect to the requirement of HR and some structure-specific endonucleases. In the replication-dependent ICL repair mechanism, the FA core complex recognizes and binds an ICL that stalled the replication fork. Then, the FA core complex monoubiquitinates FANCD2 and FANCI. The ubiquitinated FANCD2-I heterodimer localizes to the ICL and recruits structure-specific endonucleases (XPF-ERCC1, MUS81-EME1, FAN1, SLX4) that incise the DNA on either side of the lesion to create a DSB. The complementary strand containing the unhooked cross-link is replicated by a TLS polymerase, and downstream FA proteins assist in coordination of HR to repair the DSB . ICLs are also repaired by the replication-independent mechanism involving structure-specific endonucleases and TLS polymerases . Similarly, DPCs can be repaired in a replication-independent manner if NER coupled with DPC-specific proteases (functional Wss1 homologs) operates in mammalian cells (see Section 4.2). The replication-independent repair of DPCs will be important for the survival of nonproliferating cells such as neurons since it ensures faithful gene expression and maintains cellular homeostasis.
5. Biological effects of DPCs
Some types of DPC generated by chemical and physical agents are stable and are not spontaneously reversed (Section 3). Furthermore, only small DPCs are actively removed from the genome by NER in bacterial cells (Section 4). Thus, a significant portion of DPCs persist in the genome and can affect various aspects of DNA transactions such as replication, transcription, repair, and recombination. This section focuses on the cytotoxic effects of DPCs through DNA replication and transcription.
5.2. Effects on DNA replication
5.2.1. The replisome
DNA is replicated by the replisome, which comprises the replicative DNA helicase, DNA polymerase, and other factors [118, 119]. The replicative helicases unwind the parental double-stranded DNA into two single strands, and DNA polymerases synthesize leading and lagging strands in continuous and discontinuous modes, respectively, using the separated strands as templates. The mechanism is well conserved from phages and bacteria through to higher organisms [118, 119]. The replisome proceeds through the barrier of DNA-associated proteins such as nucleosomes and site-specific DNA-binding proteins. The replicative helicases disrupt nucleosomes in eukaryotes, probably with the aid of histone modifications and chaperones . They can also unwind DNA that is associated with DNA-binding proteins with variable efficiencies [120, 121], and dislodge proteins from double-stranded DNA . Thus, the replisome has an intrinsic capacity to proceed through the protein barrier as long as it is reversible. However, many DNA-damaging agents generate DPCs and immobilize proteins in DNA.
5.2.2. Host-cell reactivation assays
The effects of DPCs on replication were studied in vitro using host-cell reactivation assays. Several types of DPC containing DNMT, histone fragments, and T4 endonuclease V were introduced into plasmids and the replication of those plasmids was analyzed in E. coli [18, 123, 124]. These studies revealed that DPCs inhibit the replication of plasmids, indicating that the progression of the replisome is impeded by DPCs in vivo. Analysis of the replication intermediates of plasmids containing DNMT-DPCs by two-dimensional gel electrophoresis and electron microscopy suggests that replication can switch from the theta to the rolling circles mode after a replication fork is stalled by a DNMT-DPC . A study involving plasmids containing histone fragments as DPCs revealed that the efficiency of replication in E. coli varies with the size of the histone fragments and the uvrA gene . With plasmids containing T4 endonuclease V as a DPC, the efficiency of replication is dependent on the uvrD gene .
5.2.3. Effects on DNA polymerases
The effects of DPCs on DNA polymerases have been studied in vitro using defined DPC DNA templates. DPCs constitute absolute blocks to DNA polymerase I Klenow fragments with and without 3’-5’ exonuclease and HIV-1 reverse transcriptase [107, 125, 126]. More recently it was shown that DPCs and DNA–peptide cross-links (a 23-mer peptide) completely block DNA replication by TLS DNA polymerases η, κ, ν, and ι in vitro, whereas smaller DNA–peptide cross-links (a 10-mer peptide) are bypassed . In addition, TLS polymerase ν can bypass small DNA-peptide cross-links placed in the major but not the minor groove of DNA . The replicative polymerases, such as prokaryotic Pol III holoenzyme and eukaryotic Pol δ and Pol ε, have not been tested for DPCs.
5.2.4. Effects on replicative helicases
The impediment of the replisome by DPCs is more closely associated with replicative helicases that unwind DNA at the front of the replication fork than with replicative polymerases. Replicative helicases such as the phage T7 gene 4 protein (T7gp4), simian virus 40 large T antigen (Tag), and E. coli DnaB protein are characterized by their ring-shaped homohexameric structure with a central channel that accommodates DNA . The eukaryotic replicative helicase also assembles into a ring-shaped heterohexamer of minichromosome maintenance (Mcm) proteins 2–7 . In addition, a subcomplex comprised of Mcm4, Mcm6, and Mcm7 (Mcm467) forms a ring-shaped heterohexamer containing two respective subunits and exhibits helicase activity in vitro . Coupled with the hydrolysis of NTP (usually ATP), T7gp4 and DnaB helicases translocate along the lagging template strand with 5’–3’ polarity and disrupt the hydrogen bonds between two strands, whereas Tag and Mcm helicases translocate along the leading template strand with 3’–5’ polarity [129, 130].
The effects of DPCs on the DNA-unwinding reaction of replicative helicases have been elucidated in vitro using defined DPC substrates . DPCs in the translocating strand, but not those in the nontranslocating strand, were found to impede the progression of the T7gp4, Tag, DnaB, and Mcm467 helicases (a conflicting result with Tag has been reported, ). The impediment varied with the size of the cross-linked proteins, with a threshold size for clearance of 5.0–14.1 kDa (Fig. 6), indicating that the central channel of the dynamically translocating hexameric ring helicases can accommodate only small proteins. Although DPCs constitute strong blocks to DNA polymerases, as mentioned above, the results shown in Fig. 6 highlight an alternative mechanism of replisome blockage that involves the inhibition of replicative helicases that unwind DNA at the front of the replication fork.
In addition, the results obtained for helicase suggest the distinct fates of replisomes upon encountering conventional bulky damage and large DPCs. Conventional bulky damage both in the translocating and nontranslocating strands are cleared by helicases and arrest DNA polymerase (Fig. 7A). This can further lead to functional uncoupling of polymerases and helicases as well as that of leading and lagging polymerases. In eukaryotes, the functional uncoupling of polymerases and helicases activates a checkpoint kinase ATR (ATM and Rad3-related), which directs the DNA damage response . DPCs in the translocating strand block the helicase, immediately halting leading- and lagging-strand synthesis (Fig. 7B). This will preclude functional uncoupling of polymerases and helicases and of leading and lagging polymerases. In contrast, DPCs in the nontranslocating strand do not block the helicase, and act like conventional bulky damage. Accordingly, the mechanism underlying stalled fork-processing and the concurrent events of damage signaling may differ significantly for DPCs in the translocating and nontranslocating strands.
Stalled DnaB, T7gp4, and Mcm467 helicases exhibit limited stability and dissociate from DNA with a half-life of 15–36 min in vitro . With E. coli, replisomes that are blocked by an array of repressor–operator complexes lose the ability to continue replication with a half-life of 4–6 min in vitro , whereas they retain the ability to resume replication upon removal of the block for several hours in vivo . The dissociation of stalled DnaB from DNA accounts at least partially for the inactivation of the replisome in vitro. The inactivation of the replisome due to loss of DnaB also seems to be consistent with the finding that reactivation of a stalled replication fork requires reloading of DnaB (or replication machinery) via the PriA helicase in E. coli , and that the priA mutant is hypersensitive to DPC-inducing agents .
In yeast, replisomes stalled by tight (but reversible) DNA–protein complexes are stable in vivo, and DNA synthesis continues through the barriers after a transient pause (ca. 30 min) [138, 139]. Thus, Mcm is likely to be retained in the stalled replisome in yeast cells. In contrast, a recent study of in vitro replication of plasmids with Xenopus egg extracts has shown that Mcm7 (a component of the Mcm complex) dissociates from DNA with an approximate half-life of 10 min when progression of the replisome is blocked by an ICL . This finding with Xenopus egg extracts concurs with the observation that Mcm467 stalled by a DPC dissociates from DNA with a half-life of 33 min. It is possible that the replisome can proceed by gradually disrupting reversible protein roadblocks in cells while retaining the helicase in the replisome. Conversely, this does not occur if the replisome is completely arrested by irreversible roadblocks such as DPCs and ICLs.
5.3. Effects on transcription
5.3.1. RNA polymerase
Viral, prokaryotic, and eukaryotic RNA polymerases (RNAPs) have an ability to transcribe through nucleoproteins and site-specific DNA binding proteins, although the read-through efficiencies vary depending on the roadblocking proteins . ATP-dependent chromatin remodeling complexes, histone chaperones, and covalent histone modifications promote the transcription through nucleosomes . It has also been shown that the trailing RNAP stimulates forward translocation of the stalled leading RNAP through reversibly bound proteins [143, 144], as well as through naturally occurring pausing sites [145, 146].
RNAPs open the downstream DNA duplex at the DNA entry site to generate a transcription bubble, in which the transcribed strand (TS) is delivered deep into the active site and used for nascent RNA synthesis, while the nontranscribed strand (NTS) is relatively exposed to the surface of RNAP [147, 148]. Resolution of the crystal structure of yeast RNAP II revealed that conventional bulky lesions such as a cyclobutane pyrimidine dimer, a cisplatin intrastrand cross-link, and a monofunctional platinum adduct in the TS are delivered to the active site or its proximal position and then arrest transcription [149-151]. Conversely, DNA lesions in the NTS impose much less serious problems for transcription than do those in the TS .
5.3.2. Reporter assays
Luciferase-based reporter assays are widely used as a tool to study gene expression at the transcriptional level. To assess the effects of DPCs on transcription, histone H1 was cross-linked by formaldehyde to a pGL4.50 plasmid harboring the luciferase gene (Fig. 8A). The pGL4.50 containing histone H1-DPCs was transfected into HeLa cells and the bioluminescence resulting from the expressed luciferase was measured (Ide et al., unpublished data). The luciferase activity was found to decrease with increasing amounts of cross-linked histone H1 protein, indicating that transcription of the luciferase gene by RNAPII was inhibited by DPCs in vivo (Fig. 8B).
5.3.3. Effects on T7 RNAP
T7 RNAP is a single subunit RNAP and is structurally unrelated to bacterial and eukaryotic multisubunit RNAPs, but all share many functional characteristics in the initiation and elongation phases of transcription .
The effects of DPCs on transcription have been analyzed in vitro using phage T7 RNAP and defined DNA templates containing DPCs of various sizes (1.6–44 kDa) . When DPCs are present in the TS, both abortive and runoff transcripts were produced, indicating stalling of the T7 RNAP by DPCs. There was trend for the number of copies of runoff transcripts to decrease for larger DPCs. This result indicates that DPCs in the TS pose strong but not absolute blocks to T7 RNAP, allowing limited but significant lesion bypass even for large DPCs. This property contrasts with that of DNA polymerase I Klenow fragment, which was completely arrested even by the smallest DPC (1.6 kDa). It was also found that when DPCs are present in the NTS, no damage-dependent abortive transcripts are produced, although common weak abortive products form for all templates. The formation of runoff transcripts was retarded only moderately by NTS-DPCs. The number of copies of runoff transcripts was virtually independent of the DPC size, and was 40–60% of that for the control template.
Stalling of leading T7 RNAP by TS-DPCs caused congestion of the trailing T7 RNAPs. Interestingly, sequence analysis of runoff transcripts has shown that stalled leading and trailing T7 RNAPs become highly error prone and generate untargeted mutations in the upstream intact template regions (Fig. 9); 40–75% of runoff transcripts contained mutations in the region . In contrast, no mutations were induced in runoff transcripts when NTS-DPCs were used. This contrasts with the transcriptional mutations induced by conventional DNA lesions, which are delivered to the active site or its proximal position in RNAPs and cause direct misincorporation.
Another interesting observation is that the trailing RNAP stimulates forward translocation of the stalled leading RNAP, promoting the translesion bypass of DPCs . The cooperation of T7 RNAPs enhances transcription through DPCs by a factor of 5.2–17. It has been proposed that bacterial and eukaryotic RNAPs cooperate during elongation so that the trailing RNAP assists in the transcription of the leading RNAP through reversibly bound proteins and pausing sites, by reducing the backtracking of the stalled/paused leading RNAPs [143-146]. Accordingly, similar cooperating mechanism may be working for transcription through DPCs.
How bacterial and eukaryotic multisubunit RNAPs respond to DPCs in vitro and in vivo remains to be elucidated in future studies.
DPCs are superbulky DNA lesions that affect replication, transcription, and repair via mechanisms that differ from those involving conventional bulky lesions.
The findings from in vitro studies are summarized below. In DNA replication, DPCs, unlike conventional bulky lesions, block the progression of the replicative helicase and constitute helicase blocks when they are located in the translocating strand. Conversely, DPCs in the nontranslocating strand do not block the helicase. They act like conventional bulky damage and are delivered to polymerases (but not into the active site), constituting polymerase blocks. In transcription, DPCs in the TS block the progression of RNAP, but those in the NTS only moderately affect the transcription through DPCs. T7 RNAPs stalled by DPCs are very error prone. Thus, DPCs exert cytotoxic effects through the impairment of DNA replication and transcription. The impairment of DNA replication and transcription by DPCs has been substantiated in vivo by host-cell reactivation and reporter assays using DPC-containing plasmids.
In DNA repair, NER is the major mechanism for the repair of conventional bulky lesions. NER exhibits a robust activity for DNA–peptide cross-links, but a poor to negligible activity for DPCs. The initial recognition of DPCs by NER factors appears to be critical, and is compromised due to the steric hindrance of DPCs. However, the proteolytic degradation of DPCs by proteases may enable NER and TLS to participate in DPC repair. HR plays a principal role in the repair/tolerance of DPCs, but the molecular mechanism by which the DPC-stalled replication fork is reactivated through HR remains to be established.
Studies of the cytotoxic effects and repair of DPCs have been hampered by the facts that DPC-inducing agents concurrently produce other lethal lesions, such as ICLs (aldehydes and bifunctional cross-linking agents) and DSBs (ionizing radiation), and that repair pathways for DPCs, ICLs, and DSBs are partially overlapping. These also pose challenges for studies of the mutagenic and carcinogenic effects of DPCs, which were not addressed in this review. Future research will overcome these limitations and clarify the importance of DPCs in DNA damage together with the underlying molecular mechanism of the repair/tolerance of DPCs.
AGT: O6-alkylguanine-DNA alkyltransferase; AP: apurinic/apyrimidinic; azadC: 5-aza-2’-deoxycytidine; azarC: 5-azacytidine; BER, base excision repair; CHO. Chinese hamster ovary cells; cHyd: carbanucleoside of 5-hydroxy-5-methylhydantoin; dL: 2-deoxyribonolactone; DNMT: DNA methyltransferase; DPC: DNA-protein cross-link; dRP: 2-deoxyribose-5-phosphate; DSB: DNA double-strand break; FA: Fanconi anemia; FITC: fluorescein isothiocyanate; HR: homologous recombination; ICL: interstrand cross-link; Mcm: minichromosome maintenance; NER: nucleotide excision repair; NTS: nontranscribed strand; Oxa: oxanine; PARP-1: poly(ADP-ribose) polymerase-1; PRR: post-replication repair; RNAP: RNA polymerase; RR: replication restart; SDS: sodium dodecyl sulfate; SSB: DNA single-strand break; T7gp4: phage T7 gene 4 protein; Tag: simian virus 40 large T antigen; Tdp1: tyrosyl-DNA phosphodiesterase; TLS: translesion synthesis; TOPO: topoisomerase; TS: transcribed strand; XRCC: X-ray repair cross-complementing protein.
This work was partly supported by KAKENHI from the Japan Society for Promotion of Science and the Ministry of Education, Culture, Sports, Science and Technology in Japan (grant numbers: 22131010 and 26550030 to H.I., and 26340023 to T.N.).