Inherited mutations in genes related to DNA repair that increase the risk of cancer
DNA damage appears to be a fundamental problem for life . As we show below, the average human cell receives about 60,000 DNA damages per day due to natural endogenous causes. Most DNA damages are repaired by one or more enzyme systems. However, excessive DNA damages are a major primary cause of cancer. Error-prone replication past DNA damages or inaccurate repair of DNA damages give rise to mutations and epimutations that, by a process of natural selection, can cause progression to cancer.
DNA damage is a change in the basic structure of DNA that is not itself replicated when the DNA is replicated. A DNA damage can be a chemical addition or disruption to a base of DNA (creating an abnormal nucleotide or nucleotide fragment) or a break in one or both strands of DNA. When DNA carrying a damaged base is replicated, an incorrect base may be inserted opposite the site of the damaged base in the complementary strand, and this can become a mutation in the next round of replication. Also DNA double-strand breaks may be repaired by an inaccurate end-joining process leading to mutations. In addition, a double strand break can cause rearrangements of the chromosome structure (possibly disrupting a gene, or causing a gene to come under abnormal regulatory control), and, if such a change can be passed to successive cell generations, it is also a form of mutation. Mutations, however, can be avoided if accurate DNA repair systems recognize DNA damages as abnormal structures, and repair the damages prior to replication.
DNA damages occur in both replicating, proliferative cells (e.g. those forming the internal lining of the colon or blood forming “hematopoietic” cells), and in differentiated, non-dividing cells (e.g. neurons in the brain or myocytes in muscle). Cancers occur primarily in proliferative tissues. If DNA damages in proliferating cells are not accurately repaired due to inadequate expression of a DNA repair gene, the risk of cancer increases. In contrast, when DNA damages occur in non-proliferating cells and are not repaired due to inadequate expression of a DNA repair gene, the damages can accumulate and may cause premature aging .
A mutation is a change in the DNA sequence in which normal base pairs are substituted, added, deleted or rearranged. The DNA containing a mutation still consists of a sequence of standard base pairs, and the altered DNA sequence can be copied when the DNA is replicated. A mutation can prevent a gene from carrying out its function, or it can cause a gene to be translated into a protein that functions abnormally. Mutations can activate oncogenes, inactivate tumor suppressor genes or cause genomic instability in replicating cells, and an assemblage of such mutations, together in the same cell, can lead to cancer. Cancers usually arise as a consequence of mutations conferring a selective advantage that leads to clonal expansion. Colon cancers, for example, have an average of 3 or 4 “driver” mutations (mutations occurring repeatedly in different colon cancers) and about 75 “passenger” mutations (mutations occurring infrequently in colon cancers) . Colon cancers also have an average of 17 focal amplifications, 28 recurrent deletions and up to 10 translocations . Since mutations have a normal DNA structure, they cannot be recognized or removed by DNA repair processes in living cells. Removal of a mutation only occurs if it is sufficiently deleterious to cause the death of the cell.
An epigenetic change (epimutation) is a heritable change in gene expression that is not accompanied by a change in DNA sequence. These epigenetic changes can include DNA methylation, constitutive (not facultative or induced) changes in small noncoding RNAs including microRNAs, altered chromatin architecture, histone tail modifications by methylations and acetylations that repress or activate transcription of the DNA wrapped around the histones, and nucleosome re-positioning . These epigenetic changes are very frequent in cancers. For instance, 24 colon cancers were analyzed at more than 3,000 DNA segments within the genome for differentially methylated regions (DMRs). The colon cancers were found to have between 515 and 33,576 DMRs compared to adjacent histologically normal tissues . Most of the differential methylations were increases in methylation, though some were hypomethylations. Increased methylation of the promoter region of a gene generally represses the transcription of that gene. Another epigenetic factor, a microRNA (miRNA), can have several hundred “target genes” . Those target genes are repressed by the miRNA causing the degradation or blocked translation of the messenger RNA produced by those genes. Increased expression of an miRNA can occur due to epigenetic hypomethylation of the promoter region controlling transcription of the miRNA. When 754 miRNAs were evaluated in progression to esophageal adenocarcinoma (EAC), the expression levels of about 130 miRNAs were increased and the expression levels of 16 miRNAs were decreased in tissues with EAC (or in tissues with Barrett’s esophagus, a precursor lesion) compared to histologically normal esophageal tissues adjacent to the EACs .
As described in detail below, inherited germ-line mutations of DNA repair genes give rise to syndromes characterized by increased risk of cancer. Such inherited mutational defects of DNA repair genes can allow excess unrepaired DNA damages to accumulate in somatic cells. Inaccurate translesion synthesis past the unrepaired DNA damages can cause mutations. In addition, error prone DNA repair pathways, such as non-homologous end joining, can also cause mutation. Erroneous or incomplete DNA repair may also cause epigenetic modifications. Thus, deficient DNA repair that leaves behind excess DNA damages can cause increased mutations and epimutations, and these mutations and epimutations can include both the driver mutations and the epigenetic alterations central to progression to cancer.
As will be described below, whole genome sequencing of many different types of cancers show that between thousands to hundreds of thousands of mutations occur in various types of cancers. Mutation frequencies in non-cancerous tissues are substantially lower. Loss-of-function mutations in DNA repair genes are relatively infrequent in sporadic (non-germ-line induced) cancers. However, DNA repair genes frequently express reduced levels of repair proteins in cancers due to epigenetic repression, and this can lead to increased DNA damage, and hence, increased mutation. The epigenetic repression of DNA repair gene expression is also frequent in the field defects that surround and give rise to cancers. Thus, epigenetic reduction of DNA repair appears to be a frequent early step, central to progression to cancer.
2. Inherited mutations in DNA repair genes and cancer syndromes
Hereditary cancer syndromes account for about 5% to 10% of the incidence of cancers . Two reviews list 48  and 55  familial cancer susceptibility syndromes. Mutations in 38 genes related to DNA repair cause hereditary cancer syndromes (Table 1). Since such syndromes are frequently caused by mutations in DNA repair genes, this indicates that sporadic reductions in DNA repair gene expression may also be a frequent and crucial early event in progression to sporadic cancer.
3. Mutations versus epimutations in DNA repair genes during progression to cancer
Upon reviewing the results from sequencing 3,284 tumors and the 294,881 mutations found in these tumors, Vogelstein et al.  noted that germ-line mutations that give rise to cancer are infrequent in sporadic tumors. This indicates that if an early step in progression to sporadic cancer (rather than a germ-line syndrome) is reduction in function of a DNA repair gene, the reduction is likely due to an epigenetic alteration in that gene (an epimutation), rather than to a mutation (change in base pair-sequence).
Two examples are given here. In one case, for 113 sequential colorectal cancers, only four had a missense mutation in the DNA repair gene O-6-methylguanine-DNA methyltransferase (MGMT), while the majority had reduced MGMT expression due to methylation of the MGMT promoter region (an epigenetic alteration) . Five reports presented evidence that between 40% and 90% of colorectal cancers have reduced MGMT expression due to methylation of the MGMT promoter region [12-16].
Similarly, out of 119 cases of mismatch repair-deficient colorectal cancers that lacked DNA repair gene PMS2 expression, PMS2 was deficient in 6 due to mutations in the PMS2 gene, while in 103 cases PMS2 expression was deficient because its pairing partner MLH1 was repressed due to promoter methylation (PMS2 protein is unstable in the absence of MLH1) . In the other 10 cases, loss of PMS2 expression was likely due to epigenetic over-expression of the miRNA, miR-155, which down-regulates MLH1 .
4. DNA damages are very frequent
As shown in Table 2, an average of more than 60,000 endogenous DNA damages occur per cell per day in humans. These are largely caused by exposure to reactive oxygen molecules, hydrolytic reactions, and interactions with other reactive metabolites (including lipid peroxidation products, endogenous alkylating agents and reactive carbonyl species) .
In addition to the damages shown in Table 2, further DNA damages occur due to environmental assaults. Doll and Peto , compared cancer rates of specific organs in humans in the United States to cancer rates in these organs in other countries. They concluded that 75 - 80% of the cases of cancer in the United States were likely avoidable, and were due to DNA damaging agents found in occupational, medical and “social” exposures (including diet and tobacco).
Colon cancer is an example of a diet-related cancer that appears to be caused by excessive exposure of the colon to DNA damaging agents, mainly bile acids. Bile acids are released into the intestinal tract in response to consumption of fatty foods to aid in their digestion. As reviewed by Bernstein et al. , 14 published reports indicate that the secondary bile acids deoxycholic acid and lithocholic acid cause DNA damage. The concentration of these bile acids in the colon are affected by diet and are doubled in the colonic contents of humans on typical diets in the United States who were experimentally fed a high fat diet . The potential consequences of high fecal bile acid concentrations is illustrated by the following comparison. The concentration of deoxycholic acid (DOC) in the feces of Native Africans in South Africa is 7.30 nmol/g wet weight stool while that of African Americans is 37.51 nmol/g wet weight stool, so that there is 5.14 fold higher concentration of DOC in stools of African Americans than in Native Africans . Native Africans in South Africa have a colon cancer rate of <1:100,000  compared to the incidence rate for male African Americans of 72:100,000 , a more than 72-fold difference in rates of colon cancer. In populations migrating from low-incidence to high-incidence countries cancer rates change rapidly, and within one generation may reach the rate in the high-incidence country. This has been observed, for instance, in the colon cancer incidence of migrants from Japan to Hawaii . These changes in colon cancer rates among migrants are thought to be largely due to changes in diet.
|DNA repair gene(s)||Encoded protein||Repair pathway(s) affected*||Cancers with increased risk|
|breast cancer 1 & 2||BRCA1 BRCA2||HRR of double-strand breaks and daughter strand gaps ||Breast, Ovarian |
|ataxia telangiectasia mutated||ATM||Different mutations in ATM reduce HRR, single-strand annealing (SSA), NHEJ or homology directed double- strand break rejoining (HDR) ||Leukemia, Lymphoma, Breast [29,30]|
|Nijmegen breakage syndrome||NBS||NHEJ ||Lymphoid cancers |
|meiotic recombination 11||MRE11||HRR and NHEJ ||Breast |
|Bloom’s Syndrome (helicase)||BLM||HRR ||Leukemia, Lymphoma, Colon, Breast, Skin, Lung,|
Auditory canal, Tongue,
Tonsil, Larynx, Uterus 
|Werner Syndrome (helicase)||WRN||HRR, NHEJ, long patch BER ||Soft tissue sarcoma,|
Colorectal, Skin, Thyroid, Pancreatic 
|Rothman Thomson syndrome|
Baller Gerold syndrome
|RECQ4||Helicase likely active in HRR ||Basal cell carcinoma,|
Squamous cell carcinoma,
Intraepidemial carcinoma 
|Fanconi’s anemia gene FANC A,B,C,D1,D2,E,F,G,I,J,L,M,N||FANCA etc.||HRR and TLS ||Leukemia, Liver tumors,|
Solid tumors many areas 
C, E [DNA damage binding protein 2 (DDB2)]
|Global genomic NER repairs damage in both transcribed and untranscribed DNA [42,43]||Skin cancer (melanoma and non-melanoma) [42,43]|
A, B, D, F, G
|XPA XPB XPD XPF XPG||Transcription coupled NER repairs the transcribed strands of transcriptionally active genes ||Skin cancer (melanoma and non-melanoma) |
|xeroderma pigmentosum V (also called polymerase H)||XPV|
|Translesion Synthesis (TLS) ||Skin cancers (basal cell, squamous cell, melanoma) |
|mutS (E. coli) homolog 2|
mutS (E. coli) homolog 6
mutL (E. coli) homolog 1
increased 2 (S. cereviciae)
|MSH2 MSH6 MLH1 PMS2||MMR ||Colorectal, Endometrial |
|mutY homolog (E. coli)||MUTYH||BER of A mispaired with|
|ataxia telaniectasia and RAD3 related||ATR||DNA damage response|
likely affects HRR, but not NHEJ 
|Oropharyngeal cancer |
|Li Fraumeni syndrome||P53||HRR, BER, NER and DDR for those and NHEJ and MMR ||Sarcoma, Breast, Osteo-sarcoma, Brain, Adreno-cortical carcinomas |
|Severe combined immune deficiency (SCID)||Artemis DCLRE1C||NHEJ ||B-cell lymphoma |
|CHEK2 (a DDR gene)||CHEK2||Double-strand breaks ||Breast, Ovarian |
The likely role of bile acids as causative agents in colon cancer is illustrated by experiments with mice. When mice were fed a diet supplemented with the bile acid deoxycholate (DOC) for 10 months, raising their colonic level of DOC to that of humans on a high fat diet, 45% to 56% of these mice developed colon cancers, while mice fed the standard diet alone, with 1/10 the level of colonic DOC, developed no colon cancers [56,57].
|DNA damages||Reported rate of occurrence|
|Oxidative||86,000 per cell per day in rats|
10,000 per cell per day in humans 
|Depurinations||9,000 per cell per day |
|Depyrimidations||696 per cell per day |
|Single-strand breaks||55,000 per cell per day |
|Double-strand breaks||~50 per cell cycle in humans |
|O6-methylguanine||3,120 per cell per day |
|Cytosine deamination||192 per cell per day |
5. DNA repair deficiency allows excess DNA damage accumulation
At least 169 enzymes are either directly employed in DNA repair or influence DNA repair processes . Of these, 139 are directly employed in DNA repair processes including base excision repair (BER), nucleotide excision repair (NER), homologous recombinational repair (HRR), non-homologous end joining (NHEJ), mismatch repair (MMR) and direct reversal of lesions (DR). The other 30 enzymes are employed in the DNA damage response (DDR) needed to initiate DNA repair; chromatin structure modification required for repair; reactions needed for the reversible, covalent attachment of ubiquitin and small ubiquitin-like modifier (SUMO) proteins to DDR factors that facilitate DNA repair; or modulation of nucleotide pools.
When the incidence of endogenous and exogenous DNA damages is high, decreases in expression of DNA repair genes or DNA damage response (DDR) genes would be expected to lead to a build-up of DNA damage within a cell. Five examples below indicate that a DNA repair deficiency leads to excess DNA damage accumulation.
BLM deficiency. As reviewed by Manthei and Keck  and Croteau et al. , Bloom's syndrome helicase (BLM) likely has roles in multiple steps in homologous recombinational repair (HRR) of double-strand breaks (DSBs) in DNA. BLM is able to stimulate nuclease activity in a 5′ end resection at a DSB. This aids in initiation of HRR. This activity may serve to shuttle DSBs away from non-homologous end joining (NHEJ) pathways, which are more error prone. In the second step of HRR, the RAD51 recombinase forms a helical filament on the free 3′ DNA end. A homology search within double-stranded DNA by the RAD51/ssDNA complex produces a D-loop structure as a result of invasion of a ssDNA segment into a homologous sister-chromatid or chromosome. In this step, BLM interacts with RAD51 and is able to migrate and unwind D-loops. BLM is also part of a “dissolvasome” that resolves Holiday junctions during HRR. Humans with a germ-line BLM mutation accumulate chromosomal rearrangements and aneuploidy  and have increased susceptibility to several kinds of cancer (Table 1).
MUTYH deficiency. MUTYH protein is a glycosylase that removes an undamaged adenine mispaired with the damaged DNA base 8-OH-deoxyguanine. This removal leaves an apurinic/apyrimidinic (AP) site that initiates a special long-patch base excision repair. This repair depends on accurate translesion synthesis by polymerase lambda (pairing a cytosine opposite the 8-OH-deoxyguanine), creating a cytosine:8-OH-deoxyguanine pair, which then allows other enzymes to recognize and remove the 8-OH-deoxyguanine . If MUTYH (or MYH) expression is decreased by short hairpin RNA (shRNA) (that makes a tight hairpin turn that can silence target gene expression) in human-origin HeLa cells, external application of H2O2 causes increased accumulation of 8-OH-deoxyguanine . This finding shows that deficient expression of DNA repair protein MUTYH allows 8-OH-deoxyguanine accumulation when cells are under oxidative stress. Note that a germ-line MUTYH mutation increases the risk of colon cancer (Table 1).
ATM deficiency. In response to double-strand breaks, both ATM and ATR phosphorylate a multitude of protein substrates, including p53, and the checkpoint kinases, CHEK1 and CHEK2. These phosphorylated substrates promote cell cycle arrest and initiate DNA repair. Arresting the cell cycle allows time for enzymes to repair the DNA before DNA synthesis or chromosome segregation initiates [66,67]. As shown by Flockerzi et al.  and Rübe et al. , when DNA repair is reduced by homozygous loss of function of ATM, a low dose of radiation causes more DNA damage, especially double-strand breaks, to accumulate than when ATM is wild-type. Note that germ-line ATM mutations increase the risk of leukemia, lymphoma and breast cancer (Table 1).
ERCC1 deficiency. Nucleotide excision repair (NER) removes helix-distorting “blocking” lesions located throughout the genome. Such lesions may block movement of DNA polymerase during DNA replication or a lesion on the transcribed strand may block elongating RNA polymerase movement within an active gene. XPC-RAD23B initiates the repair response by recognizing a damage-induced structural change in DNA and then binds to the strand opposite the lesion and not the chemical adduct itself. After a number of steps, the two endonucleases, XPF-ERCC1 and XPG then carry out incisions 5’ and 3’, respectively, to the DNA damage. The presence of genetic polymorphisms of ERCC1, with reduced DNA repair capacity, allow more benzo[a]pyrene-DNA adducts to accumulate in cells exposed to benzo[a]pyrene . Thus, bulky helix-distorting lesions accumulate when ERCC1 protein activity is deficient.
DNA polymerase beta deficiency. As reviewed by Sobol , the base excision repair (BER) pathway is used to repair many DNA damages including depurinated and depyrimidinated bases (abasic sites), deaminated cytosine or 5-methylcytosine, and oxidation products such as 8-OH-dG, thymine glycol and lipid peroxidation products. Once the base lesion is removed by one of 11 DNA glycosylases and the abasic site is hydrolysed by APE1 endonuclease, DNA polymerse beta (POLB) is recruited to the lesion and carries out two functions: (1) removal of the sugar-phosphate residue that remains after APE1 cleaves the DNA backbone and (2) addition of the new nucleotide(s) to replace the one(s) removed during repair. A single nucleotide polymorphism (SNP) in POLB (P242R) that acts at half the rate of wild-type POLB occurs in 2.4% of individuals in certain human populations [72,73]. This P242R polymorphism has reduced effectiveness in DNA repair and cells carrying the P242R polymorphism accumulate double-strand breaks at a higher rate than cells carrying the wild type allele . The population of humans examined and carrying the P242R SNP was not large enough to determine whether this germ-line SNP increases the risk of cancer, so POLB is not listed in Table 1. However, while germ-line mutations in POLB are not known to increase cancer risk, the human POLB gene is mutated in 40% of colorectal tumors, though not in the field defects surrounding the tumors . Absence in the field defect but presence in the tumor suggests that a mutation in POLB is a later step in progression to a tumor.
6. DNA damages give rise to mutations and epigenetic alterations
As described below, a substantial proportion of mutations are due to translesion synthesis past otherwise un-repaired single-strand DNA damages, the most frequent endogenous DNA damages in Table 2. However, while only a minority of endogenous DNA damages in the average cell are double-strand breaks, this type of lesion appears to contribute substantially to the mutation rate as well. As indicated by Vilenchik and Knudson , the doubling dose for ionizing radiation (IR) induced double-strand breaks is similar to the doubling dose for mutation and induction of carcinomas by IR. Thus, double-strand breaks likely lead frequently to mutations. In addition, as further described below, some portion of the epigenetic alterations transmissible from one generation to the next (epimutations) appear to have arisen from otherwise temporary alterations needed during steps in DNA repair.
7. Translesion synthesis past a DNA damage
Translesion synthesis (TLS) is a DNA damage tolerance process that allows the DNA replication machinery to replicate past DNA lesions in the template strand. This allows replication to be completed, rather than blocked (which may kill the cell or cause a translocation or other chromosomal aberration) .
Humans have four translesion polymerases in the Y family of polymerases [REV1, Pol κ (kappa), Pol η (eta), and Pol ι (iota)] and one in the B family of polymerases [Pol ζ (zeta)]. REV1 inserts cytosine opposite abasic sites in DNA (which may not be the correct base for that site) and has a structural role in regulating Pol ζ. Pol ζ extends replication past distorted DNA pairs, such as mismatched pairs of bases or bases with bulky DNA adducts. Pol η is a DNA polymerase that efficiently replicates DNA templates containing thymine dimers. Pol ι utilizes Hoogsteen base pairing for efficient and correct incorporation of cytosine opposite altered purines, such as 8-oxoguanine, but also tends to incorporate guanine opposite thymine. Pol κ is specialized in performing error-free bypass of bulky minor groove N2-deoxyguanine adducts among other lesions, but is highly error-prone when replicating a normal portion of a template .
The temporary tolerance of DNA damage during replication may allow DNA repair processes to remove the damage later , and avoid immediate genome instability . However, translesion synthesis is less accurate than the replicative polymerases δ (delta) and ε (epsilon) and tends to introduce mutations .
8. Mutation due to translesion synthesis
Deficiency in expression of a DNA repair gene allows excessive DNA damages to accumulate. Some of the excess damages are likely processed by translesion synthesis, causing increased mutation.
As one example, BRCA2 protein is normally active in the accurate homologous recombinational repair (HRR) pathway. Loss of both wild-type DNA repair gene BRCA2 alleles causes rapid spontaneous acquisition of genome-wide somatic mutations in the replicating tissues of mouse embryos. The mutations were measured in LacZ-plasmid transgenic reporter mouse embryos, a system in which large genomic deletions, insertions and translocations can be detected. The mutations found in BRCA2(-/-) mouse embryos are predominantly deletion/rearrangement mutations consistent with mis-repair of DNA double-strand breaks arising during DNA replication . The proportion of deletion/rearrangement mutations (76%) in the presence of BRCA2(-/-) is close to the proportion of deletion/rearrangement mutations (71%) among the much less frequent mutations found in the presence of the wild-type alleles BRCA2(+/+). This finding suggests that the mode of error-prone translesion synthesis past any un-repaired DNA damages is likely the same in BRCA2(-/-) and BRCA2(+/+) cells. In BRCA2(+/+) cells, the accurate HRR pathway would take care of most of the relevant DNA damages, rather than translesion synthesis.
Kunz et al.  summarized a large number of experiments in yeast, in which forward mutations were measured (by sequence analyses of a few selected genes) in cells carrying either wild-type alleles or one of 11 inactivated DNA repair genes. Their results indicated that DNA repair deficient cells accumulate excess DNA damage that could then give rise to mutations after error-prone translesion synthesis. The 11 inactivated DNA repair genes were distributed among mismatch repair, nucleotide excision repair, base excision repair and homologous recombinational repair genes. Deficiencies in DNA repair increased mutation frequencies by factors between 2- and 130-fold, but most often by double digit-fold increases. Overall, the authors concluded that 60% or more of single base pair substitutions and deletions are likely caused by translesion synthesis.
Hegan et al.  studied forward mutation in mice to determine the spontaneous mutation frequency in the presence of wild-type alleles or in the presence of knockout mutations in five individual mismatch repair genes and in pairs of double knockout mismatch repair genes. They used two mutation reporter genes within chromosomally integrated, recoverable phage lambda shuttle vectors to measure mutation frequencies and to determine the types of mutations present. The inactivated mismatch repair genes were in Pms2, Mlh1, Msh2, Msh3 and Msh6. All mice with nullizygous mutations in these mismatch repair genes had significantly increased mutation frequencies compared to wild-type mice with both reporter genes tested. The highest two individual mutation frequencies were found in mice defective for Mlh1 (>72-fold increase) and Msh2 (65-fold increase). The double knockout mice had still higher frequencies of mutation than the single knockout mice. The greatest increase found was with the Msh3-/-/Msh6-/- double knockout mice that had more than a 100-fold increase in mutation frequency with one of the reporter genes compared to wild-type mice. In these mismatch repair deficient mice, the majority of mutations found were generally insertion and deletion mutations.
Stuart et al.  examined spontaneous mutation frequencies in a lacI transgene (in a Big Blue mutation assay ) in either replicating tissues or in largely non-replicating tissues of mice. If most mutations occur during translesion synthesis, then non-replicating brain tissue, which has little or no synthesis once maturity is reached, would have little or no further mutation accumulation. In mouse brain, after 6 months of age, there was no increase in mutation frequency, even at 25 months of age. In bladders of mice, with replicating tissues, mutation frequency increased with age, almost tripling between ages of 1.5 months and 12 months of age. The authors concluded that the age related increases in spontaneous mutation frequencies reflect endogenous DNA damages that were subsequently expressed as mutations following DNA replication. This indicates that translesion synthesis is a major source of mutation in the mouse.
9. Mutation due to error prone repair of double-strand breaks
As described by Bindra et al. , non-homologous end-joining (NHEJ) and homologous recombination repair (HRR) comprise the two major pathways by which double-strand breaks (DSBs) are repaired in cells. NHEJ processes and re-ligates the exposed DNA termini of DSBs without the use of significant homology, whereas HRR uses homologous DNA sequences as a template for repair. HRR predominates in S-phase cells, when a sister chromatid is available as a template for repair, and is a high-fidelity process. NHEJ is thought to be active throughout the cell cycle, and it is more error-prone than HRR. NHEJ repair comprises both canonical NHEJ and non-canonical pathways. The former pathway results in minimal processing of the DSB during repair, whereas the latter pathway, with or without the use of sequence microhomology for re-ligation, typically results in larger insertions or deletions. Mutagenic NHEJ repair is a robust process, yielding percentages of mutated sites at the position of a DSB ranging from 20 to 60%.
As pointed out by Vilenchik and Knudson , about 1% of single-strand DNA damages escape repair and are not bypassed, and some of these become converted to double-strand breaks. This may contribute to the impact of double-strand breaks in causing mutations and carcinogenesis.
10. Epigenetic alterations occur due to DNA damage
Experiments have been conducted to determine the molecular steps by which epigenetic alterations arise due to incomplete repair of DNA double-strand breaks. In one experiment O’Hagan et al.  used a cell line that was stably transfected with a plasmid containing a consensus I-SceI cut site inserted into a copy of the E-cad promoter. This promoter contained a CpG island (where a cytosine nucleotide frequently occurs next to a guanine nucleotide in the linear sequence of bases). The cytosines in these CpG islands are often hypermethylated, causing epigenetic repression of the associated genes. Such hypermethylations occur in multiple human tumor types. The investigators induced a defined double-strand break in the E-cadherin CpG island, which was not currently hypermethylated. After the onset of repair of the break, they observed the expected recruitment to the site of damage of key proteins involved in establishing and maintaining transcriptional repression, to allow repair of the break. These proteins included SIRT1, EZH2, DNMT1, and DNMT3B. Furthermore, silencing histone modifications appeared including hypoacetyl H4K16, H3K9me2 and me3, and H3K27me3. In most cells selected after the DNA break, DNA repair occurred faithfully with preservation of activity of the promoter, and removal of the silencing factors. However, a small percentage of the plated cells demonstrated induction of heritable silencing. The chromatin around the break site in such a silent clone was enriched for most of the silencing chromatin proteins and histone marks, and the region had increased DNA methylation in the CpG island of the promoter. Their data suggested that repair of a DNA break can occasionally cause heritable silencing of a CpG island-containing promoter by recruitment of proteins involved in silencing and leading to aberrant CpG island DNA methylation,. Such CpG island methylation is frequently associated with tight gene silencing in cancer.
In a second experiment showing that epigenetic alterations arise as a consequence of DNA damage, Morano et al.  studied a system in which recombination between partial duplications of a chromosomal Green Fluorescent Protein (GFP) gene is initiated by a specific double-strand break (DSB) in one copy. The unique DSB is generated by cleavage with the meganuclease I-SceI, which does not otherwise cleave the eukaryotic genome. The DSB is repeatedly formed and repaired, until the I-SceI site is lost by homologous or nonhomologous repair or depletion of the I-SceI enzyme. Recombination products can be detected by direct analysis of the DNA flanking the DSB or by the appearance of functional GFP (green fluorescent cells). Two cell types were generated after recombination: clones expressing high levels of GFP and clones expressing low levels of GFP, referred to as H and L clones, respectively. Relative to the parental gene, the repaired GFP gene was hypomethylated in H clones and hypermethylated in L clones. The altered methylation pattern was largely restricted to a segment just 3’ to the DSB. Hypermethylation of this tract significantly reduced transcription, although it is 2000 base pairs distant from the strong cytomegalovirus (CMV) promoter that drives GFP expression. The ratio between L (hypermethylated) and H (hypomethylated) clones was 1:2 or 1:4, depending on the insertion site of the GFP reporter. These experiments were performed in mouse embryonic (ES) or human cancer (Hela) cells. HRR-induced methylation was dependent on DNA methyltransferase I (DNMT1). These data, taken together, argue for a cause-effect relationship between DNA damage-repair and altered DNA methylation.
The main function of the proteins in the base excision repair (BER) pathway is to repair DNA single-strand breaks and deamination, oxidation, and alkylation-induced DNA base damage. Li et al.  reviewed recent studies indicating that one or more BER proteins may also participate in epigenetic alterations involving DNA methylation or reactions coupled to histone modification. Franchini et al.  showed that DNA demethylation can be mediated by BER and other DNA repair pathways requiring processive DNA polymerases. Still another form of epigenetic silencing may occur during DNA repair. The enzyme PARP1 [poly(ADP)-ribose polymerase 1] and its product poly(ADP)-ribose (PAR) accumulate at sites of DNA damage as intermediates of a repair process . This, in turn, directs recruitment and activation of the chromatin remodeling protein ALC1 that may cause nucleosome remodeling . Nucleosome remodeling has been found to cause, for instance, epigenetic silencing of DNA repair gene MLH1 . Thus, DNA damages needing repair can cause epigenetic alterations by a number of different mechanisms.
11. Other causes of epigenetic alterations
Heavy metals and other environmental chemicals cause many epigenetic alterations, including DNA methylation, histone modifications and miRNA alterations . DNA damage itself causes programmed changes in non-coding RNAs, and a large number of miRNAs are transcriptionally induced upon DNA damage . However, it is not clear what proportion of these alterations are reversed or are retained as epimutations after the external sources of damage are removed upon repair of the DNA damages .
Mutations in isocitrate dehydrogenase 1 (IDH1) and 2 (IDH2) are frequent in a number of cancers and they can cause epigenetic alterations. As reviewed by Wang et al. , IDH1 and IDH2 mutations represent the most frequently mutated metabolic genes in human cancer, mutated in more than 75% of low grade gliomas and secondary glioblastoma multiforme (GBM), 20% of acute myeloid leukemias (AML), 56% of chondrosarcomas, over 80% of Ollier disease and Maffucci syndrome, and 10% of melanomas. The IDH1 and IDH2 mutations that give rise to epimutations usually occur in the hotspot codons Arg132 of IDH1, or the analogous codon Arg172 of IDH2. These mutations allow accumulation of the metabolic intermediate 2-hydroxyglutarate (2-HG), and 2-HG inhibits the activity of alpha ketoglutarate (α-KG) dependent dioxygenases, including α-KG-dependent histone demethylases and the TET family of 5-methylcytosine hydroxylases. Wang et al.  found that histone H3K79 dimethylation levels were significantly elevated in cholangiocarcinoma samples that harbored IDH1 or IDH2 mutations (80.8%) compared to tumors with wild-type IDH1 and IDH2 (45.0%). In addition, they surveyed over 462,000 CpG sites in CpG islands, CpG shores and intragenic regions, and found that 2,309 genes had significantly increased methylation in the presence of IDH1 or IDH2 mutations. In particular, Sanson et al.  found that methylation of the DNA repair gene MGMT was associated with IDH1 mutation, since 81.3% of IDH1-mutated tumors were MGMT methylated compared with 58.3% methylated in IDH1 non-mutated tumors.
12. Long-term epigenetic repression of DNA repair genes in progression to cancer
A DNA repair gene that is epigenetically silenced or whose expression is reduced would not likely confer any selective advantage upon a stem cell. However, reduced or absent expression of a DNA repair gene would cause increased rates of mutation, and one or more of the mutated genes could cause the cell to have a selective advantage. The defective DNA repair gene could then be carried along as a selectively neutral or only slightly deleterious passenger (hitch-hiker) gene when there is selective expansion of the mutated stem cell. The continued presence of a DNA repair gene that is epigenetically silenced or has reduced expression would continue to generate further mutations and epigenetic alterations.
The spread of a clone of cells with a selective advantage, but carrying along a gene with epigenetically reduced expression of a DNA repair protein would be expected to generate a field defect, from which smaller clones with still further selective advantage would arise. This is consistent with the finding of field defects in colonic resections, that have both a cancer and multiple small polyps, such as the one shown in Figure 1.
The protein expressions of three DNA repair genes within a 20 cm colon resection were evaluated at six different locations within the resection (Figure 2) . A colon resection, on its inner epithelial surface, has a layer of epithelial crypts (microscopic, test tube like indentations about 100 cells deep), with 100 crypts per square millimeter. Each crypt is a clone of about 5,000 cells all generated by the 10 stem cells at the base of the crypt. One of the DNA repair proteins, KU86, was only deficient infrequently, with the deficiencies occurring in small patches (up to three crypts). These KU86 defects are not likely important in progression to colon cancer. However, two evaluated DNA repair proteins, ERCC1 and PMS2, were often deficient in patches of tens to hundreds of adjacent crypts at each of the locations evaluated (see Nguyen et al.  at minutes 18 to 24 of a 28 minute video of crypts immunostained for ERCC1 or PMS2).
Overall, ERCC1 was deficient in 100% of 49 colon cancers evaluated, and in 40% of the crypts within 10 cm on either side of the cancer. PMS2 was deficient in 88% of the 49 cancers and in 50% of the crypts within 10 cm of the cancer. As reported by Facista et al. , the pattern of expression of ERCC1 in the crypts within 10 cm of a colon cancer indicated that when the ERCC1 protein was deficient, this deficiency was due to an epigenetic reduction in expression of the ERCC1 gene. When the PMS2 protein is deficient, it is usually due to the epigenetic repression of its pairing partner, MLH1, and the instabilty of PMS2 in the absence of MLH1 . In the study of Facista et al. , ERCC1 and PMS2 were also deficient in all 10 tubulovillous adenomas evaluated (precursors to colonic adenocarcinomas). Thus ERCC1 and PMS2 are deficient at early times (in the field defect), at intermediate times (tubulovillus polyps), and late times (within cancer) during progression to colon cancer. Another DNA repair protein, XPF, was deficient in 55% of the cancers, as well . The majority of cancers were simultaneously deficient for ERCC1, PMS2 and XPF.
Deficiencies in multiple DNA repair genes were also observed in gastric cancers. Kitajima et al.  evaluated MGMT, MLH1 and MSH2 and found that synchronous losses of MGMT and MLH1 increase during progression and stage of differentiated-type cancers. In un-differentiated-type gastric cancers, the frequency of MGMT deficiency increased from early to late stages of the cancer, while frequencies of MLH1 and MSH2 deficiencies were between 48% and 74% at both early and late stages. Thus, in un-differentiated-type gastric cancers, MLH1 or MSH2 deficiency, if it is present, is an early step, while MGMT deficiency is often a later step in progression of this cancer.
Farkas et al.  evaluated 160 genes in 12 paired colorectal tumors and adjacent histologically normal mucosal tissues for differential promoter methylation. They found aberrant methylation in 23 genes, including six DNA repair genes. These DNA repair genes (with DNA repair pathways indicated) were NEIL1 (base excision repair), NEIL3 (base excision repair), DCLRE1C (non-homologous end joining), NHEJ1 (non-homologous end joining), GTF2H5 (nucleotide excision repair), and CCNH (nucleotide excision repair).
Jiang et al.  evaluated the mRNA expression of 27 DNA repair genes in 40 astrocytomas compared to normal brain tissues from non-astrocytoma individuals. They found that 13 DNA repair genes, MGMT, NTHL1, OGG1, SMUG1, ERCC1, ERCC2, ERCC3, ERCC4, MLH1, MLH3, RAD50, XRCC4 and XRCC5 were all significantly down-regulated in all three grades (II, III and IV) of astrocytomas. The deficiencies of these 13 genes in lower grade as well as in higher grade astrocytomas indicated that they may be important in early as well as in later stages of astrocytoma. For 8 DNA repair genes, ERCC3, ERCC4, MLH3, MRE11A, NTHL1, RAD50, XRCC4 and XRCC5, decreased expression was significantly associated with a poor prognosis.
Based on the examples above, decreased expression of multiple DNA repair genes is likely to occur in many types of neoplasia. This should result in increased mutation frequency in those neoplastic lesions. Most new mutations are expected to be deleterious to the cells in which they arise, and thus would cause negative selection of those cells. This expectation is consistent with the observations of Hofstad et al.  who showed that when a colonic polyp was identified during a colonoscopy and followed but not removed, between 11% and 46% of those polyps smaller than 5 mm diameter were not detectable in the succeeding one to three years. For polyps between 5 and 9 mm in diameter, between 4 and 24% became undetectable in the succeeding one to three years. Of the remaining 68 polyps that were followed for three years, 35% decreased in diameter, 25% remained the same size and 40% increased in diameter. The data of Hofstad et al.  are also consistent with statistics showing more frequent occurrence of adenomas during colonoscopy and autopsy compared to the frequency of colon cancer, indicating there must be a significant regression rate for adenomas .
When infrequent positively selected mutations arise in a cell, this can provide the cell with a competitive advantage that promotes its preferentiail clonal proliferation, leading to cancer. The continued presence of epigenetically repressed DNA repair genes, carried along as passengers in the development of cancers, also predicts that cancers will contain heterogeneous genotypes (multiple subclones). For instance, in one primary renal carcinoma with multiple metastases, 101 non-synonymous point mutations and 32 indels (insertions and deletions) were identified . Five mutations were not validated and excluded from the study. Of the remaining 128 mutations, 40 were “ubiquitous” and present in each region of the tumor sampled. There were 59 “shared” mutations, present in several but not all regions, and 29 “private” mutations, unique to a specific region evaluated. The authors constructed a phylogenetic tree and concluded that the evolution in the tumor and its metastases was branching, and not linear. Similar results were found in a further three tumors evaluated in their study. Every tumor had spatially separated heterogeneous somatic mutations and chromosomal imbalances leading to phenotypic intratumor diversity.
13. Epigenetic repression of DNA repair genes in field defects in progression to cancer
As described in detail by Rubin , field defects are of central importance in progression to cancer. While the great majority of studies in cancer research has been done on well-defined tumors formed in vivo, evidence indicates that more than 80% of the somatic mutations found in microsatellite instability (MSI) (mutator phenotype) human colorectal tumors occur before the onset of terminal clonal expansion. This evidence included the finding that adenomas were phylogenetically nearly as old as cancers. The origin of field defects was described by Braakhuis et al.  as follows. They postulated that a stem cell acquires one (or more) genetic alterations and forms a patch with genetically altered daughter cells. As a result of these and subsequent genetic alterations the stem cell escapes normal growth control, gains a growth advantage, and develops into an expanding clone. The lesion, gradually becoming a field, laterally displaces the normal epithelium. The enhanced proliferative capacity of a genetically altered clonal unit is the driving force of the process. As the lesion becomes still larger, additional genetic hits give rise to various subclones within the field. Different clones diverge at a certain time point with respect to genetic alterations but share a common clonal origin. The presence of a relatively large number of genetically altered stem cells in a field is a “ticking time bomb,” and as a result of the process of clonal divergence and selection, eventually a subclone evolves into invasive cancer.
|Cancer||Gene||Frequency in Cancer||Frequency in Field Defect||Ref.|
|Colorectal||MGMT with MSI*||70%||60%|||
|Head and Neck||MGMT||54%||38%|||
|Head and Neck||MLH1||33%||25%|||
|Head and Neck||MLH1||31%||20%|||
Epigenetic reductions in protein expression of DNA repair genes are frequent in cancers (Table 3). For any particular type of cancer, an epigenetic reduction in expression of a specific DNA repair gene, such as an epigenetic reduction of MGMT in colorectal cancer, may be common. In cases where a specific epigenetic reduction of expression of a DNA repair gene occurs in a cancer, it is also likely to be evident in the field defect surrounding the cancer (Table 3). The lower frequency in the surrounding field defect that is often found (Table 3) likely reflects the process whereby the expanding clone is laterally displacing the normal epithelium. This displacement may be only partial. Thus, areas with the DNA repair deficiency would be present at a lower frequency in the field defect than in the cancer. In the cancer, the cells carrying the DNA repair deficiency are members of a founding clone. Thus, the DNA repair defect, along with other accumulated mutations and epigenetic alterations, would be seen in the cancer at a relatively higher frequency than in the surrounding field defect.
14. Examples of epigenetic repression of DNA repair genes, due to alterations in CpG island methylation, in various cancers
Table 4, below, gives examples of reports of DNA repair genes repressed by CpG island hypermethylation (or with increased expression due to CpG hypomethylation) in 17 different cancers (this is only a partial list). Twenty different DNA repair genes (all listed among the 169 DNA repair and DNA damage response genes previously identified ) were often hyper- (or sometimes hypo-) methylated in one or more type of cancer. Such alterations in methylation of promoter regions of DNA repair genes can cause deficient repair of DNA damages. Thus, hyper- (or hypo-) methylations of DNA repair genes are frequently important factors responsible for lack of appropriate repair of DNA damages. Faulty DNA repair leads to increased mutation and epigenetic alteration, central to progression to cancer.
MGMT is one of the DNA repair genes often evaluated for hypermethylation. Of the cancers listed in Table 4, nine were reported to have some frequency of hypermethylation of MGMT. Hypermethylation of MGMT was particularly frequent in bladder cancer (93%), stomach (88%), thyroid (74%), colorectal (40-90%) and brain (50%).
Other DNA repair genes with high frequencies of hypermethylation (in particular cancers) were LIG4 (colorectal 82%), P53 (brain 60-74%), NEIL1 [head and neck 62% and non-small cell lung cancer (NSCLC) 42%], ATM (NSCLC 47%), MLH1 (NSCLC squamous cell carcinoma 48%) and FANCB (head and neck 46%). The DNA repair genes LIG4, P53, NEIL1 and FANCB were frequently not evaluated for hypermethylation in other particular types of cancers, and could be of importance in such cancers as well.
15. DNA repair gene ERCC1 expression likely can be repressed by multiple processes
A number of the DNA repair genes with reduced expression due to CpG island hypermethylation are also epigenetically repressed by other means. Many protein coding genes are repressed by microRNAs. MicroRNAs (miRNAs) are small noncoding endogenously produced RNAs that play key roles in controlling the expression of many cellular proteins. Once they are recruited and incorporated into a ribonucleoprotein complex, they can target specific messenger RNAs (mRNAs) in a miRNA sequence-dependent process and interfere with the translation into proteins of the targeted mRNAs via several mechanisms (see detailed review by Lages et al. ).
Almost one third of miRNAs active in normal mammary cells were found to contain hypermethylated DNA regions in breast cancer cells . This includes, for instance, microRNAs let-7a-3/let-7b.
As indicated by Motoyama et al.,  the let-7a miRNA normally represses the HMGA2 gene, and in normal adult tissues, almost no HMGA2 protein is present. In breast cancers, for instance, the promoter region controlling let-7a-3/let-7b microRNA is frequently repressed by hypermethylation . Reduction or absence of let-7a microRNA allows high expression of the HMGA2 protein. HMGA proteins are characterized by three DNA-binding domains, called AT-hooks, and an acidic carboxy-terminal tail. HMGA proteins are chromatin architectural transcription factors that both positively and negatively regulate the transcription of a variety of genes. They do not display direct transcriptional activation capacity, but regulate gene expression by changing local DNA conformation. Regulation is achieved by binding to AT-rich regions in the DNA and/or direct interaction with several transcription factors .
HMGA2 targets and modifies the chromatin architecture at the ERCC1 gene, reducing its expression . The lack of let-7a miRNA repression of HMGA2 could occur through translocation of HMGA2, disrupting the 3’UTR of HMGA2 which is the target of let-7a miRNA (shown in an artificial construct), and this can lead to an oncogenic transformation . However, the promoter controlling let-7a miRNA also can be strongly regulated by hypermethylation in intact cells. When human lung cells are exposed to cigarette smoke condensation, the promoter region controlling let-7a becomes highly hypermethylated . While only 38% of colorectal cancers have CpG island methylation of the ERCC1 promoter (Table 4), Facista et al.  found that 100% of colon cancers have significantly reduced levels of ERRC1 protein expression. In the 49 cancers examined, ERCC1 generally varied from 0% to 45% of the level of ERCC1 expression of neoplasm-free individuals. It is likely that hypermethylated promoter for let-7a microRNA/hyperexpressed HMGA2 or other epigenetic mechanism(s) reduces protein expression of ERCC1 in colorectal cancers in addition to the 38% of colorectal cancers in which the ERCC1 gene is directly hypermethylated.
16. DNA repair gene BRCA1 expression likely can be repressed by multiple processes
BRCA1 expression is reduced or undetectable in the majority of high-grade, ductal carcinomas . Among 32 breast cancers examined, none had a sporadic mutation in the BRCA1 gene . The frequency of BRCA1 promoter hypermethylation in breast cancer is only 13-16% [129,130] (see Table 4). However, miR-182 targets BRCA1  and the promoter controlling expression of miR-182 is hypomethylated (would have increased expression) in cancers, as indicated by Shnekenburger and Diederich . Tang et al.  showed that transcription of miR-182 is repressed when the promoter controlling its transcription is methylated so that miR-182 is clearly an epigenetically regulated miRNA. Moskwa et al.  showed that basal-like ER-negative breast cancer cell lines had relatively low levels of BRCA1 protein and in five of the six ER negative cell lines there was inverse correlation of BRCA1 protein and miR-182 expression. Thus epigenetically increased expression of miR-182 appears to be implicated in reducing BRCA1 protein expression in breast cancer.
There is a further potential epigenetic mechanism for repressing BRCA1 in breast cancers. miR-34b is repressed by methylation of its promoter . A target of miR-34b is HMGA1 . When miR-34b is repressed, expression of HMGA1 is increased . HMGA1 protein appears to target BRCA1. Baldaserre et al.  found an inverse correlation between HMGA1 and BRCA1 mRNA and protein expression in human mammary carcinoma cell lines and tissues. Thus epigenetically methylated promoter for transcription of miR-34b/increased HMGA1 may be instrumental in reducing BRCA1 protein expression in breast cancer. It is not clear whether increased miR-182 (see paragraph above) or decreased miR-34b is the more important factor in repressing BRCA1 in breast cancers.
17. DNA repair gene MGMT expression is repressed by multiple processes
In the most common form of brain cancer, glioblastoma, the DNA repair gene MGMT is epigenetically methylated in 29%  to 66%  of tumors, thereby reducing protein expression of MGMT. However, for 28% of glioblastomas, the MGMT protein is deficient but the MGMT promoter is not methylated . Zhang et al.  found, in the glioblastomas without methylated MGMT promoters, that the level of microRNA miR-181d is inversely correlated with protein expression of MGMT and that the direct target of miR-181d is the MGMT mRNA 3’ UTR (the three prime untranslated region of MGMT mRNA). miR-181d normally occurs at very low levels in the brain . It is not clear whether miR-181d is epigenetically up-regulated, when it occurs at increased levels in the brain. Thus it is not clear if this second process of reducing MGMT expression in progression to glioblastoma is an epigenetic one.
|Cancer||Gene||Frequency of hyper- (or hypo-) methylation in cancer||Ref.|
|Head and Neck|
18. DNA repair proteins and miRNAs
A number of investigators have tried to relate alteration in DNA repair gene expression to altered level of miRNA expression. For instance, Wouters et al. , using “in silico” computer programs (Targetscan and Mirbase), listed 74 DNA repair or DNA Damage Response (DDR) genes and, for each of these genes, listed between 1 and 19 “conserved” miRNAs that were predicted to repress the particular genes. They defined “conserved” miRNAs as miRNAs found in at least five mammalian species. For the purposes of this review, in which we are concerned with epigenetic alterations that control DNA repair, about half of the miRNAs they found “in silico” would not be of interest because they were inducible by UV irradiation, and thus may have been largely controlled by a transient transcriptional regulatory change rather than epigenetically.
More recently, focusing on the DNA repair gene MGMT, Kushiwaha et al.  used five different “in silico” computer programs to predict which of 885 miRNAs would repress the DNA repair protein MGMT. Kushiwaha et al.  also transfected each of the 885 miRNAs into a glioblastoma cell line where the cell line had a high original expression of MGMT. They found 103 of the tested miRNAs did reduce MGMT expression in vitro by more than 50% without causing high cytotoxicity. However, the correspondence of predicted “in silico” interactions of the miRNAs with experimentally found interactions was rather low, 20% at best, indicating that “in silico” predictions often are not biologically relevant. Of the 103 miRNAs that reduced expression of MGMT, 15 had an inverse correlation with MGMT expression in vivo in promoter-unmethylated glioblastoma tissue specimens. These 15 miRNAs included miR-181d that Zhang et al.  had previously shown to be inversely correlated with MGMT in glioblastomas. Kushiwaha et al. then focused further on one of the 15 miRNAs, miR-603, which strongly suppressed MGMT. In 23 glioblastoma cell lines, miR-603 was expressed at levels that varied by about 20 fold. It is not known whether the different expression levels were due to epigenetic control. They then determined that miR-603 suppressed MGMT by direct interaction with the 3’UTR region of MGMT mRNA, using mRNA-biotinylated miRNA complex pull down reactions with streptavidin coated magnetic beads. They were able to further show that miR-603 could cooperate with miR-181d to completely silence MGMT expression by jointly binding to nearby locations on the MGMT mRNA 3’UTR. These miRNA controls of expression of a DNA repair enzyme are illustrative of how miRNAs may interact with mRNAs to control their expression. In the experiments discussed in this paragraph, the miRNAs appear to be important in reducing expression of a DNA repair enzyme in progression to glioblastoma, but the extent to which epigenetic mechanisms are employed to control the level of the miRNAs is unclear.
Both Tessitore et al.  and Vincent et al.  listed about 20 miRNAs that are altered in cancers and that also control expression of DNA repair genes. The lists are not entirely overlapping. However, they do not indicate how these miRNAs are deregulated.
Deregulation of miRNA expression in cancers has been found to occur by a number of non-epigenetic mechanisms [120, 172]. One mechanism includes alterations in genomic miRNA copy numbers and location. Some of these are deletions that include the miRNA clusters 15a/16-1 or let-7g/mir-135-1,or else amplification or translocation of the mir-17-92 cluster. In some cancers miRNAs were deregulated because of defects in the biogenesis mechanism (the process of creating miRNAs, which has a number of steps). Some cancers have deregulated miRNAs due to single nucleotide polymorphisms (SNPs) in the genes coding for the miRNAs, or SNPs in the target gene area to which the miRNA is targeted. Some miRNAs, that target DNA repair genes, are regulated by oncogenes. For instance ATM is down-regulated by miR-421, but miR-421 is regulated by N-Myc . Thus, not all deregulation of DNA repair genes or DDR genes by miRNAs is due to epigenetic alteration affecting expression of the miRNAs.
19. Examples of epigenetic repression of DNA repair genes, due to alterations in methylation of promoters of miRNAs in various cancers
Table 5 lists nine miRNAs that have three characteristics. (1) Their expression is epigenetically controlled by the methylation level of the promoter region coding for the miRNA, (2) they control expression of DNA repair genes and (3) their level of expression was frequently epigenetically altered in one or more types of cancer. This list is not exhaustive. Many of the 30 miRNAs listed by Tessitore et al.  or Vincent et al.  might also meet these criteria upon further examination. Four of the miRNAs on this list are not noted by Tessitore et al.  or Vincent et al. . Studies of most of these epigenetically controlled miRNAs have not noted the frequencies with which they occur in cancers. This is a very recent area of research, and seems to be less systematic, at this point, than studies of hypermethylation of promoter regions of DNA repair genes.
|DNA repair gene targets||Cancers affected (frequency if measured)||Refs indicating epigenetic control of miRNA||Refs indicating target gene(s) of miRNAs||Refs indicating cancer type affected|
|RAD51, RAD51D||Osteosarcoma, lung, endometrial, stomach|||||||
(uracil DNA glycosylase)
Field defect gastric (27%)
Field defect colon (60%)
Chronic lymphocytic leukemia (18%)
Small-cell lung cancer (67%)
|||||[49, 176, 178, 179]|
|[121, 182]||[18, 183]||[18, 121]|
|Let-7a repression increases HMGA2; HMGA2 alters chromatin architecture of and represses ERCC1)||ERCC1||(Colon)|
|Let-7b repression increases HMGA1; HMGA1 targets P53||P53||Prostate|
|||[186, 187]||[186, 187]|
|miR-34b repression increases HMGA1; HMGA1 targets BRCA1||BRCA1||Breast||||[135, 136]|||
20. Whole genome sequencing indicates a high level of mutagenesis in cancers
Almost 3,000 pairs of tumor/normal tissues were analyzed for mutations by whole exome sequencing (WES) (sequencing the protein coding parts of whole genomes) and more than a hundred pairs of tumor/normal tissues were analyzed for mutations by whole genome sequencing (WGS) by Lawrence et al. . Median mutation frequencies for 27 different types of cancer were found to vary by 1,000-fold. When there was a particular median mutation frequency for a type of cancer, the scatter of values (in individual cancers) for that type of cancer, above and below that median value, also varied by as much as 1,000-fold. Some mutation rates, given as numerical values of median numbers of mutations per megabase in a review of the literature by Tuna and Amos , are shown in Table 6, and the values were also converted to mutations per whole diploid genome in the table.
|Parent/child per generation or cancer type||Mutation rate per million bases||Mutation rate per diploid genome|
|Parent/child per generation||0.00000023||70|
|Microsatellite stable (MSS) colon cancer||2.8||16,800|
|Microsatellite instable (MSI) colon cancer (mismatch DNA repair deficient)||47||282,000|
|Small cell lung cancer||7.4||44,400|
|Non-small cell lung cancer (smokers)||10.5||63,000|
|Non-small cell lung cancer (non-smokers)||0.6||3,600|
|Lung adenocarcinoma (smokers)||9.8||58,500|
|Non-UV-induced melanoma of hairless skin of extremities||3-14||18,000-84,000|
|Non-UV-induced melanoma of hair-bearing skin||5-55||30,000-330,000|
The mutation frequency in the whole genome (not just the protein coding regions) between generations for humans (parent to child) is about 30 - 70 new mutations per generation [190-192]. For protein coding regions of the genome in individuals without cancer, Keightley  estimated there would be 0.35 mutations per parent to child generation. Whole genome sequencing was also performed in blood cells for a pair of monozygotic (identical twin) 100 year old centenarians . Only 8 somatic differences were found between the twins, though somatic variation occurring in less than 20% of blood cells would be undetected.
As seen in Table 6, tumors have a substantially higher frequency of mutations than the number of new mutations per generation in individuals without cancer. Also notable in Table 6, tumors with more exposure to DNA damage (lung cancers of smokers, and melanomas in individuals with high UV exposure) had higher mutation frequencies than the comparable tumors for patients with less exposure to DNA damage (lung cancers of non-smokers and melanomas of individuals without high UV exposure).
The information from whole exome sequencing and whole genome sequencing showed that different spectrums of mutations occurred in different tissues [188, 195]. Lung cancers shared a spectrum dominated by C->A mutations, presumably consistent with exposure to polycyclic aromatic hydrocarbons in tobacco smoke. Melanomas had a spectrum with frequent C->T mutations caused by misrepair of UV-induced covalent bonds between adjacent pyrimidines. Jia et al.  found 3-5 independent mutational signatures in 9 major types of cancers, indicating a range of 3-5 predominant mutational processes in different cancers. Lawrence et al.  also found about a 2.9-fold difference in mutation frequency across the genome depending on expression level of the genes. Genes with higher expression had a lower mutation frequency, possibly due to the availability of extra transcription-coupled repair. Also the mutation frequency of genes replicated early in a cell replication cycle was 2.9 fold lower than that of genes replicated late in the cycle.
While the type of mutation spectrum depended on the most frequent DNA damages in a given tissue, and there were about 5-fold differences in mutation frequency depending on whether genes were frequently transcribed or in a DNA region replicated at early or late times in a replication cycle, the largest differences in mutation frequency were due to being in a tumor tissue versus a normal tissue (Table 6). These large differences in mutation frequency may frequently be due to whether one or more DNA repair genes are epigenetically reduced in expression in the stem cells giving rise to the development of the cancer.
21. Epigenetically reduced expression of DNA repair genes in DNA repair pathways in cancers
Figure 3 indicates typical DNA damaging agents, some of the lesions they cause and the pathways used to repair these lesions. Many of the genes active in these pathways are indicated by their acronyms. The acronyms listed in red represent genes shown, in Tables 3, 4 and 5, whose expression is frequently reduced due to epigenetic alterations in many types of cancers. The major DNA repair pathways are base excision repair, nucleotide excision repair, homologous recombinational repair, non-homologous end joining, mismatch repair and direct reversal. Each of these repair pathways employs one or more DNA repair enzymes that are frequently epigenetically reduced in expression in one or more types of cancer. This could be a substantial source of the genomic instability that is characteristic of cancers.
Deficiencies in DNA repair due to inherited germ-line mutations in DNA repair genes cause increased risk of cancer. Such DNA repair gene mutations allow excess unrepaired DNA damages to accumulate in somatic cells. Then either inaccurate translesion synthesis past the un-repaired DNA damages or the error-prone DNA damage response of non-homologous end joining can cause mutations. Erroneous or incomplete DNA repair may also cause epimutations. In sporadic cancers, mutations in DNA repair genes are relatively rare. However, at least 25 DNA repair genes are often epigenetically altered and have reduced expression in sporadic cancers and in the field defects that give rise to the cancers. Such epimutations in DNA repair genes also likely lead to a further increase in mutations and epimutations, and these mutations and epimutations can include both the driver mutations and the other epigenetic alterations central to progression to cancer. Whole genome sequencing of many different types of cancers show that between thousands to hundreds of thousands of mutations occur in various types of cancers. The epimutations in DNA repair genes that occur early in progression to cancer, are a likely source of the high level of genomic instability characteristic of cancers. Epigenetic reduction of DNA repair appears to be a frequent early step, central to progression to cancer.