Summary of selected therapeutics targeting dysregulated gene expression regulators in cancer.
Gene expression is tightly regulated via a myriad of mechanisms in the cell to allow canonical processes to occur. However, in the context of cancer, some of these mechanisms are dysregulated, and aberrant gene expression ensues. Some of the dysregulated mechanisms include changes to transcription factor activity, epigenetic marks (such as DNA methylation, histone modifications and chromatin state), or the stability of mRNA and protein. Disruption of these regulators would result changes in transcriptional landscape, affecting multiple pathways and eventually lead to continual cell proliferation and the formation of the tumor. Here, we discuss epigenetic factors that affect gene expression which are dysregulated in cancer, and summarize the therapeutic options available to target these factors.
- gene regulation
- chromatin remodelers
- histone modifications
- DNA methylation
- transcriptional regulation
Genetic information in cells is stored as DNA, and are the same in all cells of a single organism. The method in which the same code can lead to the translation of multiple different proteins in a tissue-specific manner lies in the regulation of expression of specific genes encoded by DNA. Gene expression involves the transcription of DNA to RNA, and in some cases, translation into proteins. The gene products which consist of translated and non-coding RNAs (RNAs of which the RNA is its final product and does not get translated to protein) have different but very important functions in the cell. Collectively, the combination of the genes which are expressed and those which are silenced are crucial in maintaining normal processes in the cell, determining when it should proliferate or divide etc.
To ensure that gene expression is kept in check, its regulation is multi-tiered and can be altered/halted at every step of the gene expression process. This ensures that even if one of these regulatory events goes awry, there are other mechanisms in the cell in place to curb aberrant gene expression. Numerous alterations in the multi-tiered process often lead to aberrant gene expression and abnormal function in the cell, which occurs in the case of cancer.
Cancer is the result of a cell escaping from its natural cell cycle, evading apoptosis which leads to uncontrolled and abnormal proliferation. The transformation of a normal cell into a malignant one results from the increase in expression of oncogenes, with a concomitant decrease in tumor suppressor expression. Oncogenes are involved in functions which lead to uncontrolled proliferation and growth, evading the canonical apoptotic mechanisms, while tumor suppressors curb these mechanisms. Although cancer cells of different tissue types have the same outcome of uncontrolled growth, the mechanisms involved are varied. Even within the same tissue, malignancies are very heterogeneous, contributing to the challenge in treating this disease.
In this chapter, we will summarize the multiple layers of gene regulation, focusing on the dysregulated epigenetic changes in cancer involved in gene expression regulation. We then summarize the therapeutic options available which seek to curb these gene regulation changes.
2. Transcriptional regulation
Gene expression is regulated by many varying factors governing different stages of this complex process. The regulatory process begins from the chromatin conformation defining its state, either euchromatin/open chromatin, allowing active transcription, or a repressive heterochromatin/closed chromatin, illustrated in Figure 1A . The open chromatin is actioned by several factors, including acetylated histone tails and the inclusion of specific histone variants which act to destabilize the nucleosome. This is often tested by DNase hypersensitivity assays, which measure the sensitivity of DNA to enzymatic digestion. Portions of the DNA with nucleosomes loaded would be protected from DNase digestion, while nucleosome depleted regions (NDR) are sensitive.
The idea of topological domains was suggested recently by Dixon et al. , which describe a section of the genome of which its enclosed genes are generally co-regulated. The boundaries of these topological domains interact in 3D space and are marked by the presence of CCCTC-binding factor (CTCF) and cohesin. There have been mutations observed in these topologically associated domains (TADs), which will be described later in this chapter.
On the DNA level, the region of the genome around the transcription start site (TSS) is particularly important in the regulation of gene expression as that is where transcription machinery and co-regulators bind. Proteins known as transcription factors are able to recognize motifs/transcription factor binding sites on the promoters of genes, and recruit RNA Pol II and/or phosphorylate Pol II to initiate transcription. Alternatively, transcription regulators can also inhibit the binding or recruitment of the transcription complex. In addition to the TSS, recent studies have also identified that distal regulatory elements, such as enhancers, are able to regulate expression as well. Both enhancers and promoters can be marked by different histone modifications which can be read and have an impact on the expression of its corresponding gene.
Regulatory processes that affect the stability of mRNA and proteins are equally as important as gene expression factors but will not be addressed in this chapter. These include microRNA (miRNA) which have the ability to degrade mRNA and thus prevent it from being translated to proteins, and also endogenous systems which degrade proteins.
Studies have shown that every step of the gene expression regulation can be exploited by cancer cells to prolong survival and contribute to tumorigenesis. Considering the vast nature of this topic, we will be focusing on the multiple epigenetic factors regulating gene expression which are dysregulated in cancer. However, dysregulated transcription factors form a huge topic of interest in cancer research. This includes work on tumor suppressor p53 (reviewed in [2, 3, 4]) and the oncogene MYC (reviewed in [5, 6]), amongst others.
3. Epigenetic regulators of gene expression
Epigenetics is an additional layer of complexity to the genetic code, and comprise of additional information (in the form of compounds added or secondary structures) on top of the four basic nucleotides: adenine, thymine, cytosine and guanine. This allows a gene with the same genetic sequence to be differentially regulated according to cell type or context. This occurs through mechanisms such as binding of transcription factors and machinery. These epigenetic changes and modifications allow for greater control and regulation for gene expression by transcription factors. Here, we will discuss the myriad of epigenetic features that can contribute to gene expression.
3.1. Chromatin modifications
It is only through the compaction of DNA in a nucleosome that the long length of DNA is packaged into the nucleus of each cell. A length of 146 bp of DNA is coiled around a histone octamer, consisting of two residues each of H2A, H2B, H3 and H4, secured in place by H1. It is linked to its neighboring nucleosome via linker DNA, varying between 20 and 80 nucleotides in length. The placement and composition of nucleosomes are not at all random. Rather, its content and position is strategically coordinated to regulate gene expression on several different layers.
Although the terms are used interchangeably in literature, in this chapter, we will address chromatin modifiers and remodelers as two separate groups of enzymes, the former covalently modifying histone, and the latter regulating the position and composition of the nucleosomes.
3.1.1. Chromatin modifiers
Chromatin modifiers consist of a group of enzymes that post-translationally modify histones, resulting in histone modifications that make up the histone code. Covalent modifications on these histones can consist of acetyl, methyl, ubiquitin, phosphoryl groups, amongst others. The specific modifications which are added to different histones determines its function and its role in the cell, as represented in Figure 1C . A summary of these modifications and its related downstream effect was investigated by the ENCODE team, and summarized in their paper in 2012 .
Histone modifications can occur on different regulatory regions of a gene, such as at its promoter, enhancer or even along the gene body. The presence of an active mark on the promoter, for an example, recruits other transcriptional machinery factors, and allows transcription to occur. These histone modifications are not permanent, and can differ between tissue types or depending on its cellular state. The regulation of histone modifications is a balance between its epigenetic writers and epigenetic erasers, and the dysregulation of either would result in aberrant histone modifications and thus a change in the transcription. The group of proteins that are involved in interpreting these histone marks, epigenetic readers, are also crucial, whose dysregulation could result in the misinterpretation of the epigenetic marks and therefore a change in transcriptional landscape.
A point to note is only a subset of these residues can undergo multiple modifications. For example, lysine 27 on histone 3, can be either acetylated (in active enhancers) or tri-methylated (a mark of repressed promoters), each of which contributes to a different transcriptional outcome. It is also hypothesized that the addition of a particular covalent modification sterically inhibits the alternate modification. Additionally, histone marks on the enhancers such as H3K4me and H3K27ac are capable of regulating the 3D structure of chromatin. Since this method of regulation is indeed another layer that the cell regulates gene expression in a normal setting, it comes as no surprise that chromatin modifiers are known to be dysregulated in cancer, resulting in the aberrant expression of its downstream genes (lysine acetyltransferases (KATs) reviewed in , histone methyltransferases (HMTs) reviewed in , histone deacetylases (HDACs) reviewed in [10, 11, 12], histone demethylases reviewed in [13, 14]). Here, we will focus on epigenetic factors which are dysregulated in cancer, resulting in a transcriptional change.
188.8.131.52. Epigenetic writers
Epigenetic writers are enzymes that have the ability to deposit the moiety onto histone tails, and have to work in balance with epigenetic erasers to ensure the presence of the correct histone modification to govern the required transcriptional program. All epigenetic writers require a catalytic domain which allows the enzymatic reaction of the moiety transfer to occur, and another domain which allows the recognition of the chromatin.
Another class of chromatin modifiers are KATs which are involved in acetylating lysine residues on histones. This is perhaps the most crucial modification on histone tails as it not only marks histones to be read by epigenetic readers, but the acetylated histones also allow the relaxation of chromatin conformation. Acetyl groups neutralize the positive charge of histones, therefore loosening the conformation of nucleosomes in turn allowing the binding of transcription initiation complex to chromatin, resulting in gene activation. The decrease in acetylated histones is a phenomenon observed in multiple cancers as depicted in Figure 1C , along with its permissive state, as seen with global levels of H4K16ac decreased in lymphomas when compared to normal .
TIP60 (HIV-Tat1 interactive protein 60 kDa) is an acetyltransferase, a member of the MYST (Moz, Ybf2p/Sas3p, Sas2p and TIP60) family, known to acetylate both histones and non-histone proteins. Although it has been shown to have a bivalent role in the process of carcinogenesis (dependent on cancer type), strong evidence has supported its role as a tumor suppressor [16, 17]. TIP60 exerts its tumor suppressive phenotype through acetylating several substrates in the cell, one of which is ataxia-telangiectasia mutated (ATM) at DNA damage sites [18, 19]. Additionally, TIP60 is also known to acetylate p53 at lysine 120, crucial in mediating the switch between cell-cycle arrest or apoptosis . It was also recently shown that TIP60 is able to repress telomerase transcription by acetylating Sp1, therefore inhibiting its binding on TERT promoter .
In support of its role as a tumor suppressor, the levels of TIP60 was found to be lower in tumor compared to its matched normal in multiple cancers including breast  and colon . The downregulation of TIP60 occurs through several mechanisms, including regulation at mRNA level by miR-22  or via proteosomal degradation by the human papillomavirus (HPV) oncogene E6 through EDD1, an E3 ubiquitin ligase . TIP60 has been shown to regulate transcription at the integrated HPV promoter via the acetylation of H4, and therefore repress the expression of E6 [24, 25].
GCN5 (general control of amino acid synthesis protein 5-like 2) is another acetyltransferase which acetylates H3K9, H3K14, marks of active transcription , and when part of the SAGA (Spt-Ada-GCN5-Aceyl transferase) complex, acetylates H3 and H2B. Its link with cancer is primarily through the oncogene MYC, which recruits the SAGA complex to chromatin, where GCN5 functions to activate its gene targets . Since MYC is a substrate of GCN5 and when acetylated at K323 increases its stability, both proteins are maintained in a positive feedback loop . GCN5 is also crucial in ALL (acute lymphoblastic leukemia) via the acetylation and stabilization of the oncogenic fusion protein E2A-PBX1 , leading to aberrant expression of HOX genes, therefore leukemogenesis .
The dysregulation of methyltransferases have also been implicated in the severity and progression of cancer. In particular, G9a is responsible for the mono and di-methylation of H3K9, which are characteristic of a transcriptionally repressed gene. G9a has been found to be involved in epigenetically silencing numerous tumor suppressor genes, such as DSC3 (desmocollin 3) and CDH1 (cadherin 1), with the repression of G9a resulting in rescue of tumor suppressive gene expression . Recent studies have shown the upregulation of G9a in tumors, leading to the aberrant methylation of H3K9 and thus silencing of tumor suppressor and growth inhibitory factors [32, 33]. In both lung and breast cancers, G9a exerts these effects through regulating epithelial-mesenchymal transition (EMT) factors such as EpCAM (epithelial cell adhesion molecule) and Snai1 (snail family transcriptional repressor 1) [34, 35]. In AML (acute myeloid leukemia), the depletion of G9a results in late disease onset and a reduction of leukemia stem cell frequency, although there was no observable function in hematopoietic stem cells . This was identified to occur through the regulation of transcription in a HOXA9-dependent manner. In addition, G9a also has alternate roles in the cell, acting as both a transcriptional co-repressor and co-activator. As a transcriptional co-repressor, G9a has been found to be present in the same protein complex as JARID1A, the H3K4 demethylase, while it acts as transcriptional co-activator through stabilization of the mediator complex .
EZH2 is the enzymatic subunit of the polycomb repressive complex 2 (PRC2) which methylates lysine 27 of histone H3, resulting in chromatin compaction and transcriptional silencing [38, 39]. EZH2 overexpression has been observed in a myriad of different cancers including prostate, breast, bladder and endometrial (reviewed in ). Several independent studies have shown that this gain-of-function mutation on EZH2 is able to contribute to cell proliferation  and neoplastic transformation in breast epithelial cells , which is dependent on EZH2’s methyltransferase domain. In addition, mutations have been found in the H3K27me3 demethylase, UTX , further contributing to dysregulated tri-methylation of H3K27.
184.108.40.206. Epigenetic readers
The faithful expression and activity of epigenetic readers are also crucial in regulating histone modifications, the dysregulation of which would lead to histones being modified for an extended amount of time, resulting in a cascade of downstream effects. A group of epigenetic readers are the ING (inhibitor of growth) family which contains PHD (plant homeodomain) finger at its C terminus, with the ability to read methylated lysine 4 of histone 3 [44, 45]. ING readers are present in numerous protein complexes, which allow the interpretation of the histone tails to be actioned. ING1 and ING2 are able to recruit mSin3-HDAC transcriptional repressors while ING3, ING4 and ING5 interact with HATs to activate a downstream gene expression [46, 47]. ING family members have been implicated in many cellular processes with tumorigenic features such as cell cycle progression, apoptosis, DNA repair and senescence . Because of its prominent role in the development of tumors, cancer cells have exploited this mechanism, with loss-of-function mutations in INGs observed in many solid tumors .
Although acetylated histones can exert transcriptional change by itself through the regulation of chromosomal conformation, the acetylated marks can be read by epigenetic readers and result in further gene expression changes. Extensive studies have been carried investigating the readers of acetylation marks—bromodomain-containing proteins. Bromodomain and extra-terminal (BET) proteins are a subset of this family, consisting of BRD2, BRD3, BRD4 and BRDT. BRD4 has shown to recruit the elongation factor P-TEFβ [49, 50], thus facilitating the transcription by RNA Pol II, resulting in gene activation. In other cases, BRD4 is also known to recruit repressive machinery .
220.127.116.11. Epigenetic erasers
Epigenetic erasers are capable of removing the histone modifications applied by the epigenetic writers, and are crucial in ensuring that histone modifications are removed in a timely manner to prevent aberrant transcription from occurring.
The larger of the two classes of histone demethylases are the family of proteins that contain the Jumonji C (JmJC) domain, of which JARID1B (also known as KDM5A) is a member. It has been identified to remove the methylation marks from lysine 3 of histone 4. Its downstream targets comprise of tumor suppressor genes, including BRCA1 and Caveolin 1, whose promoters JARID1B demethylates and therefore suppresses its expression [51, 52]. Not surprisingly, JARID1B was found to be overexpressed in late stage breast and prostate cancer [51, 53]. Similarly, KDM4A and KDM4B have been identified as proto-oncogenes, interacting with ERα to regulate pro-tumorigenic factors such as MYC . KDM4A, in particular, blocks cellular senescence through transcriptionally repressing the tumor suppressor CHD5 . Interestingly, KMD4C was discovered to increase the amount of euchromatin in the cell through delocalizing HP1 (a repressive protein), therefore allowing transcription .
HDACs are able to remove the acetyl groups from histone tails, and are divided into four classes based on their similarity with their yeast homologs. The most well-studied class of HDACs is the class I subfamily (consisting of HDAC1, HDAC2, HDAC3 and HDAC8), where all the members have been linked to cancer. The upregulation of HDAC1 has been associated with poor prognosis in several solid tumors such as lung, prostate and liver [57, 58], and even as an independent prognostic marker in breast tissues . Along similar lines, the transient depletion of HDAC1 and HDAC3 the cervical cancer cell line, HeLa, resulted in decreased cell proliferation . Links between the class II HDAC genes and lung cancer has also been drawn, when HDAC genes from 72 NSCLC (non-small cell lung cancer) patients were measured via real-time PCR . It was found that lower expression of class II HDAC genes was correlated with poorer prognosis, of which HDAC10 was the strongest predictor of patient outcome.
3.1.2. Chromatin remodelers
Chromatin remodelers are enzymes that are able to make structural changes to the nucleosome, either by adding or ejecting a nucleosome, or by moving the nucleosome along the string of DNA (reviewed in ). This acts as one of the first steps of gene expression regulation, allowing the DNA to be exposed to other biological factors to be read and therefore expressed.
There are four chromatin remodeler families that utilize ATP hydrolysis to facilitate the catalysis of these movements along the string of DNA, NuRD/Mi-2/CHD, switch/sucrose non-fermenting (SWI-SNF), inositol requiring 80 (INO80) and imitation switch (ISWI). There is at least one epigenetic reader protein in each of the complexes, which allow the recognition of the nucleosome prior to its ejection or relocation.
Similar to other dysregulated factors in cancer, chromatin remodelers have an important responsibility in regular gene expression, and therefore have been exploited in cancer cells as a mechanism which leads to uncontrolled proliferation. Although there has been extensive research into mutations of members of chromatin remodeling families, limited evidence has linked these mutations to epigenetic alterations and changes in the chromatin architecture.
In the SWI/SNF complex, BRG1 and SNF5 are required for maintaining nucleosome positioning at the −1 and +1 positions around the TSS of repressed genes. Up to 20% of human tumors are known to contain at least one mutation in SWI/SNF , although BRG1 was found to have dual effects in both the promotion and suppression of tumorigenesis [64, 65, 66, 67]. In spite of many studies carried out to characterize the mutations of SWI/SNF components, there are far fewer studies that identified the epigenetic implications of these mutations (reviewed in ). It was found that in the absence of either BRG1 or SNF5, there was a decrease in the distance between nucleosomes on both sides of the TSS, indicating that chromatin condensation is augmented upon SWI/SNF dysregulation . SWI/SNF is also known to interact with other chromatin modifiers, whose interaction is altered when there is a change observed in SWI/SNF. As an example, SWI/SNF complex antagonizes PRC2’s repressive activity by removing it from gene promoters, resulting in open chromatin conformation and therefore increase in gene expression [70, 71].
3.1.3. Histone variants
Histone variants are non-canonical versions of three histone subunits (all but H4), some with as few as one amino acid difference between its wildtype counterpart. Histone variants being highly conserved across different species eludes to its important cellular function separate from that of the canonical histones [72, 73].
The placement of histone variants in nucleosomes in specific portions of the genome have different abilities to regulate nucleosomal stability, dynamics and structure, and therefore transcriptional machinery. For example, the presence of H2A.Bbd results in loosened chromatin and therefore encouraging transcription . As with canonical histones, histone variants are also subject to covalent modifications (acetylation, methylation etc.) and mutations, and therefore add an additional layer of complexity to understanding its function and role.
Many of these histone variants have been found to have a role in cancer (reviewed in ), some with oncogenic and others with tumor suppressive abilities, differing based on its role in the cell. Some of the strongest correlations between histone variants and regulating transcription occurs at a macro level, in which histone variants regulates the stability of its nucleosome. Nucleosomes that contain H2A.Z or H3.3 were shown to be less stable, although they are known to occupy the normally nucleosome-depleted regulatory regions . Their presence on these regulatory sites inhibit the formation of stable repressive nucleosomes, and due to its labile nature, can be displaced easily by transcription regulators, therefore facilitating gene expression .
In addition to histone variants affecting the overall nucleosome structure, in some instances, the readers of histone variants are different from that of canonical histones. The reader of H3.3K36me3 was identified to be the tumor suppressor protein ZMYND11 (zinc finger MYND-type containing 11), which regulates RNA Pol II, hence linking histone variants and transcriptional elongation . Further, an increase in acetylation of H2A.Z was observed in prostate cancer, particularly around the promoters of actively transcribed genes, thus resulting in the aberrant activation of genes .
3.1.4. Chromatin conformation
Regions of the DNA which are known to interact frequently are classified as TADs, which can range up to several million nucleotides in length, and several factors are thought to be associated with the boundaries of these domains, including CTCF and cohesin . Characteristic features of TADs include the lower frequency of interaction of gene domains between TADs while genes within the same TADs are often co-regulated, sharing the same genetic profile (reviewed in ).
Given the role of TADs in regulating gene expression, it should come as no surprise that this cellular process is also exploited in cancer cells. Disrupted TAD boundaries have been found present in cancer cells, allowing ‘enhancer hijacking’ to occur, where enhancers do not act on their canonical targets alone, resulting in aberrant expression of non-canonical genes, illustrated in Figure 1D . It was found that GFI1 and GFI1B, members of the growth factor independent 1 family of proto-oncogenes, were upregulated not by amplification in medulloblastoma, but instead activated by enhancer hijacking, coming under the control of an aberrant active enhancer .
Similarly, viral oncogenes encoded by Epstein–Barr virus (EBV) was found to hijack DNA looping, leading to the association of two key genes, MYC and BCL2L11 (a pro-apoptotic factor) to non-canonical enhancers . Through the transactivator EBNA2, the MYC locus was reconfigured to be regulated by a non-canonical enhancer, resulting in the activation of the oncogene, promoting tumor formation. Concurrently, EBV repressors EBNA3A and EBNA3C were shown to be capable of recruiting EZH2, thus silencing the upstream regulatory enhancer hub.
3.1.5. Mediator complex
The mediator complex is a large, 26 subunit complex which coordinates the many different elements required for the activation of gene transcription. This includes the cross-talk between RNA Pol II and transcription factors that possess sequence-specific recognition sites, and also distal regulatory regions such as enhancers. The CDK8 module, consisting of MED12, CDK8, Cyclin C and MED13 , has been identified as a key component of the complex, functioning as a molecular switch  and therefore regulating the activity of the mediator complex. Due to its crucial role in regulating transcription, cancer cells have exploited this mechanism to lead to aberrant gene expression.
MED12 is a member of the complex, of which frequent mutations at the N-terminus have been found in prostate cancer, uterine leiomyosarcomas , breast adenomas  and phyllodes . Specifically, the mutations in MED12 disrupts the interaction between MED12 and CDK8, therefore rendering the CDK8 module inactive, therefore decreasing the activity of the mediator complex [88, 89, 90].
3.2. DNA methylation
One form of epigenetic regulation is CpG (5′ cytosine phosphate guanine 3′) methylation, which refer to the addition of methyl groups to the carbon residue at the fifth position on cytosine, exclusively where cytosine directly precedes guanine. These methyl moieties are modified and interpreted by three distinct groups of proteins, DNA methylation writers, readers and editors.
DNA methylation writers consist of proteins from the DNA methyltransferase (DNMT) family, namely DNMT1, DNMT3A, DNMT3B [91, 92]. De novo methylation patterns are added by DNMT3A and DNMT3B in response to stimuli in different contexts, while DNMT1’s primary role is in the maintenance of the methyl groups, allowing it to be inherited across cell divisions. The effects of CpG methylation is mediated by the reader proteins from three separate families of proteins- methyl-CpG-binding domain (MBD) proteins, the SET- and Ring finger-associated (SRA) domain family and the Kaiso family of proteins [93, 94, 95, 96]. These proteins are endowed with the ability to bind to CpG methylation and recruit other factors to exert regulatory roles in the cell [97, 98]. Finally, DNA methylation editors are able to oxidize the existing methyl group on carbon-5, and convert it to form a 5-hydroxylmethylcytosine (5-hmC), which undergo further chemical modifications before resuming its unmethylated state .
Not surprisingly, there have been reports linking all three groups of the above-mentioned proteins with cancer, leading to a global hypomethylation of repetitive elements and CpG-poor regions but a hypermethylation at CpG islands . Approximately 15% of CpG sites are situated directly upstream of genes within CpG islands, with the remainder of the genome having relatively sparse CpG sites (reviewed in ). CpG islands are regions in the genome spanning between 300 to 3000 nucleotides which contain a high density of CpG dinucleotides, and are present at about 60% of human promoters , while CpG island shores are regions 2 kb flanking the CpG islands . CpG islands have been shown to be sites of transcription initiation, evidenced by several features; TSS have been found within CpG islands, RNA Pol II found to co-localize to the islands, and the active histone mark H3K4me3 was found to be within the islands [104, 105]. CpG islands are thought to regulate gene expression in two distinct manners. First, it has been shown that the methylated CpG dinucleotide is capable of sterically hindering the binding of transcription factors and co-activators . Cancer cells have exploited this mechanism to silence tumor suppressor genes, with a global hypermethylation of CpG islands observed across multiple cancer types, depicted in Figure 1B [101, 107, 108, 109]. Secondly, the MeCP1 proteins, a class of DNA methylation readers, have been shown to recruit HDACs, responsible for deacetylating histones, therefore condensing the chromatin, ultimately leading to a decrease in transcription .
MBD proteins have been implicated in multiple cancers (reviewed in ), with its mutation and overexpression resulting in uncontrolled cell proliferation. In prostate cancer, it was discovered that MBD2 overexpression is associated in the aberrant hypermethylation and therefore suppression of GSTP1 tumor suppressor, as is with TERT in HPV-positive cells [111, 112, 113]. Recently, there was an unexpected finding that MBD2 was associated with DNMT1 and DNMT3A, and the loss of MBD2 resulting in global hypomethylation, eventuating in both downstream gene activation and repression . In particular, the hypomethylation observed at CpG islands and shores were the same regions that were hypermethylated in prostate cancer patients, eluding to the critical role of MBD2 in rewriting the cancer methylome.
The TET (ten-eleven translocation) family of proteins has also been implicated in several different types of cancer, with most studies carried out in hematological malignancies. It was in blood that TET1 was first implicated in cancers, identified as a fusion partner in mixed lineage leukemia (MLL)-rearranged AML [115, 116]. Subsequent studies in blood cancers focused on TET2’s role, discovering numerous mutations , resulting in a truncated enzyme, or one with compromised enzymatic activity. This was reflected in patients where a global decrease in 5hmC was observed in patients with homozygous or heterozygous TET2, suggesting that mutations in TET2 were haplo-insufficient loss-of-function mutations . Aside from hematological malignancies, overall decreased levels of TET2, and its concomitant decrease in 5hmC levels have been observed in cancer of other origins such as breast, lung, liver , prostate, gastric, and melanoma  and glioblastomas .
In addition, IDH1 (isocitrate dehydrogenase 1) and IDH2, genes involved in the tricarboxylic acid (TCA) cycle, were found to be mutated in gliomas and AML, leading to the hypermethylation of the genome. This is attributed to the production of a metabolite which inhibits histone and DNA demethylation [121, 122].
3.3. Chromosomal translocations
Cancer genomes are notorious for being unstable- that is, prone to mutations in the nucleic acid sequences, chromosomal rearrangements, inversions, translocations and deletions. The consequence of this is widespread and severe, resulting in aberrant expression of genes which are crucial in evading apoptosis, eventuating in tumor growth.
Chromosomal translocations are an important aspect of genomic instability, where a section of the genome is inserted into an alternate location. This can be large sections of the genome spanning millions of base pairs, as in the formation of the Philadelphia chromosome through the swap of sections of chromosome 9 and 22 (first described in 1960 ), or through a small translocation (<1 kb), as is with MLL fusion genes. Interestingly, the largest proportion of chromosomal translocation targets are transcription factors, wherein the fusion gene produced is still active, but in an aberrant manner . These chromosome abnormalities are most often observed in hematopoietic and lymphoid tumors , with fusion genes involving MLL gene accounting for up to 5–10% of ALL/AML cases, resulting in unfavorable prognoses .
There are two variations of MLL’s resultant fusion genes, with the chromosomal insertion resulting in the retention of the N or C terminus of the MLL located at 11q23, both of which have been identified to have oncogenic potential. The function of normal MLL is that of a histone methyltransferase, with its N terminus containing a CxxC domain, allowing it to recognize unmethylated CpG dinucleotides and its corresponding target genes . The C terminus of MLL, on the other hand, contains features responsible for its histone methyltransferase activity such as a SET domain, responsible for methylation of lysine 4 of histone H3 . The more prevalent class of fusion genes are the chimeras with the N terminus of MLL fused with the C terminus of the fusion partner, MLL-r (MLL-rearranged), that are also known to have more oncogenic potential. It has been observed that most MLL-r function to augment its canonical downstream targets such as the HOX cluster of genes rather than gain a new profile of target genes. However, the exact function of the fusion genes is entirely dependent on the fusion partner. The two most common fusion gene partners of MLL are AF9 and AF4, which are present in the super elongation complex (SEC), and confer the fusion product’s function of a transcription activator [128, 129]. In the chimeric gene, the fusion partner of MLL acts as an adaptor to the MLL portion of the gene (with DNA binding abilities) to the rest of the SEC, therefore resulting in aberrant expression of the downstream genes.
In a recent study, it was shown that MLL-AF9 and MLL-AF4 also bound to distal regulatory elements such as enhancers, and are able to deregulate its target gene expression, through interplay with RUNX1 . Further, enhancer regions enriched for MLL-AF9 were found to be CTCF-rich, suggesting a novel role of MLL-AF9 in mediating 3D chromatin conformation .
Although most studies on fusion genes have been published in blood malignancies, recent studies have turned their attention to solid tumors. Fusion genes have also been found to be prevalent in non-blood cancers, with a similar trend of fusion partners being transcription factors, resulting in rampant aberrant gene expression changes (reviewed in ).
Another example of a fusion protein is BRD4-NUT, prominent in NUT midline carcinoma (NMC). The N terminus of BRD4 is conjugated with the C terminus of NUT, with retention of both bromodomains (from BRD4) and the KAT catalytic domain (from NUT) in the resultant fusion protein. This fusion protein has oncogenic potential through the formation of large active chromatin (1 Mb) where BRD4-NUT and histone hyperacetylation are co-localized . In spite of the large size of chromatin which is activated, there is surprisingly only a small subset of genes which are upregulated, including MYC and TP63 .
4. Therapies targeting epigenetic factors
When considering the different ways in which biological processes are dysregulated in cancer, aberrant activity of epigenetic regulators is considered one of the easiest to treat. This is mainly due to the fact that epigenetic dysregulation typically only occurs in specific cell types, and the aberrancies are not present in all somatic cells. As such, therapies can be targeted to affected cancer cells, instead of requiring gene therapy to correct all somatic cells. Furthermore, epigenetic factors are often enzymes whose activity can be targeted, and inhibited. Therefore, diseases linked to epigenetic dysregulation often have a more positive prognosis with better treatment possibilities. Most of the epigenetic therapies currently being used are inhibitors, preventing the enzyme from performing its canonical function, as summarized in Table 1 and Figure 2 .
|Targeted mechanism||Canonical function||Tumor type||Therapeutic compound|
|DNMT1, 3A, 3B||DNA methylation, methylating and therefore silencing tumor suppressor genes||Myelodysplastic syndrome (MDS), AML||Inhibitors: 5- Azacytidine/Vidaza (FDA and EMA approved), decitabine (EMA and FDA approved) (reviewed in )|
|LSD1||Mono and dimethylated H3K4 demethylase||Promyelocytic leukemia, AML, small cell lung cancer||TCP, GSK2879552 |
|JARID1||Di and trimethylated H3K4 demethylase||Lung cancer||Compound 6j, prodrug 7j|
|Classes I, II and IV histone deacetylases||Removes acetyl groups from histone tails||Cutaneous or peripheral T cell lymphoma, glioblastoma||Inhibitor: Vorinostat/suberoylanilide hydroxamic acid (SAHA) (FDA approved) , panobinostat (FDA approved), belinostat (FDA approved) Reviewed in [171, 172]|
|Class I histone deacetylases||Removes acetyl groups from histone tails||Drug-resistant multiple myeloma, T-cell lymphoma||Inhibitor: Romidepsin (FDA approved) Reviewed in |
|Histone acetyltransferases- GCN5, p300, PCAF||Acetylates histone tails||Neuroblastoma||Inhibitor: PU139, PU141 |
|EZH2||Methylation of H3K27, and repression of tumor suppressor genes||Acute myeloid leukemia (AML), lymphomaNon-small cell lung cancer||Inhibitor: EPZ-005687 , GSK-126 , EPZ-6438/Tazemetostat , UNC1999,|
Reviewed in 
|DOT1L||Methylation of H3K79, and activation of genes involved in DNA damage response and cell cycle progression||Advanced hematological malignancies (Reviewed in )Acute myeloid leukemia (AML), lymphoma||Inhibitor: EPZ-5676 , EPZ004777 [163, 164], SYC-522GSK2816126, CPI-1205, |
|G9a||Methylation of H3K9||Non-small cell lung cancer||Inhibitor: UNC0642 , A-366 |
|Bromodomain-containing proteins||Reads acetylated histone tails||Solid tumors, AML, MDS||iBET compounds: I-BET762, I-BET151, RVX-208, RVX-2135 Reviewed in [138, 176]|
The development of 5-azacytidine has been one of the most promising epigenetic therapies thus far, the treatment of which was seen to increase survival rate when compared to conventional care in MDS and AML patients . 5-azacytidine is a cytosine analogue and incorporates into DNA and RNA, binding irreversibly to all three DNMTs, sequestering the enzymes and preventing it from performing its canonical functions. At low doses, treatment with DNMTi results in global hypomethylation (observed in LINE and Alu repetitive elements as surrogate markers of global hypomethylation)  while it is cytotoxic at higher doses . 5-azacytidine also cannot be methylated by DNMTs, therefore curbing the phenomenon of CpG hypermethylation seen in cancer cells. However, different tumor types have yielded varied response rates to DNMTi, with solid tumors demonstrating limited sensitivity in comparison to myeloid malignancies . This could be explained in part because DNMTi function during the S-phase of cell cycle, and are therefore less efficacious in solid tumors . In tumors where DNMTi was found to be effective, aberrantly silenced tumor suppressor genes were reactivated upon treatment , contributing to the mechanism in which DNMTi can lessen tumor burden. Additionally, treatment with DNMTi was found to increase the presentation of tumor antigens (such as cancer testis antigens (CTA)) and interferon signaling, increasing the visibility and therefore recognition and destruction of the tumor cells by the host [139, 140]. Endogenous retroviral elements (ERVs) were also observed to be increased upon treatment with DNMTi, which lead to the increase in cytoplasmic double-stranded RNA, inducing viral mimicry, and eventually leading to apoptosis [141, 142]. In contrast, treatment of IDH inhibitors have been met with limited success, with only a small subset of IDH-mutant cell lines demonstrating sensitivity to treatment . Currently, there are no known TET inhibitors which prevents the demethylation of CpG islands.
Similarly, aberrant histone modifications are observed in cancer cells, and therefore drugs have been developed to block the activity of the enzymes that are responsible for the maintenance of these modifications. The majority of HDAC inhibitors that have been developed can be termed broad reprogrammers, which target entire classes of deacetylases instead of specific enzymes. Class I, II and IV of HDAC enzymes all share a similarity- that they require zinc ion to perform its enzymatic function, whilst class III of HDACs require NAD+ as its cofactor. As a result, it is easier to target these HDACs as two separate entities. There are now four inhibitors which have been approved by the FDA- vorinostat/SAHA (suberoylanilide hydroxamic acid), romidepsin, belinostat, panobinostat. However, research focus has now turned to targeting the readers of these acetylated marks- proteins which contain bromodomains. After reading the acetylated histone marks, bromodomain-containing proteins can act as a scaffold to recruit other activating or repressive machinery to act on the acetylated histone tails, regulating downstream gene expression. Inhibitors of the bromodomains of bromdomain and extra-terminal motif proteins (iBETs) have gained exceptional interest as of late. One of the most prominent drugs targeting bromodomain-containing proteins that have been developed is JQ1, named after its founding chemist, Jun Qi [144, 145], initially found in NUT midline carcinoma. JQ1 acts as a competitive inhibitor of BRD2, BRD3, BRD4 and BRDT by reversibly binding to the hydrophobic bromodomain pockets, therefore not allowing it to bind to and recognize acetylated histone tails. Since MYC is a known target of BRD4, the bulk of the tumorigenic effect can be attributed to the decrease in the oncogene expression. However, the effects of iBET compounds have been shown to be not entirely dependent on MYC . Across different tumor types, JQ1 has been shown to suppress tumor growth in a myriad of different ways. In glioblastoma, JQ1 has been shown to induce G1 cell-cycle arrest and apoptosis through regulating expression of key genes such as MYC, hTERT and p21 . Similarly in medulloblastoma, JQ1 was shown to affect cell cycle genes via activating cyclin-dependent kinase inhibitors (CDKi), reducing E2F activity and affecting p53 signaling . However, JQ1 is not able to selectively target either of the two bromodomains on the BET proteins, nor between the four BRD proteins, limiting the function of JQ1 . Although BRD4 is known to regulate the transcription of many cellular genes, the treatment of JQ1 only represses a subset of these genes. This raises the question of whether BRD4 regulates transcription in a manner independent from reading acetylated histone tails. This mechanism of action was later elucidated, where BRD4 was found to be located at super-enhancers, therefore regulating transcription in a distinct manner . The abovementioned iBET compounds function to only competitively inhibit the function of the bromodomain-containing enzymes. Thus, recent research has attempted to degrade the iBET substrates by conjugating iBET to E3 ubiquitin ligases in a method known as proteolysis targeting chimera (PROTAC) [150, 151]. iBET compounds have also shown promise in NMC (where BRD4-NUT fusion protein is formed), wherein the treatment with JQ1 significantly reduced tumor formation in vivo with limited cytotoxic effects .
Histone demethylases are another class of epigenetic erasers which can be targeted in clinic, of which inhibitors against LSD1 has seen the most progress. LSD1 is a member of the lysine demethylase (KDM) 1 family, with the ability to remove mono and di-methylated H3K4, therefore leading to transcriptional repression . Its overexpression is linked to more aggressive breast and esophageal cancers, while its downregulation limits cell proliferation . Combinatorial therapies involving the LSD1 inhibitor tranylcypromine (TCP) and all-trans-retinoic acid have been found efficacious in AML mouse models, and function by accumulating H3K4 methylation and therefore the activation of previously silenced tumor suppressor genes . Other drugs such as GSK2879552, a derivative of TCP, has been developed and are currently in clinical trials for acute small cell lung cancer and AML . Similarly, JARID1 is the demethylase of tri and dimethylated H3K4, and is observed to be aberrantly expressed in several cancers. Compound 6j and prodrug 7j are inhibitors which have been developed to inhibit JARID1 activity, with suppression of growth seen in a lung cancer cell line .
Targeted therapies are another group of drugs which have higher specificity and target specific epigenetic modifiers. EZH2 is one such target, where it has been found to be overexpressed in multiple cancers. The first drug to target EZH2 was 3-deazaneplanocin-A (DZNep), which initiates the degradation of the PRC2 complex to restore expression of silenced genes . However, therapeutics targeting EZH2 has since evolved to target its enzymatic activity instead. EPZ-005687 is a competitive inhibitor which has high specificity for EZH2, and induces apoptosis via the reduction of H3K27 methylation levels in lymphoma cells . Similar effects were observed with EPZ-6438/Tazemostat treatment, with decreased H3K27 methylation and decreased tumor size in non-Hodgkin lymphoma mouse models . Several small molecule inhibitors such as GSK126  and UNC1999  have been identified to function as EZH2 inhibitors. In mice xenografts with gain-of-function EZH2 mutations, GSK126 has been shown to be effective in decreasing global levels of H3K27me3 and reactivating genes silenced by the PRC2 complex .
DOT1L is the only known histone methyltransferase of H3K79, which is often misregulated in AML as a result of gene translocations, leading to aberrant expression of hematopoietic stem-cell renewal genes . EPZ004777 has been shown to reduce H3K79 methylation and its subsequent downregulation of downstream genes, prolonging the survival of MLL mice model [163, 164]. EPZ-5676  and SYC-522  are two drugs which are currently in clinical trials, both of which have been demonstrated to efficiently decrease H3K79 methylation. G9a is the able to methylate H3K9, and has had inhibitors developed against it. UNC0638 is one such small molecule inhibitor, which results in genetic changes which phenocopy a transient depletion of G9a. Expectedly, there was a concomitant global decrease in H3K9 which was observed . However, soon after, UNC0642 was developed with improved pharmacokinetic properties .
As an alternative to directly targeting epigenetic modifiers, research is now expanding into targeting the upstream regulators of these factors such that the activity or expression of the histone modifiers are regulated, affecting the downstream histone modifications.
As discussed in this chapter, the regulation of gene expression is a highly complex and multi-tiered process, regulated by a multitude of factors, summarized in Figure 3 . Cancer cells have evolved over time to exploit these mechanisms to dysregulate many cellular processes to evade cell cycle checkpoints and apoptosis, to allow continued proliferation. In particular, oncogenic viruses have also been shown to target some of these processes to dysregulate normal cell function, as in the case of BRD4 and TIP60, both targeted by HPV oncogenes. It can be assumed that oncogenic viruses would have evolved to maximize its carcinogenic potential, and therefore have minimal redundant functions. Therefore, the mere fact that these cellular components are targeted by oncogenic viruses eludes to its high canonical importance in the normal cell.
We have presented epigenetic regulating gene expression, one of the main methods in which either the profile of genes expression is changed, or the existing profile of genes are dysregulated, leading to aberrant upregulation or downregulation. In cancer cells, the dysregulated pathways have to overpower the canonical functions, to tip the balance so that processes occur in their favor, for sustained growth. It is therefore crucial to understand the mechanisms which are dysregulated in cancer cells so that further therapies can be developed to target these aberrancies.
S.J. was supported by grants from National Research Foundation Singapore and the Singapore Ministry of Education under its Research Centers of Excellence initiative to the Cancer Science Institute of Singapore (R-713-006-014-271), National Medical Research Council (NMRC CBRG-NIG BNIG11nov001), Ministry of Education Academic Research Fund (MOE AcRF Tier 1 T1–2012 Oct −04 and T1–2016 Apr −01) and by the RNA Biology Center at CSI Singapore, NUS, from funding by the Singapore Ministry of Education’s Tier 3 grants, grant number MOE2014-T3–1-006. N.Y. was supported by a post-graduate fellowship from NUS Graduate School of Integrative Sciences and Engineering.
|ALL||Acute Lymphoblastic Leukemia|
|AML||Acute Myeloid Leukemia|
|BET||Bromodomain and Extra-Terminal|
|CDKi||Cyclin-Dependent Kinase Inhibitor|
|CTA||Cancer Testis Antigens|
|CpG||Cytosine Phosphate Guanine|
|EpCAM||Epithelial Cell Adhesion Molecule|
|ERV||Endogenous Retroviral Element|
|GCN5||General Control of Amino Acid Synthesis Protein 5-like 2|
|iBET||Inhibitors targeting Bromodomain and Extra-Terminal Motif Protein|
|ING||Inhibitor of Growth|
|INO80||Inositol Requiring 80|
|MBD||Methyl-CpG Binding Domain|
|MYST||Moz, Ybf2p/Sas3p, Sas2p and TIP60|
|NDR||Nucleosome depleted region|
|NMC||NUT midline carcinoma|
|NSCLC||Non-small cell lung cancer|
|PRC2||Polycomb repressive complex 2|
|PROTAC||Proteolysis targeting chimera|
|SAHA||Suberoylanilide hydroxamic acid|
|SEC||Super elongation complex|
|Snai1||Snail family transcriptional repressor 1|
|SRA||SET- and Ring family-associated|
|TAD||Topologically associated domain|
|TIP60||HIV-Tat1 interactive protein 60 kDa|
|TSS||Transcription start site|
|ZMYND11||Zinc finger MYND-type containing 11|