Logic of Epigenetics and Investigation of Potential Gene Regions Logic of Epigenetics and Investigation of Potential Gene Regions

In living organisms, all molecular structures are formed, and all events are carried out by specific DNA sequences referred to as ‛ genes’. However, these genes need to be governed and controlled to function properly. In this way, they can participate in bio- logical processes at the optimal time. Genes are actively controlled by other genes and specific proteins called ‛ transcription factors’. In addition, there is another mechanism that determines gene expression, which can be transmitted from generation to genera tion and from cell to cell. This mechanism is referred to as the ‛ epigenetic code’. The DNA sequence does not undergo any changes during the formation of this code, but the relevant part of DNA fragment becomes no longer meaningful. While histone modifica - tions control expression of DNA in chromosome structure, methylation modifications at the gene level are quite effective in controlling expressions the cytosine-end methylation seen in mammalian genome often occurs in the nucleotide pairs which are also called the CpG dinucleotides. The most common epigenetic modifications are the changes in histone proteins and DNA methylation, and the most widely studied and the most well-established epigenetic mechanism is the latter .


Introduction
Epigenetic changes in living organisms can basically be grouped under two headings. One is protein acetylation, which is an epigenetic modification at the protein level. The other is DNA methylation, an epigenetic modification that occurs at the DNA level. In living species, all macro-molecular structures are determined by nucleotide sequences in the genome, and there is a different mechanism that can be transferred to cell from cell, which has inherent ability to determine gene expression. It is called epigenetic code. During creation of this code, DNA sequence does not undergo any change. The genetic and epigenetic alterations mentioned above result in the activation of oncogenes or the inactivation of tumour suppressor genes. Methylation may occur in any living organism from bacteria to complex species such as humans. The most common type of methylation is the methylation of gene promoters. This is followed by exon methylation, intron methylation and exon-intron methylations, which may be observed quite frequently.
Methylation-specific PCR (MSP) and methylation-sensitive restriction fragment length polymorphism (MS-RFLP) are the two most widely utilized methods in DNA methylation studies. Also, modified DNA sequencing with bisulphite treatment, known as bisulphite sequencing, may also be employed to investigate the conformation of the region of interest. These 3 are considerably successful methods. With the advances in technology and reduced costs, methylation-specific DNA sequencing has become a frequently used method to investigate the methylated regions identified by means of these methods. Whether a methylation region affects the expression of the gene of interest is another aspect to take into account as some genes may not yield any products although they are not methylated. In that case, one should consider that the gene in question may be activated by other mechanisms. With a better understanding of such histone and DNA modifications, they now attract attention as therapeutic targets in cancer and various diseases. They have started to create new alternatives especially in cancer treatments. Various computer programs have begun to be developed for methylation analysis. This section discusses all of the aforementioned conditions separately.

Histone modifications
The most basic unit of the structure called chromatin is nucleosomes. A nucleosome is a unit of 146 bp stretch of DNA over H2A, H2B, H3 and H4 central histone proteins and binding of H1 protein to the structure as a lock. In addition, these constructs provide necessary packaging for the DNA molecule, which is quite large, to fit in a small area (Figure 1).
Covalent changes in amino acids are found in tail parts of central histone proteins form the epigenetic code. As a result of these changes, chromosome structure constitutes expression control constructs in DNA by acquiring heterochromatin (expressionally inactive) or by forming regions euchromatin (expressionally active regions). Histone modifications can be classified as acetylation, methylation and phosphorylation. The modifications are mostly visible and reversible at the amino (NH 3 -) and carboxyl (COO-) ends of central histone proteins. Each of these changes to square may cause differentiation by altering histone function. For example, it can be seen that one, two and three bases of methyl group are added in methylation of arginine [1]. Histone changes are carried out by various enzymes. These include histone acetyl transferases (HATs), histone deacetylases (HDACs) and histone methyl transferases. The equilibrium in activity of these enzymes and their associated proteins is important when they can perform functions of normal cells. These equilibrium distortions can cause problems that can occur from loss of cell function to cancer formation.

DNA modifications-methylations
The underlying transcriptional silencing mechanism of DNA methylation is based on the overmethylation of cytosine in CpG-rich islands in the promoter region of a gene. This mechanism cooperates with histone deacetylation to suppress the chromatin structure. GC-rich DNA sequences in the human genome are often found in the promoter region and exon 1 of about 50% of all genes [2]. DNA methylation is the main underlying mechanism that regulates gene expression in mammalian cells, as it happens to be one of the major mechanisms for the silencing of genes involved in cell cycle as well as cell growth and death [3].
The most widely studied and the most well-established epigenetic mechanism is DNA methylation. It is an enzymatic change where cytosines are converted to 5′-methylcytosine. The cytosine-end methylation seen in mammalian genome often occurs at the 5'-CpG-3′ dinucleotides, which are also called CpG dinucleotides [4].
Methylation occurs by means of DNA methyltransferase (DNMT) enzymes. The DNMT family consists of four members, namely DNMT1, DNMT2, DNMT3A and DNMT3B. These enzymes are stratified into two groups: those that protect the methylated region and the ones that add new methyl groups. About 70% of all CpG dinucleotides of the human genome are methylated [5]. The remaining are the CpG-rich promoter regions of about 200 base pairs or are the first exons of genes. These regions are also called CpG islands and are found in 60% of all genes [6]. CpG methylation is programmed during the early embryonic period and preserved in later periods. CpG methylation is highly important with regard to normal functions of a given cell, as it affects the regulation of gene expression. For example, DNA methylation plays an important role in gene silencing of the inactive X chromosome as well as the regulation of age-related or tissue-specific gene expression [7].
Although the structural changes that occur in DNA are usually termed as mutations, not every alteration is actually a mutation. A mutation refers to any change at base level such as purineto-pyrimidine (G-A) or pyrimidine-to-pyrimidine (C-T) changes; single or multiple alterations; insertions, deletions and even single nucleotide polymorphisms (SNP). Yet, SNPs differ from mutations due to their structure. When methylation is compared with other changes in DNA, the methylation process may be considered as another type of mutation, with a change in the structure of the base resulting from a chemical change in DNA. However, mutations are rare changes compared to methylation, and they may or may not be repaired by DNA repair mechanisms [8]. They can be inherited from any ancestor or parent, and they may also occur as germline changes. On the other hand, SNPs can be called DNA alterations, which are more common in the population and which manifest themselves as susceptibility to disease, rather than resulting in a direct disease phenotype. At this point, methylation is not considered as a mutation, despite the fact that it prevents cytosine behaviour by adding a methyl group from CpG dinucleotides to cytosine [9,10].
The most appropriate means of this option are the CpG sequences within the DNA, which are bound to their conjugates through an enzymatic process in a stronger manner compared to the A-T pairs. This is because ApTs are bound to their complementary pairs in the corresponding chain by means of two hydrogen bonds, while CpGs are bound with three. Such binding characteristics are expected to provide stability to CpGs compared to ApTs. This may explain the greater frequency of methylation in CpGs rather than ApTs in the organism.
Methylation usually occurs through the addition of a methyl group to CpG sequence or to the C base in these CpG islands. Although such a change normally appears as a mutation, it is understood that, unlike mutations, this change is a highly functional mechanism in terms of cellular development and quite common across living organisms from bacteria to highly complex multicellular species. In this way, the organism can adapt to environmental changes by changing the activation of the desired genes in response to external influences when necessary, thereby maintaining vitality and survival. During a methylation reaction, 5-methylcytosine is formed with the addition of a methyl group to the fifth carbon of the cytosine in CpG base pairs by the DNA methyltransferase enzyme (Figure 1). Potentially, any CpG base pair or island may undergo methylation. In addition, the fourth nitrogen of cytosine and sixth nitrogen of adenine, which are usually not found in multicellular organisms, may also be methylated in addition to the 5-methylcytosine formation in bacteria [11].
Genomic imprinting is another example of DNA methylation that is involved in single-allele gene expression. Approximately 80 loci are suppressed in this way. The tissue-specific and condition-specific expressions of these genes occur through the regulation of methylation [12]. At the end of 1970s, a decrease in methylcytosine numbers was observed in the genome of tumour cells [13]. This was referred to as hypomethylation of DNA and was demonstrated in benign and malignant tumours [14]. Hypomethylation of DNA may also activate oncogenes. Studies have shown hypomethylation in SI00A4, a metastasis-associated gene in colorectal cancers and the genes, cyclin-D2 and maspin, in gastric carcinomas [14,15]. Hypomethylation may cause loss of imprinting (LOI), thereby promoting cell proliferation. One of the best examples of this process is the loss of imprinting in the IGF2/H19 region, which is seen in about 40% of colorectal cancers [1].

Bacterial epigenetic mechanisms
Modulation of chromosome organisation is one of the host defence mechanisms against bacterial attacks in eukaryotes. The host cell can often resist bacteria through these highly special and successful defence mechanisms. However, bacteria also have mechanisms that are developed against this system. Some bacteria may contain eukaryote-like proteins and eukaryotic histone translation proteins, which target the chromosomal machinery. In this way, they can activate appropriate enzymes in the host [16].
Several bacteria contain an N4-methylcytosine base whose function has not been fully characterised. There are studies indicating that these N4C modifications affect global gene expression in Helicobacter pylori, an example of carcinogenic bacteria [17].
Methylation in bacteria is different from eukaryotes in that it is seen in the fourth carbon of the cytosine as well as methylated adenine (N6-methyladenine) in addition to the fourth carbon of cytosine. DNA methylation occurs in bacteria by methyl binding to cytosine C-5 or N-4 and N6-adenine on the DNA methyltransferase enzyme side. N6-methyladenine is found only in bacteria and in less complex eukaryotes, and not in vertebrates [11]. Interestingly, bacteria also contain a restriction modification system that digests DNA methylase to provide protection against foreign DNA. These consist of the restriction enzyme systems called DcM, which recognises the 5-C cytosine, and Dam, which recognises methylated adenine. Of these, the Dam family is the most well-known protein group. The functional domain of Dam is a DNA MTase with an alpha molecule consisting of a polypeptide of 10 amino acids [18].
Similar to eukaryotes, bacteria also have rRNA methylation. The most important aspect of this methylation is that it creates targets for bacterial infections that can cause infection in humans. While promoter methylation is associated with negative expression, this may not always be the case for exon methylation. Still, sometimes exon methylation shows no effect on gene expression. Investigation on genetic mechanisms affecting cardiomyocyte differentiation includes some studies, which show that intragenic methylations create cellular memory through this mechanism, particularly in pluripotent cells [11].

Mitochondrial methylation
In all eukaryotic cells, mitochondrion is the most important organelle for cellular energy and the only organelle containing genomic material apart from the nucleus. Owing to its unique and small genome, this organelle exerts certain proteins and RNAs needed for respiratory reactions and cell growth. Together with the nucleus, it is one of the two genetic systems found in the cell. Mitochondrial DNA (mtDNA) has a circular structure and is located inside mitochondrial matrix, bound to the internal membrane. The mtDNA consists of 16,569 base pairs in a loop form, containing a heavy chain (H) and a light chain. This chain structure contains 2 rRNA molecules, 22 tRNA molecules, and 13 genes necessary for oxidative phosphorylation and electron transport (Figure 2). A healthy mitochondrion exerts adequate functions by means of certain proteins that are present in the mechanism of oxidative phosphorylation. This genome is about 16.5 kb in humans, and 13 proteins and rRNAs are synthesised from the mitochondrial genome in mammals [19,20]. Therefore, the slightest change in mitochondrial genome can potentially affect the life of the cell, and thus the organism [21].
As is the case with mutations, methylation is a mechanism that alters the way the genes work together with the diet, drugs and oxidative stress. Methylation profile of human mtDNA starts from the intrauterine period. With the aid of foetal thyroid hormones, mtDNA copy number and mtDNA methylation are regulated by a thyroid-dependent pathway [23]. In addition, mtDNA is also affected by airway pollutants. The elemental carbon present in benzene and exhaust gas in traffic may influence the number of mtDNA copies by means of ribosomal RNA methylation [24].
Despite the understanding of these methylation changes in mitochondrial genome, the function of methylated mtDNA has not been fully understood; however, Monique et al. have revealed a different situation. Contrary to what is expected with the methylation of CpGs and GpCs in mtDNA, they have demonstrated that methylation of CpG base pairs had no effect on expression while methylated GpCs were associated with decreased expression [25].
Furthermore, since the mitochondria in humans are entirely of maternal origin, life style of the mother may also have effects at mitochondrion level. Habitual behaviour of the mother, her diet and excessive consumption of fats and sugars trigger obesity, which may affect epigenetics, including that of mitochondria in subsequent generations. A study conducted in mice revealed increased methylation leading to alterations in gene expression and suppression, particularly in the respiratory tract of the offspring of mice that were fed high-fat diets [26]. Furthermore, because the structure of mitochondrion is highly similar to that of bacteria, some genetic factors and structures may also be the same.

Detection of the methylation region
Any DNA region containing a CpG sequence may potentially undergo methylation. For this reason, any gene may be subjected to methylation; however, methylation most commonly occurs in the promoter region of genes. That is quite reasonable given the fact that the promoter region is the recognition site for RNA polymerases and therefore of critical importance  for gene expression. Although more rarely, methylation may also be observed in exon 1 and other exons of certain genes. One of the ways of finding out whether a gene or DNA region may undergo methylation is to investigate that region with sequence analysis programs specifically developed for this task. There are paid programs by companies such as Fermentas (Methyl Primer Express® Software v1.0) serving this purpose as well as Web-based free access programs such as 'MethPrimer' developed by Li LC and Dahiya R. (available at http:// www.urogene.org/methprimer/) [27]. There is also a Web-based program 'DiseaseMeth' (available at http://202.97.205.78/diseasemeth/Analyze.html#form3), which offers researchers who is interested in using a study of which disease or cancer is associated with a desired gene methylation. Another Web-based program is SMS (Sequence Manipulation Suite; available at http://www.bioinformatics.org/sms2/) [28]. One of many useful programs at this address is program which shows CpG islands in a desired DNA region. Another program is a Webbased program QUMA (Quantification tool for Methylation Analysis; available at http://quma. cdb.riken.jp/) (Figure 3) [29]. These programs allow methylation region mapping, designing methylation-specific primers, determining bisulphite sequencing primers, identification of CpG islands and determination of DNA sequences that are altered or newly formed due to bisulphite modifications.
In order to do this, the initial step should be obtaining the FASTA sequence of the sequence in question. The Ensembl genome browser 92, available at https://www.ensembl.org/, is a very   good resource that can be utilised for this step. Ensembl is a database that allows access to DNA sequences in formats such as BLAST and BLAT for comparative genomic studies, evolution studies, sequence variants and transcriptional variants across vertebrate genomes. The relevant DNA sequence obtained from such a database is added to the methylation primer program, the sites of interest are labelled, and the program is run (Figures 4-8).

Methylation-specific PCR (MSP)
MSP is an established and the most commonly utilised method to determine the presence or absence of methylation in a gene region of interest as well as the extent of methylation, if any [30,31].
In this method, DNA is initially subjected to total bisulphite treatment. In this way, all of the unmethylated cytosines in the DNA sequence are transformed into thymine. However, the methylated cytosines remain unchanged (Figure 5). This results in a motif change in the methylated region. Subsequently, spectrophotometric DNA quantification is conducted with DNA samples. For this, measurements are made at wavelengths of 260/280 nm and multiplied by the dilution factor to determine the amount of DNA as nanogram per microlitre; and according to these amounts, MS-PCR is performed taking care to include equal amounts of DNA samples in the PCR. In this type of PCR, the PCR is conducted in two separate tubes for each sample. While the primer specific for the methylated region is added into one tube, the other tube contains the primer for the unmethylated region, and PCR is performed as 35-40 cycles. PCR samples are then analysed by visualisation with ethidium bromide agarose gel imaging systems.

Sample reaction: (for the prestin gene promoter region in guinea pigs)
Methylation-specific polymerase chain reaction: DNA purity was measured at wavelengths of 260 and 280 nm on spectrophotometry, and DNA quantification was performed using the DNA (μg/mL) = A260 × Dilution Factor × 50 (coefficient) formula at 260 nm UV. Subsequently, 10 μL of DNA was taken from each sample and bisulphite modification was carried out for DNA with Millipore CpGenome modification kit according to the manufacturer's instructions. This modification converts the cytosines of the unmethylated region to thymines. For the region thought to be altered in this manner, CpG sequences in exon 1 of the prestin gene were detected using the MethPrimer V1.1 beta program [30]. MSP was conducted according to the following protocols to investigate methylation utilising the PCR primers for exon 1 of the prestin gene stated below, and PCR conditions for the methylated and unmethylated regions are as follows: were used to obtain a total of 50 μL, and PCR thermal cycling procedure was as follows: This investigation allows determining whether the region of interest is methylated and quantifying methylation by measuring the band intensity with any gel analysis system. If desired, results may also be obtained while real-time PCR is performed with SYBR-green or Taqman or similar fluorescent probes (the probe must be designed according to the bisulphite DNA sequence).

Evaluation of agarose gel imaging
The resulting primers were stained with ethidium bromide on 2% agarose gel, and agarose gel findings were evaluated by examination under ultraviolet light (Figures 7-10).
Example run:

MS RFLP
Another method used to detect any DNA methylation is the methylation-specific restriction fragment length polymorphism (MS-RFLP) method, which produces methylation-specific digestion. This method employs restriction enzymes obtained from bacteria, which recognise  This process digests the methylated CpG (if present) in the DNA region of interest. If that region is not methylated, digestion occurs via HpaII, an isomer of MspI. Concurrent use of these two enzymes provides insight on the methylation status of the region in question.

COBRA (combined bisulphite restriction analysis)
COBRA is a method developed by combined use of bisulphite modification and RFLP methods. In this combined method, DNA is first differentiated in the methylation-dependent DNA sequence with sodium bisulphite. As mentioned earlier, in practice, methylase cytosines are not affected, while unmethylated cytosines are converted into uracil. PCR is performed with primers designed specifically for these new DNA sequences obtained by bisulphite method. Unlike MS-PCR, primers used in this PCR step should not contain CpG sequences. After this step, digestion step of restriction comes. At this step, PCR products are treated with two restriction enzymes with TaqI (TCGA) and BstUI (CGCG). These enzymes form a methylation profile by cutting off DNA fragments to properties of whether residues of cytosine are methylated or not (Figure 11).
Methylated regions in DNA fragments digested BstUI and unmethylated homologue of the same region digested TaqI enzyme. These new fragments, which are formed according to methylation state, can be calculated as percentages of methylation rate for region investigated according to patterns and density of band formed by conducting the polyacrylamide gel electrophoresis. It should be noted here that band densities should be determined by a photo image analysis program (e.g., Figure 12). Once these values have been obtained, percentage of methylation can be calculated by substituting the following formula [33,34] (Figure 13).

Advantages
It is simple, cheap and fast. DNA methylation levels can be shown without needing for any extra bisulphite sequencing.
Due to its high specificity, it is very successful even in DNA material, which is obtained from paraffin blocks.
It is quite quantitative compared to methylation-specific PCR, which is a qualitative method.
In MS-PCR, only locus-specific methylation information can be obtained, whereas in this method, entire region within locus is examined.

Disadvantages
This method is limited by existing restriction regions in region being investigated.
In addition, incomplete digestions of restriction enzymes can have misleading results for the amount of methylation.
Due to cell-type heterogeneity in different cell complexes, methylated CG sequence may be transformed into other sequences, such as CA or CT, leading to a change in restriction sites.
Considering all these advantages and disadvantages, the COBRA method emerges as an effective method for determining a highly effective level of methylation.

Conclusion
Many studies after the first discovery of epigenetic changes have shown that epigenetic modifications are quite important in natural flow of life and that many genes are mechanisms used for expression and inactivation when needed.
Furthermore, as these mechanisms are understood better, they have been associated with many pathological conditions, from cancer to mental retardations such as fragile X and Prader-Willi Angelman, to chromosomal instability. In this regard, such diseases have emerged in new therapeutic targets. Chemical agents such as 5-azacytidine and 5-aza-2′-deoxycytidine, which inhibit methylation by binding to and inhibiting DNMT enzyme, are now being tested in phase II-III studies [35][36][37][38]. Furthermore, use of oligonucleotides that bind to promoter regions at specific gene level and perform gene inhibition is seen as approaches that may contribute to cancer treatments [39][40][41].
In addition, histone acetyltransferase inhibitors, which inhibit formation of epigenetic modifications at histone level, have emerged as novel cancer treatment agents. For example, H3-H4 of a soy protein Lunasin has been found to exhibit anticancer properties in mammals by suppressing histone acetylation. Wenyi et al. have shown that YEAST domain, an acetyl lysinebinding module, is effective in the development of cancer, and this domain appears to be the target for anticancer therapies. It seems that such approaches in the future will start to give more successful results [42].
In light of the information presented above, one may conclude that methylation is a highly important genomic mechanism for the cell from unicellular organisms to multicellular organisms. This mechanism is seen in bacteria, mitochondria and all eukaryotic cells in proportion to the complexity and development level of the organism. The identification of methylated regions is as important as the methylation process itself, as this may allow identifying several potential novel targets related to subject matters such as the development mechanism of diseases, certain roles in cancer development and bacterial resistance.