RNA polymerase function in eukaryotes.
1. Gene expression
Gene expression is a process by which the genetic information is used in the synthesis of functional products including proteins and functional RNAs (e.g., tRNA, small nuclear RNA, microRNA, small/short interfering RNA, etc.). The process of gene expression is applied by all organisms including eukaryotes, prokaryotes, and viruses to produce the macromolecular machinery for life. Through controlling the cell structure and function, the gene plays an important role in cellular differentiation, morphogenesis, adaptability, and diversity. Because the control of the timing, location, and levels of gene expression can have a significant effect on gene functions in a single cell or a multicellular organism, gene regulation may also drive evolutionary change. Several steps in the gene expression process can be regulated, such as transcription, posttranscriptional modification (e.g., RNA splicing, 3′ poly A adding, 5′-capping), translation, and posttranslational modification (e.g., protein splicing, folding, and processing).
The genomic DNA is composed of two antiparallel strands with 5′ and 3′ ends which are reverse and complementary for each. Regarding to a gene, the two DNA strands are classified as the “coding strand (sense strand),” which includes the DNA version of the RNA transcript sequence, and the “template strand (antisense strand, noncoding strand)” which serves as a blueprint for synthesizing an RNA strand. During transcription, the DNA template strand is read by an RNA polymerase to produce a complementary and antiparallel RNA primary transcript. Transcription is the first step of gene expression which involves copying a DNA sequence to make an RNA molecule including messenger RNA (mRNA), ribosome RNA (rRNA), and transfer RNA (tRNA) by the principle of complementary base pairing. In prokaryotes, transcription is performed by enzymes called RNA polymerases to form all RNA molecules. In eukaryotes, it is mainly performed by RNA polymerases I, II, III, IV, and V to make rRNA, mRNA, tRNA, etc. ( Table 1 ).
|RNA polymerase I
|Synthesis of precursor 45S rRNA (35S in yeast) which matures into 28S, 18S, and 5.8S rRNAs; the formation of the major RNA sections of the ribosome
|RNA polymerase II
|Synthesis of mRNA precursors; most snRNA and miRNA
|RNA polymerase III
|Synthesis of tRNAs, 5S rRNA and other small RNAs found in the nucleus and cytosol
|RNA polymerase IV
|Synthesis of siRNA in plants
|RNA polymerase V
|Synthesis of RNAs involved in siRNA-directed heterochromatin formation in plants
1.2 Posttranscription modification
Posttranscriptional modification is referred to as biological processes by which RNA primary transcripts are chemically changed posterior to transcription in eukaryotes. This process significantly modifies the chemical structure of RNA molecules by three main constituting steps: the putting of a 5′ cap, the addition of a 3′ polyadenylated tail, and RNA splicing. In eukaryotes, primary RNA transcript must be processed after transcription because the initial precursor mRNA (primary RNA transcript) usually contains both exons (coding sequences) and introns (noncoding sequences). By removing the introns to link the exons directly in RNA splicing, such processing is able to ensure the correct translation of eukaryotic genomes. Also, posttranscription modification can protect primary RNA transcripts from degradation in that the 5′ cap and poly-A tail are able to protect the transcripts and facilitate the transportation of the mRNAs to ribosomes.
Every mRNA is composed of a 5′ untranslated region (5′UTR), a protein-coding region or open reading frame (ORF), and a 3′ untranslated region (3′UTR). The ORF carries information for protein synthesis encoded by the RNA triplet codes. The codon, a RNA triplet of the coding region, corresponds to a binding site complementary to an anticodon triplet in tRNAs. tRNAs with different anticodon triplets carry different amino acids and thereby make amino acids link together according to the order of triplets in the coding region by the assistance of ribosomes.
In prokaryotes, translation usually occurs in the cytosol where the specific amino acid, enzymes, and small subunits of the ribosome bind to the tRNA. Translation simultaneously proceeds with transcription (co-transcriptionally), using a mRNA that is still being synthesized. In eukaryotes, transcription occurs in nucleus, and translation occurs in cytoplasm (not co-transcriptionally). Although translation can occur in a variety of regions of the eukaryotic cells, its major working locations are the cytoplasm for soluble cytoplasmic proteins and the endoplasmic reticulum (ER) membrane for secretory proteins or integral proteins.
1.4 Posttranslational modification
Posttranslational modification is referred as the chemical change which proteins may undergo after translation. The main alternations are the specific cleavage of precursor proteins, the excision of signal recognition peptides (SRP), the formation of disulfide bonds, the covalent addition or removal of low-molecular-weight groups, and the addition of metal ions, leading to protein modifications such as glycosylation, lipidation, hydroxylation, methylation, mono-ADP-ribosylation, myristoylation, oxidation, palmitoylation, and phosphorylation. It plays a crucial role in the regulation of the protein folding, the targeting of specific subcellular compartments, the interaction of ligands or other proteins, and the state of functions, such as enzyme catalytic activity or the signaling function of proteins in signal transduction pathways.
2. Reverse transcription
Reverse transcription is referred to as the synthesis of complementary DNA (cDNA) from an RNA template using reverse transcriptase (RTase). For reverse transcription, RT uses an RNA template and a short RNA primer complementary to the 3′ end of the RNA to direct a cDNA strand synthesis from 5′ end to 3′ end. The first-strand cDNA can be made to be double-stranded using DNA polymerase I, and ribonuclease (RNase) H activity for RNA primer digestion is required in this case. The engineered RTase can improve the efficiency of full-length cDNA products, ensure the completeness of the mRNA transcript copying, and enable the propagation of a faithful DNA copy of an RNA sequence. The use of the thermostable RTase is very helpful for the denaturation of RNA structure when the RNA sequence contains high amounts of secondary structure. Some retroviruses (e.g., human immunodeficiency virus, HIV) which have an RNA genome are able to transcribe RNA into cDNA using RTase. The RNase H digests the RNA strand, and then the cDNA strand is used as a template to synthesize a complementary DNA strand to form a double helix DNA structure. The resulting double stranded DNA (ds DNA) can be integrated into the host DNA genome, causing the host cell to produce viral proteins that assemble into new virions. In HIV, the host cell undergoes apoptosis (programmed cell death).
Telomerase (terminal transferase, RNA-directed DNA polymerase) is a kind of RTase that lengthens the ends of linear chromosomes in some eukaryotes, containing an RNA template from which it synthesizes a repeating sequence of DNA or “junk” DNA. It is active in stem cells, gametes. and most cancer cells to enable these cells to replicate their genomes immortally without losing important protein-coding DNA sequence. Embryonic stem cells highly express telomerase to allow these cells to divide repeatedly and form the individual such as male sperm cells, epidermal cells, and activated T and B lymphocytes. However, it is usually absent from or at very low levels in most somatic cells. The repeated DNA sequence (telomere) can be considered as a “cap” for a chromosome. In normal cells, the telomere is shortened when a linear chromosome is duplicated. Activation of telomerase is one of the processes that let cancer cells become indefinite. Telomerase allows each offspring to avoid losing a bit of DNA, making the normal cells divide without limitation and become abnormal cells, and the unbounded cell growth is a characteristic of cancer.
Epigenetics is the study of heritable phenotype changes that do not involve alterations in the DNA sequence, meaning a change in phenotype without a change in genotype. The term also refers to the functionally associated changes to the genome that do not encompass a change in the nucleotide sequence. At present, DNA methylation, histone modification, and noncoding RNA (ncRNA)-associated gene silencing are major functions for involving in the initiation and support of epigenetic changes. It often indicates changes that affect gene activity and expression and shows phenotypic changes which can be transferred to the offspring.
Epigenetic changes can be influenced by several factors including the age, environment, lifestyle, and disease; it is traditionally considered to be regular and natural. This kind of change may be consistent through cell divisions during the cell life cycle and may also sustain for several generations, even if they do not have changes in the organism’s underlying DNA sequence. The process of cellular differentiation is an example of an epigenetic change in eukaryotic biology.
3.1 The evolving landscape of epigenetic research
In 1969, Griffith and Mahler first suggested that DNA methylation might play a crucial role in long-term memory function. DNA methylation has currently become one of the most extensively studied and well-characterized epigenetic modifications. The other main modifications include chromatin remodeling, histone modifications, and noncoding RNA mechanisms. The new findings about epigenetics are the correlation between epigenetic changes and diseases such as cancers, mental retardation, immune disorders, neuropsychiatric disorders, and pediatric disorders.
3.2 Environment and lifestyle can influence epigenetic change from one generation to the next
Not only the environment but also individual lifestyles can directly interact with the genome to affect epigenetic changes. These epigenetic changes may be demonstrated at various stages throughout an individual’s life and even last to his offspring. The prenatal and early postnatal environmental factors can influence the adult risk for the incidence of various chronic diseases and behavioral disorders in human epidemiology studies. It is known that the children have elevated rates of coronary heart disease and obesity after their mothers are exposed to famine during early pregnancy compared with those who are not exposed. Additionally, maternal exposure to air pollution could affect her children’s asthma susceptibility. Similarly, the adults who were prenatally exposed to famine have also been reported to have significantly higher incidence of schizophrenia. Fortunately, maternal ingestion of vitamin D is capable to adjust DNA methylation that impacts placenta function.
3.3 Environment and lifestyle affect individual epigenetics and health
Epigenetics are considered to be dynamic and changeable by the influence of lifestyle options and environmental factors, though our epigenetic marks are more stable during adulthood. Epigenetic effects gradually occur both in the womb and the full course of a human life span, and epigenetic changes could be reversed. Epigenetics have shown that different lifestyle options and environmental exposures can change DNA marks and play a vital role in the determination of health outcomes. The environment can dominantly influence the epigenetic tags and disease susceptibility. Pollution has become a significant topic because scientists have found air pollution could induce DNA methylation and increase one’s risk for neurodegenerative disease. Fortunately, vitamin B groups potentially protect humans from harmful epigenetic effects of pollution and against the other harmful effects on the body.
3.4 Deciphering the relationship between epigenetics and diseases
It is known that chronic pancreatitis causes a high risk of inflammation-associated progression to pancreatic cancer. The difficulty in rapidly diagnosing the disease is closely associated with its high mortality rate. Previous studies have demonstrated that cell-free DNA methylation from inflammatory diseases or cancer is variable, thereby opening a new era in developing biomarkers for the early diagnosis of diseases. Hence, early diagnosis for pancreatic cancer becomes crucial and facilitates the related studies into the epigenetic profiles . Natale et al. reported that exploiting the relationship between abnormally methylated cell-free DNA and pre-neoplastic lesions or chronic pancreatitis may become a novel method in developing tools for the early diagnosis of pancreatic cancer. Early diagnosis potentially makes it possible for the prediction of prognosis, the monitoring of tumor progression, and the development of effectively therapeutic strategies and provides precision medicine for patients suffering from a pancreatic disease .
The early-life environment including air quality is known to be critical for fetal programming. The air pollution exposure to mothers during pregnancy may adversely influence newborn outcomes such as baby birth weight, preternatural birth, and preterm birth. Therefore, it is needed to understand both air pollution-induced early health effects and its later-life consequences. Saenen et al. provided an overview of air pollution-induced placental molecular changes observed in the ENVIRONAGE birth cohort and assess the existing evidence. They reported that nitrosative stress and epigenetic alterations in the placenta may result from the prenatal exposure to air pollution . It is crucial to realize the clinical consequences of early-life epigenetic changes in the follow-up of child or birth cohort study. The public health policy maker should have understanding of epigenetic consequences and transgenerational risks to propose effective strategies which are focusing on providing effective protection of pregnant women, unborn children, and infants against exposure to adverse lifestyle factors .
4. Phenotypic traits
Genotype, a unique genome that can be revealed by genomic sequencing is a complete inheritable genetic identity. It is mediated by a special gene, cluster of genes, or set of genes which are carried by an individual. Genes are certain DNA segments that code for the protein production to determine distinct traits of individuals. DNA contains the genetic code which is responsible for all cellular functions such as mitosis, meiosis, DNA replication, protein synthesis, molecule transportation, etc. Each gene is located on a chromosome and can exist in different forms called alleles which are located on specific regions of chromosomes. The alleles can be transmitted from parents to offspring through sexual reproduction. The diploid organism inherits two alleles for each gene; one allele is from the father, and the other allele is from the mother. The interactions between alleles determine an organism’s phenotype.
The phenotype is a description of actual physical characteristics and encompasses directly visible characteristics such as height, weight, skin color, eye color, size, shape, health condition, disease history, even behavior, and temperament. However, not all phenotypes are a direct result of genotype. Most phenotypes are affected by both the genotype and environment in which one has lived one’s life including everything that has ever happened. We often consider “nature” as the unique genome which one carries and “nurture” as the environment in which one has lived one’s life. The phenotype is dependent on the genetic makeup of the organism and influenced by the environment to which the organism is subjected across various epigenetic processes. It includes all of the organism’s characteristics, including traits at multiple levels of biological organization, ranging from individual behavior and trait evolution through morphology, physiology, cellular characteristics, biochemical pathways, and gene expression.
A phenotypic trait (simply trait or character state) is a distinct variant of an organism’s characteristic, and it is an obvious and measurable trait that is expressed in an observable way. For example, the eye color (green, blue, brown, and hazel) is a phenotypic trait which is a polygenetic inheritance. The phenotypic trait may be either inherited parentally or determined environmentally; that is to say, some traits are determined by the genotype, and some traits are determined by environmental factors. The different genes or alleles caused by mutation can be passed on to successive generations, resulting in different phenotypic traits. Though the environment can affect the phenotype, the heritability of a phenotypic trait is defined as the proportion of the total phenotypic variation of this specific trait that is elucidated only by the genetic variation . The phenotypic variation of a trait (P) can be divided into three contributions as follows: latent genetic (G) factors, environmental (E) factors, and gene–environment interactions(G × E) .
If an organism inherits two same alleles, it is homozygous and expresses only one phenotypic trait. If an organism inherits two different alleles, it is heterozygous and may express more than one phenotypic trait. The phenotypic traits can be dominant or recessive. In completely dominant inheritance, only dominant traits are observable because the phenotype of the dominant trait entirely masks the phenotype of the recessive trait. In contrast, the dominant allele does not mask the other allele completely in incomplete dominance inheritance, resulting in a phenotype that is a mixture of traits from both alleles. In codominance relationships, both alleles are fully expressed, resulting in a phenotype in which both traits are demonstrated independently.
The gene expression function in eukaryotes, prokaryotes, and viruses is to generate macromolecules for constituting cellular components and exhibiting living functions. Its processes can be regulated in several steps including transcription, posttranscriptional modification, translation, and posttranslational modification. Regulation of gene expression is to modulate the levels of production and the timing of the functional product production. Control of gene expression is critical to allow cells to manufacture their functional products when cells need them; in turn, this gives cells the flexibility to adapt to a variable environment and respond to external signals, stimuli, and damages. Cellular structures and functions can also be controlled by gene expression regulation. Consequently, the regulation is crucial for cells to proliferate, differentiate, transport, metabolize, and repair. It is advantageous to the versatility, development, and adaptability of living organisms. A phenotypic trait, the expression of genes in an observable way, is an obvious and measurable trait. The phenotype is variable depending on the genetic makeup of the organism and also influenced by the surroundings to which the organism is subjected across its morphogenesis, including various epigenetic processes. We have to pay attention to the significance, mechanism, function, and characteristic of gene expression and phenotypic traits. Recently new findings and their implications could be discussed in the broadest context possible. Future studies, analyses, and applications of the gene expression and phenotypic traits should also be highlighted.
Natale F, Vivo M, Falco G, Angrisano T. Deciphering DNA methylation signatures of pancreatic cancer and pancreatitis. Clinical Epigenetics. 2019; 1(1):132. DOI: 10.1186/s13148-019-0728-8
Saenen ND, Martens DS, Neven KY, Alfano R, Bové H, Janssen BG, et al. Air pollution-induced placental alterations: An interplay of oxidative stress, epigenetics, and the aging phenotype? Clinical Epigenetics. 2019; 11(1):124. DOI: 10.1186/s13148-019-0688
Tiphaine CM, Jordana TB, Timothy DS. Twin studies and epigenetics. In: International Encyclopedia of the Social & Behavioral Sciences. 2nd ed. Elsevier, London, UK. 2015. pp. 683-702. DOI: 10.1016/B978-0-08-097086-8.82051-6