Endogenous viral elements (EVEs) are the heritable sequences present in eukaryotic genomes that have originated from viral nucleotide sequences. EVEs are subdivided into two groups, according to the presence or absence of long terminal repeats (LTRs). EVEs with LTRs are called endogenous retroviruses (ERVs), and they account for approximately 8% of the human genome. EVEs without LTRs seem to be related to non-reverse-transcribing RNA and DNA viruses, and recent studies have revealed that numerous vertebrate genomes contain these non-LTR EVEs. Such EVEs are proposed to play essential roles in gene expression. EVEs can regulate gene expression as cis-regulatory DNA and RNA elements. EVE-derived non-coding RNAs and/or proteins can also influence cell transcriptomes in trans. To maintain cell integrity, cells epigenetically silence the expression of most EVEs, making these elements generally biochemically inert. These epigenetic alterations around the EVE loci can also affect host transcriptomes. Here, we highlight the current knowledge available on the regulatory activities of ERVs and non-retroviral EVEs, especially the EVEs derived from bornaviruses, which are known as endogenous bornavirus-like elements (EBLs). Better knowledge of this area will improve our understanding of gene regulation and also the co-evolution of viruses and their hosts.
- endogenous viral sequences
- long terminal repeats
Various viruses appear to have left heritable sequences originated from viral nucleotide sequences, called endogenous viral elements (EVEs), in eukaryotic genomes. EVEs are distinguished by the presence or absence of long terminal repeats (LTRs). EVEs with LTRs are called endogenous retroviruses (ERVs). The LTRs contain
EVEs use various mechanisms to regulate gene expression. First, genomic EVEs can regulate gene expression as
2. The influence of ERVs on gene expression
The exogenous retroviral genome contains the following genes:
2.1. Gene regulation by ERVs as regulatory DNAs
The LTRs of human ERVs (HERVs) have strong Pol II regulatory sequences [15, 16] and contain abundant transcription factor binding sites that function as promoters for HERV expression . Although the full-length HERV is considered to have two LTRs, up to 85% of HERVs have undergone recombinatorial deletion , making most HERV loci solo LTRs. Solo LTRs can still serve as promoters in both the sense and antisense orientations and influence gene expression [19, 20]. For example,
2.2. Gene regulation by ERV proteins
The expression products of HERVs can also affect the physiological functioning and development of the host’s tissues. For example, HERV-W (ERVWE1), HERV-FRD, and ERV-3 are three HERVs whose intact
2.3. Gene regulation by HERV-driven lncRNAs
lincRNA-RoR is a large intergenic non-coding RNA driven by HERV-H . lincRNA-RoR modulates reprograming and is indeed expressed at much higher levels in the embryonic stem cell line, H1-hESC, and human-induced pluripotent stem cells than in any other tissue or cell line [36, 37]. Knockdown of lincRNA-RoR affects the expression of other stem cell factors such as
2.4. Gene regulation by epigenetic modification of ERVs
In addition to the abovementioned roles, LTRs are important sites for epigenetic modifications that restrict HERV in the human genome. DNA methylation, which is carried out by DNA methyltransferases, histone methylation, and histone deacetylation are the major host mechanisms used for gene silencing [40, 41]. Indeed, HERVs are heavily methylated in normal tissues . By contrast, histone deacetylation alone is not sufficient to repress HERV expression. Rather, histone deacetylation in combination with other epigenetic modifications, particularly DNA methylation, is required for sufficient silencing of HERVs . Furthermore, histone demethylation, which is carried out by lysine-specific histone demethylases (KDMs), also silences HERV expression [44, 45]. All these epigenetic alterations to ERV loci can affect the expression of nearby genes. For example, MuERV-L/MERVL, a mouse ERV, is repressed by a KDM1A-mediated epigenetic modification . Some zygotic genome activation (ZGA) genes use an LTR of MERVL as a promoter or contain an MERVL element within 5 kb of their transcriptional start sites . These ERV-linked ZGA genes become de-repressed in KDM1A mutant cells, which coincide with an expanded cell fate potential . Thus, KDM1A recruitment to the MERVL LTRs seems to alter the chromatin structure around the loci, which in turn suppresses the expression of ERV-linked ZGA genes during early mammalian embryonic development.
2.5. Possible links between ERVs and human diseases
The recent studies on ERVs have revealed possible interactions between ERVs and their hosts with the potential to contribute to the development of diseases such as cancer and neurologic diseases. For example, the HERV expression is upregulated in various types of cancers [46, 47, 48]. Many HERV LTR regions, such as LTR10 and MER61, have a near-perfect p53 DNA binding site . The tumor suppressor protein p53 is a sequence-specific transcription factor, which regulates genes of diverse biological pathways . Thus, ERVs may regulate carcinogenesis via the p53 pathway.
3. The influence of nonretroviral EVEs on gene expression
EBLs are the only nonretroviral RNA virus-derived EVEs found in the human genome. EBLs seem to be generated from bornavirus mRNA in a LINE1-dependent manner ( Figure 1C and D ). Thus, they are a unique form of a processed pseudogene, which is derived from the sequences of an exogenous virus but not endogenous sequences, and they evidence the mechanism of retrotransposon-mediated RNA-to-DNA information flow from the virus to the host . In the human genome, seven EBLNs (hsEBLN-1 to hsEBLN-7) and one EBLG have been identified to date [4, 5, 6]. All seven hsEBLNs are expressed as RNAs in at least one tissue, suggesting the possibility of a biological function for these EBLs .
3.1. Gene regulation by EBLN RNAs
hsEBLN-1 is one of the most studied EBLs in the human genome. Because no natural selection of hsEBLN-1 and its orthologues is detected , hsEBLN-1 is thought to function as a DNA element or non-coding RNA, or even to have lost its function ( Figure 3 ). He et al. reported that 1067 and 2004 genes are up- and downregulated, respectively, after knockdown of hsEBLN-1 RNA in human oligodendroglia cells . The top 10 most upregulated genes were
Unlike ERVs, EBLs are thought not to be transposable themselves. Nevertheless, the hsEBLN-1 locus is silenced by several epigenetic blocks, dominantly histone deacetylation and DNA methylation, similar to the case of human immunodeficiency virus (HIV) provirus silencing [9, 56, 57]. This contrasts with the silencing mechanism of ERVs because, as described above, DNA methylation but not histone deacetylation plays a major role . Thus, the silencing mechanisms for the hsEBLN-1 locus might be more similar to those of exogenous retroviruses than to those of ERVs. This epigenetic alteration around hsEBLN integration may affect the epigenetic status of its neighboring loci and, consequently, the expression of nearby genes. Histone deacetylase (HDAC) inhibitor treatment did not affect transcription of the
Several EBLN-derived small RNAs in mouse and rat are annotated as PIWI-interacting RNAs (piRNAs) in the GenBank database . piRNAs are 25–33 nucleotides in length, are found in diverse organisms such as flies, fish, and mammals , and protect germ-line cells from transposons . piRNA clusters are transcribed as long single-stranded precursor RNAs derived from the piRNA clusters in the host genome, which are further processed into small mature piRNAs. Mature piRNAs guide Argonaute proteins, such as PIWI and MIWI proteins, to complementary target sequences. Argonaute proteins cleave the target RNAs, suppressing their expression. piRNAs are also known to epigenetically silence the target gene loci. All piRNAs derived from EBLNs are antisense relative to the proposed ancient bornaviral nucleoprotein mRNA . These observations offer a possible role for the EBLN-derived piRNA-like RNAs in interfering with bornavirus mRNAs .
3.2. Gene regulation by EBLN proteins
Among the human EBLNs, hsEBLN-1 and hsEBLN-2 have maintained long open reading frames with the potential to code for proteins of 366 and 225 amino acids, respectively. Indeed, some studies have reported that hsEBLN-1 proteins were detected in particular cell lines . Moreover, Kobayashi et al. reported that EBLNs encode functional proteins in afrotherians . Therefore, it is still possible that EBLN proteins regulate gene expression
The researches on gene regulation by EVEs have provided us with important knowledge about the evolution of regulatory sequences in the genome [5, 64]. Although integrated viral sequences are usually eliminated from the host genome, some eventually reach fixation and form EVEs. Such EVEs are not merely genetic parasites; rather, they introduce useful genetic novelties to the genome. In this article, we briefly reviewed two types of EVEs, ERVs and the non-LTR EVEs, EBLs. ERVs provide novel regulatory sequences and sites for epigenetic regulation. Transcripts derived from ERVs can also function as lncRNAs or protein-coding mRNAs, which may regulate gene expression. In particular, ERV-related transcripts are often associated with pluripotency. EBLs might also function as regulatory DNA elements such as promoters and enhancers. They are transcribed in one tissue at least, suggesting that EBL transcripts may function as lncRNAs or protein-coding mRNAs. Consistently, we have shown the evidence for the roles of EBL transcripts as lncRNA molecules in gene expression. In particular, several EBLs are associated with antiviral responses against related viruses. Additionally, both ERVs and EBLs regulate not only host gene expression, but related viral gene expression also. Further extensive studies on EVEs will augment our understanding of their biological significance in gene expression and their involvement in the co-evolution of viruses and mammals.
The preparation of this article was supported in part by the Japan Society for the Promotion of Science (JSPS) KAKENHI grant numbers JP24115709, JP25115508, JP25860336 (TH), MEXT KAKENHI grant number 15K08496 (TH), and grants from the Takeda Science Foundation, Senri Life Science Foundation, and The Shimizu Foundation for Immunology and Neuroscience grant for 2015 (TH). We thank Sandra Cheesman for editing a draft of this manuscript.
|EVE||endogenous viral element|
|LTR||long terminal repeat|
|Pol II||RNA polymerase II|
|EBL||endogenous bornavirus-like element|
|LINE-1||long interspersed nuclear element-1|
|HERV||human endogenous retrovirus|
|lncRNA||long non-coding RNA|
|MLV||murine leukemia virus|
|KDM||lysine-specific histone demethylase|
|ZGA||zygotic genome activation|
|ADHD||attention-deficit hyperactivity disorder|
|HIV||human immunodeficiency virus|
|siRNA||small interfering RNA|