Open access peer-reviewed chapter

Passenger or Driver: Can Gene Expression Profiling Tell Us Anything about LINE-1 in Cancer?

By Stephen Ohms, Jane E. Dahlstrom and Danny Rangasamy

Submitted: July 18th 2017Reviewed: December 19th 2017Published: February 28th 2018

DOI: 10.5772/intechopen.73266

Downloaded: 546

Abstract

LINE-1 retrotransposons are expressed in epithelial cancers but not normal adult tissues. Previously, we demonstrated repression of cell proliferation, migration, and invasion genes in L1-reverse transcriptase-inhibited T47D cells, while genes involved in cell projection, formation of vacuolar membranes, and intercellular junctions were upregulated. Extending this, we examined microarray data from L1-silenced and Efavirenz-treated T47D cells by Weighted Gene Correlation Network Analysis and literature mining. Hub genes in the most significant module comparing L1-silenced and untreated controls included HSP90AB2p, DDX39A, PANK2, MT1M, and LIMK2. HSP90AB2p is related to HSP90, a master regulator of cancer, cancer evolvability and chemo-resistance. DDX39A is a known cancer driver gene while PANK2 and MT1M affect multiple pathways. LIMK2 and SYBL1 impact actin cytoskeletal dynamics and the cofilin pathway, cancer cell motility, and the epithelial-to-mesenchymal transition. Also affected were signal transduction, HIF1 pathways, iron/redox metabolism, stress granules and cancer stem cell-related metabolic reprogramming and the eIF4F translation initiation complex. Hub genes in other modules, including BTRC, MDM2, and FBXW7, stabilize oncoproteins like MYC, p53, and NOTCH1 or reflect CXCL12–CXCR4 signalling. Our findings support mounting evidence that L1 activity is a cause, rather than a consequence of oncogenesis, with L1 affecting the formation of cancer stem cells.

Keywords

  • LINE-1
  • breast cancer
  • cancer stem cells
  • CSC
  • WGCNA
  • module eigengene
  • stress granule
  • protein kinase R
  • proteomics
  • cancer driver genes
  • cancer evolvability
  • epithelial-mesenchymal transition
  • mesenchymal-epithelial transition
  • EMT
  • MET
  • LINE-1 ORF1 protein interactome
  • MYC Coding Region Instability Determinant (CRD)
  • HSP90
  • ROS
  • iron

1. Introduction

Retrotransposons are mobile genetic elements that replicate through an RNA intermediate, which is copied into genomic DNA by a retrotransposon-encoded reverse transcriptase. Retrotransposons are classified into two subclasses, the long terminal repeat (LTR) elements (human endogenous retroviruses or HERVs) and non-LTR elements (long interspersed elements [LINEs], including LINE-1 (L1) elements, and short interspersed elements [SINEs], including SVA and Alu elements). L1 elements are the most prolific type of retrotransposon and can mediate insertional mutations and other forms of genome reorganization leading to several human disorders and genomic plasticity [1, 2]. There are approximately 7000 full-length L1 copies in the human genome, at least 100 of which are classified as highly active or retrotransposition-competent [3, 4]. An active L1 element is composed of a 5′-untranslated region containing an internal promoter, two open reading frames (ORF1 and ORF2), and a 3′ poly-A tail. ORF1 encodes an RNA-binding protein with nucleic acid chaperone activity, while ORF2 encodes reverse transcriptase (RT) and endonuclease enzymes, required for reverse transcription and integration of the L1 RNA intermediate into new genomic sites [2].

It has long been speculated that somatic L1 insertions might drive tumorigenesis by activating oncogenes or inactivating tumor suppressor genes. This seems to be rare in practice, although the failure to detect frequent L1 retrotransposition in tumors may reflect the fact that sequencing traditionally focuses on exons, whereas L1 insertions may be capable of exerting effects when inserted into introns by creating new promoters, altering transcription, or creating new polyadenylation sites [5, 6, 7].

Although adult tissues do not normally express L1 ORF1 protein (ORF1p) [8, 9], many human neoplasms do express L1 RNA and proteins, including epithelial neoplasms [9, 10, 11], multiple myeloma, and leukemias [12, 13]. This topic has been the subject of numerous reviews, many of which are recent (listed in Table 1 ), indicating that the role of L1 in cancer is gaining ever-increasing attention.

TitleReference
Transposable elements in cancer[9000]
The role of somatic L1 retrotransposition in human cancers[9001]
LINE-1 methylation level and prognosis in pancreas cancer: Pyrosequencing technology and literature review[9002]
Methylation levels of LINE-1 as a useful marker for venous invasion in both FFPE and frozen tumor tissues of gastric cancer[9003]
The function of LINE-1-encoded reverse transcriptase in tumorigenesis[9004]
The human long interspersed element-1 retrotransposon: An emerging biomarker of neoplasia[9005]
Links between human LINE-1 retrotransposons and hepatitis virus-related hepatocellular carcinoma[9006]
The connection between LINE-1 retrotransposition and human tumorigenesis[9007]
The reverse transcriptase encoded by LINE-1 retrotransposons in the genesis, progression, and therapy of cancer[9008]
Crossing the LINE toward genomic instability: LINE-1 retrotransposition in cancer[9009]
LINE-1 in cancer: Multifaceted functions and potential clinical implications[9010]
Regulatory roles of LINE-1-encoded reverse transcriptase in cancer onset and progression[9011]
LINE-1 hypomethylation in blood and tissue samples as an epigenetic marker for cancer risk: A systematic review and meta-analysis[9012]
L1 retrotransposons, cancer stem cells and oncogenesis[9013]
Clinical implications of the LINE-1 methylation levels in patients with gastrointestinal cancer[9014]
Long interspersed element-1 (LINE-1): Passenger or driver in human neoplasms?[9015]
The human L1 element: A potential biomarker in cancer prognosis, current status and future directions[9016]
L1 retrotransposon and retinoblastoma: Molecular linkages between epigenetics and cancer.[9017]
A role for endogenous reverse transcriptase in tumorigenesis and as a target in differentiating cancer therapy[9018]

Table 1.

Reviews of LINE-1 involvement in cancer.

The list above is the subset of the results returned by a search in PUBMED using the search term: ((LINE-1) AND cancer) AND review.

In summary, while a clear correlation has been established between L1 and cancer, whether L1 expression and activity is a cause rather than a consequence of oncogenesis has been unclear. Probably, the strongest evidence that L1 drives cancer is the finding that L1 induces hTERT and ensures telomere maintenance in tumor cell lines [14]. L1 knockdown also leads to decreased cMyc and KLF4 mRNA and protein expression, two of the main transcription factors of telomerase, and changes in mRNA levels of other stem cell-associated proteins like CD44 and hMyb, with correspondingly reduced growth in spheroids. In addition, knockdown of KLF4 or cMyc decreases L1-ORF1 mRNA levels, suggesting specific reciprocal regulation with L1 [14].

Furthermore, L1 activity is dependent on phosphorylation of L1 ORF1p by the peptidyl prolyl isomerase 1 (Pin1) and is thus integrated with regulatory phosphorylation cascades [15]. This suggests that, like many pathogens, L1 can appropriate a major regulatory cascade of the host, and that competition for kinases by ORF1p could perturb signaling cascades.

Further evidence for an active role for L1 in cancer comes from studies with anti-retroviral drugs that target the reverse transcriptase of L1. Efavirenz is a first-line antiretroviral drug used in the treatment of HIV-1 but also reported to suppress the activity of L1-RT and, remarkably, to promote morphological differentiation in a range of cancer cell lines [16, 17]. In addition to these reports, in another study, we showed that RT expression is widespread in MCF7 and T47D breast cancer cells and decreased markedly after treatment with Efavirenz [11]. Both cell types showed significantly reduced proliferation, accompanied by cell-specific differences in morphology. MCF7 cells displayed elongated microtubule extensions that adhered tightly to their substrate, while T47D cells formed long filopodial projections. These morphological changes were reversible upon stopping RT inhibition, confirming their dependence on RT activity. Microarray gene expression profiling showed that genes involved in proliferation, cell migration, and invasive activity were repressed in RT-inhibited cells. Concomitantly, genes involved in cell projection, formation of vacuolar membranes, and cell-to-cell junctions were upregulated.

Standard microarray or RNA-seq analyses seek to identify differentially expressed genes in which each gene is analyzed independently. This approach fails to use much of the information that is captured in the transcriptome profiling experiment, namely that the expression of many genes is correlated. Thus, WGCNA quantifies the correlations between individual pairs of gene expression profiles and also the extent to which any two genes are highly correlated with the same neighbors (called topological overlap). The underlying assumption is that the correlated gene profiles and genes that overlap topologically must reflect common regulatory mechanisms or biological function.

In gene networks, a gene that has many interactions with other genes is called a hub gene and usually plays an essential role in gene regulation and biological processes [18, 19]. Compared to standard gene-wise methods of analysis, WGCNA has the advantage of enabling the identification of these hub genes and, in addition, overcomes the problem of multiplicity of hypothesis testing. This is because the number of modules of co-expressed genes is far less than the number of genes on the microarray and a single consensus gene profile from each module is subjected to statistical testing in preference to individual genes. Another advantage of WGCNA is that hub genes and other interesting genes in a module that are relevant to the phenotype under investigation may not be differentially expressed and would escape notice in a conventional gene-wise analysis.

Motivated by our initial findings described above, we decided to reanalyze the transcriptome data in greater detail using the more powerful WGCNA method [20], combining the data from Efavirenz (Efa)-treated cells [11] with our unpublished microarray data from T47D cells subjected to L1-silencing mediated by siRNA. An additional reason for combining the data was that the reproducibility of the co-expressed gene modules found by WGCNA increases as the number of samples increases with 12–15 samples currently being regarded as the practical minimum.

2. Methods

The details of the gene expression profiling in Efa-treated T47D cells have been published previously [11]. The siRNA-treated T47D cells were treated and harvested at the same time to minimize batch effects. Briefly, total RNA was isolated from cells and labeled cDNA hybridized to Roche NimbleGen Human Whole Genome 12-plex arrays. Gene expression levels were calculated with NimbleScan Version 2.4. Relative signal intensities for each gene were generated using the Robust Multi-Array Average algorithm with quantile normalization and summarized by the median polish method with NimbleScan Version 2.4. The biological samples included four experimental groups (L1 silenced by siRNA (pUTR), controls with scrambled vector (pSM2), Efavirenz-treated (Efa), and dimethyl sulfoxide-treated controls (DMSO)). There were three replicate samples in each group. To calculate individual gene-wise p-values and fold changes for the contrasts between L1-silenced or Efavirenz-treated cells and untreated controls, the (Robust Multi-array Average) RMA-normalized calls files were imported into Partek Genomics Suite v6.2 (St. Louis, Missouri, USA), and the log2 gene expression values were analyzed with a one-factor ANOVA design: (Treatment—with four levels—“DMSO”,“Efa”, “pSM2”, “pUTR”). Contrasts were calculated for pUTR versus pSM2 and Efa versus DMSO. 4951 probes passed a false discovery rate threshold of 0.001 for the pUTR versus pSM2 contrast and 9946 for the Efa versus DMSO contrast.

For the WGCNA analysis, the RMA-normalized calls files were imported into R (version 3.1.0) [21] as log2 values, and a subset of the 10,000 most variable probesets was selected to remove noise genes (measured by variance of the expression values of each gene across the 12 samples). A weighted gene coexpression network was constructed using the WGCNA package. Plots of scale-free fit using the pickSoftThreshold and softConnectivity functions indicated that a reasonable scale-free fit could be achieved by setting the soft-thresholding power (beta, β) for network construction to 20. The other parameters used for the blockwiseModules function in WGCNA included a minimum module size of 40, and the dendrogram cut height for module detection set to 0.10 to define modules of co-expressed probesets. networkType was set to “signed,” maxBlockSize was set to 10,000, and other parameters were left at their default values.

The statistical enrichment of the overlap between the genes in some modules and relevant gene lists identified in literature was calculated using an online program athttp://nemates.org/MA/progs/overlap_stats.htmlwhich uses the hypergeometric distribution. This program calculates a representation factor, which is the number of overlapping genes between any two gene lists divided by the expected number of overlapping genes drawn from two randomly chosen gene lists of similar size and is a measure of the enrichment of a gene list with genes from another list. A representation factor > 1 indicates more overlap than expected between two independent groups. A genome size of 19,000 genes was used in all overlap calculations.

3. Results and discussion

3.1. Module discovery

Based on a correlation threshold, WGCNA assigns genes to modules (clusters) in which the expression of genes in a module varies in a similar manner across the different experimental conditions. The modules are labeled automatically by WGCNA with a color code according to the number of genes in the module: turquoise denotes the largest module, blue the next, then brown, green, yellow, etc. WGCNA identified 34 modules (excluding a gray module containing unassigned probesets) ranging in size from a darkmagenta module (58 probes) to a turquoise module (1359 probesets). For statistical analysis, each module is represented by a consensus profile of all the genes in the module, by default, the first principal component, to calculate a module eigengene. A one-factor ANOVA analysis was carried out on the module eigengenes in R using the same ANOVA design (Treatment—with four levels—“DMSO”,“Efa”, “pSM2”, “pUTR”) used for the gene-wise analysis in Partek. After correcting for multiplicity by multiplying all p-values by 34, the most significant module eigengene for the contrast between L1-silenced and scrambled vector controls was for the darkmagenta module ( Figure 1 ) with a Bonferroni-corrected p-value of 6.55E − 11 (uncorrected p-value 1.93E − 12) ( Table 2 ). All genes in the darkmagenta module were downregulated in a range from −1.53 to −2.62 in the contrast between L1-silenced and scrambled vector controls ( Figure 2 ). The most significant module eigengene for the contrast between Efavirenz-treated and DMSO controls was for the black module (Bonferroni-corrected p-value 1.15E − 09) ( Figure 3 ). Due to space limitations, the following results and discussion focus mainly on the darkmagenta module with references to genes in other modules ( Figures 4 8 ) that can be linked in common pathways or processes to those in the darkmagenta module.

Figure 1.

Scatterplot for the darkmagenta module. In Figures 1 – 8 , genes specifically mentioned in the text are labeled blue, otherwise they are labeled red. The vertical axis (Gene Significance) is the -log10(p-value) for the contrast between pUTR versus pSM2 ( Figures 1 , 2 , 4 – 8 ) or for Efavirenz versus DMSO ( Figure 3 ). The intramodular connectivity for each gene is plotted on the horizontal axis. Genes with higher values of gene significance have smaller p-values in the gene-wise analysis in Partek Genomics Suite. Genes towards the right of the plots have higher intramodular connectivities and are hub genes. Intramodular connectivities were calculated with the WGCNA/intramodularConnectivity function from an adjacency matrix calculated by the WGCNA/adjacency function on the 10,000 most variable probes and with a soft thresholding power = 20. The horizontal line is the false discovery rate (FDR) 0.001 threshold calculated in Partek GS for the gene-wise ANOVA contrast. All genes above this line pass the FDR threshold at the 0.001 level. The plot was created with the WGCNA/verboseScatterplot function.

Module eigengenepUTR vs. pSM2Efa vs. DMSOpUTR vs. pSM2 (Bonferroni)Efa vs. DMSO (Bonferroni)
MEdarkmagenta1.93E − 126.60E − 116.55E − 112.25E − 09
MEpurple2.42E − 112.79E − 078.24E − 109.47E − 06
MEviolet6.64E − 110.053087962.26E − 091.80E + 00
MEorange8.03E − 110.014388912.73E − 094.89E − 01
MEwhite1.38E − 101.16E − 094.68E − 093.95E − 08
MEroyalblue1.85E − 105.02E − 116.31E − 091.71E − 09
MElightyellow5.11E − 100.010250571.74E − 083.49E − 01
MEmagenta1.09E − 091.21E − 083.69E − 084.10E − 07
MEdarkgreen1.57E − 096.28E − 105.35E − 082.13E − 08
MEdarkgrey2.73E − 095.28E − 089.29E − 081.80E − 06
MEsteelblue7.19E − 091.05E − 072.45E − 073.57E − 06
MEdarkolivegreen7.37E − 091.00E − 072.51E − 073.41E − 06
MEred7.67E − 092.17E − 072.61E − 077.37E − 06
MElightgreen7.98E − 090.0001160412.71E − 073.95E − 03
MEcyan1.02E − 080.0010023423.47E − 073.41E − 02
MEgreenyellow2.16E − 080.0013058637.36E − 074.44E − 02
MEtan4.37E − 081.36E − 081.49E − 064.64E − 07
MEyellow6.60E − 082.77E − 102.24E − 069.40E − 09
MEgrey606.76E − 086.75E − 102.30E − 062.29E − 08
MEmidnightblue9.07E − 080.18886043.08E − 066.42E + 00
MEdarkorange6.92E − 079.11E − 092.35E − 053.10E − 07
MEblack1.18E − 063.37E − 114.00E − 051.15E − 09
MElightcyan2.67E − 062.93E − 109.08E − 059.95E − 09
MEgreen5.77E − 065.31E − 101.96E − 041.80E − 08
MEsaddlebrown1.15E − 051.44E − 093.89E − 044.88E − 08
MEblue1.73E − 053.75E − 095.89E − 041.27E − 07
MEskyblue4.73E − 059.32E − 101.61E − 033.17E − 08
MEturquoise0.0001728718.22E − 085.88E − 032.79E − 06
MEdarkred0.000196234.79E − 116.67E − 031.63E − 09
MEpink0.0002014520.1828796.85E − 036.22E + 00
MEsalmon0.0002868583.75E − 119.75E − 031.27E − 09
MEpaleturquoise0.0002922288.01E − 089.94E − 032.72E − 06
MEdarkturquoise0.0023554591.15E − 068.01E − 023.91E − 05
MEbrown0.035812471.16E − 091.22E + 003.94E − 08

Table 2.

Uncorrected and Bonferroni-corrected p-values for ANOVA contrasts for module eigengenes.

pUTR vs. pSM2 is the ANOVA contrast p-value for L1 silenced by siRNA versus scrambled vector controls. Efa vs. DMSO is the ANOVA contrast for Efavirenz-treated versus DMSO controls.

Figure 2.

Similar plot to Figure 1 but points are labelled with the fold-change for the gene in the comparison between pUTR and pSM2 (L1-silenced versus controls). All genes in the darkmagenta module are downregulated in a range from −1.53 to −2.62 for this comparison.

Figure 3.

Scatterplot for the black module.

Figure 4.

Scatterplot for the cyan module.

Figure 5.

Scatterplot for the pink module. The pink module is enriched in genes from the LINE-1 ORF1 protein interactome (DDX21, NPM1, PABPC4, PTBP1, STAU1, STK38) with a representation factor of 3.2 and p-value <0.011.

Figure 6.

Scatterplot for the tan module.

Figure 7.

Scatterplot for the darkolivegreen module.

Figure 8.

Scatterplot for the orange module.

3.2. LINE-1 silencing affects HSP90, a master regulator of cancer

The most extreme outlier in the darkmagenta module is HSP90AB2P ( Figure 1 ). Despite its classification as a pseudogene, the existence of this protein is supported by mass spectrometry evidence [22]. The parent gene, heat shock protein 90 (HSP90) is a ubiquitously expressed molecular chaperone representing 1–2% of all cellular protein that controls the folding, assembly, intracellular disposition, and proteolytic turnover of approximately 100 proteins, most of which are involved in signal transduction [23]. HSP90 proteins also stabilize and refold denatured proteins under stress, with two major cytosolic forms, an inducible form (HSP90AA1, a hub in the cyan module, Figure 4 ) and HSP90AB1, a constitutive form. Significantly, both HSP90AA1 and HSP90AB1 have been identified as members of L1 ORF2p complexes in isotopic differentiation of interactions as random or targeted (I-DIRT) affinity proteomics experiments and quantitative MS [24], thus supporting the presence of HSP90AB2P and HSP90AA1 in the darkmagenta and cyan modules, respectively. I-DIRT has the advantage of allowing the discrimination of protein-protein interactions formed in-cell from those occurring post-extraction.

HSP90 is a master regulator of cancer [25]. HSP90 family members are overexpressed in many human cancers, and many HSP90 clients are nodes of oncogenic pathways. Cytosolic HSP90 interacts with HER-2, a member of the ErbB family of receptor tyrosine kinases that play central roles in cellular proliferation, differentiation, cell migration, and cancer progression. The interaction may involve stabilization of the cytoplasmic kinase domain of HER-2, and disruption of this association with HSP90 inhibitors leads to proteosomal degradation of the receptor [26].

Cell surface and secreted forms of HSP90 also exist. An HSP90AA α isoform is secreted and associated with matrix metalloproteinase 2 (MMP-2), incriminating extracellular HSP90 (eHSP90) in cancer metastasis [27]. eHSP90 can also initiate the EMT in prostate cancer cells by modulating EZH2 expression and activity [28]. Surface HSP90 appears to interact with the extracellular domain of HER-2. Disruption of the interaction inhibits cell invasion and is accompanied by altered actin dynamics in human breast cancer cells. In addition, the protein-tyrosine phosphatase PTPN9 negatively regulates ErbB2/HER-2 signaling in breast cancer cells and its presence in the darkmagenta module also supports involvement of HER-2.

Hsp90 also plays an essential role regulating pluripotency factors, including Oct4, Nanog, and Stat3 in mouse embryonic stem cells (ESCs) [29]. Inhibition of Hsp90 with 17-N-Allylamino-17-demethoxygeldanamycin or miRNA leads to ESC differentiation while overexpression of Hsp90β partially rescues the phenotype restoring Oct4 and Nanog levels.

The normal cellular proteome is only marginally thermodynamically stable, and this problem is exacerbated in cancer since most mutations destabilize proteins [30]. As a protein chaperone, HSP90 has a critical role in the protein homeostasis that supports cancer cell evolvability and that facilitates the rapid evolution of drug resistance in cancer [30]. HSP90 is also involved in the maturation of Piwi [31, 32], which enables piRNA-mediated silencing of transposons, including LINE-1, with the co-chaperone Fkbp6 having a critical role in delivering piRNAs to Miwi2 in the mouse [33].

3.3. LINE-1 silencing potentially affects the THOC/TREX nuclear export complex through DDX39A

DDX39A and SRP9 in the central region of the darkmagenta module plot ( Figure 1 ) were among 96 proteins associated with the L1 ORF1p and its ribonucleoprotein identified by co-immunoprecipitation of tagged L1 constructs and mass spectrometry [34]. DDX39A (also known as DDX39 or URH49) is a member of the DEAD box RNA helicase family implicated in processes involving alteration of RNA secondary structure, including translation initiation, nuclear and mitochondrial splicing, and ribosome and spliceosome assembly. There are two closely related paralogs, DDX39A and DDX39B (also known as UAP56 or BAT1), both of which have roles in the THO nuclear export complex. The nuclear THO/TREX complex regulates the export of pluripotency-related transcripts and controls ESC self-renewal and somatic cell reprogramming, including controlling the nuclear export of ESRRB, Nanog, Sox2, and Klf4 transcripts. DDX39A interacts physically and functionally with other export factors in the THO/TREX complex [35, 36] and mediates interactions between the THO complex and the general export receptor Nxf1 that binds mRNAs and transports them through the nuclear pore complex (NPC) [37].

An impact of L1-silencing on the THOC/TREX complex is supported by findings from other studies. First, SR (Serine And Arginine Rich Splicing Factor) proteins, three members of which are members of the L1 ORF1p interactome (SRSF1, 6, and 10), interact with NXF1 [38]. Secondly, CDC5L, another member of the L1 ORF1p interactome, is present in the DDX39B/UAP56 immunoprecipitate [39]. Thirdly, the L1 3′ UTR contains a novel sequence element that binds NXF1 suggesting a role in L1 RNP transport from the nucleus, and possibly its reimport into the nucleus for retro-integration in the genome [40].

DDX39 has also been identified as a cancer driver gene in two studies. Firstly, DDX39 was identified as a marker predicting urinary bladder cancer progression by proteome analysis [41]. Secondly, DDX39 was identified as a key driver gene and anti-cancer drug target by data mining in the “Sanger Genomics of Drug Sensitivity in Cancer dataset from the Cancer Genome Project” [42]. This dataset contains gene expression levels, copy number, and mutation status for 654 cell lines and IC50 values of 138 anti-cancer drugs. The string-db network [43] of the potential driver genes with the highest 10 largest importance measures among the selected genes for each anti-cancer drug is shown in Figure 9 .

Figure 9.

String-db gene network for cancer driver genes identified using data mining by Park et al. [42]. Network connections are based on known and predicted protein-protein interactions. Medium confidence interactions are shown. The network shows the central location of HSP90AA1 and ERBB2. Genes from the darkmagenta module and the LINE-1 ORF1p interactome are also present.

Four of the markers identified by Kato et al. [41] (CCT4, IDH1, NPM1, YBX1) overlap the L1 ORF1p interactome resulting in a statistically significant overlap with a representation factor of 56.5 and p < 5.893E-07. There are also five overlaps between the L1 ORF1p interactome and the cancer driver gene set identified by Park et al. [42] (DDX39A, NPM1, PABPC4, TCP1, YBX1) resulting in a representation factor of 9.9 and p < 1.528E-04. This is, in itself, strong evidence for LINE-1 having an active rather than a passive role in cancer.

Furthermore, three genes in the darkmagenta module either match, or are closely related to genes in the Park et al., cancer driver gene signature (DDX39A, EEF1A1, HSP90AA1). In the case of three perfect matches, this would result in a representation factor of 9.8 and p < 0.004, further supporting the biological plausibility of this module.

3.4. LINE-1 silencing affects genes with fundamental roles in cancer including PANK2, MT1M, and GAPDH

Pantothenate kinase 2 (PANK2), a master regulator of coenzyme A synthesis, and metallothionein 1M (MT1M), a protein mostly associated with cellular metabolism of metal ions, are among the most highly connected hub genes in the darkmagenta module ( Figure 1 ).

PANK2 is the mitochondrial enzyme essential for converting dietary pantothenate into 4′ phosphopantethenic acid, the first regulatory step in the synthesis of coenzyme A (CoA). CoA is an essential cofactor in nearly 100 enzymatic reactions including those involved in the citric acid cycle, amino acid synthesis, and the beta-oxidation of fatty acids.

Mutations in the Drosophila PANK homolog (dPANK) lead to reduced CoA levels, impaired acetylation of histones leading to downstream epigenetic effects, and impaired acetylation and stability of tubulin [44].

PANK deficiency in Drosophila and human neuronal cell cultures leads to abnormalities in F-actin organization and abnormally high levels of phosphorylated cofilin (CFL1) ( Figure 9 ), a conserved actin filament severing protein. The increased levels of phosphorylated cofilin coincide with morphological changes in PANK-deficient Drosophila S2 cells and human neuronal SHSY-5Y cells with the latter also forming markedly fewer neurites in culture—a process that is strongly dependent on actin remodeling [44]. Cofilin also plays a critical role in breast cancer invasion and metastasis [45] with the cofilin pathway comprising a group of kinases and phosphatases that regulate cofilin and coordinately initiate actin polymerization and cell motility in response to stimuli in the microenvironment of mammary tumors.

Mutations in the human PANK2 gene lead to neurodegeneration with brain iron accumulation and are linked to changes in ferroportin expression, the only known protein to mediate the export of intracellular iron [46]. Downregulation of PANK2 by siRNA in HeLa cells leads to a 12-fold induction of ferroportin mRNA [47]. Ferroportin is strongly downregulated in breast cancer, possibly being required for phenotypic transitions occurring during metastasis [48]. High ferroportin gene expression identifies an extremely favorable cohort of breast cancer patients with a 10-year survival of >90% [49].

Iron-dependent oxidative demethylation mediated by the Jumonji family of enzymes is linked to the epigenetic regulation of cancer [50, 51]. H3K4 methylation is a key determinant of epithelial-mesenchymal plasticity, and loss of H3K4me3 correlates with poor survival in breast cancer [52]. In addition, the ten–eleven translocation (TET) enzymes promote the iron-dependent oxidative demethylation of 5-methylcytosine and regulate the epithelial-mesenchymal transition (EMT) and the reverse mesenchymal-epithelial transition (MET) [72, 73, 74]. Iron may also be directly involved in promoting selective oxidative demethylation of key DNA or histone residues in chromatin to control the epithelial-mesenchymal status in a dynamic manner.

Iron and iron-mediated processes appear to have a central role in the formation of breast cancer stem cells (CSCs) and to be potential therapeutic targets in breast CSCs [48]. Salinomycin and a derivative, Ironomycin, exhibit potent selective activity against breast CSCs in vitro and in vivo, by accumulating and sequestering iron in lysosomes [48]. Preferential iron trafficking also characterizes glioblastoma (GBM) stem-like cells [56]. GBM CSCs have been shown to potently extract iron from the microenvironment more effectively than other tumor cells and preferentially require the transferrin receptor and ferritin, two core iron regulators, to propagate and form tumors in vivo. Transferrin was the top upregulated gene compared with tissue-specific progenitors [56].

The presence of CYB561D1, a putative mitochondrial ferrireductase in the darkmagenta module close to PANK2 ( Figure 1 ), further supports perturbation of iron-related metabolism by L1-silencing. A paralog, CYB561D2 (101F6), is highly expressed in lung tumor cell lines [57]. Its forced expression in NSCLC tumor cell lines or tumor xenografts significantly reduces cell viability by inducing apoptosis while lung metastases in nu/nu mice are also greatly reduced following systemic delivery of 101F6-encoding adenoviral vectors [58].

PANK2 also affects NADH levels [59, 60]. Hepatocytes from dKO PANK2 mouse pups cannot maintain NADH levels compared to wild-type hepatocytes [61]. In addition, induced pluripotent stem cell (iPSC)-derived neuronal models of PANK2-associated neurodegeneration reveal mitochondrial dysfunction with activated NADH-related and inhibited FADH-related respiration, leading to increased reactive oxygen species generation and lipid peroxidation [59].

The link between CoA and NADH also supports an important role for PANK2 in the metabolism of breast CSCs. Reactive oxygen species (ROS) and ROS-dependent signaling pathways and transcriptional activities appear to be critical to both normal stem cell self-renewal and differentiation and to CSCs [62]. CSCs possess low levels of ROS but how they control ROS production and scavenging and how ROS-dependent signaling pathways contribute to CSC function remain poorly understood.

In close proximity to PANK2 in Figure 1 , MT1M is a member of the metallothionein (MT) family; metallothioneins are small cysteine-rich proteins involved in metal metabolism and detoxification and redox metabolism. Metallothioneins may form a critical surveillance system protecting cells from damage caused by electrophilic carcinogens [63]. However, several studies suggest that metallothioneins have wider roles, contributing to numerous fundamental carcinogenic processes, including proliferation, survival, metabolism, invasion, and metastasis [64, 65].

Metallothionein expression is also strongly associated with tumor grade in breast, ovarian, uterine, and prostate cancers [66].

Hypoxia-inducible factor-1 (HIF-1α) can co-activate MT gene transcription by interacting with the metal-responsive transcription factor (MTF1) in hypoxic conditions increasing the biological aggressiveness of cancer cells [67, 68]. Conversely, metallothioneins can increase HIF-1α transcriptional activity by suppressing ROS accumulation or activating the ERK/mTOR pathway [69, 70]. Also, even though MTF1 is not inducible by iron, expression of ferroportin is induced directly via MTF1 [71]. HIF-1α also transcriptionally activates SLUG expression in hypoxic conditions [72, 73], and because upregulation of HIF-1α and metallothionein expression is self-reinforcing, MT1M may also affect SLUG expression. SLUG is a member of the SNAIL superfamily of zinc finger transcriptional factors involved in the EMT. SLUG expression correlates with reduced cell adhesion, increased cell migration and invasion, and biological aggressiveness in several tumor types including breast cancer [74, 75].

While not a hub gene, GAPDH is the most significantly differentially expressed gene in the darkmagenta module ( Figure 1 ). Overexpression of GAPDH occurs in diverse human cancers. Several cancer-related factors, such as insulin, HIF-1, p53, nitric oxide (NO), and acetylated histones, modulate GAPDH gene expression and affect GAPDH protein function [76]. In addition to its role in glycolysis, in which it catalyzes the oxidation and phosphorylation of glyceraldehyde-3-phosphate to 1,3-bisphosphoglycerate in conjunction with NAD+, GAPDH is a key mediator of oxidative stress responses, involving GAPDH nuclear translocation and induction of cell death [77]. GAPDH also inhibits telomerase activity and induces breast cancer cell senescence [77].

3.5. LINE1-silencing affects genes involved in MET-related metabolic reprogramming

The reprogramming of somatic cells to iPSCs by transgene expression of the transcription factors Oct4, Sox2, Klf4, and Myc triggers a mesenchymal-epithelial transition (MET) [78]. This transformation is promoted by the TET enzymes and blocked by kinase-dependent cytoskeletal reorganization [79]. Two closely associated hub genes in the darkmagenta module ( Figure 1 ), LIM Domain Kinase 2 (LIMK2) and Apolipoprotein C1 (APOC1), have roles in the MET, with the presence of APOC1 also suggesting the involvement of TET1. The TET proteins are DNA hydroxylases that mediate oxidation of methylcytosines and thus regulate hypoxia-sensitive gene expression. Among its many actions, TET1 regulates the hypoxia-induced EMT by acting as a co-activator of genes involved in cholesterol metabolism including APOC1 [54]. Significant changes in APOC1 expression are seen in leukemia cell lines in the NCI60 cancer cell line collection [80, 81], while APOC1 is highly expressed at the protein level and protects pancreatic cancer cells from apoptosis [82]. In addition, APOC1 is highly expressed in late-stage lung cancer [83] and is also one of a small number of genes undergoing late-stage upregulation downstream of KLF4 during the metabolic shift that facilitates reprogramming during the generation of iPSCs in an SeVdp(KOSM)-based system [84].

3.6. LINE-1 silencing targets DICER by acting though miR-103/107 embedded in the PANK2 gene

In addition to their central role in metabolism, the PANK1–3 genes contain the microRNAs, miR-103 and miR-107, in their intronic regions, with PANK1, 2, and 3 corresponding to pri-miR-107, pri-miR-103-2, and pri-miR-103-1, respectively. Expression of miR-103/107 has been shown to parallel that of the PANK genes in a series of cell lines and in normal human tissues [85]. Furthermore, miR-103/107 are predicted bioinformatically to affect multiple mRNA targets in pathways that involve cellular acetyl-CoA and lipid levels and thus to act synergistically with their host genes [86].

Although specific microRNAs can be upregulated in cancer, global miRNA downregulation is a common trait of human malignancies. This can be attributed, at least in part, to miR-103/107, which have been shown to target the 3’-UTR of Dicer leading to its downregulation and, in turn, to global downregulation of microRNA expression [87]. In human breast cancer, high levels of miR-103/107 are associated with metastasis and poor outcomes and this has been attributed to the miR-103/107-Dicer axis controlling epithelial plasticity and induction of the EMT, in part via regulation of miR-200 [87].

3.7. LINE-1 silencing is linked to the mitophagy-driven regulation of stem cell fate through TOMM7

The presence of Translocase Of Outer Mitochondrial Membrane 7 (TOMM7) in the darkmagenta module ( Figure 1 ) is further evidence of L1 having an impact on cancer cell metabolism acting through HIF1α. TOMM7 encodes a member of the TOM pre-protein translocase complex of the outer mitochondrial membrane, the main entry portal for protein precursors from the cytosol into mitochondria.

TOMM7 has a crucial role in mitophagy, the autophagic elimination of damaged mitochondria that has a role regulating stem cell fate [88]. Mitophagy is regulated by the PTEN-induced putative kinase 1 (PINK1). TOMM7 stabilizes PINK1 on the outer mitochondrial membrane, and accumulation of PINK1 bound to the TOM complex is completely blocked by the loss of TOMM7 from the TOM complex [89]. PINK1 loss-of-function compromises both mitochondrial autophagy and oxidative phosphorylation and reprograms glucose metabolism through HIF1 [90]. Pink1 deficiency also stabilizes HIF1α in cultured mouse embryonic fibroblasts and primary cortical neurons as well as in vivo [90]. This effect, mediated by mitochondrial ROS, leads to upregulation of the HIF1 target, PDK1 (pyruvate dehydrogenase kinase-1), which inhibits pyruvate dehydrogenase (PDH) activity. HIF1α stimulates glycolysis in the absence of Pink1, and the promotion of glucose metabolism by HIF1α stabilization is required for cell proliferation in Pink1−/− mice. Thus, it is possible that loss of Pink1 reprograms glucose metabolism through HIF1α, sustaining increased cell proliferation.

Independent support for the presence of TOMM7 in the darkmagenta module comes from an I-DIRT affinity proteomics study of L1 interactors [91]. TOMM40, another member of the TOM complex, was one of 37 high-confidence L1 ORF2-interactors in addition to Translocase Of Inner Mitochondrial Membrane 13 (TIMM13), a member of the TIMM family of proteins, that import proteins from the cytoplasm into the mitochondrial inner membrane in conjunction with the TOM complex.

3.8. LINE-1 silencing affects cytoskeletal dynamics and the MET through LIMK2, GSN, SYBL1, BLOC1S1, and RNF165

LIMK2 (darkmagenta module, Figure 1 ) has a key role in the MET, controlling the depolymerization of filamentous actin, by phosphorylating the actin stabilizer, cofilin. LIMK2 is one of the two kinases that have been shown to phosphorylate cofilin and stabilize actin stress fibers in fibroblasts, thus blocking the MET and preventing iPSC generation from mouse embryonic fibroblasts or human fibroblasts [79]. In the MET, the actin cytoskeleton is reorganized from actin stress fibres to cortical actin, the expression of mesenchymal transcription factors such as Zeb1 and Snai1 is lost, and the cells establish tight and adherens junctions stabilized by Par3/ZO-1 or E-cadherin [92].

Gelsolin (GSN) (lower left in darkmagenta module, Figure 1 ) is another key regulator of actin filament assembly and disassembly. Gelsolin is highly expressed at tumor borders infiltrating into adjacent liver tissues, contributes to lamellipodia formation in migrating cells, and induces tumor invasion by modulating the urokinase-type plasminogen activator cascade [93].

LIMK2 also acts with SYBL1 (darkmagenta module) in the assembly and maturation of invadopodia. Invadopodia are actin-rich protrusions that degrade extracellular matrix and are required for penetration through the basement membrane, stromal invasion, and intravasation. SYBL1 encodes VAMP-7, a transmembrane protein from the soluble N-ethylmaleimide-sensitive factor attachment protein receptor (SNARE) family. VAMP-7 localizes to late endosomes and lysosomes and is involved in the fusion of transport vesicles to their target membranes. MT1-MMP is delivered by the IQGAP1-WASH-exocyst complex and fuses to the membrane via VAMP-7, resulting in matrix degradation [94].

Biogenesis of Lysosomal Organelles Complex 1 Subunit 1 (BLOC1S1) (darkmagenta module, Figure 1 ) is a component of the ubiquitous BLOC1 multisubunit protein complex required for the biogenesis of specialized organelles of the endosomal-lysosomal system, including melanosomes and platelet dense granules. Loss of BLOC1 function results in downregulation of the actin-related protein-2/3 complex (Arp2/3), a seven-subunit protein complex that plays a major role in the regulation of the actin cytoskeleton. This complex is present in cellular regions characterized by dynamic actin filament activity, including the leading edges of motile cells in lamellipodia, and also has a role in invadopodia [95]. The Arp2/3 complex is also potently activated by WASH [96].

The presence of RNF165/ARKL2 as a hub gene in the darkmagenta module ( Figure 1 ), in the context of changes in expression of actin-related genes, is consistent with a bone morphogenetic protein (BMP)-driven MET. BMP has a key role in the induction of the MET [97] and RNF165/ARKL2 is an E3 ubiquitin-protein ligase that regulates motor axon elongation downstream of BMP [98]. A close homolog, RNF111/Arkadia is a key component of TGFβ signaling [99] and amplifies TGFβ and BMP signaling through degradation of the inhibitory Smad7. Aberrant RNF111/Arkadia activity occurs in clear-cell renal-cell carcinoma, colorectal cancer, and non-small cell lung cancer [119, 120, 121, 122]. In contrast, not a great deal is known about RNF165 outside the nervous system, although it appears to have a significant role in metastatic prostate carcinoma [104].

3.9. LINE-1 affects stress granule formation through SRP9

The signal recognition particle (SRP) is a cytoplasmic ribonucleoprotein consisting of six polypeptides and a 300-nucleotide (7SL) RNA molecule. SRP9, a key member of the SRP is a member of darkmagenta module ( Figure 1 ), while another member, SRP14, is a member of the L1 ORF1p interactome. The SRP9 and SRP14 polypeptides form a heterodimer and bind to the 3′ and 5′ ends of the SRP 7SL RNA. The SRP functions in the co-translational targeting of secretory and membrane proteins to the rough endoplasmic reticulum by complexing with ribosomes associated with the membrane of the RER via its receptor, SRPR, a hub gene in the pink module ( Figure 5 ).

Remarkably, the Alu family of SINEs is thought to have originated from a 7SL RNA gene early in primate evolution [105] and subsequently amplified by retrotransposition so that over 1 million copies are now present in the human genome [106]. Binding of the SRP 9/14 proteins to the RNA of Alu elements precedes and is likely to be necessary for efficient L1-mediated Alu retrotransposition [107, 108].

In addition, the SRP9/14 heterodimer can bind to cytoplasmic Alu RNA and 40S ribosomal subunits in a pathway involving the formation of stress granules (SGs) [109]. Cellular stress triggers the formation of dense cytosolic aggregations that sequester mRNA, 40S ribosomal subunits, initiation factors, and RNA-binding and signaling proteins to promote cell survival. SRP9/14 localizes to SGs following arsenite or hippuristanol treatment. The localization and function of SRP9/14 in SGs is mediated by direct binding to 40S ribosomal subunits. Binding of SRP9/14 to 40S or Alu RNA is mutually exclusive indicating that the heterodimer alone is bound to 40S in SGs and that Alu RNA may competitively regulate 40S binding. Following resolution of stress, cells actively increase cytoplasmic Alu RNA levels to promote disassembly of SGs by disengaging SRP9/14 from 40S [109].

The involvement of stress granules in tumor initiation in breast cancer cells was discovered by screening for intracellular proteins enhancing the effect of chemotherapeutic agents on TIC-enriched breast cancer cells [110]. This screen identified 44 proteins that interacted with the lead compound, C108, including the stress granule-associated protein and GTPase-activating protein (SH3 domain)-binding protein 2 (G3BP2). G3BP2 was shown to regulate breast tumor initiation through the stabilization of squamous cell carcinoma antigen recognized by T cells 3 (SART3) mRNA, leading to increased expression of the pluripotency transcription factors Oct4 and Nanog. THOC6, an interaction partner of DDX39B in the THO complex and involved in the nuclear export of pluripotency-related transcripts, was also among the 44 interacting partners of C108.

At least two genes in the darkmagenta module ( Figure 1 ) are linked to the C108 protein interactome, thus supporting the involvement of this module in SG formation. PTPN9 is present in C108 protein interactome, while AK130123 is highly similar to PPP2R2A, whose gene product interacts with those of PPP2R1A and PPP2R1B (present in the C108 protein interactome).

Another three interaction partners of C108 (IGF2BP1, IGF2BP2, and PABPC1) and SART1, but not SART3, are also present in the L1 ORF1p interactome. The degree of overlap between these two interactomes is statistically significant with a representation factor of 12.6 and p < 0.002. L1 ORF1 protein has, in fact, been shown by yeast two-hybrid screening to localize in stress granules with other RNA-binding proteins, including components of the RISC complex [111].

3.10. LINE-1 is likely to promote the cancer stem cell phenotype through SART1 and SART3

Although not members of the darkmagenta or any other module, SART1/TIP110, a member of the L1 ORF1p interactome and the functionally related SART3 implicate the L1 ORF1 protein in promotion of the cancer stem cell phenotype. SART1 (also known as U4/U6.U5 Tri-SnRNP-Associated Protein 1) encodes two proteins, the SART1(800) protein expressed in the nucleus of the majority of proliferating cells and the SART1(259) protein expressed in the cytosol of epithelial cancers. The SART1(259) protein is translated by −1 frameshifting during post-transcriptional regulation. SART1(259) plays an essential role in mRNA splicing by recruiting the tri-snRNP to the pre-spliceosome during spliceosome assembly. In contrast, SART3 associates transiently with U6 and U4/U6 snRNPs during the recycling phase of the spliceosome cycle. As mentioned before, stabilization of SART3 mRNA leads to increased expression of the pluripotency transcription factors, Oct-4 and Nanog [110]. SART3 also regulates OCT4 splicing in hESCs [112].

A recent proteomics study identified 13 SART3/TIP110-interacting cellular proteins, 5 of which are also present in the L1 ORF1p interactome [113]. This degree of overlap is highly significant with a representation factor of 76.1 and a p-value < 3.694E-09. These observations suggest that L1 affects SART3 in some way, thus implicating L1 in SART3-mediated breast cancer initiation.

Like SART3, SART1(800) also has fundamental roles in the formation of cancer stem cells. SART1(800), also known as hypoxia-associated factor (HAF), is overexpressed in a variety of tumor types. HAF is an E3 ubiquitin ligase that binds to and ubiquitinates HIF-1α by an oxygen- and pVHL-independent mechanism, targeting HIF-1α for proteasomal degradation [114]. HAF expression lowers HIF-1α levels and decreases HIF-1 transactivating activity. HAF also binds to HIF-2α but does not lead to its degradation and instead increases HIF-2 transactivating activity. Thus, HAF expression switches the hypoxia response of the cancer cell from HIF-1α- to HIF-2α-dependent transcription of genes such as MMP9 and OCT-3/4. This switch by HAF promotes the cancer stem cell phenotype and invasion, resulting in highly aggressive tumors in vivo [115].

3.11. LINE-1 silencing affects cancer-related signal transduction pathways by downregulating DGKA and GNA15

The presence of diacylglycerol kinase alpha (DGKA) and G protein subunit alpha 15 (GNA15) in the darkmagenta module ( Figure 1 ) implicates LINE-1 silencing in affecting signal transduction pathways. Increasing evidence points to DGKA (DGKα) being a major node in oncogenic signaling [116]. DGKA converts diacylglycerol (DAG) to phosphatidic acid (PA), with both being critical lipid second messengers found in the plasma membrane. DGKA activity terminates DAG signaling and has been linked to activation of NF-κB, HIF-1α, c-Met, ALK, and VEGF [117]. DAG, in turn, binds directly to protein kinase C and D family members, to Ras family members, and to diacylglycerol kinase family members, while PA controls the activity of mTOR, Akt, and Erk.

DGKA plays an important role in the spread and invasion of breast cancer cells [118]. Among the microenvironment signals sustaining cancer cell invasiveness, stromal cell-derived factor-1α (SDF-1α, or CXCL12) plays a major role in several cancers, including breast cancer [119]. SDF-1α is a chemokine secreted by tumor-associated fibroblasts and bone marrow stromal cells, which by activating its CXCR4 receptor (tan module, Figure 6 ), promotes migration and invasion of malignant cells and their homing to target organs [120, 121]. Following SDF-1α stimulation, DGKA is activated and localized at cell protrusions, promoting their elongation and mediating SDF-1α-induced MMP-9 metalloproteinase secretion and matrix invasion. PA generated by DGKA promotes recruitment of atypical PKCs (protein kinase C’s) to cell protrusions or ruffling sites, which play an essential role by promoting Rac-mediated protrusion elongation and localized recruitment of β1 integrin and MMP-9. Moreover, DGKA activity sustains the pro-invasive activity of metastatic p53 mutations, by promoting the recycling of α5β1 integrin to invasive protrusions in tridimensional matrix [122].

GNA15 (also known as G15, Gα15 or GNA16) is a heterotrimeric G protein selectively expressed in immature hematopoietic and epithelial cells with high renewal potential. GNA15 is notable for its ability to bypass the usual selectivity of receptor G-protein interactions and to non-selectively couple structurally and functionally diverse receptors to phospholipase C [123]. Following activation of GPCRs, rapid desensitization of receptor responsiveness normally prevents uncontrolled signaling and is initiated by phosphorylation of the receptor by GPCR kinases [124, 125] followed by uncoupling of GPCR-G protein interactions mediated by β-arrestin protein family members [126, 127]. Intriguingly however, GNA15 is not affected by GPCR desensitization. In certain cell lineages, GNA15 amplifies incoming stimuli regardless of β-arrestin-induced desensitization, thus promoting sustained activation of its downstream effectors, including key players in cancer signal transduction such as PKD1, Ras, Raf, PI3K, MEK, PKCs, and STATs [147, 148, 149, 150]. Based on its resistance to desensitization and extraordinarily poor coupling selectivity [128], GNA15 may promote unconventional stimulation based on prolonged auto/paracrine activation of GPCRs. These may include GPCRs known for supporting the immature stages of pancreatic cancer, such as CXCR4 [132, 133], S1PRs [153, 154, 155], Frizzled [137, 138], and Smoothened (SMOH) [158, 159, 160].

GNA15 was recently identified in a three gene signature highly expressed in a leukemic stem cell-enriched CD34 + cell fraction in normal karyotype acute myeloid leukemia [142]. Ectopic expression of GNA15 is also found in pancreatic carcinoma [141]. In contrast, GNA15 mRNA and protein expression were found to be severely downregulated in a panel of non-small cell lung cancer cell lines and in human lung adenocarcinoma and squamous carcinoma patients [143]. Additionally, GNA15 has been identified as a regulator of non-small cell lung cancer cell proliferation and anchorage-independent cell growth [143].

3.12. Genes involved in protein kinase R stress signaling are enriched in the darkmagenta module

We uploaded the gene list from the darkmagenta module to the MetaCore web server (Clarivate Analytics;https://clarivate.com/) to search for enriched cellular pathways. The most significant pathway identified was that of “Apoptosis and survival_Role of PKR in stress-induced apoptosis” with a raw p-value = 4.925E-6 and a FDR-corrected p-value = 8.126E-4. The darkmagenta module contains 3 of the 53 genes in this pathway. These are NFKBIB, IFNB1, and AK130123 (a probable transcript variant of PPP2R2A). Although not identified as a member of this pathway, IL3 (also present in the darkmagenta module) appears to positively regulate protein synthesis by inducing the inactivation of PKR via a growth factor signaling pathway.

Protein kinase R (PKR) (also known as eukaryotic translation initiation factor 2 alpha kinase 2/EIF2AK2) is a serine/threonine protein kinase that is activated by autophosphorylation after binding to dsRNA. By this mechanism, PKR inhibits the replication of a wide range of DNA and RNA viruses by phosphorylating the alpha subunit of eukaryotic initiation factor 2 (EIF2S1/eIF2α), a central node of the cellular response to stress signals. This impairs the recycling of EIF2S1 between successive rounds of initiation leading to inhibition of translation, which eventually results in shutdown of cellular and viral protein synthesis.

Stress-induced phosphorylation of EIF2S1 also induces stress granule assembly by preventing or delaying translational initiation and, additionally, is involved in the restriction of LINE-1 retrotransposition by SAMHD1. The HIV-1 restriction factor SAMHD1 can negatively modulate retrotransposition of LINE-1 by a mechanism that involves sequestration of L1 RNP in stress granules [144]. SAMHD1 promotes the formation of these stress granules by inducing phosphorylation of EIF2S1 and disrupting the interaction between eIF4A and eIF4G [144].

In addition to its role in stress granule formation, PKR phosphorylates p53/TP53, PPP2R5A, DHX9, ILF3, and IRS1 with DHX9 and ILF3 being members of the LINE-1 ORF1p interactome. Either as an adapter protein and/or via its kinase activity, PKR can also regulate the p38 MAP kinase, NFΚB, and insulin signaling pathways and transcription factors (JUN, STAT1, STAT3, IRF1, ATF3) involved in the expression of genes encoding pro-inflammatory cytokines and interferons. PKR also has a role in the regulation of the cytoskeleton by binding to Gelsolin (GSN; darkmagenta module, Figure 1 ), sequestering the protein in an inactive conformation away from actin [145].

The downregulation of NFKBIB in the darkmagenta module suggests activation of NFkB signaling. Hyperactivation of NFkB induces the expression of stemness-associated genes and inflammatory genes in CSCs but this is likely to be context-dependent involving Toll-like receptor signaling and saturated fatty acids [146, 147].

3.13. LINE-1 silencing affects the initiation, elongation, and termination steps of protein translation

Dysregulation of three of the four major steps of mRNA translation: initiation, elongation, and termination, has been implicated in the development and progression of cancer. In addition to the role of PKR signaling in initiation mentioned above, several genes in the darkmagenta module can be directly linked to these steps as can several members of the L1 ORF1p interactome.

Elevated protein synthesis arises as a consequence of increased signaling flux channeled to eIF4F, the key regulator of the mRNA-ribosome recruitment phase of translation initiation and a critical nexus for cancer development. The eIF4F complex is a trimeric complex consisting of the eIF4E cap-binding protein, the eIF4G scaffold protein, and the eIF4A helicase and is subject to regulation by major oncogenic pathways, including the PI3K/AKT/mTOR and MAPK cascades [148]. At least three members of the L1 ORF1p interactome (eIF4B, PABPC1, and PABPC4) interact with eIF4A [148]. In addition, based on a string-db [43] analysis by us of the LINE-1 ORF1p interactome members including eIF4E, there is suggestive literature evidence for interactions between other components of the L1 ORF1p (PCBP2, LARP1, SSB, DDX39A, RNMT, HNRNPA1, and PCBP2) and eIF4E (data not shown). In addition, eIF1B, a highly connected gene in the darkmagenta module ( Figure 1 ), is a key player in start codon selection, a critical step in translation initiation that sets the reading frame for decoding [149].

EEF1A1P9 or EEF1AL7 (LOC441032 in the darkmagenta module, Figure 1 ) is a pseudogene related to eukaryotic translation elongation factor 1A1 (eEF1A1/EEF1-α 1), an isoform of eEF1A. eEF1A is a protein subunit of the eukaryotic translation elongation 1 (eEF1) complex, which is composed of eEF1A, valyl-tRNA, and the eEF1B complex, comprising eEF1G, eEF1B, and eEF1D. Overexpression of EEF1D/eEF-1δ in cadmium-transformed Balb/c-3T3 cells in conjunction with eIF3 is a major mechanism responsible for cell transformation and tumorigenesis induced by cadmium [150]. In addition to eEF1A’s canonical role in translational elongation, eEF1A has a growing list of functions beyond protein synthesis, including protein degradation [151, 152], apoptosis [153, 154], nucleocytoplasmic trafficking [155], heat shock [156], and multiple aspects of cytoskeletal regulation [157]. eEF1A1 may also mediate turnover of the LINE-1 restriction factor, SAMHD1, by targeting it to the proteosome for degradation [158].

While translation termination is generally not considered a major target of tumorigenesis, eukaryotic release factors such as AF447869/GSPT1/eRF3 (darkmagenta module, Figure 1 ) are implicated in gastric cancer [159]. GSPT1/eRF3 is also involved in the regulation of cytoplasmic mRNA decay in association with Poly(A)-binding protein (PABP), two isoforms of which, PABPC1 and PABPC4, are present in the L1 ORF1p interactome. GSPT1 also has a role in nonsense-mediated decay [160].

There are five known GSPT1/eRF3a human alleles, one of which has been correlated with increased cancer risk in several studies and which may act by decreasing the binding affinity of GSPT1 for PABP [161]. Alternatively, GSPT1/eRF3 may be involved in tumorigenesis as a result of its non-translational roles, which involve cell cycle dysregulation, apoptosis, and transcription [159].

3.14. E3 ubiquitin protein ligases that affect oncoprotein stability are hub genes in several modules

Proteins that promote cell proliferation must be expressed in a controlled manner but also efficiently degraded. A major pathway for such targeted protein degradation is the ubiquitin-proteasome system (UPS), and oncoproteins that drive tumor development are often deregulated and stabilized in malignant cells. Several E3 ubiquitin protein ligases targeting oncoproteins are hub genes in other modules, including BTRC (a hub gene in the darkolivegreen module, Figure 7 , fold change -1.68x downregulated in L1-silenced versus controls) and FBXW11 and FBXW7 (hub genes in the pink module ( Figure 5 ), although neither are differentially expressed in L1-silenced versus controls). FBXW10 and BC067077 /MDM2, although not hub genes, are present in the darkmagenta module ( Figure 1 ).

A number of proteins driving the development and progression of cancer are direct or indirect targets of the UPS. For example, FBXW7 (FBW7 or F-box and WD repeat domain containing 7 E3 Ub protein ligase) promotes ubiquitination and proteasomal degradation of mTOR [162]. This leads to breast cancer suppression in cooperation with PTEN. BTRC also regulates mTOR activity through the targeted degradation of DEP domain-containing mTOR-interacting protein (DEPTOR), an inhibitor of both mTORC1 and mTORC2 [163]. NOTCH signaling is involved in the short-range communication between neighboring cells, and its activation plays a key role in cancer progression. NOTCH receptors are regulated by multiple E3s, and turnover of the unstable NOTCH intracellular domains is also mediated by FBXW7 [164, 165]. In addition, the RING finger E3 Ubiquitin ligase BC067077/MDM2 (E3 Ub ligase mouse double minute 2), present in the darkmagenta module (BC067077, Figure 1 ), is an oncoprotein in its own right and a negative regulator of p53 protein expression [166].

MYC proteins are regulated by at least five different E3 ubiquitin ligases, including FBXW7 and BTRC [167]. FBXW7 acts as a negative regulator of MYC [168], while BTRC positively regulates MYC protein stability [169]. In addition to control of MYC protein by the UPS, a number of other modulators of MYC activity have prominent positions in key modules. The STK38 kinase (pink module, Figure 5 ) (upregulated 1.69x in L1-silenced versus controls) regulates MYC protein stability and turnover in a kinase activity-dependent manner. In human B-cell lymphomas, STK38 kinase inactivation prevents apoptosis following B-cell receptor activation, whereas silencing of STK38 decreases MYC levels and promotes apoptosis [170]. STK38 knockdown also suppresses growth of MYC-addicted tumors in vivo [170]. CSNK2A2 (a hub gene in the orange module; Figure 8 ) (fold change -1.76x downregulated in L1-silenced versus controls) also phosphorylates and regulates MYC in addition to multiple transcription factors and Hsp90 and its co-chaperones and regulates Wnt signaling by phosphorylating CTNNB1 [171, 172].

Other oncoproteins targeted by E3 ubiquitin ligases in the modules described here include p53 and NFKBIB/IκKB (with NFKBIB/IκKB being present in the darkmagenta module, Figure 1 ). The p53 transcription factor is a tightly regulated sensor of cellular stress and its activation can lead to cell cycle arrest, apoptosis, senescence, DNA repair, altered metabolism, or autophagy [173]. Under normal conditions, protein levels of p53 are kept low by proteasomal degradation, promoted in part through continuous targeting by MDM2 [166]. The transcription of MDM2 is also upregulated by p53, creating a feedback loop in which MDM2 targets both p53 and itself for proteasomal degradation [174]. MDM2 also blocks the transactivating activity of p53, preventing transcriptional activation of p53 target genes [175]. In addition, MDM2 can heterodimerize with the homologous RING finger protein MDM4/MDMX (a hub in the tan module, Figure 6 ). MDM4 binds p53 although it has no intrinsic ubiquitin ligase activity [176]. MDM2 can either mono-ubiquitinate p53, facilitating its transport to the cytoplasm and terminating p53’s nuclear activity, or cooperate with MDM4 or other Ub ligases to poly-ubiquitinate and thereby target p53 for degradation by proteasomes [177].

In unstimulated cells, NFκB proteins are generally kept inactive by binding to proteins known as inhibitors of NFκB (IκBs) [178]. In addition to its actions described above, BTRC triggers ubiquitination of the NFκB inhibitor, IκBA [179], with the closely related NFKBIB/IκBβ being a member of the darkmagenta module ( Figure 1 ). NFκB signaling controls many cellular functions, including cell growth and survival, differentiation, development, immunity, and inflammation [180], and is subject to tight post-translational regulation by protein kinases, deubiquitinating enzymes [181], and ubiquitin ligases.

Phosphorylation of IκBA by IKK targets it for ubiquitination and proteasomal degradation by BTRC, allowing the NFκB protein, RelA, to translocate to the nucleus and activate gene expression. BTRC also contributes to NFκB pathway activation by promoting the formation of specific NFκB protein complexes in the nucleus through ubiquitination and partial proteolysis of IκBs, such as p105 and p100. Furthermore, FBXW7 also targets p100 for degradation in a GSK3β-dependent manner [201, 202, 203].

3.15. LINE-1 may affect MYC mRNA stability via MYC’s coding region instability determinant

MYC is also subject to regulation at the transcript level. In the mouse, the IGF2BP1 RNA-binding protein stabilizes c-myc RNA by associating with a coding region instability determinant (CRD) located in the last 249 nucleotides of the coding region of c-myc [185]. Four RNA-binding proteins present in the LINE-1 ORF1p interactome (HNRNPU, SYNCRIP, YBX1, and DHX9) associate with IGF2BP1 in an RNA-dependent fashion and are essential to ensure stabilization of MYC mRNA via its CRD [186]. Complex formation at the CRD may limit transfer of MYC mRNA to polysomes and subsequent translation-coupled decay. Furthermore, IGF2BP2-3, two members of the LINE-1 ORF1p interactome appear to operate redundantly with IGF2BP1 in regulating MYC mRNA in addition to having important roles in modulating tumor cell fate [187].

In further evidence of links between the L1 ORF1p and the IGF2BP1 protein interactomes, Weidensdorfer et al. [186] identified 24 proteins associating with IGF2BP1 by immune-purification and mass spectrometry, 14 of which are also present in the L1 ORF1p interactome. This degree of overlap is highly significant with a representation factor of 115.5 and p-value < 4.927E-27.

4. Conclusions

The findings from our WGCNA analysis of the L1-silenced transcriptome in T47D breast cancer cells add weight to the growing body of evidence that L1 expression and activity is a cause rather than a consequence of oncogenesis. In our WGCNA analysis, the observed changes in expression of numerous genes with fundamental roles in cancer and the formation of cancer stem cells or the phenotypic transitions of the EMT/MET seem too concerted and related by function for L1 to be dismissed as a passenger gene or epiphenomenon. Furthermore, a number of these changes in gene expression are consistent with the changes in cancer cell morphology observed upon pharmacological blockade of L1-RT. Our results also support a central hypothesis of the WGCNA method; that the similar expression profiles of genes in a module reflect common regulatory mechanisms or biological functions.

In addition to our gene expression profiling of L1-silencing, we present evidence from independent studies showing statistically significant overlaps between the L1 ORF1p and ORF2p interactomes and cancer driver genes identified by proteomics and data mining. This alone is strongly suggestive of a driver role for L1 in cancer. We also present evidence from independent proteomics studies consistent with L1 having a role in the stabilization of MYC, an oncoprotein with a key role in the global metabolic reprogramming that occurs in cancer.

In summary, we find evidence of L1 activity mounting a concerted attack on cancer cell gene expression consistent with EMT/MET-related phenotypic transitions. L1 activity is also important in the formation of breast cancer stem cells, the support of cancer cell evolvability and, probably, the development of chemoresistance.

Future directions include a more intensive transcriptomic investigation of the effects of L1 on the formation of cancer stem cells with a wider range of cancer cell types and larger sample sizes. Another high priority will be further investigation of the effects of L1 on non-coding RNA and integrating this with the effects seen here on gene expression. In this context, we have already shown global upregulation of microRNA expression mainly due to a marked increase in let-7 expression following L1-silencing by siRNA [188]. This is consistent with the effects of PANK2 downregulation on Dicer described earlier. It is also likely that the effects of L1-silencing by siRNA differ from those induced by pharmacological blockade of L1-RT and these will need to be investigated to establish whether the concept of pharmacological blockade of L1-RT is therapeutically viable. Chemotherapy is implicated in the formation of drug-resistant cancer stem cells, and NNRTI drugs like Efavirenz are probably no exception to this issue. Finally, thought should be given to targeting the L1 ORF1 protein pharmacologically as it is likely that this has a more important role than L1-RT.

Abbreviations

APOC1Apolipoprotein C1
BLOC1S1Biogenesis of lysosomal organelles complex 1 subunit 1
BMPBone morphogenetic protein
CMLChronic myeloid leukemia
CoACoenzyme A
CSCsCancer stem cells
DAGDiacylglycerol
DEPTORDEP domain containing mTOR-interacting protein
DMSODimethyl sulfoxide
dPANKDrosophila PANK homolog
eEF1Eukaryotic translation elongation 1
EIF2S1/eIF2αEukaryotic initiation factor 2
EMTEpithelial-mesenchymal transition
ESCsEmbryonic stem cells
EZH2Enhancer of zeste 2 polycomb repressive complex 2 subunit
FKBP6FK506 binding protein 6
G3BP2GTPase-activating protein (SH3 domain)-binding protein 2
GBMGlioblastoma
GSNGelsolin
HAFHypoxia-associated factor
HERVHuman endogenous retrovirus
HIF-1Hypoxia inducible factor-1
HSP90Heat shock protein 90
hTERTHuman telomerase reverse transcriptase
I-DIRTIsotopic Differentiation of Interactions as Random or Targeted
IkBsInhibitors of NFkB
iPSCInduced pluripotent stem cell
IQGAP1-WASH-exocyst complex:
IQGAP1IQ Motif Containing GTPase Activating Protein 1
WASHArp2/3 activating protein localized at surface of endosomes where it induces formation of branched actin networks
ExocystOctameric protein complex involved in vesicle trafficking and cell migration
L1-KDL1 knockdown
L1 ORF1pL1 ORF1 protein
LIMK2LIM Domain Kinase 2
L1LINE-1
L1-RTLINE-1 reverse transcriptase
METMesenchymal-epithelial transition
Miwi2Mouse homolog of PIWIL4 (Piwi Like RNA-Mediated Gene Silencing 4)
MMP-2Matrix metalloproteinase 2
MTMetallothionein
MT1MMetallothionein 1 M
MT1-MMPMembrane type 1 metalloprotease
MTF1Metal-responsive transcription factor
NONitric oxide
NPCNuclear pore complex
PAPhosphatidic acid
PANK2Pantothenate kinase 2
PDHPyruvate dehydrogenase
Pin1Peptidyl prolyl isomerase 1
PINK1PTEN-induced putative kinase 1
piRNAPiwi-interacting RNA
PiwiP-element Induced WImpy testis (a subfamily of Argonaute proteins)
PKRProtein Kinase R
RMARobust multi-array average
ROSReactive oxygen species
RTReverse transcriptase
SART3Squamous cell carcinoma antigen recognized by T cells 3
SDF-1α, or CXCL12Stromal cell-derived factor-1α
SINEsShort interspersed elements
SMOHSmoothened
SNARESoluble N-ethylmaleimide-sensitive factor attachment protein receptor
SRPSignal recognition particle
TETTen–eleven translocation
TICTumor-initiating cell
TOMM7Translocase of Outer Mitochondrial Membrane 7
UPSUbiquitin–proteasome system
WGCNAWeighted gene correlation network analysis

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Stephen Ohms, Jane E. Dahlstrom and Danny Rangasamy (February 28th 2018). Passenger or Driver: Can Gene Expression Profiling Tell Us Anything about LINE-1 in Cancer?, Gene Expression and Regulation in Mammalian Cells - Transcription Toward the Establishment of Novel Therapeutics, Fumiaki Uchiumi, IntechOpen, DOI: 10.5772/intechopen.73266. Available from:

chapter statistics

546total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Viral Modulation of Host Translation and Implications for Vaccine Development

By Abhijeet Bakre and Ralph A. Tripp

Related Book

First chapter

Introductory Chapter: Current Studies in Transcriptional Control System; Toward the Establishment of Therapies against Human Diseases

By Fumiaki Uchiumi

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us