In order to assure the precise utilization of genetic information, gene expression is regulated at the level of transcription as well as multiple post-transcriptional levels including splicing, transport, localization, mRNA stability, and translation ,,,,,,. During evolution, cells developed precise mechanisms to ensure that each transcript is appropriately stored, modified, translated or degraded, depending on the need for the mRNA or encoded protein by the cell. Steady-state protein levels within a cell correlate poorly with steady-state levels of mRNA, leading scientists to hypothesize that the gene expression is regulated at post-transcriptional levels . Work over the past quarter century has resulted in the identification of unifying concepts in post-transcriptional regulation. One unifying concept states is that post-transcriptional regulation is mediated by two major molecular components:
Various experimental approaches have been developed to understand the interaction between RBPs and the network of transcripts that they regulate. One of the most widely used techniques involves immunopurification of specific RNA-binding proteins from cellular extracts followed by high-throughput analysis of the co-purified RNA species . The coupling of this technique to powerful bioinformatic analysis methods has lead researchers to understand the binding specificity of a wide-variety of RBPs. The advent of new technology such as next generation sequencing and chemical cross-linking procedures have improved these methodologies and allowed for the fine-scale mapping of RBP binding sites, as well as the refinement of RBP binding motifs. Microarray-based studies that evaluated mRNA decay rates on a global basis have also provided valuable information about the role of post-transcriptional regulation of a wide variety of transcripts that have important physiological functions ,,,,,.
This chapter focuses on the role of CELF1 (CUGBP and embryonically lethal abnormal vision-type RNA binding protein 3-like factor 1) in the regulation of posttranscriptional gene expression. CELF1 functions to regulate posttranscriptional gene expression by binding to RNA sequences known as GU-rich elements (GREs). Genome-wide measurements of mRNA decay and bioinformatic sequence motif discovery methods were used to identify the GRE as a highly conserved sequence that was enriched in the 3’UTR of mRNA transcripts with short half lives in primary human T lymphocytes . This sequence resembled previously characterized binding sites for CELF1 ,, and CELF1 was found to bind with high affinity to GRE sequences and mediate mRNA degradation . This chapter reviews how CELF1 and its target transcripts function as an evolutionarily conserved posttranscriptional regulatory network which plays important roles in health and disease.
2. Evolutionary conservation of CELF proteins
The CELF protein family is an evolutionarily conserved family of RNA-binding proteins that play essential roles in post-transcriptional gene regulation ,. These proteins contain three highly conserved RNA-Recognition Motifs (RRM) with the 2 N-terminal RRMs and the C-terminal RRM being separated by a highly divergent linker domain . The RRMs confer RNA binding activity, and it is postulated that the divergent linker domain is an important site for functional regulation. Six members of the CELF family have been identified in humans and mice: CELF1 (CUGBP1) and CELF2 (CUGBP2) proteins are expressed ubiquitously and play vital role in embryogenesis ,,,,, whereas CELF proteins 3-6 are restricted to adult tissues and found almost exclusively in the nervous system ,. CELF proteins often serve multiple functions in both the cytoplasm and the nucleus ,. Human CELF1 and its orthologs in
CELF1 function is conserved across evolution at the level of biochemical mechanism as well as its function in regulating development. Transcript deadenylation is often the first step in the mRNA degradation process, and CELF1 has been shown to promote transcript deadenylation in diverse species ,. In
As described below, CELF proteins from diverse species bind to RNA preferentially at GU-rich sequences and thereby regulate post-transcriptional processes such as mRNA splicing, translation, deadenylation and mRNA degradation. The structure and biochemical properties of CELF family members suggest functional redundancy , yet each CELF protein targets specific sub-populations of RNA transcripts and appears to have distinct functions . We are starting to understand the mechanisms by which an individual CELF protein can serve multiple biochemical functions to coordinately regulate gene expression at posttranscriptional levels .
3. Biochemistry of binding by CELF proteins to target mRNA
CELF 1 and 2 proteins were first isolated and characterized as novel heterogeneous nuclear ribonucleoproteins (hnRNPs). Timchenko et.al. demonstrated that these proteins bound to RNA containing the sequence (CUG)8 within the 3'UTR of myotonin protein kinase mRNA
Structural studies have provided valuable insight into the mechanisms underlying the RNA- binding activity of CELF1. CELF proteins all contain two N-terminal and one C-terminal RNA recognition motifs (RRMs), separated by a 160-230 residue divergent domain ,. The highly conserved RRMs bind to RNA in a sequence-specific manner ,. Nuclear Magnetic Resonance spectroscopic (NMR)-based solution studies demonstrated that both RRM1 and RRM2 each contribute to binding to a 12-nt target RNA containing two UUGUU motifs. The tandem RRM1/2 domains together show increased affinity compared to the binding by each domain separately to an RNA sequence with two sequential UUGU(U) motifs, thus indicating binding cooperativity between the two RRMs ,. Crystallographic studies showed that both RRM2 and RRM1 bind to GRE-RNA, and RRM1 is important for crystal-packing interactions .
In addition to RRM1 and RRM2, RRM3 also has RNA-binding activity. According to NMR analysis, RRM3 specifically recognizes the UGU trinucleotide segment of bound (UG)3 RNA through extensive stacking and hydrogen-bonding interactions within the pocket formed by the beta-sheet and the conserved N-terminal extension . Experiments investigating CELF1 function through a yeast three hybrid system suggested that deletion/mutation of RRM1 or RRM2 does not abrogate binding to GU-rich RNA, suggesting that RRM3 may recognize GU-repeats more avidly than RRM1 or RRM2 . Additionally, it has been reported that RRM3 is able to recognize a poorly defined G/C-rich sequence from the 5’UTR of Cyclin D1 when combined with the divergent domain . The divergent domain also appears to be important for RNA-binding since the presence of divergent domain within recombinant CELF1/CELF4 chimeric proteins increased RNA-binding affinity, perhaps by conveying important conformational changes necessary for RNA-binding ,,. The divergent domain may also facilitate CELF:CELF homotypic interactions  which may influence its activity. For example, CELF:CELF interactions appear to activate RNA deadenylation in
3.1. Regulation of CELF1 function through phosphorylation
CELF1 is a known phosphoprotein with multiple predicted phosphorylation sites, and CELF1 phosphorylation appears to regulate its function as a mediator of alternative splicing, mRNA decay, and translational regulation ,,,. One of the pathologic events which occurs in the disease Myotonic Dystrophy type 1 (DM1) is an increase in the protein abundance of CELF1 and an associated increase in CELF1 mediated alternative splicing activity. This increase in CELF1 protein abundance is a result of increased CELF1 protein stability secondary to hyperphosphorylation . In DM1, the (CUG)n expansion of the DMPK 3’UTR leads to protein kinase C (PKC) activation through an unknown mechanism. PKC, in turn, hyperphosphorylates CELF1, resulting in increased protein stability and abundance as well as increased splicing activity . Additionally, in transgenic mouse models of DM1, mice treated with specific inhibitors of the PKC pathway showed amelioration of cardiac abnormalities associated with the disease phenotype . Phosphorylation of CELF1 also influences its ability to regulate muscle development (reviewed in ). CELF1 phosphorylation by Akt kinase at Ser 28 in normal muscle myoblasts influences its ability to affect the translation of its target transcripts during differentiation . Phosphorylation of CELF1 also directly influences its RNA-binding activity. For example, cyclin D3-Cdk4/6 phosphorylates CELF1 at Ser 302, altering the binding specificity of CELF1 to RNA and translation initiation proteins, such as eIF2α . During the process of T cell activation, phosphorylation of CELF1 alters binding by CELF1 to target transcripts. Shortly following T cell activation, CELF1 becomes phosphorylated, dramatically decreasing its affinity for mRNA and leading to stabilization of CELF1 target transcripts . Overall, these studies show that phosphorylation regulates the many functions of CELF1 in posttranscriptional gene regulation.
3.2. Identification of CELF1 target transcripts
Insight into the biological significance of CELF1 function as a coordinate regulator of post-transcriptional network was revealed through the experimental determination of CELF1 target transcripts. A technique involving RNA-immunoprecipitation followed by microarray analysis of associate transcripts (RIP-Chip) has allowed for the unbiased, genome-wide experimental identification of RNA-binding protein target transcripts. This technique involves immunoprecipitating an RNA-binding protein of interest from cell lysates under conditions that preserve RNA:Protein interactions. The co-purified RNA found associated with the immunoprecipitated RNA-binding protein is then isolated and interrogated using high throughput methods such as microarrays. Using this methodology, CELF1 targets have been identified in HeLa cells, resting and activated human T cells, and mouse myoblasts ,,. CELF1 targets, identified in cytoplasmic extracts from HeLa cells using an anti-CELF1 antibody, were analyzed to identify the CELF1 target sequence, which is known as the GRE. The sequence profile of CELF1 target transcripts was analyzed for enriched sequences using a Markov Chain Monte Carlo based gibbs sampler algorithm (BioProspector) as well as an overrepresentation algorithm, and the previously described GRE sequence, UGUUUGUUUGU, and a GU-repeat sequence, UGUGUGUGUGU, were found to be highly overrepresented in the 3’UTRs of the CELF1 target transcripts . Both sequences were validated as CELF1-binding targets and were shown to function as mRNA decay elements by accelerating the decay of reporter transcripts. While GU-repeat sequences had previously been identified as a CELF1 recognition motif through
Another approach to identify targets of RNA-binding proteins utilizes a cross-linking step prior to immunoprecipitation (CLIP) and subsequent high throughput methods to identify protein binding sequences. Using this method, 315 CELF1 RNA targets were identified in whole cell extracts from mouse hindbrain . These RNA-binding targets for CELF1 were enriched in UG repeat sequences, with 64% of target sequences found in introns and 25% found in 3’ UTR sequences . Similar analysis of CELF1 in the C2C12 mouse myoblast cell line  extensively characterized RNA-binding sites of CELF1 and found that CELF1 bound predominantly in 3’UTRs and caused mRNA decay. The authors found significant enrichment of CELF1 binding sites in intronic regions flanking exons, supporting a role for CELF1 in alternative splicing . Overall, these studies suggest that GU-rich sequences serve as genuine binding sites for CELF1 in a manner that has been conserved through evolution. In the next sections, we review the data supporting the model that CELF1 recognizes GU-rich sequences and thereby regulates pre-mRNA splicing, translation, and/or mRNA deadenylation/decay depending on the cellular and environmental context.
4. CELF1 as a regulator of splicing
Pre-mRNA alternative splicing is a common mechanism for generating transcript and protein diversity. An estimated 90% of human genes produce alternatively spliced transcripts ,. Alignment of the genomic regions adjacent to mammalian intron-exon splice sites, identified TG-rich motifs (TTCTG and TGTT) as conserved
CELF1-mediated regulation of alternative splicing is critical for maintenance of normal muscle structure and function ,,. Much of what we know about the role of CELF1 in alternative splicing comes from studies investigating the role of CELF1 in the pathogenesis of the neuromuscular disease myotonic dystrophy type 1 (DM1). In this disease, aberrant gain of CELF1 function is combined with a corresponding loss of function of the splicing factor MBNL1, resulting in the mis-splicing of a number of crucial genes (reviewed in ). Minigene reporter systems that contain alternative splice sites proved to be a useful tools for the identification of pre-mRNA targets for CELF1, including genes for cardiac troponin T (TNNT2), insulin receptor (INSR), and chloride channel1 (CLCN1),. Interestingly, these genes were all shown to be mis-regulated in tissues from patients who suffered from DM1. Minigene systems have been particularly useful in demonstrating that individual pre-mRNA splicing events are affected by loss or gain of activities of specific regulatory proteins. Studies performed in cultured cells with transiently transfected minigenes have identified a number of alternative gene regions regulated by CELF1 and other family members,,,,,,,,,,,,. However, as in other chimeric systems, the results of minigene overexpression experiments may not necessarily reflect the full-length pre-mRNA splicing patterns observed
5. CELF proteins as regulators of deadenylation, translation, and mRNA decay
CELF1 plays important roles in mRNA stability and translation in diverse species. In eukaryotic organisms, the length of a transcript’s polyA tail influences the translational state of a transcript, and deadenylation is regulated by GU-rich sequences and CELF1 proteins across evolution. Regulation of translation through deadenylation in
Removal of the polyA tail is the rate-limiting step in the degradation of the majority of mammalian mRNAs ,.In human cell lines, CELF1 has been shown to associate with the deadenylase enzyme polyA ribonuclease (PARN) and to stimulate polyA tail shortening in a cell-free assay using S100 extracts from human cells . It is not known if CELF1 activates other deadenylases in mammalian cells or how deadenylated transcripts are subsequently degraded. PARN, EDEN-BP and cytoplasmic polyadenylation element-binding proteins (CPEB) are present in
Translation is a critical layer of post-transcriptional control of gene expression that is regulated in response to environmental and developmental changes. CELF proteins have been shown to be involved in the activation of translation of several mRNA species at various stages of development . Additionally, CELF proteins have been shown to function as inhibitors of translation under conditions of stress, where they act as translational silencers in conjunction with other protein binding partners. The involvement of CELF1 in translational regulation is evolutionarily conserved, with several CELF1 homologues having been shown to regulate translation. For example, in the
One well studied instance of CELF1 mediated translational control involves the translation of alternative isoforms of the transcription factor CCAAT/enhancer-binding protein (CEBPbeta) ,,. In a rat model, CELF1 phosphorylation was activated by partial hepatectomy, which promoted the formation of a complex between CELF1 and eIF2a. This subsequently led to selective translation of the liver enriched inhibitory protein (LIP) isoform of CCAAT/enhancer-binding protein . It was later shown that in liver, CELF1 undergoes hyper-phosphorylation through a GSK3beta-cyclin D3-cdk4 kinase pathway, and the activity of this pathway seemed to increase with age . Similar to the partial hepatectomy model, the cdk4-mediated hyper-phosphorylation of CELF1 was involved in the age-associated induction of the CELF1-eIF2 complex . In the rat aging model, the CELF1-eIF2 complex binds to the 5’UTR of HDAC1 mRNA and increases histone deacetylase 1 protein levels in aging liver ,. It was further shown that during rat aging, CELF1 phosphorylation promotes its interaction with a GC-rich sequence in 5’UTR of p21 mRNA causing p21 translational arrest and senescence in fibroblasts . In myocytes, p21 mRNA is stabilized in discrete cytoplasmic structures called stress granules, which serve as reversible storage sites for mRNA under conditions of stress. Interestingly, only during late senescence did p21s localization in stress granules interfere with its translation ,. One important component of stress granules is the RNA-binding protein T cell internal antigen 1 (TIA1). Consistent with CELF1’s recruitment to stress granules, CELF1 has been shown to function as a translational silencer through interaction with the TIA1 protein . Further support for this model comes from experiments utilizing DM1 cell harboring CUG repeat RNA. The presence of a CUG repeat expansion was found to cause stress and activation of the PKR-phospho-eIF2α–CELF1 pathway leading to stress granule formation and inhibition of mRNA translation . This disruption to physiologic mRNA translation pathways by cellular stress signals might contribute to the progressive muscle loss in DM1 patients. Taken together, this data suggests that CELF proteins may function as activators or repressors of translation, depending on the context.
5.3. mRNA Decay
Bioinformatic analysis of short lived-transcripts in primary human T cells led to the identification of the conserved, GU-rich element (GRE) enriched in transcript’s 3’UTRs. CELF1 was subsequently identified as a protein that specifically bound to the GRE
In mouse myoblasts, cytoplasmic CELF1 bound hundreds of target transcripts that contained GU-rich sequences, including networks of transcripts that regulated cell cycle, intracellular transport and cell survival . Knockdown of CELF1 in this myoblast cell line led to the stabilization of many endogenous GRE-containing targets, as well as luciferase reporter RNAs . Many CELF1 target transcripts were found to be significantly stabilized in CELF1 knockout myoblasts, suggesting that CELF1 mediates the decay of a network of transcripts during myoblast growth and differentiation . In the DM1 disease model, there is aberrant activation of the protein kinase C pathway as a result of the CTG expansion, and this results in CELF1 phosphorylation. Mouse myoblasts (C2C12 cells) made to express CTG expanded RNA were shown to experience stabilization of tumor necrosis factor alpha (TNF-alpha) mRNA . This result suggested that the over-expression of TNF-alpha observed in DM1 could be coming from muscle, and this TNF-alpha overexpression may contribute to the muscle wasting and insulin resistance that are characteristic of this disease . In summary, CELF1 and its GRE-containing target transcripts define posttranscriptional regulatory networks that function to control cellular growth, activation, and differentiation (Figure 3).
6. The GRE/CELF1 posttranscriptional network in human diseases
The CELF family is an evolutionarily conserved family of RNA-binding proteins that plays an essential role in several aspects of post-transcriptional gene regulation and participates in the control of important developmental processes. Disruption of CELF1/GRE-mediated mRNA regulation may play a role in the pathophysiology of developmental defects ,,, or cancer ,. In
Despite the fact that the field of CELF1 biology is relatively young, there is some data supporting a potential link between dysregulated CELF1 mediated RNA metabolism and a cancerous phenotype. One recent study found CELF1 to be one to the top ten candidates in a transposon-based genetic screen in mice to identify potential drivers of colorectal tumorigenesis . Additionally, CELF1 expression has been shown to be lost through a t(1;11)(q21;q23) translocation in certain forms of pediatric acute leukemia . One way in which disruption of CELF1 may contribute to a malignant phenotype is through disregulation of C/EBPbeta expression. In HER2-overexpressing breast cancer cells CELF1 is activated favoring the production of the C/EBPbeta transcription-inhibitory isoform LIP over that of the active isoform LAP, and this contributed to evasion of TGFbeta and oncogene-induced senescence . Treatment of HER2-transformed metastatic breast cancer cells with the anti-HER2/neu monoclonal antibody trastuzumab reduced CELF1 protein level and it’s activity, suggesting that the targeting of CELF1 may be a viable adjunct therapy in the treatment of breast cancer . Expressions of C/EBPbeta and C/EBPalfaare translationally repressed in BCR/ABL cells (chronic myelogenous leukemia) and it can be re-induced by imatinib via a mechanism that appears to depend on the activity of CELF1 and the integrity of the CUG-rich intercistronic region of C/EBPbeta mRNA,.
Another potential mechanism of CELF1 mediated tumor promotion comes from our lab’s results of RIP-Chip experiments investigating CELF1’s targets in normal and malignant cells. In primary human T cells, we observed that CELF1 bound to a large number of transcripts involved in cell cycle and apoptosis regulation pathways, and that upon activation and proliferation of these cells, CELF1 bound to a drastically reduced mRNA population . This result suggests that CELF1 inhibition is correlated with a cellular state of proliferation and altered apoptotic response. We also identified hundreds of CELF1 target transcripts in human HeLa cells (carcinoma cell line) and many of these transcripts were different than those in normal T cells suggesting again that altered CELF1’s RNA binding specificity may correlate with malignancy .
CELF1-HDAC1-C/EBPbeta pathway is activated in young rat liver cells and in human tumor liver samples suggesting that CELF1-HDAC1-C/EBPbeta complexes are involved in the development of liver tumors ,. The inhibition of the ubiquitin-dependent proteasome system (UPS) via specific drugs (such as Bortezomib) is one type of approach used to combat cancer . Gareau et. al. showed that CELF1 is required for p21 mRNA stabilization and localization in stress granules induced upon treatment with Bortezomib. The authors postulated that this may allow cancer cells survive stress and escape apoptosis . This mechanism may explain why some tumors are refractory to Bortezomib treatment.
Thus, the dysregulation of CELF1 and GREs appears to contribute to malignant phenotype, perhaps by abrogating its ability to mediate the rapid and timely degradation of GRE-containing growth-regulatory transcripts and promote translation of some cell cycle regulators and oncogenes.
In summary, we have learned a wealth of information about CELF1-mRNA complexes and their importance in development, regeneration, aging and disease. CELF1 binds preferentially to GRE-containing transcripts, and affects expression of transcripts encoding other transcription factors and RNA-binding proteins that regulate cell growth, apoptosis, and development/differentiation (reviewed in ,). Thus, CELF1 may be functioning as a posttranscriptional “regulator of regulators”, whereby CELF1 influences the expression of a network of target transcripts encoding RNA/DNA binding proteins. This, in turn regulates individual subnetworks of transcripts necessary for development or environmental responses, such as immune activation, requiring transition from a quiescent state to a state of cellular activation and proliferation.
Understanding gene regulatory networks and the integration of transcriptional and posttranscriptional events are the next important tasks in translational medicine. It will require innovations in computational methods, experimental techniques and new animal models. It is also important to further investigate
This work was supported by NIH grants AIO57484 and AIO72068 to P.R.B. D.B. was supported by MSTP grant T32 GM008244 from the NIH. I.A.V-S. was funded through a fellowship from the Lymphoma Research Foundation.