Expression levels in B versus plasma cells, see text for references
The regulation of the immunoglobulin heavy chain (Igh) alternative RNA processing has served as a model system for revealing the competition between cis and trans-acting RNA factors influencing splicing and polyadenylation, reviewed in (Peterson 2007) and (Borghesi and Milcarek 2006). This review will explore recent studies on the role of transcription elongation factors in the super elongation complex (SEC) including ELL2 (eleven-nineteen lysine rich leukemia factor) on driving poly(A) site choice. Elongation factors impel high levels of Igh mRNA production and alternative processing at the promoter proximal, secretory-specific poly(A) site (sec) in plasma cells by enhancing RNA polymerase II modifications and downstream events. The sec poly(A) site, essentially hidden in B cells, is found by SEC factors in plasma cells.
Most mRNAs have a poly(A) tail and it has been estimated that up to 20% of the human genes may be arranged with a competition of splicing and polyadenylation sites as seen in the Igh (Tian, Pan et al. 2007; Rigo and Martinson 2008). Numerous genes contain multiple poly(A) sites (Tian, Hu et al. 2005); many subject to regulation (Edwalds-Gilbert, Veraldi et al. 1997) and (Lutz and Moreira 2011). Advancing developmental stage is generally correlated with use of promoter distal poly(A) sites (Ji, Lee et al. 2009; Ji and Tian 2009). Yet the opposite situation applies as B-cells terminally differentiate into plasma cells; elongation factors may hold the key to understanding this apparent paradox.
Transcriptional pausing and subsequent elongation are emerging as important check points for gene regulation. Phosphorylation of the carboxyl-terminal domain of RNAP-II by initiation and elongation factors directs the correct associations to ultimately drive mature mRNA output. The SEC, composed of nine proteins including pTEFb and ELL (see key), kick starts elongation by facilitating the phosphorylation of both the negative acting factors that have stalled the polymerase and polymerase itself. The mPAF is recruited by the action of the pTEFb through phosphorylation of the serine-2 of the RNA polymerase carboxyl-terminal domain heptad repeat; mPAF brings along with it the polyadenylation factors. The factors that polyadenylate mRNA are more strongly associated with the polymerase transcription complex on highly active promoters and/ or through up-regulation of the polyadenylation and elongation factors, especially CstF (cleavage stimulatory factor), pTEFb, and ELL2. Polymerases deficient in the appropriate factors may transit the gene, but if these pauci-polymerases proceed, they lack sufficient concentrations of factors to efficiently process the nascent RNA. Information from the literature on the linkages between elongation and alternative processing as well as TAR:TAT mediated RNA output and pTEFb interactions in HIV infected T-cells inform the search for understanding the mechanisms operative in the Igh locus.
Frequently used abbreviations: CPSF, Cleavage Polyadenylation Specificity Factor with subunits of 160, 100, 73 & 30 kDa; CstF, Cleavage stimulatory factor. Subunits of 77, 64 and 50 kDa; CTD, the carboxyl-terminal domain of eukaryotic RNA polymerase II; ELL2, eleven-nineteen lysine rich leukemia factor, RNA polymerase elongation factor 2, similar but not identical to ELL1; Igh, immunoglobulin heavy chain; mPAF, Human/ mammalian analog of yeast complex, polymerase II associated factors that coordinate setting of histone marks associated with active transcription; Mediator, Large, variable complex of proteins associated with DNA-bound transcription factors aiding RNAP-II binding to transcription start site. Composition may depend on promoter; pTEFb, positive transcription elongation factor b, composed of cyclin T and cdk9; RNAP-II, RNA Polymerase II of eukaryotic cells, composed of multiple subunits; SEC, super elongation complex for transcription elongation; Sec:mb, Ratio of secretory-specific polyadenylated to M1 spliced Igh mRNA; >20:1 in plasma cells; SR proteins, Serine and Arginine rich proteins that tend to enhance splicing (eg. SF2/ASF) of the pre-mRNA exons to which they bind. In contrast, SRp20 seems to suppress exon inclusion; U1A, Protein A (35 kDa) found with the U1 small nuclear RNP (U1 snRNP).
2. Igh and plasma cell gene regulation, an RNA processing problem
The mature B-cell has an exquisitely specific antigen receptor, the membrane form of the Immunoglobulin of the mu or delta type, paired with Ig light chain, expressed on its surface. The V region of the Ig is the result of a series of gene rearrangements generating specificity in the development of that B cell. This B cell can be activated by antigen association directly or be assisted in its differentiation by T-cells to start to secrete the Ig in the terminally differentiated B cell, i.e. the plasma cell. There are seven immunoglobulin heavy chain genes (Igh) in mice. The mouse heavy-chain genes (isotypes) are named mu, delta, gamma 1, gamma 2a, gamma 2b and gamma 3, and alpha; they are clustered together in an array on the chromosome in that order. The name of the heavy chain designates the name of the H2:L2 molecule; for example µ2:L2, i.e. mu heavy chains plus light chains, are designated IgM while and 2:L2 are called IgD. The heavy chains encode secreted proteins of approximately 55,000 kDa, with mu producing the largest H chain. The light chains (kappa or lambda) produce approximately 25,000 kDa proteins and the genes undergo no alternative RNA processing. Each Igh gene has the capability to produce both membrane-specific and secreted forms of the protein, as encoded in the alternatively processed heavy chain mRNAs. Plasma cells express only one of the Igh isotypes as a secreted protein.
Blimp-1, (B-lymphocyte inducer of maturation) a transcription repressor, turns off a number of early B-cell transcription factors like pax-5 and bcl-6 while it indirectly activates production of the secreted form of the Igh mRNA and a family of gene products leading to plasma cell terminal differentiation. Among the blimp-inducible genes are ELL2 and EAF2 (Kuo and Calame 2004; Shaffer, Shapiro-Shelef et al. 2004); these are two transcription elongation factors previously isolated in transcription studies. Activation of the blimp-1 gene occurs both in T-dependent and T-independent pathways for B-cell development, as
recently reviewed (Martins and Calame 2008). Blimp-1 is low or absent in memory B-cells; meanwhile IRF-4 levels are key to the activated/ memory/ plasma cell transition; the role for each gene, as well as Xbp-1, is still under active investigation, see review (Martins and Calame 2008); one synthesis of all the information is presented in (Lu 2008). It is also notable that ELL2 expression is influenced directly by expression of IRF4, a transcription factor that modulates plasma cell differentiation; IRF4 was found bound to the ELL2 promoter by chromatin immune-precipitation studies (Shaffer, Emre et al. 2008).
The Igh transcripts of any isotype are subject to alternative RNA processing, summarized in (Borghesi and Milcarek 2006) and (Peterson 2007) and diagrammed in Figure 2. The various Igh genes have different numbers of CH regions so the last secretory exon is labeled CH4 in the mu Igh or CH3 in the gamma genes. When the secretory specific poly(A) site is used, the pre-mRNA is cleaved so that splicing between the site embedded in the last secretory-specific exon and M1 does not occur. This is the preferred but not exclusive mode in plasma cells. The spacing between the weak M1 splicing and sec-specific poly (A) sites in the various Igh genes is highly conserved at ~300 nts. When the splice in CH3 (γ) or CH4 (µ) is
completed to M1, the secretory specific poly(A) site is removed from the pre-mRNA and unavailable and the mb poly(A) site is used. This is the predominant pathway in mature and memory B-cells. The sec poly(A) and splice to M1 are therefore mutually exclusive in any one transcript, while within the population of RNAs the sec:mb ratio is ~1:1 in B-cells and in plasma cells sec:mb is >20:1. The tendency to use the first poly(A) site in plasma cell development for the Igh gene reverses the trend seen in many developmentally regulated genes where it is the downstream site not the upstream site favored as development proceeds, perhaps caused by a weakening of mRNA polyadenylation activity (Ji, Lee et al. 2009). ELL2 may be induced to reverse this trend in plasma cells and thus alleviating a transcriptional pause may lead to alternative mRNA processing. We will explore the regulation of the alternative processing of the Igh RNA in the course of this review.
3. Differential expression of factors in a number of pathways seen between B cells and plasma cells
The entire ~11 kb gamma 2a or 2b Igh gene is transcribed ~2kb past the membrane poly(A) site regardless of which poly(A) site is used; this occurs in both B-cells and plasma cells (Flaspohler and Milcarek. 1990; Flaspohler and Milcarek 1992; Flaspohler, Boczkowski et al. 1995). The plasma cell phenotype dominates over that seen in the mature/memory B-cell when somatic cell hybrids were created and influences the nuclear versus cytoplasmic accumulation of many RNAs; Igh chain is the most profoundly affected. Igh protein levels are directly proportional to Igh mRNA levels (Milcarek, Hartman et al. 1996).
Weak poly(A) sites are used more efficiently in plasma cells than in B/ memory cells even in constructs in which there is no splicing between the sites. Polyadenylation therefore tips the balance towards secretory poly(A) site use in plasma cells (Milcarek and Hall 1985; Kobrin, Milcarek et al. 1986; Lassman, Matis et al. 1992; Lassman and Milcarek 1992; Matis, Martincic et al. 1996). Differences between plasma and memory B-cells are found in the polyadenylation machinery. These changes occur in the binding efficiency of basal factors Cleavage Stimulatory Factor CstF-64 and Cleavage Polyadenylation Specificity Factor CPSF-100 (Edwalds-Gilbert and Milcarek 1995; Edwalds-Gilbert and Milcarek 1995). LPS induction of splenic B-cells increases CstF-64 and introduction of CstF-64 into chicken DT40 B-cells results in Ig secretion (Takagaki, Seipelt et al. 1996), see Table 1.
Both the Igh 5’ splice site in the last sec exon and the sec poly(A) sites are weak and in competition (Peterson and Perry 1989); overall splicing is decreased in plasma cells (Bruce, Dingle et al. 2003). A shift in the SR-proteins is seen from the activating type (ASF/SF2) in B-cells to the repression type (SRp20) in plasma cells. The factors hnRNP F and U1A, both of which block the secretory-specific poly(A) site from functioning, are reduced in plasma cells (Veraldi, Arhin et al. 2001; Milcarek, Martincic et al. 2003; Alkan, Martincic et al. 2006) and (Ma, Gunderson et al. 2006) which would allow the secretory site to function better. The major effectors in the alternative processing in B versus plasma cells are summarized in Table 1.
The OCA-B transcription factor is up-regulated in plasma cells (Qin, Reichlin et al. 1998); it binds to oct2 at the Igh promoter. The CCNC complex of cyclin C and cdk8 along with PC4/ sub1 are induced by IRF4 in multiple myeloma a tumor of the plasma cells (Shaffer, Emre et al. 2008) and could act on the Igh gene to regulate transcription.
We re-examined the micro-array data on gene expression in B versus plasma cells that we published previously (Martincic, Alkan et al. 2009) to concentrate on transcription elongation factors. The data are shown Table 1. The robust up-regulation of blimp-1 and IRF4 are presented as controls to show that the plasma cells are at the fully differentiated stage, as expected. The ELL2 and PC4 mRNA are the next most highly induced in plasma cells, increased approximately 6-fold over B cells. Most of the other factors like cdk8, cyclin C and eaf2 are moderately induced in our test plasma cells (<2-fold). But it is of interest that supt5h (a DSIF subunit) and pTEFb subunits cdk9 and cyclin T are also induced ~2-fold as is the relA subunit of NF-kB, a transcription factor that binds to the Igh enhancer region.
|Factors, function||5’ splice favored|
(B and memory cells)
|Sec poly(A) site favored (Plasma cells/ myeloma)|
|Igh mRNA form||Mb"/> sec||Sec"/>"/>mb|
|Igh mRNA abundance||1||20-100 greater|
|OCA-B, Igh promoter binding||Low||Higher|
|Blimp-1, IRF4, general transcription factors||Low||6-300 X|
|relA, NF-kB transcription factor||Low||~2X|
|Cdk8, cyclin C. PC4, CTD phosphorylation||Low||~4X in myeloma|
|RNA polymerase loading on Igh||Low||High|
|Ser-2, ser-5 P on CTD RNAP-II Igh promoter||Low||~8X|
|LPS induced CstF-64, polyadenyation||-||~6X|
|ELL2, transcription elongation||Low||~6X|
|pTEFb, Supt5h, elongation||Low||~2X|
|hnRNP F, polyadenylation inhibitor||~4X||Low|
|U1A, polyadenylation inhibitor||~4X||Low|
|Pausing at sec poly(A) site||yes||same|
Taking all this information into account, it is clear that the plasma cell program for expression of the secreted form of the Igh mRNA is indeed complex. Transcription and elongation are a unifying theme for many of the changes seen. There is up-regulation of a number of transcription factors including the NF-kB subunit relA. There are changes to the phosphorylation of the CTD of RNAP-II and the enzymes that do this. Transcription elongation factors like ELL2 are up-regulated. Each of these aspects of gene expression will be discussed in turn, below.
3.1. Do factors bound at the Igh promoter or enhancer influence alternative Igh RNA processing?
The original studies of the Igh gene showed only a small enhancement of transcription activity in plasma cells over that seen in B-cells (Kelly and Perry 1986). Those studies used the incorporation of radioactive nucleotides and filter hybridization. Our studies using chromatin immunoprecipitation showed only a 2-fold change in the amount of RNA polymerase II on the Igh TATA region (see Figure 2). Meanwhile, the modifications of polymerases differ between the cell types and may contribute significantly to processing changes; this will be discussed in a subsequent section. But the alternative processing of the Igh gene appears to obey the same rules with respect to alternative processing whether the gene is intact (Kobrin, Milcarek et al. 1986) or if the CH1 exon to the 3’ end are placed as a cassette linked to an SV40 promoter (Peterson and Perry 1986). Therefore most studies have ignored the contribution of the Igh promoter to regulation. However, recent studies linking elongation and pausing to alternative processing of RNA implicate the promoter; the changes we saw in polymerase modifications also suggest a re-examination of factors that might act the Igh promoter.
Neither blimp-1 nor IRF-4 has any mapped interactions with the Igh promoter. But TAF105 is a lymphocyte specific “general transcription factor” found in a small fraction of TFIID complexes (Dikstein, Zhou et al. 1996; Wolstein, Silkov et al. 2000). TAF105 is dispensable for B-cell differentiation (Freiman, Albright et al. 2002); its role in immunoglobulin heavy chain (Igh) expression is unknown. Meanwhile, OCA-B (OBF-1/Bob1) is a transcription factor that binds indirectly to the octamer sequence in the Igh promoter through oct2, interacts with TAF105 (Wolstein, Silkov et al. 2000) and is up-regulated in plasma cells (Qin, Reichlin et al. 1998). OCA-B deficient mice show strain-specific, partial blocks at multiple stages of B-cell maturation and a complete disruption of germinal center formation in all strains. IgM secretion (an early event) may be normal while IgG+ cell expansion (a later event) is disrupted but not isotype switching (Teitell 2003). Gene array studies show that OCA-B may act on genes for cell expansion, like cyclin D3 (Kim, Siegel et al. 2003), not Igh secretion per se, although its role in alternative processing has not been extensively studied. Oct2 was originally believed to work on Igh expression but studies with knock-out mice reveal that its major role is on regulation of the IL-5 receptor (Emslie, D'Costa et al. 2008). But these factors may direct RNAP-II modifications.
The subunits cdk8 and cyclin C (CCNC) are involved in modification of the RNA polymerase and might possibly thus serve a regulatory role in alternative Igh mRNA processing. These two are up-regulated in the condition known as multiple myeloma (Shaffer, Emre et al. 2008) a tumor of a plasma cell, along with PC4 (aka sub1 in yeast). Both cdk8 and cyclin C have been shown to be associated with the mediator complex. Mediator is a co-activator of transcription; the composition of this large complex may vary with the promoter in response to different DNA-bound activators, acting as a link between them and the transcription start-site (Sato, Tomomori-Sato et al. 2004). The order of addition of mediator vs RNAP-II may vary for different genes (Lewis and Reinberg 2003). Transcriptional enhancers like p53 and VP16 target different mediator subunits (Taatjes, Marr et al. 2004) and other general transcription factors like TFIIB and TFIIA ; meanwhile herpes virus 1 ICP4 targets TFIID, another general transcription factor bound at the promoter (Grondin and DeLuca 2000). CPSF, a subunit of the polyadenylation complex, was found associated with TFIID at the promoter and subsequently transferred to the elongating polymerase (Dantonel, Murthy et al. 1997). Transcriptional activators like GAL4-VP16 enhance the polyadenylation of mRNA precursors through interaction with elongation factors (Nagaike, Logan et al. 2011). It will be interesting to determine if CCNC functions in alternative Igh RNA processing. The role of CTD phosphorylation is discussed below.
The nuclear factor kappa B (NF-kB) was first discovered in B-cells but subsequently found in most other cells. It has been shown to regulate a number of important processes including the cell cycle (Hinz, Krappmann et al. 1999). The relA and p50 subunits are released from an inhibitor and allowed to enter the nucleus on activation of NF-kB; expression is constitutive in mature B-cells (Fields, Seufzer et al. 2000). There are binding sites for NF-kB throughout the Igh gene (Horowitz, Zalazowski et al. 1999). We see an increase in relA in plasma cells. What role NF-kB plays in the differential expression of the Igh mRNAs is not known but may provide insights into the role of activation of promoters and increased polyadenylation.
The nuclear transcription factor C/EBP, also called NF-IL6, regulates a variety of genes involved in diverse functions such as acute phase response,(Poli 1998) immune function,(Screpanti, Romani et al. 1995; Tanaka, Akira et al. 1995; Poli 1998) inflammation,(Lekstrom-Himes and Xanthopoulos 1999) and hematopoiesis (Calkhoven, Muller et al. 2000). Binding sites for factor C/EBP are found in the Igh enhancer and a variety of genes important in lymphocytes and multiple myeloma (Pal, Janz et al. 2009). Deletion of the C/EBP gene in mice results in impaired generation of B lymphocytes (Chen, Liu et al. 1997). Therefore it is a candidate for up-regulating Igh transcription leading ultimately to Ig secretion.
The composition of transcription complexes assembled on different core promoters has been shown to affect splice site selection during pre-mRNA splicing (Cramer, Caceres et al. 1999; Zhao, Hyman et al. 1999). The promoter sequence motif in the Simian Virus 40 (SV40) early core promoter influences alternative splicing presumably by influencing both the composition of the transcription complex that assembles and the processivity of transcription elongation (Gendra, Colgan et al. 2007). Thus the common theme between the Igh and SV40 promoters, both of which were used to drive first poly(A) site use in the Igh locus, is that they are strong promoters in plasma cells. The Igh gene may get that way by the up-regulation of trans acting factors like cdk8 and cyclin C or something else while the SV40 promoter is inherently strong and something may be lacking or inhibitory in B cells which does not allow it to function optimally. There is still much to learn about this.
The presence or absence of the TATA element in the core promoter can change the way the NF-kB gene itself engages elongation and pausing factors (Amir-Zilberstein, Ainbinder et al. 2007). Thus there is a tight link between the promoter, initiation, elongation and polyadenylation in NF-kB. Interestingly, different Igh V regions vary with respect to the presence or absence of the TATA box (Johnston, Wood et al. 2006) so this linkage of TATA to elongation seen in NF-kB may not be applicable to the Igh family.
3.2. RNA polymerase II phosphorylation is altered on the Igh gene in plasma cells
Productive metazoan gene expression involves recruitment of RNAP-II to the promoter region, subsequent initiation of transcription, followed by, in many cases, a pause which is then released to allow elongation and RNA processing. The resulting mature mRNA is spliced and polyadenylated as a part of the transcription elongation “machine”. At least 50 proteins are involved in DNA recognition and assembly of a transcription complex on a promoter. RNAP-II, general transcription factors, gene-specific DNA binding proteins, the mediator complexes and nucleosome-modifying factors are involved. Modifications occur in the chromatin, the RNA polymerase II itself, elongation and processing factors; this is a concerted, multifaceted transformation, reviewed in (Selth, Sigurdsson et al. 2010) and (Hargreaves, Horng et al. 2009). Many excellent reviews have been published that deal with the transition from initiation to elongation, for example (Nechaev and Adelman 2011) and the linkage of elongation to RNA processing, DNA repair, Ig gene hypermutation, nuclear export and sister chromatid cohesion (Akhtar, Heidemann et al. 2009; Perales and Bentley 2009).
The RNA polymerase large subunit contains a carboxyl-terminal domain (CTD) which can be phosphorylated in multiple positions of the 52 repeats of the heptad consensus: Tyrosine-Serine-Proline-Threonine-Serine-Proline-Serine, reviewed in (Muñoz, de la Mata et al. 2010). Phosphorylations of Ser-2, ser-5 and ser-7 of the CTD are the major modifications but cis-trans isomerization (Xu and Manley 2007) as well as serine and threonine glycosylation can occur. The multilayered phosphorylation of the CTD is brought about by a series of enzymatic activities during initiation and elongation. Near the promoter, TFIIH (Kin28/cdk7) phosphorylates ser-5 and ser-7 of the heptad consensus (Akhtar, Heidemann et al. 2009; Glover-Cutter, Larochelle et al. 2009). Associated with mediator complex are cyclin C and cdk8 which also phosphorylate primarily ser-5 although some activity on ser-2 has been noted. In a plasma cell tumor cdk8, cyclin C and PC4 the mammailian homolog of the yeast sub1 are up-regulated (Shaffer, Emre et al. 2008); this is highly suggestive of a role for them in Igh mRNA production. Yeast sub1 has been shown to interact with all the CTD kinases and it may have multiple actions throughout the transcription cycle (Garcia, Rosonina et al. 2010). Surprisingly, our experiments did not show a role for PC4/ sub1 in directing alternative Igh mRNA expression (Martincic, Alkan et al. 2009) and below.
The primary phosphorylation of ser-2 occurs through the action of positive transcription elongation factor b, aka pTEFb (composed of Cyclin T1 or 2 & cdk9). The pivotal role of pTEFb in elongation has been described in numerous studies, reviewed in (Pirngruber, Schhebet et al. 2009), and discussed below. Whole genome studies of CTD phosphorylation in S. cerevisiae reveal a complex pattern of the three ser-phosphorylations that are gene specific with the ser-7 marks being dynamic, i.e. placed anew most likely by bur1, the yeast cyclin dependent kinase (Tietjen, Zhang et al. 2010). Thus the myriad combinations of these three modifications on 52 repeats, alone or together, allows for tremendous multiplicity in the association of factors with the CTD.
These phosphorylations of the CTD allow for the associations of histone modifying enzymes primarily to histone H3; activating modifications include acetylation at K9 and K14 and methylations at K4, K36 and K79 to link the CTD to processing factors and possibly help unwind the DNA from the histone core.
|RNAP-II CTD phosphorylation||Ser-5, ser-7||Ser-5"/> ser-2||Ser-2"/>"/>ser-5||?|
yeast set1, set7/9
|Activating Histone H3 methylations||Lysine 4||Lysine 36||Lysine 79|
With this in mind, we investigated the patterns of polymerase loading and serine phosphorylation of CTD on the Igh gene in two cell lines representing either B-cells or plasma cells, which are noteworthy in that they carry the identical IgG2a heavy chain gene (Shell, Martincic et al. 2007). Using chromatin immunoprecipitation and real time PCR across the Igh locus (probes illustrated in Figure 2) we found that there is a large increase in both ser-5 and ser-2 phosphorylation of CTD near the 5’ end in plasma cells concomitant with high Ig heavy chain secretory-specific mRNA production. Factors for polyadenylation (CPSF-160, CstF-50 and -64), transcription elongation (ELL2), and co-transcription activation (PC4) can be found in much greater abundance near the 5’ start site in the plasma cells than in B-cells, See Figure 3 for summary of those data. We concluded that increased phosphorylation of RNA polymerase II at the start of transcription and the increased association of polyadenylation, and elongation/ co-transcription factors are important in influencing alternative mRNA processing of the Ig heavy chain gene. Using siRNA to ELL2 reduced the binding of ELL2 and CstF-64 to the Igh TATA region (Figure 3) implying a sequential relationship among those factors.
The drug D-ribo-furanosyl-benzimidazole (DRB) can be used to inhibit the action of pTEFb on its targets (Yamaguchi, Wada et al. 1998). We showed that treatment of plasma cells with DRB inhibits ser-2 phosphorylation and causes the unmodified RNAP-II on the Ig heavy chain in plasma cells to stall near the 5’-end of the Ig gene (Shell, Martincic et al. 2007) and Figure 3. DRB also causes the decreased association of ELL2, PC4 and the polyadenylation factor CstF-64, suggesting that their binding to RNAP-II require the ser-5 and ser-2
modifications. The increased CTD phosphorylation, polyadenylation factor, and elongation factor association at the 5’ end in plasma cells vs B-cells are consistent with an RNAP-II on the Ig heavy chain gene in plasma cells that is primed for recognition of the proximal (secretory-specific) poly(A) site. Such polymerases would be expected to produce mRNA, leading to high abundance. The nascent RNAs could be efficiently processed, most likely at the promoter proximal poly(A) site, thereby producing the secreted form of the Igh mRNA and the protein.
In contrast to plasma cells, in the B-cells the polymerase on the Ig heavy chain gene is much less heavily modified with either type of ser-phosphorylation, polyadenylation factors, ELL2 or PC4. These polymerases in B-cells seem more likely to allow the weak donor site for the alternative CH3 to M1 splice to be set up and are not as competent for polyadenylation at the weak sec poly(A) site. Of course the presence of inhibitory factors also plays a role (see below). It appears that the polymerase in B-cells lacking the ser-2 and ser-5 modifications never acquire them in bulk even downstream (Shell, Martincic et al. 2007). Thus they may be dubbed “pauci-polymerases” in so far as they lack the modifications and the factors necessary to efficiently polyadenylate the RNA. This also results in decreased mRNA yield per polymerase pass. This is consistent with recent observations that a strong promoter directs strong association of polyadenylation factors with the polymerase (Nagaike, Logan et al. 2011).
The co-transcriptional processing of nascent RNA can be an inefficient activity even when RNAP-II is transiting the gene. This is evident in the c-H-ras gene where a mutation making a splice stronger significantly up-regulates mature mRNA production and leads to oncogenesis without changing transcription rate (Cohen and Leninson 1988). In the prothrombin gene, increased efficiency of the mRNA 3’ end formation and thus protein production is brought about by mutation near the poly(A) site with no change in transcription rate. The mutation at the poly(A) site causes increased risk for blood clots in people carrying the mutation (Gehring, Frede et al. 2001). So weak cis acting signals allow the polymerase to avoid co-transcription processing; this along with pauci-polymerases can also result in low production of mature mRNA.
Changes in the level of rate-limiting, trans acting factors, like CstF-64 following lipopolysaccharide stimulation in B-cells (Takagaki, Seipelt et al. 1996) or macrophages (Shell, Hesse et al. 2005) or in the cell cycle (Martincic, Campbell et al. 1998), can also increase mRNA polyadenylation (Getz, Elder et al. 1976). Changes in the level of ELL2, a transcription elongation factor, increase use of the first poly(A) site in the Igh gene (Martincic, Alkan et al. 2009), an important step for plasma cell development. Therefore, the throughput of initiated transcripts to polyadenylated, mature mRNA is limited by a.) cis acting elements in the nascent RNA, b.) the availability of trans acting polyadenylation factors and c.) the elongation factors and their requisite associations with the polymerase at pause sites.
3.3. Do modifications of the histones regulate alternative processing of Igh RNA?
The appearance of mono-, di-, and tri-methylated forms of histone H3, modified at Lysine 4, Lysine 36 or Lysine 79 (aka H3K4me2/3, H3K36me3 and H3K79me1,2,3 ) are signals of gene activation (Steger, Lefterova et al. 2008). The K4 mark is associated with the multiple lineage leukemia gene MLL in mammalian cells and yeast SET1 and COMPASS at a number of gene loci (Wang, Lin et al. 2009). MLL interacts with Menin to form a histone methyltransferase for H3K4 (Yokoyama, Lin et al. 2004) which results in the expression of the developmentally important Hoxa7 and -9 genes, hallmarks of MLL induced leukemia (Ayton and Cleary 2003).
The ser-5 modification of the CTD of RNAP-II directs histone modifications especially at H3K4, see Table 2. The H3K36me3 mark has been associated with recruitment of splicing factors to exons (Spies, Nielsen et al. 2009) and it can modulate alternative splicing by recruiting Polypyrimidine Tract Binding (PTB) protein to sub-optimal exons (Luco, Pan et al. 2010). The yeast SET2 enzyme binds to a peptide of CTD phosphorylated at ser-2 and ser-5 with two heptapeptide repeats and three flanking NH2-terminal residues, whereas a single CTD repeat is insufficient for binding (Vojnic, Simon et al. 2006). The mammalian homolog of SET2 is NSD1 which has been shown to link H3K36methylation and leukemogenesis (Wang, Cai et al. 2007). H3K79 methylations are brought about by Dot1L (Steger, Lefterova et al. 2008); the linkage of Dot1L action to CTD modifications has thus far not been made. Conversion of H3K79 monomethylation into di- and tri-methylation is correlated with the transition from low- to high-level gene transcription. The multi-subunit Dot1 complex (DotCom) includes MLL partners: ENL, AF9/MLLT3, AF17/MLLT6, and AF10/MLLT10 (Mohan, Herz et al. 2010). In another study ENL was shown to associate with Dot1L to methylate K79 (Mueller, Bach et al. 2007); ENL may thus link the super elongation complex (SEC) with H3K79 modifications, see Table 2.
We saw no changes in H3 K9 or K14 acetylation on the Igh by chromatin IP in previous studies although unusual high acetylation was seen near the Igh enhancer region, consistent with its role in gene activation (Shell, Martincic et al. 2007). However, terminal B cell differentiation in vitro was induced by the inhibition of histone acetylases (Lee, Bottaro et al. 2003) and knocking out histone deactylase-2 in chicken B cells had an influence on both transcription and alternative processing (Takami, Kikuchi et al. 1999). Studies of other modifications of the histones are clearly warranted and on-going in the Igh locus to understand how these might control alternative processing.
3.4. Polyadenylation of Igh RNA is altered in plasma cells
Most mature eukaryotic mRNAs contain a homopolymer (20-250 nts) of adenosines, the 3’ poly(A) tail, which controls mRNA degradation, mRNA export, and translation efficiency in somatic cells (Sachs and Wahle 1993) and in oocyte maturation (Sheets, Wu et al. 1995). When poly(A) tails were first discovered, hybridization studies revealed some exceptions to the rule, there are unique poly(A) minus mRNAs (Milcarek, Price et al. 1974). A recent look at mRNA expression by deep sequencing has revealed both poly(A)-plus and minus mRNAs but these poly(A)-minus mRNAs are in the minority (Yang, Duff et al. 2011).
Despite these exceptions, polyadenylation is a common means for 3’ end formation; many genes have more than one poly(A) site (Lutz and Moreira 2011). The sequence requirements for poly(A) tail addition are an AAUAAA or a closely related poly(A) signal, followed by the site where cleavage and poly(A) addition will occur and from 30-50 nts downstream, a GU or U rich downstream sequence, illustrated in Figure 4. Minimal protein factors required for the coupled pre-mRNA cleavage and polyadenylation include: cleavage stimulatory factor trimer (CstF 77, 64 & 50); poly(A) polymerase (PAP); cleavage factors (CF) Im and IIm.; nuclear and cleavage and polyadenylation specificity factor complex (CPSF 160, 100, 73 & 30 kDa). CPSF recognizes the AAUAAA signal and CstF recognizes the downstream element. Nuclear poly(A) binding proteins II, involved in transport and cytoplasmic stability, are bound to the newly formed poly(A) tail. CPSF-73 is the presumptive endonuclease for cleavage of the pre-mRNA (Mandel, Kaneko et al. 2006). These factors interact to form a large complex in vitro (Moore, Skolnik-David et al. 1988; Stefano and Adams 1988) (Veraldi, Edwalds-Gilbert et al. 2000) and in vivo (Shi, Di Giammartino et al. 2009). CPSF interacts with TFIID (transcription factor II D) then RNA polymerase II at the promoter and remains associated with RNAP-II in HeLa cells (Dantonel, Murthy et al. 1997; Mc Cracken, Fong et al. 1997) and on the Igh gene as we have shown (Shell, Martincic et al. 2007)
The CstF trimer is required for the cleavage reaction (MacDonald, Wilusz et al. 1994), interacts with CPSF via its 77 kDa subunit, and binds the GU-rich region downstream of the poly(A) site via its 64 kDa subunit (Wilusz and Shenk 1990; Takagaki, MacDonald et al. 1992; Takagaki and Manley 1997). The 50 kDa subunit interacts with BRCA1/BARD1 to modulate polyadenylation during DNA repair (Kleiman and Manley 2001), and with RNAP-II through CTD (Mc Cracken, Fong et al. 1997). The 77 kDa subunit of CstF contains the putative nuclear localization signal for transporting the 50:77:64 trimer into the nucleus (Takagaki and Manley 1994).
3.5. Alterations in trans-acting factors for polyadenylation in B and plasma cells
Nascent RNA cleavage in polyadenylation can be blocked by U1A (Phillips, Pachikara et al. 2004; Ma, Gunderson et al. 2006) or hnRNP F (Veraldi, Arhin et al. 2001); the levels of both of these are high in B-cells and lower in plasma cells (summarized in Table 1) and (Ma, Gunderson et al. 2006) consistent with a loss of their inhibitory function on the poly(A) sites in plasma cells. While hnRNP F might be expected to block any poly(A) site based on its sequence preference for downstream regions (Alkan, Martincic et al. 2006), we found by a micro-array analyses of hnRNP F transfected cells that only some genes were affected, most notable among these were secretory Igh and ELL2, a transcription elongation factor (Martincic, Alkan et al. 2009). The finding that there are U1A binding sites up-stream of the Igh mu secretory poly(A) site indicates a selectivity consistent with the observed increased use of the site when U1A levels fall (Phillips and Virtanen 1997). The data clearly show that a change in the levels of these factors influences the polyadenylation of the Igh secretory site. The smaller number of polyadenylation factors associated with the B cell RNAP-II may not recognize the AAUAAA and downstream regions of the secretory poly(A) site because these two RNPs block access.
Recently, using a knock-down and whole genome array approach, the U1 snRNA itself, not the U1A protein, was shown to protect pre-mRNA from premature cleavage and polyadenylation (Kaida, Berg et al. 2010). This reveals a function for U1 RNA independent of splicing where its interaction with the splice site in the nascent RNA plays a seminal role. In another recent study a fraction of the U1 snRNA, without any of the normal RNP proteins, was found associated with TAF15 aka TAFII-68 (Jobert, Pinzon et al. 2009). This TAF interacts with a distinct population of TFIID so U1 snRNA may play a heretofore unrecognized role in transcription initiation. These unique aspects of U1 snRNA function have not been investigated in B or plasma cells.
The CstF-64 gene (CSTF2) maps to the X chromosome; expression of the protein fluctuates and is rate-limiting for the CstF complex formation (Martincic, Campbell et al. 1998). Knocking out CstF2 in chicken B-cells resulted in changes to viability, Igh transcription, and alternative splicing (Takagaki and Manley 1998). An alternative form of CstF-64 (CSTF2T) was found originally in testes, but then it was also found to be expressed in other tissues including B-cells and plasma cells. The CSTF2T gene maps to an autosome allowing for its transcription in spermatogenesis when the X chromosome is suppressed (Dass, McMahon et al. 2001). CSTF2T protein is induced by LPS in splenocytes just like CSTF2. Deletion of the CSTF2T gene gives rise to spermatogenetic defects but no immunological defects were associated with its loss (Hockert, Martincic et al. 2011). Thus CSTF2T does not play an indispensible part in Igh expression.
3.6. Splicing of Igh RNA is altered in plasma cells
RNA splicing occurs through a concerted series of events facilitated by the U family of small Ribonuclear protein particles (snRNPs) 1, 2, 4, 5, 6 and their associated proteins. Serine arginine (SR) proteins, e.g. SF2/ASF, SRp20, often act as enhancers for splicing while in general heterogenous nuclear ribonucleoproteins (hnRNPs) act as spoilers. The balance between these has been shown to control many genes, reviewed in (House and Lynch 2008). Changes in the levels of the splicing factors in various B cell types (Bruce, Dingle et al. 2003) and during development have been shown to alter splicing patterns (Expert-Bezancon, Sureau et al. 2004). For, example, SRp20 plays an important role in alternative splicing of the ED1 exon (de la Mata and Kornblihtt 2006). The finding that the levels of SRp20 increase in plasma cells is intriguing (Table 1). How SRp20 is influencing the balance between splicing and polyadenylation is not known. It is tempting to speculate that it blocks use of the weak splice site in the secretory exon in plasma cells.
RNA processing events and transcription are tightly linked (Proudfoot, Furger et al. 2002). The recruitment of different factors to control splice site selection occurs on the carboxyl-terminal domain (CTD) of RNAP-II (Mc Cracken, Fong et al. 1997). Elongation rates were shown to modulate alternative splicing with high processivity of the polymerase correlating with exon skipping, reviewed in (Kornblihtt 2006). A key role for pTEFb was found for coupling transcription elongation with alternative splicing (Barboric, Lenasi et al. 2009). But chromatin modifications also seem to be important. For example, H3K36me3 marks are associated with recruitment of splicing factors to exons (Spies, Nielsen et al. 2009). The histone tail–binding protein MORF-related gene 15 (MRG15), a component of the retinoblastoma binding protein 2 (RBP2)/H3-K4 demethylase complex, recruits Polypyrimidine Tract Binding protein to sub-optimal exons (Luco, Pan et al. 2010). The mammalian ortholog of the SWI/SNF (SWItch/Sucrose NonFermentable) yeast nucleosome remodeling complex has been shown to play a role in CD44 alternative splicing (Batsche, Yaniv et al. 2006). In addition, H3K9 hyperacetylation outside the promoter stimulates alternative splicing (Schor, Rascovan et al. 2009).
Several models for coupling alternative splicing to elongation or chromatin have been considered recently. The modulation of elongation rates to splicing has been advocated (de la Mata, Lafaille et al. 2010). Meanwhile nucleosome positioning has been discussed as a regulator (Tilgner, Nikolaou et al. 2009). And chromatin as a scaffold for pre-mRNA splicing regulation has been discussed (Allemand, Batsché et al. 2008). While there is a link between transcription elongation and RNA splicing, the details of these connections are still emerging and chromatin modifications may play a large role as well. Our data clearly indicate a role for transcription elongation factor ELL2 in modulating alternative splicing (see below) but much more needs to be done to understand how this is accomplished.
4. RNAP-II elongation is altered in Igh gene regulated RNA processing
4.1. Elongation regulation is crucial
Photo-bleaching of fluorescent RNA polymerase II in living cells reveals that transcription elongation is much faster than expected but the polymerases enters a paused state for unexpectedly long times (Darzacq, Sahav-Tal et al. 2007). Genome wide studies indicated that a number of developmentally important genes in Drosophila embryos that were not yet being actively transcribed but are scheduled to be expressed in subsequent stages have a paused polymerase at the 5’ end (Zeitlinger, Stark et al. 2007).
The dysregulation of elongation and subsequent aberrant gene expression have been strongly associated with the genesis of several cancers. The von Hippel-Lindau tumor suppressor gene (VHL) predisposes individuals to a variety of tumors by inhibiting elongin SIII, a normal component of the elongation machinery (Duan, Pause et al. 1995). The tumor suppressor Cdc73, a component of the mPAF elongation complex, is inactivated by mutation in hereditary and sporadic parathyroid tumors and alters poly(A) site choice (Rozenblatt-Rosen, Nagaike et al. 2009).
In multiple lineage leukemia (MLL), which has provided an important window to the role of elongation and chromatin modification (Shilatifard 1998) and (Lin, Smith et al. 2010; Mohan, Herz et al. 2010), fusions occur with different elongation factors in different tumors to bring MLL protein in close association with the start of target genes, like the hox genes, and thereby inappropriately activate them (Meyer, Kowarz et al. 2009). The regulation of pausing and elongation are therefore emerging as important steps in gene regulation.
4.2. ELL genes in general transcription elongation
ELL1 (Eleven-nineteen Lysine-rich Leukemia gene 1) was discovered as a translocation partner with the MLL gene in individuals with acute leukemia (Thirman, Levitan et al. 1994). The closely related family member ELL2, isolated by homology with ELL1, can also stimulate RNAP-II elongation in vitro (Shilatifard, Lane et al. 1996), (Shilatifard, Duan et al. 1997), and (Miller, Williams et al. 2000). ELL3 is primarily expressed in testis and tumor cells; it also increases the catalytic rate of transcription elongation in vitro (Miller, Williams et al. 2000).
The mRNAs for the ELL1 and 2 are widely expressed but vary from tissue to tissue, suggesting they may serve a regulatory role (Shilatifard, Duan et al. 1997). Human ELL1 protein is 621 amino-acids long while ELL2 is 640, see Figure 5 for ELL2. They are very similar from aa 1-345. This region is responsible for elongation enhancement. The segments of ELL2 from aa 168-180 and from 239 to 250 were shown to interact with mediator in a proteomic screen which has not yet been verified by functional assays (Sato, Tomomori-Sato et al. 2004). ELL1 and 2 differ from aa ~352-443 and 473-516, amino acids that map primarily to the later ~75% of exon 8. The 352-516 region differences may explain some of the unique associations of ELL1 versus ELL2. This region is rich in hydrophobic amino acids like proline and leucine but also contains hydrophilic charged residues like lysine and glutamate. No known protein motif has been identified in this domain. The mid-portion of ELL has been determined to be the region that is required for its association with heat shock puffs in Drosophila (Gerber, Shilatifard et al. 2005).
From aa 520 to the carboxyl-end ELL1 and 2 are very similar. The C-terminal domain is conserved in all the members of the ELL-family including Drosophila; it shares homology with occludin, an integral plasma membrane protein located in tight junctions. The functional significance of the homology is unknown but this C-terminal domain is required for viability in Drosophila (Eissenberg, Ma et al. 2002; Gerber, Shilatifard et al. 2005). The C-terminal domain of ELL is sufficient for immortalization of myeloid progenitors most likely through its interactions with p53, although the MLL:ELL fusion is more efficient than ELL alone (Wiederschain, Kawai et al. 2003).
In addition, ELL1 was found to bind to the promoter and to have direct transcriptional activity on the thrombospodin-1 gene in mammalian cells (Zhou, Feng et al. 2009). The DNA binding domain maps to the aa 1-45. Studies using zebrafish, confirmed that ELL regulates TSP-1 mRNA expression in vivo; the conserved C-domain found in MLL fusions was the region involved in this activation (Zhou, Feng et al. 2009). This is the first report of ELL acting without binding to RNAP-II.
4.3. EAF genes in transcription elongation
EAF1 and EAF2 (ELL associated factors 1 and 2) were discovered via two-hybrid and co-IP studies to be associated with ELL. The association of EAF1 was mapped to amino acids 28-117 and 508-621 of ELL1 while the EAF2 associations mapped to aa 23 to 90 of ELL1 (Luo, Lavau et al. 2001; Simone, Luo et al. 2003), see Figure 5. Both EAF1 and 2 are required for viability in Drosophila (Liu, Hu et al. 2009). EAF2/U19/FESTA is associated with another elongation factor, SII/TCEA/TFIIS. The eaf2-/- mice are viable but form tumors in the spleen and prostate later in life (Xiao, Zhang et al. 2008). Therefore, not having EAF2 is not fatal but sufficient to produce a back-up at the mature B-cell stage. Investigation of the production of secreted Igh in the eaf2-/- mice would be an important problem.
4.4. pTEFb in transcription elongation
Cyclin T1 or T2 and cdk9, a cyclin dependent kinase, form the core of pTEFb, which has been implicated in many studies of gene regulation through alleviation of RNAP-II pausing (He, Pezda et al. 2006). The c-myc gene was among the earliest genes studied that displays a strong pause site, which can either be alleviated or cause transcription termination leading to aborted mRNA production (Roberts and Bentley 1992). Recent studies have shown that myc requires pTEFb, the positive transcription elongation factor, to relieve the pause and empower the polymerase to make mature mRNA, or in other words, increase thoughput (Rahl, Lin et al. 2010). The heat shock genes were recognized as having a paused polymerase with a requirement for pTEFb to drive elongation (Lis, Mason et al. 2000). The factor pTEFb has been shown to be required for the expression or repression of several immunologically important genes including AIRE (Oven, Brdickova et al. 2007), p53 (Gomes, Bjerke et al. 2006), runx in double positive T cells (Jiang and Peterlin 2008) and primary response genes in macrophages following Toll-Like-Receptor signaling (Hargreaves, Horng et al. 2009). Thus an examination of the pTEFb factor is important for understanding pausing and subsequent elongation.
Inactive pTEFb is associated with the small nuclear 7SK snRNA and HEXIM 1 or 2 plus other proteins. HEXIM is induced by a hexamethylene bis-acetamide and has no introns. Its levels vary with developmental stage in a number of studies. HEXIM molecules can associate as homo- or hetero-dimers, through their carboxyl-terminal tails (Dulac, Michels et al. 2005). Multiple complexes of HEXIM dimers with their associated pTEFb bind to 7SK with a conformation change in 7SK structure (Krueger, Varzavand et al. 2010); cdk9 is thus inhibited in its enzymatic ability to phosphorylate substrates. The inactive form of pTEFb is released from HEXIM and binds other factors to become activated. The equilibrium between active and inactive forms shifts based on the amount of HEXIM expressed, linking the pTEFb equilibrium to the intracellular transcriptional demands, proliferation, and differentiated states of cells (He, Pezda et al. 2006). When pTEFb is released from the inactive state it can be brought to the DNA: RNA polymerase complex by Brd4 through recognition of acetylated chromatin structures (Vollmuth, Blankenfeldt et al. 2009) or to virus transcription by specialized mechanisms.
The DRB sensitive factor DSIF (composed of Spt1 and Spt5(h) aka p160) and a multi-subunit negative elongation factor (NELF) act together to arrest RNAP-II shortly after initiation. The recruitment of the kinase activity of cdk9 in pTEFb to the polymerase phosphorylates not only the ser-2 of the CTD of RNAP-II, linking to H3K36 modifications (see Table 2), but also the C-terminal region of the Spt5h subunit of DSIF. This releases NELF from the complex and converts DSIF into a positively acting elongation factor. DSIF then cooperates with Tat-SF1 and the mPAF complex to super-charge elongation. The on-going ser-2 phosphorylation of the CTD of RNAP-II is associated with histone H2B mono-ubitination of K 120. The FACT complex is then recruited to help unwind nucleosomes (Barboric, Lenasi et al. 2009).
Mammalian polymerase associated factor complex (mPAF) functions in transcription after mediator recruits RNAP-II; mPAF helps recruit the histone methylases and possibly RNA processing factors to the RNAP-II based on CTD phosphorylation. PAF binds the polyadenylation factors at the promoter and transfers them to the elongating polymerase. The mPAF subunit parafibromin alters poly(A) site choice by association with CPSF/CstF (Rozenblatt-Rosen, Nagaike et al. 2009; Shi, Di Giammartino et al. 2009) and alterations in parafibromin cause tumors. Thus it is clear that pTEFb plays a pivotal role in elongation, polyadenylation thru mPAF recruitment, and splicing through the H3K36 modifications.
4.5. The associations of pTEFb and ELL2 to form the super elongation complex
Two recent papers have shown that pTEFb can be found in distinct complexes based on cell type and HIV status. There are numerous shared proteins between these complexes see Table 3, with potential fusion partners of the MLL gene in each complex. A super elongation complex (SEC) was demonstrated in 293, a human embryonic kidney derived cell line generated by adenovirus DNA transformation; the 293 line shares many properties with neuronal cells (Shaw, Morse et al.). The SEC was isolated (Lin, Smith et al. 2010) using antibody-mediated purification of epitope-tagged MLL-ELL1, MLL-ENL, MLL-AFF1 or MLL-AF9 transfected into the cells, with subsequent analysis of the immune-precipitate by mass spectrometry. The complexes all contained: cdk9, cyclin T1, T2a and b; AFF1, AFF4, AF9, ENL, and an ELL gene product, 1, 2 or 3. If the AFF4 was used for the pull-down EAF1 and all three ELLs were present. If epitope-tagged ELL1 or ELL2 was used for the pull-down the other was found in the complex, perhaps implying mixed dimers, while if ELL3 was used for the pull-down ELL1 but not ELL2 was found. With antibodies to tagged ELL1, no EAF was found while if ELL2 or ELL3 was used EAF1 and 2 were co-immuno-precipitated in the complex. This variety indicates that there may be several different complexes based on which ELL is expressed in the cells. What emerges is that the AFF4 is crucial for the formation of the SEC and recruitment to the HSP70 both at the promoter and into the body of the gene during transcription (Lin, Smith et al. 2010). Those studies did not address the recruitment of the pTEFb in the SEC from the 7SK:hexim complexes. This is an area that will be of interest in the future. Our ChIP experiments, outlined in Figure 3,
|Major MLL fusion partners|
(Meyer, Kowarz et al. 2009)
|Super elongation complex components|
(Lin, Smith et al. 2010)
(Mohan, Herz et al. 2010)
|TAR:TAT interacting complex|
(He, Liu et al. 2010)
in plasma cells
(Martincic, Alkan et al. 2009)
|ELL1||ELL1, 2 or 3||-||ELL2||ELL2|
|EAF1 or 1+2||EAF2||EAF2|
|Cyclin T||Cyclin T||Cyclin T|
|Supt5h/ suppressor of Ty5|
showed that DRB treatment to inactivate pTEFb and ser-2 phosphorylation of the CTD eliminated the binding of ELL2 and CstF-64 to the TATA region of Igh. In the light of the discovery of the SEC this indicates that pTEFb and ELL2 and all of the SEC components may be associated in plasma cells as they are in 293 cells.
In HIV infection of T cells, the rate limiting step for transcription is RNA polymerase pausing on viral genes near the 5’ end of all viral transcripts in the Long Terminal Repeat (Peterlin and Price 2006). The pause in transcription is relieved by the action of the HIV-1 Tat protein that recruits the host pTEFb to the TAR element in the viral RNA. TAR is a stem-loop structure near the 5’ end of the LTR that resembles the 7SK RNA and may thus serve as a binding site for pTEFb recruitment. Epitope tagged factors cdk9-F and inducible expression of Tat-HA were used for sequential affinity purifications of proteins (He, Liu et al. 2010) and (Sobhian, Laguette et al. 2010). Associated with pTEFb and Tat were AFF4, ENL, AF9, and ELL2, summarized in Table 3. These complexes are similar to the SEC but differ as well. In the absence of Tat, AFF4 can mediate the ELL2 to pTEFb interaction but less efficiently than with Tat present. This is consistent with the SEC studies where AFF4 seems to be the core of the complex. AFF17 and 10 were not found in the Tat mediated complex and ELL1 was missing; AFF17 and 10 may serve the role that Tat does in holding the SEC together. ELL2 may be more likely to associate when Tat is present than when AFF17 or 10 are part of the complex. The over-expression of Tat or AFF4 seem to use a common mechanism to stabilize ELL2 which is mediated by the cdk9 kinase activity of pTEFb and the sequestration of ELL2 in the complex to block proteolysis. Whether ELL2 is a substrate for pTEFb phosphorylation has not yet been determined but ELL2 shows a mobility shift in the experiments.
Another interesting observation was that the ELL2:pTEFb complexes were associated with a number of cellular promoters, in the absence of TAT:TAR (He, Liu et al. 2010). This implies that the complex is able to enhance the regulation of elongation on many promoters. This has implications for the role of ELL2 in regulating plasma cell genes besides Igh, for example, it could enhance the expression of IRF-4 and blimp-1, thus solidifying their expression and the plasma cell phenotype.
5. Role of ELL2 in Igh expression and alternative RNA processing
5.1. ELL2 is induced and binds to the Igh TATA region in plasma cells
The mRNA for ELL2 has been shown to be induced at least 4 to 6-fold in plasma cells when compared to B-cells by micro-array analyses by several labs (Underhill, George et al. 2003) and (Turner, Mack et al. 1994; Lin, Wong et al. 1997). We showed that both a 59 kDa cleavage product of full length ELL2 protein and a shorter, internal Methionine (M186) initiated 58kDa form of ELL2 were increased in plasma cells and after stimulation of splenic B-cells to Igh secretion (Martincic, Alkan et al. 2009). Taken together these data suggest an important role for ELL2 in Igh chain expression.
The mRNA and protein for ELL2 increase when Ig sec mRNA production increases and ELL2 is decreased with over-express hnRNP F; over-expression of hnRNP F decreases production of Igh sec mRNA (Martincic, Alkan et al. 2009). By chromatin immunoprecipitations, ELL2 is associated with the endogenous Ig mu heavy chain gene in primary mouse splenic B cells and with the gamma Igh in cultured mouse B and plasma cells. An siRNA to ELL2 is able to diminish Ig secretory mRNA production not only of Ig gamma in cultured plasma cells, but also of Ig mu in primary splenic B cells. That same siRNA to ELL2 is able to inhibit the binding of not only itself but also of CstF-64, a factor in the poly(A) addition reaction, to the promoter of an Igh gene transfected into J558L plasma cells which lack their own Ig gene (summarized in Figure 3).
Increased phosphorylation of both ser-5 and ser-2 on the CTD of RNAP-II and loading of polyadenylation factors and ELL2 onto the polymerase at the promoter in plasma cells vs B-cells (see Figure 3) contribute to regulation of production of Igh mRNA (Shell, Martincic et al. 2007). RNAP-II is therefore more competent to deliver the factors to the first poly(A) site encountered, the sec-poly(A) site in plasma cells. Hence, in the absence of the competing factors that would be both negative for the sec poly(A) site and positive for splicing, and with higher local concentrations of CstF and CPSF, the polyadenylation reaction could be favored over splicing by mass-action. How do the polyadenylation factors load better in plasma cells? We hypothesize that the polyadenylation factor loading onto RNAP-II is directed at least in part by ELL2 in its association with pTEFb and subsequent mPAF interactions.
5.2. ELL2, but not ELL1 or PC4, influences the sec poly (A) site choice in Igh reporters
We had previously cloned the ~11 kb IgGb heavy chain gene with an intact Ig heavy chain promoter, introns and enhancer between VDJ and CH1; production of B- or plasma cell specific Igh sec vs membrane forms of mRNA was shown to be dependent on the cell-type with this reporter (Kobrin, Milcarek et al. 1986). The Igh gene was co-transfected into A20 B-cells along with an empty expression vector or full length cDNAs for ELL2, ELL1, CstF-64 or PC4 cloned into that vector. We also assessed the effect of NH2 terminal, COOH terminal portions of ELL2 and a mutant in which the M 133, 138, 186 were changed to Ileu to prevent internal translational initiation. The ratio of sec (polyadenylated at first site): mb (splicing to M1) heavy chain mRNA species produced in these cells after 48 hours was quantified by QPCR following RT using the primers specific for sec p(A) site use or splicing. The data are summarized in Figure 6, below. Equal efficiency of transfection by the ELL1 & 2, CstF-64 and PC4 plasmids was assessed by RT-QPCR of their mRNAs using primers unique for the transfected products.
Considering the sec:mb mRNA ratio produced by the IgG2b heavy chain reporter in the B-cells as 1, we saw, in the data summarized in Figure 6, an increase in secretory specific mRNA production of approximately 4.5-fold when A20 B cells were transfected with mRNA for wild-type ELL2. This stimulation of secretory mRNA is more efficient with ELL2 than co-transfection with CstF-64, shown previously to drive first site poly(A) selection in chicken DT-40 cells (Takagaki, Seipelt et al. 1996). A dominant negative mutation of CstF-64 (d/n), which can assemble into the CstF complex but lacks the essential final 282 a.a.s at the COOH domain, suppressed sec poly (A) site use, as we predicted. Co-transfection of the Igh reporter and shRNA plasmid targeted to ELL2 expression (iELL2) reduced production of the Igh secretory form while a nonsense shRNA (nsiRNA) had no effect.
When cells were transfected with a plasmid carrying engineered ELL2 with Met 133, 138 and 186 to Ileu mutations, (see Figure 5 for location of the in-frame methionines) the production of Ig secretory mRNA was still stimulated relative to empty vector, about 4-fold. These cells would presumably be making primarily full length ELL2 protein and the cleaved 59 kDa form but not the 58 kDa internally initiated form. Using a clone with a segment of ELL2 corresponding to only the amino-terminal (NH2) or the 58 kDa protein (COOH ELL2), we observed virtually no stimulation in the production of the secretory specific Ig heavy chain mRNA over that with empty vector in B-cells. Taken together the data indicate that
the wild type, full length form of ELL2 is the most efficient at stimulating secretory-specific mRNA production from the Igh reporter. We speculate that the increased production of full length ELL2 results in rapid turnover into the 59/58kDa forms after having had its effect on Igh poly(A) splicing choice. Some of the ELL2 could be stabilized in the complex with pTEFb as it is in the TAT:TAR complex during HIV infection.
Meanwhile, full length ELL1 was unable to stimulate secretory specific mRNA production, a surprising result in light of the similarities of the sequences of the two proteins and their role at enhancing elongation. With PC4, a co-transcriptional factor with multiple activities (Calvo and Manley 2005) and (Garcia, Rosonina et al. 2010), there was also no stimulation of sec-specific mRNA production, even though we showed that PC4 bound to the Igh promoter in plasma cells.
5.3. ELL2 influences splicing of alternative exons with several promoters
To assess the effects of elongation factors on splicing we used transient transfections with the alternative splicing constructs (Kornblihtt, De La Mata et al. 2004) with the ED1 exon driven by the alpha-globin promoter. This reporter has been used extensively to study the role of elongation on splicing. First we used the alpha-globin promoter to drive transcription and saw an effect of ELL2 on splicing (see Table 4). Then we cloned in the J558 Igh promoter and enhancer to replace the alpha-globin promoter in the reporter to control for potential promoter effects. We used RT QPCR to assess ED exon inclusion or skipping in A20 B-cells. As summarized in Table 4, we saw significant increases in skipping with ELL2 (P<0.001) vs the vector control. ELL1 had a 2 to 4 fold effect on skipping (with P<0.05). The COOH portion of ELL2 (aka 58 kDa ELL2, amino acids 186-640) was effective at the splicing choice and this may reflect a role for it in the plasma cell. We saw a greater effect with the NH2 terminal end (aa 1-285) than the whole ELL2 molecule on the alpha-globin-promoter driven splicing reporters but not the Ig promoter constructs. This reveals differences in the two reactions and could indicate a role for interactions of the promoter, other factors, or the RNAP-II with portions of ELL2.
|Fold increase over control|
|Igh sec poly(A) site"/> splicing (Igh promoter)||4.5||-||-||-|
|Exon skipping with Igh promoter||3||-||3||3|
|Exon skipping with alpha-globin promoter||3||6||4||2|
|Proximal poly(A) site choice (400 nts bw sites)||-||-||-||-|
|Proximal poly(A) site choice 1 kb spacing||3||-||4.5||Not done|
Meanwhile we saw no effects on poly(A) choice with ELL2 or ELL1 in B cells where tandem poly(A) sites were only about 400 nts apart in the reporter (Table 4). In the Igh gene the two poly(A) sites (sec vs mb) are ~3kb apart. When the sites were moved >1kb apart in our polyadenylation only vector, full length ELL2 stimulated proximal poly(A) site use. Interestingly the fraction of ELL2 from aa 186 to 640 amino-acids was better than the full length molecule. ELL1 was not tested in this assay but should be for completeness. Thus various portions of the ELL2 have different activities. It will be important to determine which of the components of the SEC or TAT:TAR complex interact directly with which portions of ELL2. Yeast two-hybrid studies with cloned portions of the ELL2 molecule are in progress to determine this. The region corresponding to the occludin-like and p53 interacting domain shows trans-activation by itself, indicating that it interacts with the basal transcription machinery in yeast.
6. Insights from viral systems
Other model systems for poly(A) site use include virus infection. In adenovirus distal poly(A) sites in the major late transcript are favored later in infection (DeZazzo, Falck-Pedersen et al. 1991); is this because the levels of polyadenylation factors and elongation factors fall? This has not been explored. It would be interesting to examine the levels of the subunits of the SEC both in the differentiated cells and late in viral infection. The prediction would be that they decline leading to pauci-polymerases with little ability to polyadenylate efficiently. Perhaps a similar situation pertains late in development of a particular tissue, a kind of elongation exhaustion.
The ICP4 gene of Herpes Simplex Virus Type 1 (HSV-1) targets TFIID, a general transcription factor bound at the promoter (Grondin and DeLuca 2000) to high-jack transcription towards viral genes. ICP4 forms complexes with TFIID and mediator (Lester and DeLuca 2011). The prediction would be that the presumed association of the polyadenylation factors with TFIID would allow the HSV genes to associate with mPAF and the elongation factors more efficiently. In addition another HSV gene, ICP27, mediates the inhibition of cellular splicing early in infection, whereas, later it helps to recruit cellular RNA polymerase II to viral replication sites and to facilitate viral RNA export (Sandri-Goldin 2008). Early on, ICP27 specifically mediates the reduction of phosphorylation of cellular SR proteins and thereby changes their subcellular location; this favors viral mRNA production (Sciabica, Dai et al. 2003). Phosphorylation of SR proteins has been shown to influence RNA binding and is variable based on physiological changes (Ghosh, Adams et al. 2011). What if any changes may occur in the phosphorylation of SR proteins during B cell maturation has not been explored.
An earlier discussion of the HIV-1 TAT:TAR interaction with components of the SEC shows how viruses can help us clearly get at molecular events in elongation. In HIV-1 the long terminal repeat RNA contains not only the TAR stem loop, for TAT association, but also a poly(A) site followed by a major splice donor. This arrangement of sites is repeated at the 3’ end of the virus where the poly(A) site functions. The promoter proximal poly(A) site is occluded by the presence of the major splice donor (Ashe, Griffin et al. 1995) and the U1 RNA binding there at the splice donor plays a role in the occlusion (Ashe, Furger et al. 2000). In addition there is a gene loop structure that is found between the 5′LTR promoter and 3′LTR poly (A) signal. An inhibitor of pTEFb (flavopiridol) blocks 5′ to 3′LTR juxtaposition, indicating that this structure is maintained during transcription. Activation of the 5′LTR poly (A) signal or inactivation of the 3′LTR poly (A) signal abolishes gene loop formation. Thus transcription, elongation factors, and pre-mRNA processing are essential for gene loop formation (Perkins, Lusic et al. 2008). The prediction is that these structures represent a defining feature of regulation for the virus. It has been suggested that the transcription factory for cellular genes stays put while the DNA and RNA thread through. The role of elongation factors in formation of this structure is unknown.
7.1. Model for regulated alternative RNA processing in the Igh gene
In Figure 7 a model for what we know about Igh alternative mRNA processing is presented. In B-cells, U1A (Milcarek, Martincic et al. 2003), hnRNP F (Veraldi, Arhin et al. 2001), and Serine-Arginine-rich (SR) proteins like ASF/SF2 (Bruce, Dingle et al. 2003) are expressed in relatively higher concentrations than in plasma cells. The CTD of RNAP-II on the Igh gene is relatively under-phosphorylated; the polyadenylation factors CPSF and CstF and ELL2 are not strongly associated (Shell, Martincic et al. 2007) with the polymerase. The question mark indicates that we do not know if enhancing SR protein(s) like SF2/ASF for the 5’ splice to the M1 exon is/are associated with the RNAP-II. When the secretory-specific poly(A) site is transcribed there are few polyadenylation factors on RNAP-II to recognize it (thus it is a pauci-polymerase). In B-cells splicing between the 5’ splice site in the terminal (CH3 in gamma, CH4 in mu) exon, occurs presumably as the default pathway. This leads to the production primarily but not exclusively of the membrane-specific form of the Igh mRNA and protein at a low level. (Some sec mRNA is made at a low level.)
In plasma cells the RNAP-II is more heavily phosphorylated early on, perhaps by cyclin C and cdk8, perhaps through the action of NF-kB, and RNAP-II is associated with the polyadenylation factors and ELL2 via the associated pTEFb kinase activity. Polyadenylation factors are also associated with the polymerase, perhaps through the action of mPAF. When the RNAP-II reaches the sec poly(A) site the factors act upon the pre-mRNA and trigger production primarily but not exclusively of the sec-specific mRNA. This is an efficient process and mRNA yield per polymerase pass is much higher. The splice to M1 cannot then occur on the cleaved RNA; the CH3 exon splice to M1 is thus “undefined” by either an active or passive mechanism.
7.2. Questions remaining
Since the secretory poly(A) site occurs first in the Igh transcript, how is membrane-specific Igh mRNA ever made? We know that the levels of U1 snRNP splice factor and Serine Arginine (SR) exon enhancing factors are higher in B-cells than in plasma cells. This could facilitate stronger 5’ splice site recognition. In B-cells ELL2 levels are lower and the RNAP-II is quite deficient in polyadenylation factors. U1A and hnRNP F block access to the poly(A) site as well (Phillips, Pachikara et al. 2004; Ma, Gunderson et al. 2006) & (Veraldi, Arhin et al. 2001). The sec-poly(A) site is therefore weakest in B-cells and used less than half the time. As a result, some RNAP-IIs ignore it and proceed downstream the 3 kb to the membrane poly(A) site. The long intron (~3 kb) allows the setting up of the weak 5’ splice site in CH3 with its associated enhancing factors on the pre-mRNA. Meanwhile the RNAP-II may have acquired polyadenylation factors in the IVS between CH3 and M1, ‘on the fly’ as it were, as is seen for the c-myc and GAPDH genes (Glover-Cutter, Kim et al. 2008). Acquisition of polyadenylation factors by the polymerase ‘on the fly’ may be a less efficient process and could account for the overall reduced amount of processed Igh transcript seen in B-cells. This may explain the pause seen in some studies after the secretory poly(A) site in both B cells and plasma cells (Peterson, Bertolino et al. 2002), and listed in Table 1. In the case of B cells perhaps the polyadenylation factors are added to RNAP-II at the pause, although at a sufficiently low level so that we were not able to see them in our chromatin IP studies. In plasma cells the polymerase may pause to allow time for the polyadenylation reaction to occur. In vitro a coupling between splice sites and poly(A) sites has been seen which may slow the polymerase down (Rigo and Martinson 2008) and this may explain the pause. The poly(A) site at the last membrane-encoding exon is known to be a strong default site but the RNAP-II must be competent for processing the RNA when it reaches it. Some transcripts probably never get processed at all and turn-over rapidly in the nucleus.
Is the skipping of the 5’ splice site in CH3 an active or passive process and are SR proteins involved? The increase of SRp20 levels in plasma cells suggests that the splicing inhibitory process of this SR protein may play a significant role. It is not known if it travels with the RNAP-II complex in plasma cells or associates with the nascent RNA. This remains an interesting and open question.
What role does histone modification play in directing polyadenylation? The increase in ser-5 and ser-2 near the Igh promoter should be accompanied by increases in H3 K4, K36 and perhaps K79 methylations. This is another open question in the regulation of the alternative processing of Igh RNA.
An additional question is what is the role of ELL2 vs ELL1 in the splicing vs polyadenylation choice. Both ELL2 and ELL1 direct exon skipping with an alpha-globin or the Igh promoter (Table 4). When we assessed the ability of the “COOH” portion of ELL2 (aa 186-640) in the Igh reporter assay it had minimal effect on first poly(A) site use. However that protein was lacking the first 186 amino-acids encompassing one region thought to interact with the mediator complex (aa168-186) as well as the EAF2 interaction region (aa 6-80). Therefore we hypothesize that when ELL2 was enhancing the choice of the sec poly(A) site it might have been acting as a bridge between several factors. This is in keeping with two regions of ELL2 interacting with either different mediator subunits or EAF1. Hence maintaining proper spacing may be key. It will be interesting to determine what proteins uniquely interact with ELL1 versus ELL2. Perhaps they are AF17 and AF10, see Table 3.
A major issue for plasma cell development is whether ELL2 drives the expression of other genes besides Igh. In the TAT:TAR studies ELL2 was found on several promoters by chromatin IP. Our preliminary data support a more broad distribution of ELL2 on plasma cell genes. If ELL2 acts independently of the Igh promoter it may assist the expression of IRF4 and blimp-1, thus providing a feed-forward loop for its own expression. More IRF4 would lead to more ELL2 which would lead to more IRF4 and so on. Global genome wide mapping of ELL2 binding would help to reveal its targets. Knocking ELL2 out in a conditional way so the animals survived would also help show what it was responsible for activating in specific cell lineages. These are separate but complementary questions.
The Igh locus has been a useful model in an important biological system; we have much more to learn from it. The experiments done thus far on the role of elongation in directing poly (A) site use and choice provide us with worth-while glimpses of how RNAP-II functions with a myriad of other factors to produce large amounts of mature mRNA. The interactions of these complexes with the specific modifications of the CTD of RNAP-II may only be able to be approached by in vitro methodologies as was done for the interactions of SET2 with phosphorylated CTD (Vojnic, Simon et al. 2006). The role of individual factors like ELL2 may have some redundancy with other elongation factors but the observation that portions of ELL1 and 2 differ in primary sequence permit us to consider that each factor may play a unique role in gene expression. The additive effect of all the factors in a single complex (SEC) implies they are all needed. The immune system is particularly suited to the study of conditional knock-outs targeted specifically at B-cell or T-cells. Knocking out ELL2 specifically in the B-cell lineage should provide valuable insights into its role in elongation and Igh mRNA processing. Once we fully understand how polyadenylation is linked to transcription we may then be able to more easily see how the competition between splicing and polyadenylation can be accomplished.
I extend my sincere thanks to all my trainees who participated in the research presented here and to many colleagues who have honed my thinking on this topic. I apologize to those whose publications I was not able to discuss or cite due to space limitations. This work was support by grant # MCB-0842725 from the National Science Foundation.