Histone modifications found in different part of RNA polymerase II genes.Underlined modification means that it changes with level of transcription. Font size represents level of the histone modification relative to others in the same study. The predominant cell type used in the study is found under the reference.
RNA processing is an essential process in eukaryotic cells, creating different RNA species from one and the same gene. RNA processing occurs on nearly all kinds of RNAs, including mRNA that codes for proteins, ribosomal RNA, tRNA, snRNAs, and μRNA. RNA processing usually occurs co-transcriptionally, and many factors are recruited by the RNA polymerase itself. This stimulates RNA processing by enhancing the correct assembly of factors as the RNA is being produced. Some factors, such as splice factors and cleavage factors for rRNA, are also recruited by the growing RNA-chain. A further link has been established by the transcription rate itself: the low processivity of the RNA polymerase, where it pauses a lot, favours inclusion of alternative splice sites, for instance.
1.1. RNA processing in RNA polymerase II transcription
RNA processing of the mRNA, 5' capping (addition of a methyl-guanosine at the 5' end), splicing (removal of internal introns) and polyadenylation (cleavage and polyadenylation of the 3’ end), are tightly coupled to transcription, and take place mainly co-transcriptionally (for review see Moore and Proudfoot, 2009; Wahl et al., 2009). The factors required for these processing events are recruited to the growing RNA chain during transcriptional elongation by specific sites or structures on the nascent RNA, the transcription machinery and the chromatin environment (for review see Moore and Proudfoot, 2009; Alexander and Beggs, 2010; Luco et al., 2011; Schwartz and Ast, 2010; Carillo Oesterreich et al., 2011). Some of the factors necessary for RNA processing are also recruited by the RNA polymerase itself. RNA polymerase II has a C-terminal domain (CTD) with a repeated sequence of seven amino acids on the largest subunit, (Rbp1). The CTD forms a platform to which many different proteins involved in transcription may associate. The repeated sequence is conserved, YSPTSPS, in different organisms, whereas the length of the CTD is different, 52 residues in human cells and 26 in yeast. The CTD goes through a series of phosphorylations depending on where in the transcription cycle the polymerase is: unphosphorylated at the recruitment to the promoter, serine-5 phosphorylation (Ser-5) at promoter clearance (the first 20-40 nucleotides) and serine-2 (Ser-2) phosphorylation on the elongating RNA polymerase II (for review see Meinhart et al. 2005; Buratowsky, 2009). Other modifications exist; Ser-7 phosphorylation and the recently found argenine-3 phophorylation play roles in snRNA and snoRNA transcription (Egloff et al., 2007; Kim et al., 2009; Sims et al., 2011), and Ser-7 phosphorylation was recently found to be more abundant in intronic sequences (Hyunmin et al., 2010). The CTD cycle relies on kinases and phosphatases acting at the right moment during transcription (review in Phatnani and Greenleaf 2006; Buratowski 2009). First the CTD is phosphorylated at Ser-5 at the promoter by the cdk7/kin28 in the RNA polymerase II auxiliary factor TFIIH, and this promotes promoter clearance. The shift into elongating mode is, at least partly, made by the dephosphorylation of Ser-5 by specific phosphatases, Rtr1 and Ssu72 and the phosphorylation of Ser-2 by cdk9 in the active P-Tefb complex in metazoan (ctk1 performs a similar task in yeast). Fpc1 is the phosphatase connected to dephosphorylation of Ser-2 of the CTD of RNA polymerase II. Recent studies have questioned these static events, and now a dynamic turnover of the phosphorylations is put forward, in which an elongating RNA polymerase is predominantly Ser-2 phosphorylated but the Ser-5 is also phosphorylated to some extent (Buratowski et al., 2009).
1.1.1. The CTD of the RNA polymerase in RNA processing – recruitment versus elongation rate
The concepts that the transcriptional elongation rate and recruitment of RNA processing factors through the CTD of RNA polymerase II are important for RNA processing have been known for some time (Mc Cracken et al., 1997). Several proteins are recruited to transcription sites via association with the CTD, often depending on phosphorylation state (reviewed in Perales and Bentley, 2009). Capping of the nascent transcript occurs co-transcriptionally, as soon as the growing RNA leaves the exit channel of the RNA polymerase II. The enzymes involved in the capping machinery, at least Ceg1 in yeast, associate with the Ser-5-P CTD of the initiating RNA polymerase II (Komaninsky et al., 2000; Schroeder et al., 2000). The recruitment of 5’ capping machinery to the RNA polymerase enhances the capping reaction (Moteki et al. 2002), but it also has an influence on transcription. It has been suggested that the capping machinery or capping of the nascent transcript stabilises the RNA polymerase, which helps to convert the initiating polymerase to an elongating polymerase (Moore and Proudfoot, 2009; Perales and Bentley, 2009)). Termination and polyadenylation are also enhanced by the polyadenylating machinery associating with the Ser-2 phosphorylated CTD (Komanisnky et al., 2000; Adamson et al., 2002; Lunde et al., 2010).
The spliceosome is assembled by different snRNPs being recruited by cis-acting sequences on RNA, the 5’ splice site and the 3’ splice site around introns. The recruitment of splicing factors occurs stepwise, with U1 snRNPs assembling on the 5’ splice site and U2 snSNP at the 3’ splice site, before the tri snRNP U4-U5/U6, assemble and the intron is removed (for review see Wahlet al., 2009; Moore and Proudfoot, 2009). The recruitment and assembly process of the different snRNPs is enhanced by transcription, and the complexes are further stabilised by the cap-binding complex (Lacardie et al., 2006; Listerman et al, 2006; Görnemann et al., 2005). A close coupling of the splicing to the transcriptional process has been seen also in several in vitro studies (Hicks et al., 2006, Das et al., 2006) in which transcription enhances splicing if the RNA polymerase II is transcribing and not T7 polymerase, lacking the CTD. It has also been shown that some of these snRNPs and splicing factors associate with the RNA polymerase II itself, in particular U1 snRNPs and some SR proteins (Morris et al., 2000; Das et al., 2007) (Figure 1A). The U1 snRNP was recently shown to associate with the RNA polymerase II early in the transcription cycle, independently of whether the gene contains introns or not, an interaction that would allow the U1 snRNP to scan the growing RNA for splice sites (Brody et al., 2011). It can be seen that the amount of spliceosomes increases in intron-containing genes (Brody et al., 2011), most likely being recruited to the pre-mRNA by splice sites. These studies have led to the proposal that spliceosomes are efficiently loaded onto the nascent transcript if recruited by RNA polymerase early in the transcription process, since the factors do not have to compete with inhibitory RNA-binding factors. However, the interactions between the CTD or other subunits of the RNA polymerase II and splice factors vary from study to study, which has questioned the generality of the results. It is possible that the interactions only occur on specific genes or in specific cell types (discussed in Carillo Oesterreich et al., 2011).
It has also been shown that different promoters affect splicing differently, suggesting that different transcription factors at the promoter recruit specific splice factors. In particular, co-regulators to nuclear receptors associate with splice factors, which in turn also act as transcription factors, regulating transcriptional initiation and elongation (Auboeuf et al., 2004; Auboeuf et al, 2007). Splicing also enhances transcription, in particular at splice sites near the promoter (Furger et al,. 2002; Damgaard et al., 2008). Promoter proximal splice sites increase the binding of TFIID, TFIIB and TFIIH, all important general transcription factors for transcriptional initiation, and thereby increase loading and initiation of RNA polymerase II. Furthermore, the splice factor SC35, which is a splicing enhancer, associates with RNA polymerase and the cdk9 in the P-Tefb complex (Lin et al. 2008). Depletion of SC35 results in a transcription block, most likely caused by a lack of P-Tefb at the transcription site, which in turn leads to a severe reduction in Ser-2 phosphorylation of the CTD necessary for the switch to the elongating form of RNA polymerase II. Another example of a splice factor that interacts with P-Tefb is SKIP (prp45 in yeast) (Brès et al. 2005), which in this way influences transcriptional elongation.
The RNA polymerase pauses during transcription, both near the transcription start site and locally inside the gene body. It is well established that the RNA polymerase II stalls proximal to the promoter, before switching from an initiation Ser-5 phosphorylated CTD mode to Ser-2 phosphorylated elongating mode (Brodsky et al., 2005). This is achieved by the recruitment of P-Tefb by the Ser-5 phosphorylated CTD, but also by histone modifications, such as H3K4me3 and H3Bub (for review Brés et al., 2008; Lenasi and Barboric, 2010) Pausing of RNA polymerases also occurs in the gene body (Figure 1B), local pausing, which may be achieved by a number of mechanisms: intrinsic features of the RNA polymerase II, such as backtracking, when the RNA polymerase II slides, elongation factors, and features in gene architecture, such as DNA sequences and structures formed in the growing RNA (review in Carillo Oesterrich et al,. 2011; Perales and Bentley, 2009). Several genome wide studies of the distribution of RNA polymerases have observed an accumulation at exons, suggesting a slower rate when the RNA polymerase meets an exon. It has been shown that paused RNA polymerase II is hyperphosphorylated, both Ser-5 and Ser-2 phosphorylation are present on the CTD (Munoz et al., 2008). Since both mutation of the CTD mimicking constant phosphorylation and inhibition of CDK9 induced a slower rate, it was suggested that homogenous phosphorylation is required achieved by altering the phosphorylation/dephosphorylation cycle (Munoz et al. 2008). The slower elongation rate often occurs at the 3’ splice site of introns, concomitant with hyperphosphorylation of the CTD and splice factor recruitment, as was shown in UV-damaged cells (Munoz et al. 2008). Recent studies have also shown that the splicing event induces transcriptional pausing. The RNA polymerase accumulates in the Ser-5 CTD form at 3’ splice sites in the gene body (Alexander et al., 2010). Alexander et al. (2010) used a model gene in yeast with introns to map RNA polymerases with high resolution. The stalling of RNA polymerase II was not observed in intronless genes or in genes with mutated splice sites. Alexander et al. (2010) suggest a check point control, related to splicing, in which such splice factors as SC35, Skip or DExD/H-box RNA helicases or even base-pairing between the snoU12 and the mRNA are involved. A similar mechanism has also been shown at the terminal exon in yeast, where an accumulation of RNA polymerase II is found (Carrillo Oesterrieich et al, 2010). Intronless genes or genes with mutated splice sites do not have stalled RNA polymerases (Alexander et al. 2010; Carrillo Oesterreich et al., 2010).
1.1.2. Alternative splicing – Recruitment versus the transcriptional elongation rate
Splicing is a regulated process, and exons can be included or excluded giving different mRNAs, which generates a diversity of protein products. Alternative splicing is regulated by splice factors, such as specific SR protein and hnRNP proteins, which bind to specific sequences on RNA. Both splicing enhancer elements and splicing silencers exist and these sequences are found both in introns and exons (for review see Wahl et al., 2009). The splice factors can be expressed in a tissue-specific manner or can be activated by signalling pathways. In addition, core splice components are involved in alternative splicing, which has in some cases been attributed to production of different isoforms with specific function (Dredge et al, 2005) and in others to autoregulation coupled to degradation by the nonsense mediated decay pathway (Saltzman et al, 2008). Although a general splice factor is alternatively spliced and autoreguated, not all splicing events are affected. Recently, it was shown that the U1 core factor SmB7B’ is autoregulated, and its downregulation affects alternative splicing of several RNA processing proteins resulting in the reduction of snRNPs (Saltzman et al., 2011). Another mechanism that affects splicing outcome is the transcription rate; pausing of the RNA polymerase favours inclusion of exons (Kornblihtt et al., 2004). This was shown by using a slow transcribing RNA polymerase II, which produced transcripts with more included exons (de la Mata et al., 2002). The slower rate gives time for splice factors to find the splice site, in particular the weak 3’ splice site upstream of alternative exons. Regulation of alternative splicing has therefore been suggested to rely on elongation rate, with pausing of the RNA polymerase II resulting in exon inclusion. It has been hard to find one general mechanism for alternative splicing, which sometimes is an all or nothing event, sometimes several alternative splice forms exist in the same cell. Many different factors are involved in splice site choice, many only being involved in a subset of genes. The complex regulation, both by specific alternative splice regulators, general splice factors and the transcription elongation rate, is a challenge. In addition, transcription occurs in a chromatin environment, with nucleosomes being present along the gene. This provides a further layer of regulation, in addition to splice factors and the transcription rate.
1.2. Chromatin influences RNA processing
Co-transcriptionally RNA processing occurs in a chromatin environment. Chromatin constitutes a barrier for transcription, both initiation and elongation. Several ways of changing the structure occurs upon transcription; the nucleosomes are moved and histones in nucleosomes are modified. Histone acetylation and H3K4me3 are particularly abundant at promoters, creating an open chromatin structure so that TFIIB and the RNA polymerase can bind. These modifications drop towards the 3’ end of the gene. Instead H3K36me3 appears.
1.2.1. Nucleosome distribution at exons
Several studies have suggested that the chromatin architecture influences RNA processing of RNA pol II genes. Based on DNA sequencing, Baldi et al. (1996) and Kogan and Trifinov (2005) suggested that exons have nucleosomes positioned at intron-exon junctions. Genome-wide analyses using Mn-digested chromatin of CD4+ T cells before deep Solexa sequencing (presented in Schones et al., 2008) have shown that exons are more prone to harbour positioned nucleosomes than intronic regions (Tilgner et al., 2009; Schwartz et al., 2009: Andersson et al., 2009; Hon et al., 2009; Speis et al., 2009; Nahkuri et al., 2009; Chodavarapu et al., 2010). If the exon is longer than 147 bp (the number of nucleotides wrapped around the histone core), the positioned nucleosome has a position at the 3' end in chromatin from C. elegans (Kolasinska-Twister et al., 2009). The distribution observed in human cells differs; however, the peak observed is at the 5’ end of exons (Hon et al., 2009). The average exon length in human cells is approximately 150 bp, fitting the length of one nucleosome, but both shorter and longer exons exist. The discrepancy between studies may be attributed to the definition of exon length used. When analysing both shorter and longer exons a consensus arises: short exons (50 bp or shorter) do not contain a positioned nucleosome, long exons (more than 300 bp) have positioned nucleosomes at both the 5’ end and the 3’ end of exons (Schwartz et al., 2009; Andersson et al., 2009; Hon et al., 2009). The positioned nucleosomes are stronger in exons with weak splice sites, but pseudo-exons (with strong splice sites which are not used) are depleted of nucleosomes (Tilgner et al., 2009; Spies et al., 2009). Schwartz et al. (2009) found that the occupancy was related to exon usage, with less nucleosomes on low-abundant alternative exons, and more in high abundant, constitutive exons. The higher nucleosome occupancy in exons is a conserved feature, found in both Mn-seq from C. elegans, Drosophila, mice, man, and in a number of other eukaryotes, such as fungi and plants. The position of the nucleosomes is explained by the DNA sequence, the higher GC-content of exons producing a favourable curvature in the DNA (Tilgner et al., 2009; Schwartz et al., 2009; Speis et al., 2009). This has been suggested to be caused by codon bias to exclude long stretches of A-tracts in exons (Cohanim and Haran, 2009). Another explanation is provided by Schwartz et al. (2009), who found nucleosome exclusion sequences at the 3’ splice site, thereby shoving the nucleosome into the exon. The fact that transcription does not affect the position of the nucleosomes suggests that the underlying sequence determines the nucleosome pattern and form a mechanism to mark exons (Anderson et al., 2009; Chodavarapu et al., 2010). However, a study investigating three cell types, the erythroid K562, the monocytic U937 and CD14+ monocytes, showed that the pattern of positioned nucleosomes varies – a higher density over exons of expressed genes is seen in primary CD14+ cells, whereas the cell lines K562 and U937 displayed a lower density (Dhami et al., 2010). More cell types must be investigated to resolve these matters; these results may reflect differences in cell type.
The nucleosome distribution at polyadenylation site is also set, with a region around the polyadenylation signal being depleted of nucleosomes, and a region downstream with a higher enrichment of nucleosomes (Speis et al., 2009; Carrillo Oesterreich, 2010). The depletion of nucleosomes is not linked to the level of transcription, but the downstream nucleosomes depend on the level of usage of the polyadenlylation site: high-level usage has more nucleosomes 76-375 bp downstream of the polyadenylation site.
1.2.2. Histone modifications at exons
The nucleosome position is only markedly increased at exons compared to introns (1.5 fold) and this is not affected by transcription (discussed in Schwartz and Ast, 2010), which has lead to several studies correlated the nucleosome occupancy data with genome-wide ChIP-seq of several histone modifications. Histone modifications are involved in the regulation of transcription initiation and elongation, where they change both the nucleosome structure and recruit factors. The connection of the nucleosome distribution data with data from histone modifications along genes identified the H3K36-me3 as a specific modification accumulated at exons, and more pronounced at exons in the 3’ end of the gene (Kolasinska-Zwierz et al., 2009; Schwartz et al., 2009; Andersson et al., 2009; Hon et al., 2009). Tilgner et al. (2009) could not find a clear accumulation of H3K36me3 at exons, however, when adjusted to the distribution of nucleosomes. The conflicting results reported, mainly based on the same data, could be due to different normalisation criteria and different algorithms used when correlating data obtained with different techniques; MN-seq with ChIP-seq and ChIP-Chip data (discussed in Ringrose, 2010). Nevertheless, most studies identified H3K36-me3 as a mark for exons, together with several other modifications, forming a modification code along genes (Table 1). H3K36me3 accumulation at exons in the gene body was also seen in expressed genes when Dhami et al. (2010) investigated the three different cell types. In this study, some histone modifications were excluded from exons in expressed genes; H3K9me2/3, H3K27me2/3. These cell types also display a specific modification pattern for exons in non-expressed genes, H3K27me3. A recent study has further examined the nucleosomal architecture at exon-intron junctions, based on published genome-wide MN digestion-seq and ChIP-seq surveys (Huff et al., 2010). A specific histone modification pattern was identified in the 5’ end of the gene, different from the 3’ end of the gene. The genes were therefore divided into three parts; promoter region, 5’ end of the gene (up until the internal exons), and the 3’ end of the gene, to further map the chromatin landscape. The 5’ intron-rich region had higher levels of H3K79me2, and also peaked in H2Bub. The H3K36me3 was higher in the 3’ exon-rich part of the gene, and peaked near exons and extended downstream.
|Promoter||Exon 5’||Exon 3’||Alternative exons||Silent exons||Excluded from exons||introns||pol II accumul.||Ref.|
|H3K36me3||H3K79me2?||Huff el al.|
|Kolasinksa et al.|
|Dhami et al. three cell lines|
|H3K4me3 at TSS||H3K4me3||H3K36me3|
|H3K9me3||no||Speis et l.|
|H3K36me3||At exons||Schwartz et al.|
|H4K20me1||Tilgner et al.|
|Hon et al.|
|H3K36me3||H3K27me2/3||Andersson et al.|
220.127.116.11. How are the histone modifications achieved and maintained?
The distribution of the nucleosomal exon–intron pattern is mainly explained by the underlying DNA sequence, but the histone modifications varies along the gene, indicating that active mechanisms also apply to set and maintain the pattern. Nucleosome modifications at exon-introns follow the level of transcription, but not completely. Histone modifications are loaded onto the histone tails by modifying enzymes recruited to the site of action (reviewed in Gardner et al., 2011; Murr, 2010). Modifications associated with promoters, acetylated histones and H3K4me3 are set by histone acetyl transferases (HATs) and the SET1/MLL/COMPAS methyl transferases, respectively. These modifications are then removed by histone deacetylases (HDACs) and demethylases (HMTs). HATs are often co-regulators at the promoter, and MLL is recruited to the promoter by Ser-5 CTD on the RNA polymerase II. However, sometimes histone modifications at the promoter do not depend on transcription, instead they prime the genes for transcription (Raisner et al., 2005; Kouskouti and Talianidis, 2005; Liber et al., 2010: Min et al., 2011). Other histone marks are more abundant in the gene body (Murr et. al., 2010; Bannister and Kouzarides, 2011). H3K36 tri-methylations are achieved by HypB/setd2 (Set2 in yeast) methyl transferases (Edmunds et al., 2008), which in turn are recruited to the transcribed gene most likely by Ser-2 CTD on the elongating RNA polymerase II (Eissenberg et al. 2007). The occurrence of modifications specifically at exons raises the question of how these patterns are set. By comparing the patterns of genes at different expression levels in the three cell types investigated, Dhami et al. (2010) identified a priming event consisting of H3K27me3 and H3K36me1 at non-expressed genes. These marks were then replaced by H3K36me3 and H3K27me1 during transcription, and the levels reflected the expression level of the gene. This study also demonstrated a difference in alternatively spliced exons, showing less H3K36me3 at less-included exons. This suggests that some of the modifications are set not only by events during transcription, such as the elongation rate of RNA polymerase II, but also by events in splicing.
Huff et al. (2010) addressed the question of whether splicing per se can affect the nucleosome modification pattern, by investigating two genes that are alternatively spliced upon stimuli, YPEL5 and CD45. No effect on the H3K36me3 levels could be seen upon inclusion of the alternative exons, ruling out that splicing events are setting the marks. Instead, these marks are relatively stable, similar to histone marks at promoters where marks are present without active transcription. Based on these results, a model in which exons are defined, or primed, by histone marks can be put forward. In this model the basic marks can shift according to a set pattern, H3K27me3 to H3K36me3, by active transcription, but not splicing. However, this model is based on only two studies and needs verification. It is also worth noting that exons have a higher degree of DNA methylation, both in plants and in human cells (Chodavarapu et al., 2010). Whether this is a result of a higher DNA methylation of nucleosomal DNA or an active recruitment of DNA-methylation enzymes at exons remains to be resolved. In addition, the question of how the priming of exons in non-transcribed exons is achieved remains to be investigated.
18.104.22.168. The functional significance of the chromatin pattern
The next question to be addressed is whether the nucleosome distribution and histone modifications associated with exon-introns play a functional role in splicing? Two ideas have been proposed: The nucleosomes form “speed bumps” for the RNA polymerase or the histone modifications on nucleosomes form recruitment platforms for splice factors (Figure 2A and 2B). The “speed bump” model is related to the “kinetic model”, and presumes that nucleosomes constitute a barrier for the elongating RNA polymerase II, in particular that nucleosomes at exons make the polymerase pause more. An accumulation of RNA pol II is seen also over exons (Schwartz et al., 2009; Dhami et al., 2010; Chodavarapu et al., 2010), indicating that the higher nucleosome density at exons (Schwartz et al., 2009) or the identity of the specific modifications provides a barrier for the RNA polymerase and makes it pause (Dhami et al., 2010). The pausing of the RNA polymerase II will then allow for splice factors to assemble onto the growing mRNA. The “recruitment model” instead proposes that the chromatin architecture, with different histone modifications, is involved in recruiting components of the spliceosome and splice factors, such as SR-proteins and hnRNPs. These proteins will then decide the splicing outcome by binding to the nascent RNA and direct the spliceosome to the right place. These two models will be discussed below.
22.214.171.124. RNA polymerase rate in a chromatin environment and alternative splicing
RNA polymerase II processivity has an affect on splice site choice, as can be seen in alternative splicing. Alternative exons are included with a slower moving elongating RNA polymerase II (de la Mata et al. 2003). Inclusion of alternative splice sites can also be seen in cells by using different inhibitors of transcription, such as 5,6-dichlorobenzimidazole, 1-b-D-ribofuranoside (DRB), which inhibits b-Tefb from phosphorylating elongating RNA polymerase II, and camptothecin (CPT), an inhibitor of topoisomerase I (reviewed in Kornblihtt 2004; Kornblihtt 2007). The slower rate gives time for splice factors not only to find the strong splice site, but also weak splice sites surrounding alternative exons, and assemble the U2 snRNP before the appearance of a strong splice site (Figure 2A). However, the result of a slow RNA polymerase II may not always result in inclusion of alternative exons as also specific inhibitors of alternative splicing have time to be recruited to splicing silencers (Pagani et al. 2003). Whether changes in the elongation rate of RNA polymerase along a gene are a general mechanism to affect splicing outcome is still unclear. It is not yet established whether the elongation rate of RNA polymerase II changes at alternative exons at all genes, or whether this is a local effect seen in only a subset of genes. It has been shown that the processivity of RNA polymerase II is consistent during transcription, at approximately 3. 8 kb/min, independently of whether intron-rich regions or exon-rich regions are transcribed. This was measured on a few endogenous genes with long introns by qPCR coupled to the alleviation a DRB block of elongation by RNA polymerase II (Singh and Padgett, 2009). In a separate study using FRAP of GFP-RNA polymerase II on a model gene, Brody et al. (2011) found that the RNA polymerase rate is similar on intronless genes as it is on genes harbouring several exons/introns. On the other hand, unspliced polyadenylated mRNA remains at the gene locus until properly spliced. These studies did not address the question of RNA processivity on genes with alternatively included exons. Recently, Ip et al. (2011) showed that the elongation rate affects the inclusion of alternative splice sites in a subset of genes. From a mechanistic point of view, the exons affected displayed an accumulation of RNA polymerase II upstream of the exon, in agreement with paused polymerases. These exons were surrounded by weaker splice sites than other alternative exons, and followed by a strong 3’ splice site at the downstream exon. The genes harbouring alternative exons sensitive to a slower elongation rate were mainly involved in RNA processing and apoptosis, and the alternative exon usage often introduced a premature termination codon, marking the transcript for nonsense-mediated decay.
The processivity or pausing of the RNA polymerase depends on several factors: the architecture of the gene, DNA sequences, and factors bound to DNA and chromatin. Nuclesomes affect the RNA polymerase rate, but depending on which histone modifications they carry, the effects are of different magnitudes. Acetylated nucleosomes, for example, increase the rate of elongation by increasing the accessibility of DNA, whereas repressive methylation modifications reduce the RNA polymerase processivity. It has been shown that histone modifications alter the splice site choice, for instance that inhibiting histone deacetylases (HDACs), creating hyperacetylated nucleosomes, results in the exclusion of alternative exons at the 5’ end of the gene (Kornblihtt 2003; Allo 2011). This was attributed to an increase in H3-Ac at the 5’ end of genes. Recently, it was shown that an increase in H4-Ac at alternative splice sites in the gene body increases RNA polymerase II processivity, which in turn favours exclusion of the alternative exons (Hnilicavá et al. 2011). The HDAC inhibitor also reduces the association of the SR-protein SRp40 with chromatin/RNP, maybe also affecting splice factor acetylation. It has also been shown that inducing repressive histone marks, H3K9me2, by siRNA slows RNA polymerase II, and exons in the proximity of the recessive mark are included (Allo et al. 2010; Allo et al. 2011). The kinetic model connected to the genome-wide results predicts that H3K36me3, maybe together with other modifications or factors, decreases the RNA polymerase II processivity and leads to exon inclusion. The higher density of RNA polymerases at exons would favour such a model, but most studies find a lower H3K36me3 level at alternatively used exons than constitutively used exons. H3K36me3 is found on actively transcribed genes and follows the transcription level, which indicates that other factors must operate in addition to RNA processivity to determine exon usage.
126.96.36.199. Splicing factor recruitment to chromatin in alternative splicing
The recruitment model suggests that histone modifications constitute a platform for specific proteins to bind (Figure 2B). These proteins are chromatin-binding proteins that adopt an adapter function to recruit splice factors or they are splice factors that bind histone modifications directly. The capping machinery, for instance, binds not only to the CTD of the RNA polymerase, but also to the H3K4me3 found close to the promoters (Perales and Bentley, 2009). Four examples of chromatin factors functioning in splicing and possessing the adapter function have been described, and these examples will be discussed below. The first one is a protein that is recruited to chromatin by binding to H3K36me3, MORF-related gene 15 (MRG15). The H3K36me3 is the major histone modification observed in transcribed exons, and it is tempting to suggest that the function is to attract specific factors to exons. H3K36me3 is recognised by MRG1, which interacts with the TIP60 HAT complex and HDACs. The yeast orthologue is involved in preventing promiscuous transcriptional initiation inside genes by removing acetyl groups formed around the elongating RNA polymerase II. The MRG15 protein also interacts with the H3K4me3 demethylase RBD2 (retinoblastoma binding protein 2) on certain genes. Luco et al. (2010) connected MRG15 with the PTB protein (polypyrimidine tract binding protein), which regulates the alternative splicing of certain transcripts by binding to a splicing silencer sequence on the RNA. MRG15 interacts with PTB, and this interaction is used to recruit the PTB to alternative exons on a subset of genes. However, H3K36me3 is located at all exons, and also at included alternative spliced exons, so a splice-enhancing protein must exist that couples H3K36me3 to inclusion of exons. It is also hard to reconcile a model in which H3K36me3 is a marker for alternative exons with the pattern identified in genome-wide surveys, where all exons exhibited the mark.
The chromatin remodelling ATPase CHD1 is recruited to active genes by binding to H3K4me3, and in turn interacts with the ATPase SNF2h and U2 spliceosome components (Sims et al. 2007). The CHD1 protein and H3K4me3 are not involved in changing splicing outcome, but rather the efficiency of the splicing reaction, suggesting that the recruitment of splicing factors is enhanced to splice sites. The H3K4me3 appears mainly at the 5’ end of genes and is not abundant in the 3’exon-rich part, (Spies et al. 2009), which suggests that the impact on splicing is restricted to the exons proximal to the promoter. However, H3K4me3 has been identified on isolated exons in the 3’ end of genes in highly transcribed genes (Spies et al. 2009), and the question is whether the effects seen by CHD1 and H3K4me3 are on these exons.
Acetylated histones have been correlated to an increased RNA processivity, but genome-wide studies have also shown a peak of H3K9-Ac just prior to exons in active genes (Dharmi et al. 2010). The SAGA complex, with the HAT GCN5, interacts with acetylated histones, and recruits U2 snRNP proteins in yeast (Gunderson and Johnson 2009). Recently it was shown in yeast that deleting HDAC causes mis-regulation of the U2 snRNP along the gene body, suggesting that the dynamic acetylation-deacetylation cycle along the gene regulate spliceosome assembly (Gunderson et al. 2011). The deacetylation in the gene body is essential not only to prevent internal initiation of transcription but also to restrict splicosome assembly to exons.
The fourth example of a chromatin protein that interacts with splice factors is the heterochromatin protein 1 (HP1). HP1, which associates with H3K9me3 in repressive chromatin, associates with hnRNPs in Drosophila cells, both in actively transcribed genes and in heterochromatin (Piacentini et al. 2009). It has been suggested that the role of HP1α in transcribed genes is not to affect the splicing reaction, but to be involved in the packaging and export of RNPs for a subset of genes. Recently, it was shown that phosphorylated HP1γ interacts with H3K9me3 in alternative exons in the CD44 gene in human cells (Saint-André et al. 2011). Activation of PKC, which phosphorylates HP1γ, results in the inclusion of the variable exons, a process that is reduced when HP1γ is silenced. Simultaneously with the targeting of HP1γ, the RNA polymerase II and the splice factor U2AF65 are enriched at the variable spliced region of the gene. It has been proposed that HP1γ acts as an adapter for splicing factors, but the accumulation of RNA polymerase II suggests that it also reduces the processivity of RNA polymerase II. The histone modification pattern over this region contains peaks of H3K9me3, with low levels of H3K36me3. Most genome-wide studies show that H3K9me3 is excluded from exons in transcribed genes, but a genome-wide screen of histone methylations found a peak of H3K9me in the gene body (Barski et al., 2007). The discrepancy between the genome-wide studies and the CD44 gene regarding the presence of H3K9me3 could also be explained by the genome-wide surveys missing subsets of genes that exhibit a different histone modification pattern than the most prevalent ones, and also that cell type specific differences may exist.
188.8.131.52. Chromatin remodelling factors in alternative splicing
Chromatin remodelling complexes have also been implicated in RNA processing. Biochemically purified SWI/SNF complex interacts with several snRNP proteins and splice factors, such as Prp4 (Delaire et al. 2001) and Prp8 (Patrik Asp and Ann-Kristin Östlund Farrants, unpublished data). A functional role of the mammalian SWI/SNF ATPase proteins, BRM and BRG1, affects the splicing of specific genes. The SWI/SNF complexes are mainly involved in transcriptional initiation, remodelling the structure of nucleosomes at the promoter (for review Hargreaves and Crabtree, 2011). In addition, they have an effect on the splicing patterns of the CD44 gene and the telomerase gene (Batsche et al., 2006; Ito et al. 2008). The mechanism behind the splicing effect has been attributed to an effect on the elongation rate of RNA polymerase II at the region containing various exons. The expression of BRM is to reduce the processivity of RNA polymerase by inducing a Ser-5 phosphorylation of the CTD, while at the same time BRM interacts with the splicing regulator protein SAM68. SAM68 requires to be phosphorylated by PKC to bind to the nascent RNA and to BRM, which couples BRM and its effect on splicing to activation by external signals. The ATPase activity, which is required for chromatin remodelling, is not essential for the splicing effect. A number of genes have been identified, all of which rely on BRM for inclusion of alternative exons. However, the BRM on polytene chromosomes from the diphterian Chironomus tentans has been observed not only on chromatin but also on the growing mRNP (Figure 3A) (Tyagi et al., 2009), suggesting that the SWI/SNF complexes can influence splicing on several levels.
The BRM also fractionates with the RNA fraction, as do the human BRG1 and BRM. Similar to BRM in human cells, SiRNA silencing of the Drosophila BRM affects the splice pattern of a subset of genes. Closer examination of these genes did not reconcile the splicing outcome on all genes affected with a slow transcription elongation rate. These variations led to the model presented in Figure 4A and 4B, which proposes that SWI/SNF complexes operate on two levels, one on chromatin by affecting the transcription elongation rate and one on RNA, most likely by associating to splice factors. A similar mechanism has been proposed for other chromatin proteins, such as the HATs PD20 and PCAF (Sjölinder et al., 2005; Obrdlik et al., 2008).
It is clear that RNA processing is coordinated with transcription and chromatin environment. Firstly, several processing factors associate with the RNA polymerase II, or with chromatin proteins, suggesting that the processes are interlinked. Secondly, the transcriptional elongation rate affects splicing outcomes; several lines of evidence show that a pausing RNA polymerase II results in inclusion of exons, in particular at alternative exons with weak splice sites. One open question that still remains to be answered is how the RNA polymerase is slowing down at exons. One possible way of slowing down the elongating RNA polymerase is to change the phosphorylation state of its CTD, increasing its Ser-5 phosphorylation. The mechanism behind this is not known, however. Is a Ser-5 kinase recruited at internal exons or is a phosphatase inactivated? The Ser-5 kinase TFIIH is found at the promoter, but it has been suggested that CDK9 in the p-Tefb functions as a Ser-5 kinase in the gene body (Munoz et al., 2009). The yeast orthologue ctk1 has both Ser-5 and Ser-2 kinase activity (Jones et al., 2004). It has also been suggested that a change in the phosphorylation/dephosphorylation cycle contributes to the altered elongation rate (Munoz et al., 2009). Or is the phosphorylation secondary to sequences in the DNA, or backtracking of the RNA polymerase? Moreover, the effect of chromatin and the histone modifications specific to exons and introns, on the elongation rate needs to be elucidated.
The elongation rate of RNA polymerase II does not explain all splicing outcomes, and other mechanisms, such as recruitment of splicing enhancers and silencers to nascent RNA also contribute. In addition, chromatin remodelling proteins can be recruited to the nascent RNA and influence splice site choice. Recent studies have shown that chromatin landscape is involved in splicing; exons are denser in nucleosomes than introns and the histones at exons carry specific histone modifications. In particular H3K36me3 are abundant in exons, the level is dependent on transcription. This mark is not affected by splicing events, suggesting a preset state, marking exons in a particular manner depending on exon usage, resulting in different patterns in different cell types. This raises several important questions. How are exon-intron histon marks set: by H3K36me3 specific methyl transferases or by demethylases being recruited? How are these modification enzymes recruited? The finding that expressed exons carry H3K36me3, whereas exons in non-expressed genes carry H3K27me3 suggests that the regulation of histones mark is even more complex. The studies presented show that chromatin and histone modifications constitute a further level of regulation of gene expression.
2.1. Chromatin and ribosomal processing
Ribosomal biogenesis employs a specific RNA polymerase machinery, employing specific processing and assembly factors. A large gene, the 47/45S rRNA gene, is transcribed by the RNA polymerase I, in the nucleolus, and from this transcript three of the four ribosomal RNAs in ribosomes; 18S, 5.8S and 28 S, are produced. The 4th rRNA is transcribed from a separate, small gene, by RNA polymerase III, in the nucleolus in yeast but in the nucleoplasm in metazoans. The nucleolus is the location also of rRNA processing and assembly. The ribosomal genes are present in tandem repeats, around 200 in yeast and 400 in human cells, spread out on five chromosomes (for review Grummt and Mc Stay, 2008). These gene loci constitute the nucleolar organisation centres (NORs) around which the nucleoli are formed after exit from mitosis when actively transcribed. RNA polymerase I transcription requires specific auxiliary factors, such as the UBF (upstream binding factor) and SL1 (selectivity factor 1), a TBP-containing complex. UBF binds to the rDNA promoter and recruits SL1, which in turn recruits the RNA polymerase I. The assembly of the transcription imitation complex is regulated by phosphorylation and acetylation of UBF and SL1 throughout the cell cycle through signalling pathways. UBF is a high motility group protein (HMG), which bends DNA in a similar manner to a nucleosome. UBF is found not only at the promoter but also along actively transcribed genes. The major fraction of the genes is not transcribed, however, and do not have UBF bound, or very low levels bound. Instead, these genes are organised into heterochromatin.
2.1.1. RNA processing in RNA polymerase I transcription - rRNA processing
The processing of the rRNAs involves cleavage of the transcripts and covalent modifications, such as pseudouridylation, 2’-O ribose methylation and base methylations (Decatur and Fournier, 2003; Henras et al., 2008). The processing of the rRNAs is initiated co-transcriptionally with the assembly of a “terminal knob”, which comprises the growing pre-rRNA with snoRNPs and modifying proteins. The pre-RNA is subsequently cleaved into the separate rRNAs by a number of exonucleases and endonucleases, which are helped by snoRNA and RNA helicases, GTPases and kinases (Strunk and Karbstein, 2010; Kressler et al., 2010). The 47/45S is assembled with the snoRNPs U3, U8 and U13 (U14, snR30) and processing proteins, such as RNA helicases, cotransciptionally into the 90S pre-ribosomal particle (also called the “small subunit processome”, SSU). The essential snoRNA most likely binds to RNA and produces the right structure to the pre-RNA for further processing: the snoU3 binds to the ETS and ITS1 surrounding the 18S rRNA module by complementary base pairing, helicases are then required to dissociate the snoRNA during the maturation process (Bohnsack et al., 2008). SnoU3 binds early to the pre-RNA, together with several proteins important for the cleavage and modification of the 18S rRNA, and this part will later assemble into the 40S ribosome. The distal part of the rRNA, which will after cleavage of the 90S SSU form the LSU, large subunit processome, assembles later when the 5.8 and 28S appears.
Some proteins that assemble into the SSU are directly bound to the snoU3 RNP, the UTP (U three proteins). Seven UTPs in yeast form a separate subcomplex, t-UTP, which is associated with chromatin, affects transcription and is necessary for processing (Wery et al. 2009). A homologous complex is present in human cells, with six identified proteins (Prieto and McStay, 2007). The function of the t-UTPs on transcription and chromatin has not been fully worked out. It has been proposed that the t-UTP in yeast is involved in pre-RNA stabilisation rather than in RNA polymerase I transcription (Wery et al. 2009).
2.1.2. The link between transcription and pre-RNA processing
The pre-RNA processome contains proteins that affect both transcription and processing. This has led to the proposal that transcription and processing are coordinated and influence each another. The t-UTP complex would then be the first level of coordinating these events during transcription, since it is recruited early in the process. Schneider et al. (2007) showed that transcriptional defects can affect processing. A mutant RNA polymerase I, which exhibited a slow initiation and elongation, was produced in yeast and these cells have severe defects in pre-RNA processing and assembly. It was proposed that a slower elongation rate produces improper recruitment of processing proteins and snoRNA. It has long been unclear how t-UTPs affect transcription, but recently a t-UTP, 1A6/DRIM (t-UTP20), was found to associate with a histone acetyltransferase-like protein, hALP, in human cells (Peng et al., 2010, Kong et al., 2011). The hALP is involved in acetylating UBF, and by being recruited to 1A6/DRIM, the processome promotes transcriptional initiation and, possibly, elongation.
2.1.3. Chromatin and RNA processing
The chromatin structure at the RNA polymerase I genes is complex. The silent copies are tightly packaged with nucleosomes, with features characteristic for heterochromatin, such as the repressive marks H3K9me3, H4K20me3 and DNA methylated CpGs at the promoter. The setting of the repressive state is caused by a chromatin remodelling complex, the NoRC (Santoro et al., 2002; Zhao et al. 2009). The active copies are heavily transcribed and whether canonical nucleosomes are present is debated. Studies in yeast show that no histones or very small nucleosomes are present on genes that have RNA polymerase I (Merz et al., 2008; Jones et al., 2007). Similarly, it has been proposed that mammalian cells are devoid of nucleosomes. UBF, which binds to the DNA as a dimer, resembles a nucleosome and may be the major chromatin protein in active rDNA in (Sanij and Hannan, 2009). However, histone chaperones in the FACT complex, which function as elongation factors in RNA polymerase II transcription, also influence RNA polymerase I transcription (Birch et al., 2009). Whether the chromatin structure constitutes a barrier for RNA polymerase I is unclear. A number of chromatin remodelling proteins have been associated with active genes, such as the CSB (Cockayes syndrome protein B) (Bradsher et al., 2002; Yuan et al., 2007; Lebedev et al., 2008) and B-WICH (Cavellán et al. 2006, Percipalle et al., 2006, Vintermist et al., 2011). Chromatin remodelling factors have been also isolated with the p32 (splicing factor 2–associated protein p32), which has been identified as a regulator of the transformation of the 90S particle to a 40S and a 60S pre-RNA particle. The p32 is also involved in RNA polymerase II splicing, so it is unclear whether the interaction with 13 chromatin remodelling proteins stems from p32 in RNA polymerase I or RNA polymerase II transcription. The CSB and B-WICH are involved in active transcription of ribosomal genes affect histone modification. CSB recruits the histone methyltransferse G9a, which results in H3K9me2 histones at actively transcribed genes (Yuan et al., 2007). The B-WICH, composed of the WSTF (William syndrome transcription factor), the ATPase SNF2h and nuclear myosin 1 (NM1), remodels chromatin at the promoter, and allows HAT to associate. The HATs subsequently acetylate histone H3, in particular H3K9-Ac. Both H3K9me2 and H3-Ac lead to increased transcription. Interestingly, the B-WICH complex also contains RNA processing proteins, such as RNA helicase Guα and the Myb-binding protein, and the 45S rRNA (Cavellán et al. 2006). Further analysis has shown that also the snoU3 associates with the complex (Figure 5A) and the B-WICH subunits can be linked to the 45S rRNA on the gene (Figure 5B). NMI binds RNA and is important for the export of the 60S subunit (Obrdlik et al., 2010), indicating that it is loaded onto the RNA via a chromatin-remodelling complex at the promoter and along the gene. It then follows the 60S pre-ribosome, when the 90S pre-18S-processome is cleaved off, through assembly and export.
Even though the B-WICH is not regarded as a t-UTP, some aspects resemble such a complex, a model has been proposed, where the complex affects transcription by chromatin remodelling, whereas other components are acting on processing (Figure 6).
The different steps, transcription, processing and export, in ribosomal biogenesis are tightly linked. The assembly of the processing factors, such as RNA helicases and snoRNP, are important, recognising structures in the rRNA and other proteins. An increasing number of “processing” factors are recruited at the promoter and these affect both processing and transcription. Even chromatin remodelling factors act as processing factors, function to remodel chromatin at the promoter and provide processing and export factors to the processomes. The B-WICH is one such example, with the main function in chromatin remodelling, but associates with several RNAs and processing proteins. Another is the hALP t-UTP. The function of the t-UTP is still unknown, so more factors with functions both in chromatin remodelling and RNA processing may exist.
Nuclear processes are tightly coupled and coordinated. Several lines of evidence now show that most RNA processing, both the action of RNA polymerase I and that of RNA polymerase II, is performed co-transcriptionally (Staley and Woolford, 2009). Both processes comprise RNA polymerases producing RNP particles, in which the RNA is to be matured. The machineries are different but many principles of action resemble one another. Transcription and RNA processing are influencing one another, conducted in the vicinity of one another, and some proteins bind to polymerases and the growing RNA at the same time, influencing both transcription and processing. Many of these interactions, such as interactions with proteins recruited to the RNA polymerase II, are dynamic and need changing for the next step to proceed. The organisation of the chromatin structure at the transcribed genes has now emerged as a further component in the network, both in RNA polymerase I and RNA polymerase II transcription. In RNA polymerase I transcription, it is likely that UBF, together with histone proteins, takes on the role of nucleosomes in RNA polymerase II transcription. Recent results demonstrate that histone modifications regulate both the elongation rate and the recruitment of processing proteins to the RNA. Furthermore, the RNA can also recruit chromatin proteins, making RNA processing a complex network of protein and RNA interactions regulated by phosphorylations, acetylation, methylation and small GTPases.
This work is supported by the Swedish Cancer foundation, Carl Trygger Foundation and Magnus Bergvall Foundation.