The most important replication proteins, their structural and functional assembly with each other and with DNA, and the whole mechanism that ensures complete and accurate duplication of DNA, are highly conserved during eukaryotic evolution. Prior to initiation of DNA replication, a multiprotein assembly called pre-replication complex (pre-RC) forms at replication origins. Formation of the pre-RC is initiated by the eukaryotic initiator Origin Recognition Complex (ORC). ORC recognizes origin DNA and recruits Cdc6, Cdt1, and MCM complex to origins (review Blow & Dutta, 2005). The selection of particular locations within the eukaryotic chromosome for initiation is poorly understood. The whole process of origin recognition includes complex interplay between factors affecting ORC assembly on replication origins and structural constrains of bound DNA. Given that replication origins of higher eukaryots do not have common consensus sequences, specificity of protein-DNA interactions does not play a central role in origin recognition. However, origin transfer studies show that origins have some genetic elements comprised in different modules which are essential for origin activity and are functionally interchangeable between origins (review Aladjem, 2007). The quite obscure step of origin selection is followed by origin remodeling which is promoted by pre-RC. Origin remodeling opens DNA for replication proteins and prepares the origin to be activated and to accommodate the double replisome. The complete sequence of events from origin recognition to activation involves multiple protein-protein and protein-DNA interactions. Identification of all abovementioned components and their interactions will ultimately lead to understanding of the complex mechanism which governs origin selection and ensures accurate initiation in eukaryotic cells.
2. Eukaryotic replication origins
Replication origins are DNA sites at which replication initiates during the S phase. Using different approaches, these sites were identified in simple eukaryotes and in metazoa. In yeast, specific origin sequences were identified by their ability to confer autonomous replication to small circular plasmids. The same assay employed in multicellular systems reviled that virtually any DNA fragment not smaller than 10kb had origin activity (Krysan, 1993). Real replication origins of metazoa were identified by alternative procedures, but they are not numerous and not yet well understood, due to extreme genome complexity and cell-to-cell heterogeneity in origin selection (Hamlin et al., 2008). Recent development of genome-scale methods for identifying hundreds or thousands of new origins may compensate for this lack of data and thus provide information regarding the recognition features of complex origins (Eaton et al., 2010; MacAlpin et al., 2010).
The best understood replication origins belong to budding yeast. They are composed of AT-rich sequences of about 150bp, termed ARS elements. ARS elements were discovered because of their capacity to confer high-frequency transformation of plasmids in S. cerevisiae (ARS assay). As shown by systematic mutation analysis, full replicator activity requires multiple DNA elements. These are origin recognition elements A and B, elements that exclude nucleosomes, and DUE or DNA unwinding elements (review Aladjem et al., 2006). DNA elements A and B1 bind the origin recognition complex (ORC). The A element contains an 11-bp essential ACS, (A/TTTTAYRTTT(A/T) that tolerates at maximum 2 mismatches, and adjacent nonconserved sequences. Point mutations in the ACS reduce or inactivate ARS function, origin activity and ORC binding in vitro (Bell, 2002). The B domain consists of several elements (B1-B4) positioned 3’ to the T-rich strand of the A element. Individual B elements are not essential for replicator function and their arrangement within the domain varies. However, these elements contribute significantly to origin efficiency (Huang and Kowalski, 1996). Asymmetric AT-rich sequences, with clusters of A’s on one strand and T’s on the other, are present in many ARS elements and enriched in nucleosome-free DNA (Yuan et al., 2005). Cooperatively, these and other sequences facilitate replicator chromatin opening, which is functionally important, as illustrated by the observation that forced positioning of the nucleosome over the A element inactivates origin function (Simpson, 1990).
Fission yeast Schizosaccharomyces pombe, a very distant relative of budding yeast, has considerable larger (500 – 1000 bp) and less understood replicators. Fission yeast replicators also consist of AT-rich sequences, initiate autonomous replication on plasmids, and bind a six-subunit ORC complex. However, fission yeast can initiate replication from any sufficiently extensive stretch of AT-rich DNA without any apparent sequence preference (Cotobal et al., 2010; Dai et al., 2005). Several S. pombe replicators contain two or three required regions consisting of asymmetric AT-rich sequences. In ars2004 there are three such regions that can be replaced by 40bp fragments of poly(dA/dT) (Okuno et al., 1997), which shows that the distribution of AT elements, rather than their specific sequences may contribute to origin function.
Different origin mapping procedures have revealed that replication origins of metazoa belong to two categories. Some of them contain unique high-frequency initiation sites, while the others have extensive zones with numerous initiation sites (Aladjem, 2007) and diffuse initiation pattern. Within such zones each site is activated only in a fraction of cell cycles which means that, once initiated at one site, replication forks just pass through the others. In beta-globin genes (HBB), the replication origin contains two adjacent initiation sites which are activated in different cell cycles (Wang et al., 2004). It therefore seems that combined data suggest that replication origins in metazoa generally contain few or more nonrandom initiation sites which could be activated in different cell cycles. Some of these sites are strong and initiate DNA replication at high frequency while the others are not and initiate DNA at low frequency. It, however, seems that initiation sites do not contain any common DNA motif, which is consistent with the apparent lack of sequence specificity of metazoa ORCs in vitro. In contrast to that, deletion and origin transfer studies demonstrate some role for DNA sequence composition in positioning origins. In the same manner, the analysis that compares the position of origins with the positions of evolutionary conserved regions (CRs) in mammalian genomes, suggests that replication origins contain sequence motifs under selective constraints (Cadoret, 2008).
Analysis of HBB ori provided the first evidence that origins in metazoa contain specific sequences: origin activity was abolished by deletion of the 8-kb origin, which forced the locus to replicate from an unidentified upstream origin (Kitsberg et al, 1993). Deletion experiments were followed by transfer of putative replicators to ectopic chromosomal regions and testing for replication initiation at the new locations. Following this procedure, specific origin sequences were shown to be both necessary and sufficient to direct initiation of replication when transferred to ectopic locations. Replicators that exhibit ectopic origin activity include those near the Drosophila chorion genes (Lu et al., 2001), DHFR ori-β (Altman and Fanning, 2001), the human HBB (Aladjem et al., 2002), lamin B2 (Paixao et al., 2004), c-myc (Liu et al., 2003) and possibly HRPT (Cohen et al., 2004), which exhibits origin activity when replacing its murine orthologue. Combined, origin transfer studies show that at least some replication origins have modular structure, with each module being essential for origin activity and functionally interchangeable between origins (Paixao et al., 2004). The following structural features of origin modules could be important for origin activation:
Sequences rich in A+T are abundant in eukaryotic origins and could have roles in facilitating DNA unwinding. Asymmetric AT stretches are present in the hamster DHFR origin and in the human LMNB2, HBB and DHFR origins. Such sequences are recognized by proteins, such as SpORC4 that has the relevant AT hooks (Kong & DePamphilis, 2002). Interestingly, even its human homologue displays a similar preference for asymmetric AT stretches (Stefanovic et al., 2003). The other AT rich elements that could be important for initiation of DNA replication are matrix attachment sites. Matrix attachment regions are required for maintenance of plasmid replication in human cells and could be part of metazoan replication origins. In addition, different AT elements could build unorthodox structures similar to one detected in the asymmetric AT-rich stretch of the LMNB2 origin (Kusic et al., 2005).
In some promoters CpG islands are correlated with open chromatin structure and it is believed that they could have similar roles in replication origins. Human nascent DNA is 10-fold enriched in CpG islands (Delgado et al., 1998), whereas removal of a CpG island significantly decreases the efficiency of ectopic LMNB2 origin (Paixao et al., 2004).
Unusual DNA structures could form from origin elements that are not AT rich. Palindromes were found in hamster DHFR and human HBB and LMNB2 origins. Different unorthodox structures were detected in the hamster DHFR origin (Bianchi et al., 1990) including one bent element important for DHFR ectopic activity, which indicated a correlation between origin topology and its function (Altamn & Fanning, 2004).
In summary, replication origins in metazoa are determined by a complex, poorly understood set of structural and topological features of DNA in which no single-sequence module has a key role.
3. Origin recognition complex
The Origin Recognition Complex (ORC) was first discovered and purified from budding yeast, based on its ability to specifically bind to replication origins (Bell & Stillman, 1992). Shortly thereafter, the corresponding genes were cloned and orthologues of Orc1-Orc5 were identified in organisms such as Drosophila melanogaster (Gossen et al., 1995), Arabidopsis thaliana (Diaz-Trivino et al., 2005) and Homo sapiens (Dhar & Dutta, 2000), which strongly suggested that these genes could exist in all eukaryotes. ORC6 genes are relatively well conserved between metazoa and fission yeast, but there is insufficient identity to conclude that they are homologous to budding yeast Orc6 (Dhar & Dutta, 2000).
ORC-like proteins are not just confined to eukaryotes. Studies of archeal Orc1/cdc6 proteins, as well as DnaA have provided important structural information about ORC-DNA interactions. DnaA, like ORC, acts as an initiator of DNA replication, and whereas DnaA and the archeal Orc1/cdc6 proteins share little sequence identity, structural studies have shown that they do have a high degree of similarity in some of their functional domains (Mott & Berger, 2007). Moreover, a study of Drosophila ORC structure suggested that DnaA and ORC wrap DNA in a similar manner (Clarey et al., 2008).
ORC function is tightly controlled by ATP binding and hydrolysis. Three of the ORC subunits (Orc1, 4, and 5) are members of AAA+ family of ATPases. Recent studies suggest that ORC2 and ORC3 represent more distant relatives of the AAA+ proteins that lack the key conserved elements of the ATP-binding site (Speck et al., 2005). In S. cerevisiae and Drosophila the ATP-binding activity of Orc1 is essential and regulates DNA binding. Although not essential, mutations in the S. cerevisiae Orc5 ATP-binding motif cause defects in the apparent complex stability (Takahashi et al., 2004). In contrast, mutations of the Orc1, Orc4 or Orc5 ATP-binding motifs inhibit the ability of human ORC to activate replication in ORC-depleted Xenopus egg extracts (Giordano-Coltart et al., 2005). Direct DNA binding studies of human ORC show that addition of ATP stimulates ORC-DNA interaction (Vashee et al., 2003). DNA has a significant effect on the ATP binding and hydrolysis functions of ORC. In S. cerevisiae, double-stranded origin DNA stabilizes ATP binding and inhibits ATP hydrolysis, whereas single-stranded DNA of any sequence stimulates ATPase activity (Lee et al., 2000). Similar findings have been made for Drosophila ORC (Chesnokov et al., 2001).
Although formally a member of ORC, Orc6 does not share similar structural features or a common evolutionary origin with Orc1-5 (Duncker et al., 2009). Nevertheless, its association with the other five subunits is required to promote the initiation of DNA replication and it is considered an ORC protein.
It appears that all six ORC subunits remain associated with chromatin throughout the cell cycle in S. cerevisae, but not in metazoan cells. In human cells ORC was detected on replication origins in G1 and S phases, while missing in the M phase. Orc1 protein leaves the replication complex when DNA synthesis starts (DePamphilis, 2005). When the cells move into the S phase the pre-replicative complex is restructured into the smaller post-replicative form. The transition from the pre- to the post-replicative complex is accompanied by displacement of the ORC subunit (Abdurashidova et al., 2003). Immunofluorescent detection of Orc2-green fluorescent protein (GFP) in Drosophila neuroblasts and live-cell imaging in embryos show no ORC2 in chromosomes in the period from prophase to anaphase (Baldinger & Gossen, 2009). Fluorescent loss in photobleaching analysis in hamster cells suggests less static interaction of ORC subunits with chromatin and shows a highly dynamic interaction of both Orc1 and Orc4 with chromatin throughout the cell cycle (McNairn et al., 2005).
Assembly of the human recombinant ORC subunits was investigated in vitro (Giordano-Coltart et al., 2005; Siddiqui & Stillman, 2007) and it was demonstrated that human ORC follows an ordered pathway of assembly. First subunits 2 and 3 bind to each other and then recruit Orc5. The Orc2/3/5 complex recruits Orc4 then Orc1. Mutations in the ATP binding sites of Orc4 and Orc5 impair complex assembly, whereas Orc1 does not require ATP binding. It is possible than in living cells additional regulatory mechanisms operate at the level of the ORC complex assembly and disassembly and not only at the level of protein-DNA interaction or preRC activation.
4. Mechanisms of replication initiation site selection
Selection of DNA replication origins may be regulated by various factors and may be achieved at different levels. Replication initiates at many sites along linear chromosomes, which ensures complete genome duplication within a single S phase, but the number of activated origins does not match the number of prereplication complexes previously assembled on DNA (review Gilbert, 2010). From the large pool of all assembled pre-RCs only a subset is chosen for subsequent initiation while the rest remain dormant. Two-step mechanism or mechanisms that select preRCs for initiation or govern the pre-RC assembly remain unknown.
In budding yeast, ORC binds the corresponding ARS element in a sequence specific manner. One component of the recognition site is the 11-bp ACS. As shown by analysis of modified DNA substrates, DNA-bound ORC primarily interacts with the A-rich strand of the ACS. It is not yet clear which subunit of ORC determines DNA binding, but protein-DNA cross-linking studies show four out of six ORC subunits (Orc1, Orc2, Orc4 and Orc5) in close association with origin DNA (Lee & Bell, 1997).
In S. pombe replication origins are recognized by ORC via a species-specific AT-hook in the ORC4 subunit (Chuang & Kelly, 1999). SpORC binds to preferred DNA sites containing multiple runs of three A’s or T’s in vitro. In fission yeast structural elements are redundant and could compensate for deletion of one of the many ORC-binding sites.
ARS function appears to be governed primarily by AT content and length (Dai et al., 2005). Whether the replicator length is needed to include the required DNA elements or to provide spacing between them is not clear. DNA appears to wrap around ORC (Gaczynska et al., 2004), suggesting a possible spacing length requirement between ORC-binding sites. Intervening deletion mutations could affect replicator function by either shortening the spacing length between elements or by removing elements.
S. pombe Orc4 could bind to origin DNA even in the absence of other ORC subunits. The AT hook motif is known to bind to the minor groove of AT where it can recognize or induce structural changes. The N-terminal domain of S. pombe Orc4 may function to tether the ORC complex to origins of DNA replication and this interaction is independent of ATP. However, the tethered complex may also make ATP-dependent contacts with additional sites in the origin to nucleate formation of the initiation complex. As demonstrated by recent studies, SpORC binds DNA in at least two steps (Houchens et al., 2008). The first step, possibly mediated by electrostatic interactions between the AT-hook motifs of SpOrc4 subunit and AT tracts in replication origin, results in formation of a salt sensitive SpORC-DNA complex, which is then slowly converted to a salt-stable form.
In the metazoan model system ORC-DNA interactions were first explored in Drosophila. As demonstrated by imunofluorescent studies, the Drosophila DNA element ACE3 (Amplification Control Element 3) alone directed ORC to the region of chorion amplification (Austin et al., 1999). Moreover, chromatin imunoprecipitation studies indicated that ACE3 can target Drosophila ORC not only to sites within the ACE3 element itself but also to sites within adjacent DNA sequences. In contrast to that, ORC purified from Drosophila embryos or reconstituted from recombinant proteins, bound origin DNA in an ATP-dependent manner but with little sequence specificity (Remus et al, 2004), insufficient to target the ORC to origins of replication.
Similar to Drosophila ORC, reconstituted, highly purified human ORC exhibits ATP stimulated DNA-binding and preference for either natural or synthetic AT rich sequences, (Vashee et al., 2003). However, it is generally assumed that origin specification in metazoa involves mechanisms other than simple recognition of DNA sequence by ORC. Thus DmORC exhibits 30-fold higher affinity for negatively supercoiled DNA as compared to relaxed or linear DNA (Remus et al., 2004). Binding of DmORC is accompanied by changes in DNA topology, suggesting that ORC-DNA complexes contain underwound DNA. Purified human ORC induces similar topological changes in origin DNA (Houchens et al., 2008), indicating conservation of this property of ORC during eukaryotic evolution.
Interestingly, human ORC and human Orc4 exhibit similar DNA binding properties, such as preference for negatively supercoiled DNA (Kusic or Tomic unpublished), preference for AT rich DNA and the ability to distinguish between different AT-rich DNA structures. HsOrc4 protein also exhibits preference for triple stranded DNA (Kusic et al., 2010) and the ability to stimulate formation of noncanonical oligonucleotide structures (Stefanovic et al., 2008). Such HsOrc4 properties could play part in origin selection through directing ORC to DNA sequences able to adopt unorthodox structures.
Pre-RC factors other than ORC may also contribute to origin recognition. In budding yeast, Cdc6 ATPase activity contributes to stabile and specific binding of the ORC-Cdc6 complex to the origin (Speck & Stillman, 2007), whereas in fission yeast Cdt1 and Cdc6 proteins facilitate SpORC-DNA interactions (Houchens et al., 2008).
In addition to origin structure and preRC components, chromatin structure could significantly affect replication origin selection. As revealed by ChIP-seq for ORC in budding yeast, many consensus sequences are not bound by ORC (Eaton et al., 2010). A genome-wide analysis of nucleosome architecture of replication origins in budding yeast, aligned by their ORC-binding sites, suggested a model in which the underlying DNA sequence at replication origins occludes nucleosomes. This creates a permissive environment for ORC binding, after which ORC positions nucleosomes in regular array on both sides (Berbenetz et al., 2010). In addition, only a subset of nucleosome free regions (NFR) with specific flanking sequence features – which allow the ORC to position nucleosomes with sufficient space for MCM protein loading – can promote binding of ORC. Accordingly, by genome-scale mapping of D. melanogaster ORC localization, ORC was found in previously mapped NFRs. The sites of rapid nucleosome turnover were found to align with ORC (MacAlpine et al., 2010). Consistent with in vitro binding data, specific sequence motifs were not identified, but an in silico learning approach revealed a complex code of short sequences that could simultaneously predict ORC binding and NFR.
Transcription factors may also play a role in localization of ORC. Thus, at the chorion loci of Drosophila folicule cells, transcription factors containing the Myb protein facilitate DNA replication at the ACE3 and ori-β replication origins (Beal et al., 2002). Specific RNA recruits ORC to the Epstein-Barr virus (EBV) replicator, oriP by linking oriP-bound nuclear antigen-1 (EBNA-1) and ORC (Norseen et al., 2008).
5. Assembly of Pre-RC complexes
Originally identified by in vivo DnaseI protection assay (Diffley et al., 1994), the multiprotein assembly formed at all potential origins of replication was termed the pre-RC. Pre-RC formation can only occur during late M and G1 phases of the cell cycle and only preexisting pre-RCs can be activated in the subsequent S phase. Pre-RC formation requires at least 4 different entities: the origin recognition complex (ORC), Cdc6, Cdt1, and the MCM complex. Since the pre-RC complex acts as eukaryotic replicative helicase, pre-RC formation is equivalent to helicase loading event (Chong et al., 2000).
In pre-RCs formed in vivo or in vitro multiple MCM complexes are assembled at each origin. Depending on the organism, the MCM:origin DNA ratio varies between 10:1 and 40:1 (Takahachi et al., 2005). Since each replication fork requires at least one MCM complex, the role of additional MCMs remains unclear. The MCM complex has no affinity for origin DNA and its association with the origin requires the action of ORC, Cdc6 and Cdt1. Once assembled, MCM does not require Cdc6 and Cdt1 proteins and they are released from chromatin (Hua & Newport, 1998). After loading, MCM DNA association is independent of other components (Bowers et al., 2004), possibly due to the ring shape structure of MCM which could be closed around origin DNA. Since loaded MCM complexes direct initiation in the apparent absence of other pre-RC components, ORC, Cdc6 and Cdt1 could be considered MCM loading factors. It is important to note that ten of the fourteen protein components of the pre-RC belong to the AAA+ family of ATPases (Mcm2-7, ORC1, Orc4, Orc5, and Cdc6). Consequently the pre-RC formation requires ATP and is inhibited by its of nonhydrolyzable analogs (Harvey & Newport, 2003).
A similarity of ORC subunits and Cdc6 to sliding clamp loaders and the ring shaped structure of MCM have led to the proposal of a model for MCM origin loading (Speck & Majka, 2009). Loading initiates by association of ORC with origin DNA in the ATP-bound state. ORC-ATP recruits Cdc6, stimulates its association with ATP and subsequent recruitment of Cdt1 and the MCM proteins. According to the model, this leads to the opening of the MCM ring thus exposing a previously hidden DNA-binding site. MCM binding to DNA triggers Cdc6 ATP hydrolysis which leads to two events: the release of Cdc6 and Cdt1, and closing of the MCM ring around the DNA.
As suggested by extensive conservation of replication factors, the basic mechanism of DNA replication is evolutionally conserved. However, regulation of origin firing in higher eukaryotes is much more complex than in lower eukaryotes. Consequently, in order to understand what specifies the metazoan origins one must look far beyond simple linear sequences and take into account combinatorial interaction of multiple components that make up the initiation machinery and insert it in the cell cycle regulatory network. The main entity that initiates preRC formation, protein complex ORC, does not have the ability to select origins in metazoa based solely on its own affinity for specific DNA sequences. In this function it could be aided by other pre-RC proteins, DNA topology, and even unorthodox DNA structures. Characteristics and state of chromatin structure in specific regions of the genome, nucleosome positioning, binding of transcription factors and degree of DNA supercoiling may restrict the area in which initiation could occur. Altogether these features could play a critical role in initiation of DNA replication by the mechanism that requires many precise small steps leading to a single goal.
This work was supported by grants from the Ministry of Science and Technological Development, Serbia (173008) and the International Centre for Genetic Engineering and Biotechnology, Italy (CRP/YUG08-01).