Asparagine (N-) linked protein glycosylation is a common and essential post-translational modification of proteins in eukaryotes, archaea and some bacteria. It plays crucial roles in protein folding and in regulation of protein function. Although the general principles of N-glycosylation have been long known, the precise details governing whether a particular asparagine residue will be N-glycosylated or not are not well understood. This is of broad general importance in understanding the structure and function of the immense variety of N-glycoproteins in diverse biological systems. This chapter will review the current understanding of the mechanisms that determine how asparagine residues are selected for glycosylation by the enzyme oligosaccharyltransferase.
2. Overview of N-glycosylation in the endoplasmic reticulum
The initial steps in N-glycosylation take place in the lumen of the endoplasmic reticulum (ER). The enzyme oligosaccharyltransferase (OTase) catalyzes the key step in N-glycosylation,
The OTase enzyme is a multiprotein complex in most eukaryotes, and in yeast consists of 8 protein subunits (Ost1p, Ost2p, Ost3/6p, Ost4p, Ost5p, Swp1p, Wbp1p and Stt3p) (Kelleher & Gilmore, 2006). It is now clear that the Stt3p protein houses the catalytic site of OTase, while the accessory protein subunits of multiprotein complex OTases are required for complex stability, enzymatic regulation of OTase activity, substrate recognition and OTase enzyme localization (Mohorko et al., 2011). OTase physically associates with the translocon (Shibatani et al., 2005, Yan & Lennarz, 2005) and the ribosome (Harada et al., 2009), and so has direct access to nascent polypeptides immediately as they enter the ER lumen (Dempski & Imperiali, 2002). Glycosylation of many asparagines is co-translocational, and occurs essentially as soon as they enter the ER lumen and can reach the OTase active site (Whitley et al., 1996). Other sites are also glycosylated post-translocationally, with extended residence of protein in the ER lumen (Ruiz-Canada et al., 2009). However, in all cases the protein substrate of OTase must be unfolded for glycosylation to occur.
2.2. Roles of N-glycans in protein folding
The key role of N-glycans on proteins in the ER is to assist in productive protein folding (Helenius & Aebi, 2004). By virtue of their hydrophilic bulk, N-glycans alter the overall biophysical properties of nascent polypeptides, increasing their solubility and constraining local polypeptide conformation (Wormald & Dwek, 1999). N-glycans can also function as signals for incomplete folding of particular domains of proteins, and so direct these to the ER resident thiol oxidoreductase ERp57 via the lectins calnexin and calreticulin (Oliver et al., 1999). Timed trimming of N-glycans on glycoproteins in the ER lumen is also key for regulating retro-translocation of incorrectly folded glycoproteins to the cytoplasm for degradation (Aebi et al., 2010).
3. The ‘glycosylation sequon’
The key recognition factor for selection of asparagines for glycosylation by OTase is the ‘glycosylation sequon’. This has been historically defined as Asn-Xaa-Ser/Thr (Xaa Pro). However, it has also long been clear that this is not an adequate predictor of glycosylation, as ~1/3rd of Asn in sequons in secreted proteins are not glycosylated. In addition to this, several examples of glycosylation of Asn residues not in sequons have been reported in recent years.
3.1. Definition of the sequon
The term ‘sequon’ was likely first used by Derek Marshall (Marshall, 1974) to describe the apparent three amino acid local sequence requirement for N-glycosylation. However, it was long recognized that the presence of a sequon was not sufficient for N-glycosylation to occur at a given Asn in portions of polypeptides entering the ER lumen. Nonetheless, the efficiency of glycosylation at a given asparagine is primarily determined by the flanking amino acids, with the primary factor increasing glycosylation being the presence of a threonine or serine at the +2 position. This has such a strong influence of the efficiency of glycosylation that it has been termed the ‘glycosylation sequon’ in recognition of its importance. However, the presence of a glycosylation sequon is neither necessary nor sufficient for an asparagine to be glycosylated.
3.2. The ‘+2’ position: Thr, Ser, Cys, Etc
Whilst both Ser and Thr are accepted as amino acids at the +2 position in glycosylation sequons, they are not equal, as glycosylation of Asn-Xaa-Thr sequons is approximately 40 times efficient than of Asn-Xaa-Ser sequons (Kasturi et al., 1995, Kasturi et al., 1997). Far and away the majority of glycosylated asparagines are in traditional Asn-Xaa-Ser/Thr (XaaPro) sequons. However, several very well validated examples have been reported of asparagines
Several reports have been made of glycosylation at asparagines in the sequence Asn-Xaa-Cys. Human CD69 has such an Asn-Xaa-Cys glycosylation site (Vance et al, 1997). Human beta protein C is glycosylated at an Asn with cysteine at the +2 position (Miletich & Broze, 1990). Interestingly, the Cys in beta protein C is involved in a disulfide bond in the mature protein, and the formation of this disulfide competes directly with glycosylation at the preceding Asn. CHO-cell expressed recombinant human epidermal growth factor receptor (EGRF) also has such a glycosylation site (Sato et al., 2000). Heterologous expression of an insect cathepsin B-like counter-defense protein in
Several large-scale discovery projects for identification of N-glycosylation sites have been performed. The largest of these, from mouse, identified over 5000 putatively glycosylated asparagines (Zielinska et al., 2010). While the vast majority of these were in conventional Asn-Xaa-Ser/Thr sequons, a small but significant number of Asn not in such sequons were identified as being glycosylated. Asn-Xaa-Cys sites represented 65/5052, and Asn-Xaa-Val 20/5052. It was also reported that Asn-Gly sites were modified. However, this result must be treated with extreme caution, given the propensity for non-catalyzed spontaneous deamidation (asparagine-aspartate conversion) is especially high at Asn-Gly sequences (Palmisano et al., 2012, Robinson et al., 2004)
It was proposed that the hydroxyl group of Ser/Thr amino acids at the +2 position was directly involved in catalysis, via the formation of an ‘Asparagine turn’ (Imperiali & Hendrickson, 1995). This proposal was certainly powerful, and could withstand the observation of rare Asn-Xaa-Cys glycosylation sequons with the relatively weak hydrogen bonding capacity of the cysteine sulfhydryl group. However, apparent glycosylation of Asn-Xaa-Val sequons could not be explained by this mechanism. Resolution of the role of the +2 amino acid in determining glycosylation needed to wait until an atomic resolution structure of OTase was available.
3.3. Further a field: The ‘X’ position and beyond
The amino acids immediately proximal to the glycosylated Asn also influence the efficiency of its glycosylation. Experimental manipulation of model proteins has shown that the +1 position of an Asn has a strong effect on its extent of glycosylation, with bulky hydrophobic or acidic amino acids strongly reducing glycosylation occupancy, and small, hydrophilic or basic amino acids giving high levels of modification (Shakin-Eshleman et al., 1996). These results may be misleading, as glycosylation only occurs before protein folding, and so mutations which disrupt or slow local protein folding could make extrapolation of such results difficult. However, roughly this same overall pattern has also been observed in non-experimental comparisons of glycosylated and non-glycosylated Asn (Petrescu et al., 2004). Interplay with the amino acid at the +2 position has also been shown to be important. Studies in a model glycoprotein showed that amino acid substitutions at the +1 position that reduced glycosylation efficiency with Ser at the +2 position were still completely modified if Thr was at the +1 position (Kasturi et al., 1997). The major difficulty in interpreting these results is that the amino acids in the vicinity of a glycosylated Asn residue influence both specific interactions with OTase
In addition to local sequence dependency, the position of an asparagine within its protein sequence also contributes to the extent or probability of glycosylation. For instance, probability and extent of glycosylation increases with increasing distance from the C-terminus of a protein. This has been measured both experimentally using manipulation of model proteins and by
3.4. The extended bacterial glycosylation sequon
The discovery of N-glycosylation systems in bacteria that are homologous to those in eukaryotes promised rapid progress in understanding the molecular basis for their specificity and activity, given their comparative simplicity and ease of manipulation (Szymanski et al., 1999, Wacker et al., 2002). Initially it was observed that the
3.5. Structural insights into the requirement for the glycosylation sequon
The high-resolution 3D crystal structure of the
This structure of the PglB OTase provides clear evidence that the role of the glycosylation sequon is to increase the binding affinity of asparagines to the active site of OTase (Lizak et al., 2011b). Accessory subunits of multiprotein complex OTases in many eukaryotes have been shown to bind substrate polypeptide, perhaps contributing to increasing the binding affinity of specific Asn and leading to the short requirement of specific binding of an Asn-Xaa-Ser/Thr. In contrast, the single protein OTases such as the bacterial PglB may have evolved the requirement for an extended sequon in the absence of such additional binding by accessory OTase subunits.
3.6. The future of the sequon
How to best define the glycosylation ‘sequon’? Many factors influence whether a particular asparagine is glycosylated, including: binding affinity of the region immediately proximal to the Asn to the polypeptide acceptor site of OTase; local folding, such as secondary structural elements, disulfide bond formation or hydrophobic collapse; the regulatory state of OTase, including the concentration and structure of lipid-linked oligosaccharide donor; protein expression rate, both global (rate of protein secretion saturates OTase catalytic ability) and local (position of Asn within the protein sequence); and the affect of glycosylation at an Asn on the total possibility of protein folding. (If glycosylation at a given Asn would not allow correct folding of the protein, such that the portion of nascent polypeptides that were glycosylated there would never correctly fold, then that Asn would appear to never be glycosylated. The converse is also true, that if glycosylation is strictly required at a particular Asn for correct protein folding, then that Asn will appear to always be glycosylated, even if most of the nascent polypeptide is not modified and degraded by the quality control systems of the ER.)
It is the combination of these factors that determines if a particular Asn reaches the threshold for modification by OTase. However, even the definition of this threshold is an analytical artefact, as it is increasingly apparent that most glycosylated Asn are only partially modified, with some portion ranging from a fraction of a percent to essentially all copies of a protein, actually glycosylated (Hülsmeier et al., 2007, Sumer-Bayraktar et al., 2011). This pattern seems to contrast with the general requirement of many proteins for N-glycosylation for correct and efficient protein folding (Helenius & Aebi, 2004). Two key factors probably explain this conundrum. Many proteins can fold correctly even without glycosylation at many sites, as long as a certain critical level of glycosylation is present, perhaps sufficient for ER-lectin chaperone recruitment to crucial protein domains, or overall biophysical solubility. Additionally, Asn residues are inherently likely to be present at the ends of secondary structural elements. This means that glycosylation at such sites is, in general, not likely to strongly disrupt protein folding.
In the end it appears that the descriptive beauty of the ‘glycosylation sequon’ is actually a dramatic simplification. However, the current state of knowledge is far from being able to quantify the ‘glycosylatability’ of a particular Asn. In place of this developing skill, the ‘sequon’ as it is traditionally defined is still a very accurate predictor of the possibility of glycosylation.
4. Oligosaccharyltransferase defines the sequon
The enzyme oligosaccharyltransferase (OTase) catalyses transfer of oligosaccharide from lipid to nascent polypeptide in the ER. However, while this enzyme shows a high degree of conservation between species with respect to the small scale reaction it catalyses, the immense range of different polypeptide substrates in various biological systems can be efficiently glycosylated because of co-evolution of these substrate proteins and the acceptor specificities of OTase. In turn, this evolutionary history determines whether a particular asparagine residue will be efficiently glycosylated in a given biological system. The OTase defines the ‘sequon’.
4.1. OTase protein subunits
OTase consists of the catalytic protein subunit Stt3p/PglB with varying numbers of additional accessory subunits in different organisms (reviewed in (Kelleher & Gilmore, 2006, Mohorko et al., 2011)). Comparison of the evolutionary tree of eukaryotes with the protein subunit composition of OTase implies that accessory protein subunits have been added sequentially during eukaryotic evolution, starting from an ancestral single protein Stt3p OTase enzyme. The functions of most accessory OTase subunits are not clearly defined, although roles in recognition and regulation of glycan and protein substrate have been proposed.
4.2. Single protein OTases
Some divergent eukaryotes such as
4.2.1. Single protein OTases in Trypanosoma brucei
4.2.2. Single protein OTases in Leishmania major
The single subunit OTase enzymes of the related
4.2.3. Role of OTase catalytic subunit homologues STT3A and STT3B
Even when present in multiprotein complexes, STT3 homologues have different activities. OTase complexes containing either of the homologous mammalian STT3A and STT3B proteins have different kinetic parameters (Kelleher et al., 2003), and are also responsible for either co-translocational or post-translocational N-glycosylation (Ruiz-Canada et al., 2009), thereby glycosylating different protein substrates (Wilson & High, 2007). However, it is not clear if there is further definition of protein or glycan substrate specificity defined by the presence of Stt3A or Stt3B in an OTase complex.
4.3. Role of accessory OTase proteins
In organisms with multiprotein complex OTases, there are several lines of evidence that some of these additional non-catalytic subunits provide different protein substrate specificities and allow regulation of oligosaccharide substrate recognition and enzymatics.
4.3.1. Role of accessory OTase proteins Ost3p and Ost6p
4.3.2. Role of accessory OTase protein ribophorin I / Ost1p
Mammalian Ribophorin I (Ost1p in yeast) is required for efficient glycosylation of selected membrane proteins. Ribophorin I physically associates with selected membrane proteins after insertion into the ER membrane (Wilson et al., 2005). This interaction with these selected substrate proteins was also shown to be required for their efficient glycosylation by OTase (Wilson & High, 2007). The interaction between selected membrane proteins and ribophorin I is direct, but the precise mechanisms of the interaction are not clear (Wilson et al., 2008). It is possible that ribophorin I / Ost1p function in a conceptually similar way to Ost3/6p, in transiently tethering substrate protein close to the catalytic site of OTase to allow efficient glycosylation of a defined subset of glycosylation sites or glycoproteins.
4.3.3. Additional known accessory OTase proteins
An integral membrane protein with homology to the integral membrane domain of Ost3p and Ost6p has been identified in mammalian cells. This protein, DC2 or OSTC, is required for glycosylation of specific substrate glycoproteins (Wilson & High, 2007). A further protein, Keratinocyte-associated protein 2 (KCP2), has been shown biochemically to be a subunit of the mammalian OTase (Sanyal & Menon, 2010, Roboti & High, 2012), and to be required for glycosylation of some proteins (Wilson & High, 2007).
4.3.4. Putative accessory OTase protein presenilin 1
A direct link between site-specific glycosylation and Alzheimer’s disease has been made, through the Presenilin-1 protein (Lee et al., 2010). N-glycosylation of the vaculoar ATPase subunit V0a1 is mediated by selective binding of the Alzheimer’s disease related protein presenilin-1 (PS1) to unglyclosylated V0a1 and OTase. V0a1 glycosylation is required for ER-lysosome trafficking, and so lack of PS1 causes deficiencies in lysosomal acidification and proteolysis during autophagy. It is not clear if PS1 is a truly protein-specific enhancer of glycosylation, or if it interacts with additional substrate glycoproteins to enhance their glycosylation.
4.3.5. How many OTase subunits are there?
Have all OTase subunits been identified? Most known OTase subunit proteins have been identified in the yeast
It is possible that other, less tightly bound or lowly expressed proteins are yet to be identified. It is also possible that sequential addition of accessory proteins to the OTase complex has proceeded divergently in different eukaryotic lineages. This would mean that biochemical analyses, rather than genomic comparisons, would be necessary to identify any additional OTase complexes in for example the plant or protozoan OTase. Any such additional subunits would likely have diverse additional roles in regulation of OTase core activity.
5. Analytical approaches to determine site-specific glycosylation occupancy
5.1. Glycosylation site identification and occupancy
A goal of understanding the function of OTase in diverse biological systems is to enable accurate prediction of whether a particular Asn will be efficiently glycosylated. However, such prediction depends on a complete understanding of how OTase interacts with substrate polypeptides in each biological system, and as such is probably a very difficult problem. In addition to the diversity of OTase subunit proteins, OTase activity may also be subject to regulation. In the absence of accurate prediction tools, analytical identification and quantification of glycosylation occupancy is therefore necessary for accurate characterisation of the glycosylation status of a protein. In addition, it is not sufficient to identify that a site is glycosylated, as an Asn can be identified as ‘glycosylated’ in enrichment experiments, but may actually only be modified at a very low occupancy. The physiological relevance of glycosylation at such sites is therefore questionable. The converse of this is also true, as it appears that with sensitive analytical detection some or even most glycosylation sites are not completely occupied (there exists a small but significant proportion of proteins that are not glycosylated at that particular site) (Hülsmeier et al., 2007). Analytical methods should therefore consider the proportion of a particular Asn that is glycosylated, for instance using LC-MS approaches that can compare the abundance of glycosylated and non-glycosylated versions of the same peptide (Schulz & Aebi, 2009). Although these methods are not in general absolutely quantitative, they can provide relative quantification and a first step towards characterization of the site-specific extent of glycosylation.
5.2. Western blotting for measuring glycosylation occupancy
Numerous studies have made use of Western blotting with antibodies recognizing a specific protein of interest to gauge glycosylation occupancy. However, Western blotting is inherently limited to analysis of proteins for which specific antisera are available, and is constrained to low-throughput assays. Western blotting can also only identify protein-wide glycosylation occupancy, and cannot distinguish between partial glycosylation at different Asn residues on the same protein. Mass spectrometry can overcome both of these key difficulties, as it is a general analysis tool that can be used for site-specific analysis of protein glycosylation.
5.3. Glycoconjugate enrichment stragtegies
Detection of glycosylation at a specific site is the first step in its quantitative analysis (Schulz et al., 2012). Enrichment of glycoproteins or glycopeptides is key to the success of high sensitivity detection of glycosylation sites. Various enrichment strategies can be employed depending on the biological system of interest, and the analytes of interest within that system. The physical properties of carbohydrates that distinguish them from protein can be used to enrich glycopeptides and glycoproteins. Typical enrichment strategies based on the physical properties of glycans include hydrophilic interaction chromatography (Mysling et al., 2010, Gilar et al., 2011, Christiansen et al., 2010), phenyl boronic acid (Li et al., 2000, Li et al., 2001) and hydrazide (Zhang & Aebersold, 2006, Zhang et al., 2003) attachment. A key mechanism mediating the functional roles of glycans in many biological systems is recognition of specific glycan structures by proteins, or lectins. The specificity of such lectins for defined glycan structures can be used to enrich particular subsets of glycopeptides or glycoproteins bearing those structures (Drake et al., 2006, Zielinska et al., 2010).
5.4. Mass spectrometry for measuring glycosylation occupancy
To obtain quantitative or semi-quantitative measurement of the extent of glycosylation at that site subsequent comparison must be made with the unglycosylated form of the detected peptide. This can be done using comparison of ion intensities of the glycosylated and unglycosylated peptides. The unglycosylated form of the peptide will only be present in one form. However, as glycosylation generally results in a complex mixture of glycan structures at each glycosylation site, measurement of the abundance of the glycosylated form of a given site is not trivial. Some approaches have used detection of entire glycopeptides, although this approach generally requires more specialized and targeted LC-MS technologies (Sumer-Bayraktar et al., 2011). Other approaches have focused on improving quantification of occupancy, and have discarded information on site-specific glycan structure by endoglycosidase treatment (Schulz & Aebi, 2009). For instance, PNGaseF cleaves N-glycans and converts previously glycosylated Asn to Asp, while EndoH leaves a single
5.5. Selected-reaction-monitoring mass spectrometry
Recent years have seen impressive success with targeted mass spectrometry approaches, using selected-reaction-monitoring (Lange et al., 2008, Gallien et al., 2011). N-glycosylation has been used as a useful tag to specifically enrich otherwise low abundant components of biological fluids (Stahl-Zeng et al., 2007). Often this has been performed not out of direct interest in glycosylation per se, but because of the ubiquity of glycosylation, and its proven utility in biomarker discovery. However, some analyses have used this approach to specifically measure glycosylation occupancy, for instance in patients with congenital disorders of glycosylation (Hülsmeier et al., 2007).
5.6. Future analytical directions
Use of tools such as those outlined above, in combination with experimental manipulation of growth conditions, N-glycan biosynthetic pathways, protein translation and translocation, and OTase function or composition, will allow identification of the regulation and roles of site-specific N-glycosylation occupancy at a systems level.
6. Is the ‘glycosylation sequon’ an example of convergent evolution? Insights into glycosylation site evolution
6.1. HMW-ABC glycosylation in non-typeable
A family of cytoplasmic bacterial enzymes have been recently described that catalyse an N-glycosylation reaction remarkably reminiscent of ‘traditional’ N-glycosylation. These enzymes are the HMW-C glycosyltransferase of non-typeable
A key step in NTHi infection is adherence to the host epithelium. Surface exposed adhesin proteins mediate this adherence, with the high molecular weight (HMW) adhesin system being of key importance in many NTHi clinical isolates. HMW-C is a glycosyltransferase associated with this two-partner secretion system adhesin, encoded in the HMW-ABC locus. Two highly homologous loci are present in the ~80% of NTHi clinical isolates that encode this system, HMW1ABC and HMW2ABC respectively (St. Geme et al., 1998, Ecevit et al., 2004). HMW1A encodes an adhesin glycoprotein (Gross et al., 2008, Grass et al., 2003), which is secreted across the inner membrane via the Sec apparatus, and requires the outer membrane protein HMW1B for correct export across the outer membrane (St Geme & Yeo, 2009). HMW1C encodes a family 41 glycosyltransferase that glycosylates HMW1A (Grass et al., 2010, Kawai et al., 2011, Choi et al., 2010). This glycosylation is required for stability, efficient folding and secretion of the HMW1A glycoprotein adhesin (Grass et al., 2003). In turn, the HMW1A adhesin is important for NTHi colonisation and pathogenesis (St Geme et al., 1993, St Geme, 1994).
Similar to several other described bacterial protein glycosyltransferases, HMW1C glycosylates its HMW1A substrate protein in the cytoplasm, before secretion across the inner membrane (Fleckenstein et al., 2006, Charbonneau et al., 2012, Choi et al., 2010, Schwarz et al., 2011a). Most of these other reported bacterial glycosyltransferases are O-glycosyltransferases, transferring nucleotide-activated monosaccharides to the hydroxyl groups of Ser or Thr. In contrast, HMW1C glycosylates Asn residues, with a strong tendency to glycosylate Asn within glycosylation sequons with the sequence Asn-Xaa-Ser/Thr (Gross et al., 2008).
6.2. HMW-C versus OTase: Unrelated enzymes, same sequon?
The HMW-C and OTase systems are not homologous, as traditional N-glycosylation as described above is catalysed by the integral membrane OTase, which transfers an oligosaccharide from a lipid linked carrier to nascent polypeptide in the lumen of the ER (or periplasm). In contrast, the HMW-C cytoplasmic system of some bacteria is catalysed by a soluble glycosyltransferase that transfers a nucleotide-activated monosaccharide to protein in the cytoplasm. However, it is striking that the bacterial cytoplasmic HMW-C enzymes have very similar site recognition to ‘traditional’ OTase enzymes: they efficiently glycosylate Asn in ‘sequons’ with Asn-Xaa-Ser/Thr (Xaa≠Pro), but are capable of glycosylating some selected asparagines lacking S/T at the +2 position (Choi et al., 2010, Grass et al., 2010, Schwarz et al., 2011a, Schwarz & Aebi, 2011). HMW-C enzymes also share the substrate requirement of OTase for unfolded protein, or flexible loops in folded protein (Schwarz et al., 2011a).
A high-resolution 3D crystal structure of an HMW-C enzyme from
This then raises the very curious observation that two non-homologous enzymatic systems for glycosylation of Asn have independently evolved essentially identical substrate recognition motifs. This suggests convergent evolution of enzyme-substrate interactions in these two systems, which would in turn imply that there is some functional benefit for site recognition of Ser/Thr at +2 amino acid residues of an Asparagine. It is tempting to speculate that this sequence may have evolved to balance the need for sufficient binding affinity of the polypeptide acceptor with the advantages of a general glycosylation system.
The selection pressure for OTase and HMW-C to require unfolded polypeptide substrate is not completely clear. However, this requirement is likely due to the benefit of glycoysylation in increasing both protein folding efficiency and the stability of folded proteins. Addition of glycans to already folded proteins can serve to increase their stability, potentially in a regulated manner (Yuzwa et al., 2012). However, to be of assistance in protein
6.3. Why is the sequon as it is?
Why then should Ser/Thr be part of a preferred glycosylation recognition sequence, and not any other amino acids? Perhaps part of the answer is that these hydroxyl-containing residues are typically surface exposed and are not charged. The hydrophilic nature of Ser and Thr means that they are generally not present internally in folded proteins, but are almost always surface exposed. As addition of a glycan in the hydrophobic core of a protein would be incompatible with correct protein folding, a hydrophilic recognition motif is necessary. Charged residues (His, Arg, Lys, Asp, Glu) would also be potential candidates for such a role, but here the generality of the neutral hydroxyl groups of Ser and Thr is perhaps important. Neutral hydrophilic residues such as Ser and Thr are compatible with almost any position on the surface of a folded protein. In contrast, charge-based attraction and repulsion is an important contributor to protein folding, stability and function. Point mutation to insert one of these charged amino acids on the surface of a protein is likely to disrupt the protein structure. Ser/Thr as an extended recognition sequence therefore likely provides the affinity and ubiquity necessary for evolution of OTase/HMW-C enzymes as general glycosylation enzymes capable of glycosylating multiple Asn residues in many different proteins.
The structural basis for the glycosylation sequon is now apparent. However, it is also clear that recognition and glycosylation of selected asparagine residues is subject to further control and regulation depending on variation within catalytic STT3 enzymes, and on the presence of accessory protein subunits of multiprotein OTase complexes. In order to understand the roles of these accessory proteins, it is however necessary for them to be completely identified. With recent years showing the identification and preliminary characterization of several novel accessory proteins of mammalian OTase, it is probable that additional subunits still remain to be discovered. Biochemical characterization of OTase complexes in other eukaryotes may well also present additional, non-homologous, accessory protein subunits. Further, OTase enzymatic activity is actively regulated, adding to the complexity of potential OTase function. Mass spectrometry-based future analytics for glycosylation analysis will enable phenotypic characterization of the site-specific activity of OTase in these varied biological circumstances. Such analysis will contribute to, and also benefit from, a complete quantitative understanding of the interplay between glycoprotein folding and N-glycosylation. Finally, understanding of the molecular mechanisms of N-glycosylation site selection is beginning to open the possibilities for co-engineering of glycosylation sites and OTases in synthetic biology approaches outside of natural evolutionary constraints, moving N-glycosylation beyond the sequon.
The author acknowledges the support of NHMRC Career Development Fellowship APP1031542 and NHMRC Project Grant 631615.