1. Introduction
Asparagine (N-) linked protein glycosylation is a common and essential post-translational modification of proteins in eukaryotes, archaea and some bacteria. It plays crucial roles in protein folding and in regulation of protein function. Although the general principles of N-glycosylation have been long known, the precise details governing whether a particular asparagine residue will be N-glycosylated or not are not well understood. This is of broad general importance in understanding the structure and function of the immense variety of N-glycoproteins in diverse biological systems. This chapter will review the current understanding of the mechanisms that determine how asparagine residues are selected for glycosylation by the enzyme oligosaccharyltransferase.
2. Overview of N-glycosylation in the endoplasmic reticulum
The initial steps in N-glycosylation take place in the lumen of the endoplasmic reticulum (ER). The enzyme oligosaccharyltransferase (OTase) catalyzes the key step in N-glycosylation,
2.1. Oligosaccharyltransferase
The OTase enzyme is a multiprotein complex in most eukaryotes, and in yeast consists of 8 protein subunits (Ost1p, Ost2p, Ost3/6p, Ost4p, Ost5p, Swp1p, Wbp1p and Stt3p) (Kelleher & Gilmore, 2006). It is now clear that the Stt3p protein houses the catalytic site of OTase, while the accessory protein subunits of multiprotein complex OTases are required for complex stability, enzymatic regulation of OTase activity, substrate recognition and OTase enzyme localization (Mohorko et al., 2011). OTase physically associates with the translocon (Shibatani et al., 2005, Yan & Lennarz, 2005) and the ribosome (Harada et al., 2009), and so has direct access to nascent polypeptides immediately as they enter the ER lumen (Dempski & Imperiali, 2002). Glycosylation of many asparagines is co-translocational, and occurs essentially as soon as they enter the ER lumen and can reach the OTase active site (Whitley et al., 1996). Other sites are also glycosylated post-translocationally, with extended residence of protein in the ER lumen (Ruiz-Canada et al., 2009). However, in all cases the protein substrate of OTase must be unfolded for glycosylation to occur.
2.2. Roles of N-glycans in protein folding
The key role of N-glycans on proteins in the ER is to assist in productive protein folding (Helenius & Aebi, 2004). By virtue of their hydrophilic bulk, N-glycans alter the overall biophysical properties of nascent polypeptides, increasing their solubility and constraining local polypeptide conformation (Wormald & Dwek, 1999). N-glycans can also function as signals for incomplete folding of particular domains of proteins, and so direct these to the ER resident thiol oxidoreductase ERp57 via the lectins calnexin and calreticulin (Oliver et al., 1999). Timed trimming of N-glycans on glycoproteins in the ER lumen is also key for regulating retro-translocation of incorrectly folded glycoproteins to the cytoplasm for degradation (Aebi et al., 2010).
3. The ‘glycosylation sequon’
The key recognition factor for selection of asparagines for glycosylation by OTase is the ‘glycosylation sequon’. This has been historically defined as Asn-Xaa-Ser/Thr (Xaa Pro). However, it has also long been clear that this is not an adequate predictor of glycosylation, as ~1/3rd of Asn in sequons in secreted proteins are not glycosylated. In addition to this, several examples of glycosylation of Asn residues not in sequons have been reported in recent years.
3.1. Definition of the sequon
The term ‘sequon’ was likely first used by Derek Marshall (Marshall, 1974) to describe the apparent three amino acid local sequence requirement for N-glycosylation. However, it was long recognized that the presence of a sequon was not sufficient for N-glycosylation to occur at a given Asn in portions of polypeptides entering the ER lumen. Nonetheless, the efficiency of glycosylation at a given asparagine is primarily determined by the flanking amino acids, with the primary factor increasing glycosylation being the presence of a threonine or serine at the +2 position. This has such a strong influence of the efficiency of glycosylation that it has been termed the ‘glycosylation sequon’ in recognition of its importance. However, the presence of a glycosylation sequon is neither necessary nor sufficient for an asparagine to be glycosylated.
3.2. The ‘+2’ position: Thr, Ser, Cys, Etc
Whilst both Ser and Thr are accepted as amino acids at the +2 position in glycosylation sequons, they are not equal, as glycosylation of Asn-Xaa-Thr sequons is approximately 40 times efficient than of Asn-Xaa-Ser sequons (Kasturi et al., 1995, Kasturi et al., 1997). Far and away the majority of glycosylated asparagines are in traditional Asn-Xaa-Ser/Thr (XaaPro) sequons. However, several very well validated examples have been reported of asparagines
Several reports have been made of glycosylation at asparagines in the sequence Asn-Xaa-Cys. Human CD69 has such an Asn-Xaa-Cys glycosylation site (Vance et al, 1997). Human beta protein C is glycosylated at an Asn with cysteine at the +2 position (Miletich & Broze, 1990). Interestingly, the Cys in beta protein C is involved in a disulfide bond in the mature protein, and the formation of this disulfide competes directly with glycosylation at the preceding Asn. CHO-cell expressed recombinant human epidermal growth factor receptor (EGRF) also has such a glycosylation site (Sato et al., 2000). Heterologous expression of an insect cathepsin B-like counter-defense protein in
Several large-scale discovery projects for identification of N-glycosylation sites have been performed. The largest of these, from mouse, identified over 5000 putatively glycosylated asparagines (Zielinska et al., 2010). While the vast majority of these were in conventional Asn-Xaa-Ser/Thr sequons, a small but significant number of Asn not in such sequons were identified as being glycosylated. Asn-Xaa-Cys sites represented 65/5052, and Asn-Xaa-Val 20/5052. It was also reported that Asn-Gly sites were modified. However, this result must be treated with extreme caution, given the propensity for non-catalyzed spontaneous deamidation (asparagine-aspartate conversion) is especially high at Asn-Gly sequences (Palmisano et al., 2012, Robinson et al., 2004)
It was proposed that the hydroxyl group of Ser/Thr amino acids at the +2 position was directly involved in catalysis, via the formation of an ‘Asparagine turn’ (Imperiali & Hendrickson, 1995). This proposal was certainly powerful, and could withstand the observation of rare Asn-Xaa-Cys glycosylation sequons with the relatively weak hydrogen bonding capacity of the cysteine sulfhydryl group. However, apparent glycosylation of Asn-Xaa-Val sequons could not be explained by this mechanism. Resolution of the role of the +2 amino acid in determining glycosylation needed to wait until an atomic resolution structure of OTase was available.
3.3. Further a field: The ‘X’ position and beyond
The amino acids immediately proximal to the glycosylated Asn also influence the efficiency of its glycosylation. Experimental manipulation of model proteins has shown that the +1 position of an Asn has a strong effect on its extent of glycosylation, with bulky hydrophobic or acidic amino acids strongly reducing glycosylation occupancy, and small, hydrophilic or basic amino acids giving high levels of modification (Shakin-Eshleman et al., 1996). These results may be misleading, as glycosylation only occurs before protein folding, and so mutations which disrupt or slow local protein folding could make extrapolation of such results difficult. However, roughly this same overall pattern has also been observed in non-experimental comparisons of glycosylated and non-glycosylated Asn (Petrescu et al., 2004). Interplay with the amino acid at the +2 position has also been shown to be important. Studies in a model glycoprotein showed that amino acid substitutions at the +1 position that reduced glycosylation efficiency with Ser at the +2 position were still completely modified if Thr was at the +1 position (Kasturi et al., 1997). The major difficulty in interpreting these results is that the amino acids in the vicinity of a glycosylated Asn residue influence both specific interactions with OTase
In addition to local sequence dependency, the position of an asparagine within its protein sequence also contributes to the extent or probability of glycosylation. For instance, probability and extent of glycosylation increases with increasing distance from the C-terminus of a protein. This has been measured both experimentally using manipulation of model proteins and by
3.4. The extended bacterial glycosylation sequon
The discovery of N-glycosylation systems in bacteria that are homologous to those in eukaryotes promised rapid progress in understanding the molecular basis for their specificity and activity, given their comparative simplicity and ease of manipulation (Szymanski et al., 1999, Wacker et al., 2002). Initially it was observed that the
3.5. Structural insights into the requirement for the glycosylation sequon
The high-resolution 3D crystal structure of the
This structure of the PglB OTase provides clear evidence that the role of the glycosylation sequon is to increase the binding affinity of asparagines to the active site of OTase (Lizak et al., 2011b). Accessory subunits of multiprotein complex OTases in many eukaryotes have been shown to bind substrate polypeptide, perhaps contributing to increasing the binding affinity of specific Asn and leading to the short requirement of specific binding of an Asn-Xaa-Ser/Thr. In contrast, the single protein OTases such as the bacterial PglB may have evolved the requirement for an extended sequon in the absence of such additional binding by accessory OTase subunits.
3.6. The future of the sequon
How to best define the glycosylation ‘sequon’? Many factors influence whether a particular asparagine is glycosylated, including: binding affinity of the region immediately proximal to the Asn to the polypeptide acceptor site of OTase; local folding, such as secondary structural elements, disulfide bond formation or hydrophobic collapse; the regulatory state of OTase, including the concentration and structure of lipid-linked oligosaccharide donor; protein expression rate, both global (rate of protein secretion saturates OTase catalytic ability) and local (position of Asn within the protein sequence); and the affect of glycosylation at an Asn on the total possibility of protein folding. (If glycosylation at a given Asn would not allow correct folding of the protein, such that the portion of nascent polypeptides that were glycosylated there would never correctly fold, then that Asn would appear to never be glycosylated. The converse is also true, that if glycosylation is strictly required at a particular Asn for correct protein folding, then that Asn will appear to always be glycosylated, even if most of the nascent polypeptide is not modified and degraded by the quality control systems of the ER.)
It is the combination of these factors that determines if a particular Asn reaches the threshold for modification by OTase. However, even the definition of this threshold is an analytical artefact, as it is increasingly apparent that most glycosylated Asn are only partially modified, with some portion ranging from a fraction of a percent to essentially all copies of a protein, actually glycosylated (Hülsmeier et al., 2007, Sumer-Bayraktar et al., 2011). This pattern seems to contrast with the general requirement of many proteins for N-glycosylation for correct and efficient protein folding (Helenius & Aebi, 2004). Two key factors probably explain this conundrum. Many proteins can fold correctly even without glycosylation at many sites, as long as a certain critical level of glycosylation is present, perhaps sufficient for ER-lectin chaperone recruitment to crucial protein domains, or overall biophysical solubility. Additionally, Asn residues are inherently likely to be present at the ends of secondary structural elements. This means that glycosylation at such sites is, in general, not likely to strongly disrupt protein folding.
In the end it appears that the descriptive beauty of the ‘glycosylation sequon’ is actually a dramatic simplification. However, the current state of knowledge is far from being able to quantify the ‘glycosylatability’ of a particular Asn. In place of this developing skill, the ‘sequon’ as it is traditionally defined is still a very accurate predictor of the possibility of glycosylation.
4. Oligosaccharyltransferase defines the sequon
The enzyme oligosaccharyltransferase (OTase) catalyses transfer of oligosaccharide from lipid to nascent polypeptide in the ER. However, while this enzyme shows a high degree of conservation between species with respect to the small scale reaction it catalyses, the immense range of different polypeptide substrates in various biological systems can be efficiently glycosylated because of co-evolution of these substrate proteins and the acceptor specificities of OTase. In turn, this evolutionary history determines whether a particular asparagine residue will be efficiently glycosylated in a given biological system. The OTase defines the ‘sequon’.
4.1. OTase protein subunits
OTase consists of the catalytic protein subunit Stt3p/PglB with varying numbers of additional accessory subunits in different organisms (reviewed in (Kelleher & Gilmore, 2006, Mohorko et al., 2011)). Comparison of the evolutionary tree of eukaryotes with the protein subunit composition of OTase implies that accessory protein subunits have been added sequentially during eukaryotic evolution, starting from an ancestral single protein Stt3p OTase enzyme. The functions of most accessory OTase subunits are not clearly defined, although roles in recognition and regulation of glycan and protein substrate have been proposed.
4.2. Single protein OTases
Some divergent eukaryotes such as
4.2.1. Single protein OTases in Trypanosoma brucei
4.2.2. Single protein OTases in Leishmania major
The single subunit OTase enzymes of the related
4.2.3. Role of OTase catalytic subunit homologues STT3A and STT3B
Even when present in multiprotein complexes, STT3 homologues have different activities. OTase complexes containing either of the homologous mammalian STT3A and STT3B proteins have different kinetic parameters (Kelleher et al., 2003), and are also responsible for either co-translocational or post-translocational N-glycosylation (Ruiz-Canada et al., 2009), thereby glycosylating different protein substrates (Wilson & High, 2007). However, it is not clear if there is further definition of protein or glycan substrate specificity defined by the presence of Stt3A or Stt3B in an OTase complex.
4.3. Role of accessory OTase proteins
In organisms with multiprotein complex OTases, there are several lines of evidence that some of these additional non-catalytic subunits provide different protein substrate specificities and allow regulation of oligosaccharide substrate recognition and enzymatics.
4.3.1. Role of accessory OTase proteins Ost3p and Ost6p
The
4.3.2. Role of accessory OTase protein ribophorin I / Ost1p
Mammalian Ribophorin I (Ost1p in yeast) is required for efficient glycosylation of selected membrane proteins. Ribophorin I physically associates with selected membrane proteins after insertion into the ER membrane (Wilson et al., 2005). This interaction with these selected substrate proteins was also shown to be required for their efficient glycosylation by OTase (Wilson & High, 2007). The interaction between selected membrane proteins and ribophorin I is direct, but the precise mechanisms of the interaction are not clear (Wilson et al., 2008). It is possible that ribophorin I / Ost1p function in a conceptually similar way to Ost3/6p, in transiently tethering substrate protein close to the catalytic site of OTase to allow efficient glycosylation of a defined subset of glycosylation sites or glycoproteins.
4.3.3. Additional known accessory OTase proteins
An integral membrane protein with homology to the integral membrane domain of Ost3p and Ost6p has been identified in mammalian cells. This protein, DC2 or OSTC, is required for glycosylation of specific substrate glycoproteins (Wilson & High, 2007). A further protein, Keratinocyte-associated protein 2 (KCP2), has been shown biochemically to be a subunit of the mammalian OTase (Sanyal & Menon, 2010, Roboti & High, 2012), and to be required for glycosylation of some proteins (Wilson & High, 2007).
4.3.4. Putative accessory OTase protein presenilin 1
A direct link between site-specific glycosylation and Alzheimer’s disease has been made, through the Presenilin-1 protein (Lee et al., 2010). N-glycosylation of the vaculoar ATPase subunit V0a1 is mediated by selective binding of the Alzheimer’s disease related protein presenilin-1 (PS1) to unglyclosylated V0a1 and OTase. V0a1 glycosylation is required for ER-lysosome trafficking, and so lack of PS1 causes deficiencies in lysosomal acidification and proteolysis during autophagy. It is not clear if PS1 is a truly protein-specific enhancer of glycosylation, or if it interacts with additional substrate glycoproteins to enhance their glycosylation.
4.3.5. How many OTase subunits are there?
Have all OTase subunits been identified? Most known OTase subunit proteins have been identified in the yeast
It is possible that other, less tightly bound or lowly expressed proteins are yet to be identified. It is also possible that sequential addition of accessory proteins to the OTase complex has proceeded divergently in different eukaryotic lineages. This would mean that biochemical analyses, rather than genomic comparisons, would be necessary to identify any additional OTase complexes in for example the plant or protozoan OTase. Any such additional subunits would likely have diverse additional roles in regulation of OTase core activity.
5. Analytical approaches to determine site-specific glycosylation occupancy
5.1. Glycosylation site identification and occupancy
A goal of understanding the function of OTase in diverse biological systems is to enable accurate prediction of whether a particular Asn will be efficiently glycosylated. However, such prediction depends on a complete understanding of how OTase interacts with substrate polypeptides in each biological system, and as such is probably a very difficult problem. In addition to the diversity of OTase subunit proteins, OTase activity may also be subject to regulation. In the absence of accurate prediction tools, analytical identification and quantification of glycosylation occupancy is therefore necessary for accurate characterisation of the glycosylation status of a protein. In addition, it is not sufficient to identify that a site is glycosylated, as an Asn can be identified as ‘glycosylated’ in enrichment experiments, but may actually only be modified at a very low occupancy. The physiological relevance of glycosylation at such sites is therefore questionable. The converse of this is also true, as it appears that with sensitive analytical detection some or even most glycosylation sites are not completely occupied (there exists a small but significant proportion of proteins that are not glycosylated at that particular site) (Hülsmeier et al., 2007). Analytical methods should therefore consider the proportion of a particular Asn that is glycosylated, for instance using LC-MS approaches that can compare the abundance of glycosylated and non-glycosylated versions of the same peptide (Schulz & Aebi, 2009). Although these methods are not in general absolutely quantitative, they can provide relative quantification and a first step towards characterization of the site-specific extent of glycosylation.
5.2. Western blotting for measuring glycosylation occupancy
Numerous studies have made use of Western blotting with antibodies recognizing a specific protein of interest to gauge glycosylation occupancy. However, Western blotting is inherently limited to analysis of proteins for which specific antisera are available, and is constrained to low-throughput assays. Western blotting can also only identify protein-wide glycosylation occupancy, and cannot distinguish between partial glycosylation at different Asn residues on the same protein. Mass spectrometry can overcome both of these key difficulties, as it is a general analysis tool that can be used for site-specific analysis of protein glycosylation.
5.3. Glycoconjugate enrichment stragtegies
Detection of glycosylation at a specific site is the first step in its quantitative analysis (Schulz et al., 2012). Enrichment of glycoproteins or glycopeptides is key to the success of high sensitivity detection of glycosylation sites. Various enrichment strategies can be employed depending on the biological system of interest, and the analytes of interest within that system. The physical properties of carbohydrates that distinguish them from protein can be used to enrich glycopeptides and glycoproteins. Typical enrichment strategies based on the physical properties of glycans include hydrophilic interaction chromatography (Mysling et al., 2010, Gilar et al., 2011, Christiansen et al., 2010), phenyl boronic acid (Li et al., 2000, Li et al., 2001) and hydrazide (Zhang & Aebersold, 2006, Zhang et al., 2003) attachment. A key mechanism mediating the functional roles of glycans in many biological systems is recognition of specific glycan structures by proteins, or lectins. The specificity of such lectins for defined glycan structures can be used to enrich particular subsets of glycopeptides or glycoproteins bearing those structures (Drake et al., 2006, Zielinska et al., 2010).
5.4. Mass spectrometry for measuring glycosylation occupancy
To obtain quantitative or semi-quantitative measurement of the extent of glycosylation at that site subsequent comparison must be made with the unglycosylated form of the detected peptide. This can be done using comparison of ion intensities of the glycosylated and unglycosylated peptides. The unglycosylated form of the peptide will only be present in one form. However, as glycosylation generally results in a complex mixture of glycan structures at each glycosylation site, measurement of the abundance of the glycosylated form of a given site is not trivial. Some approaches have used detection of entire glycopeptides, although this approach generally requires more specialized and targeted LC-MS technologies (Sumer-Bayraktar et al., 2011). Other approaches have focused on improving quantification of occupancy, and have discarded information on site-specific glycan structure by endoglycosidase treatment (Schulz & Aebi, 2009). For instance, PNGaseF cleaves N-glycans and converts previously glycosylated Asn to Asp, while EndoH leaves a single
5.5. Selected-reaction-monitoring mass spectrometry
Recent years have seen impressive success with targeted mass spectrometry approaches, using selected-reaction-monitoring (Lange et al., 2008, Gallien et al., 2011). N-glycosylation has been used as a useful tag to specifically enrich otherwise low abundant components of biological fluids (Stahl-Zeng et al., 2007). Often this has been performed not out of direct interest in glycosylation per se, but because of the ubiquity of glycosylation, and its proven utility in biomarker discovery. However, some analyses have used this approach to specifically measure glycosylation occupancy, for instance in patients with congenital disorders of glycosylation (Hülsmeier et al., 2007).
5.6. Future analytical directions
Use of tools such as those outlined above, in combination with experimental manipulation of growth conditions, N-glycan biosynthetic pathways, protein translation and translocation, and OTase function or composition, will allow identification of the regulation and roles of site-specific N-glycosylation occupancy at a systems level.
6. Is the ‘glycosylation sequon’ an example of convergent evolution? Insights into glycosylation site evolution
6.1. HMW-ABC glycosylation in non-typeable Haemophilus influenzae
A family of cytoplasmic bacterial enzymes have been recently described that catalyse an N-glycosylation reaction remarkably reminiscent of ‘traditional’ N-glycosylation. These enzymes are the HMW-C glycosyltransferase of non-typeable
A key step in NTHi infection is adherence to the host epithelium. Surface exposed adhesin proteins mediate this adherence, with the high molecular weight (HMW) adhesin system being of key importance in many NTHi clinical isolates. HMW-C is a glycosyltransferase associated with this two-partner secretion system adhesin, encoded in the HMW-ABC locus. Two highly homologous loci are present in the ~80% of NTHi clinical isolates that encode this system, HMW1ABC and HMW2ABC respectively (St. Geme et al., 1998, Ecevit et al., 2004). HMW1A encodes an adhesin glycoprotein (Gross et al., 2008, Grass et al., 2003), which is secreted across the inner membrane via the Sec apparatus, and requires the outer membrane protein HMW1B for correct export across the outer membrane (St Geme & Yeo, 2009). HMW1C encodes a family 41 glycosyltransferase that glycosylates HMW1A (Grass et al., 2010, Kawai et al., 2011, Choi et al., 2010). This glycosylation is required for stability, efficient folding and secretion of the HMW1A glycoprotein adhesin (Grass et al., 2003). In turn, the HMW1A adhesin is important for NTHi colonisation and pathogenesis (St Geme et al., 1993, St Geme, 1994).
Similar to several other described bacterial protein glycosyltransferases, HMW1C glycosylates its HMW1A substrate protein in the cytoplasm, before secretion across the inner membrane (Fleckenstein et al., 2006, Charbonneau et al., 2012, Choi et al., 2010, Schwarz et al., 2011a). Most of these other reported bacterial glycosyltransferases are O-glycosyltransferases, transferring nucleotide-activated monosaccharides to the hydroxyl groups of Ser or Thr. In contrast, HMW1C glycosylates Asn residues, with a strong tendency to glycosylate Asn within glycosylation sequons with the sequence Asn-Xaa-Ser/Thr (Gross et al., 2008).
6.2. HMW-C versus OTase: Unrelated enzymes, same sequon?
The HMW-C and OTase systems are not homologous, as traditional N-glycosylation as described above is catalysed by the integral membrane OTase, which transfers an oligosaccharide from a lipid linked carrier to nascent polypeptide in the lumen of the ER (or periplasm). In contrast, the HMW-C cytoplasmic system of some bacteria is catalysed by a soluble glycosyltransferase that transfers a nucleotide-activated monosaccharide to protein in the cytoplasm. However, it is striking that the bacterial cytoplasmic HMW-C enzymes have very similar site recognition to ‘traditional’ OTase enzymes: they efficiently glycosylate Asn in ‘sequons’ with Asn-Xaa-Ser/Thr (Xaa≠Pro), but are capable of glycosylating some selected asparagines lacking S/T at the +2 position (Choi et al., 2010, Grass et al., 2010, Schwarz et al., 2011a, Schwarz & Aebi, 2011). HMW-C enzymes also share the substrate requirement of OTase for unfolded protein, or flexible loops in folded protein (Schwarz et al., 2011a).
A high-resolution 3D crystal structure of an HMW-C enzyme from
This then raises the very curious observation that two non-homologous enzymatic systems for glycosylation of Asn have independently evolved essentially identical substrate recognition motifs. This suggests convergent evolution of enzyme-substrate interactions in these two systems, which would in turn imply that there is some functional benefit for site recognition of Ser/Thr at +2 amino acid residues of an Asparagine. It is tempting to speculate that this sequence may have evolved to balance the need for sufficient binding affinity of the polypeptide acceptor with the advantages of a general glycosylation system.
The selection pressure for OTase and HMW-C to require unfolded polypeptide substrate is not completely clear. However, this requirement is likely due to the benefit of glycoysylation in increasing both protein folding efficiency and the stability of folded proteins. Addition of glycans to already folded proteins can serve to increase their stability, potentially in a regulated manner (Yuzwa et al., 2012). However, to be of assistance in protein
6.3. Why is the sequon as it is?
Why then should Ser/Thr be part of a preferred glycosylation recognition sequence, and not any other amino acids? Perhaps part of the answer is that these hydroxyl-containing residues are typically surface exposed and are not charged. The hydrophilic nature of Ser and Thr means that they are generally not present internally in folded proteins, but are almost always surface exposed. As addition of a glycan in the hydrophobic core of a protein would be incompatible with correct protein folding, a hydrophilic recognition motif is necessary. Charged residues (His, Arg, Lys, Asp, Glu) would also be potential candidates for such a role, but here the generality of the neutral hydroxyl groups of Ser and Thr is perhaps important. Neutral hydrophilic residues such as Ser and Thr are compatible with almost any position on the surface of a folded protein. In contrast, charge-based attraction and repulsion is an important contributor to protein folding, stability and function. Point mutation to insert one of these charged amino acids on the surface of a protein is likely to disrupt the protein structure. Ser/Thr as an extended recognition sequence therefore likely provides the affinity and ubiquity necessary for evolution of OTase/HMW-C enzymes as general glycosylation enzymes capable of glycosylating multiple Asn residues in many different proteins.
7. Conclusion
The structural basis for the glycosylation sequon is now apparent. However, it is also clear that recognition and glycosylation of selected asparagine residues is subject to further control and regulation depending on variation within catalytic STT3 enzymes, and on the presence of accessory protein subunits of multiprotein OTase complexes. In order to understand the roles of these accessory proteins, it is however necessary for them to be completely identified. With recent years showing the identification and preliminary characterization of several novel accessory proteins of mammalian OTase, it is probable that additional subunits still remain to be discovered. Biochemical characterization of OTase complexes in other eukaryotes may well also present additional, non-homologous, accessory protein subunits. Further, OTase enzymatic activity is actively regulated, adding to the complexity of potential OTase function. Mass spectrometry-based future analytics for glycosylation analysis will enable phenotypic characterization of the site-specific activity of OTase in these varied biological circumstances. Such analysis will contribute to, and also benefit from, a complete quantitative understanding of the interplay between glycoprotein folding and N-glycosylation. Finally, understanding of the molecular mechanisms of N-glycosylation site selection is beginning to open the possibilities for co-engineering of glycosylation sites and OTases in synthetic biology approaches outside of natural evolutionary constraints, moving N-glycosylation beyond the sequon.
Acknowledgement
The author acknowledges the support of NHMRC Career Development Fellowship APP1031542 and NHMRC Project Grant 631615.
References
- 1.
Aebi M. Bemasconi R. Clerc S. Molinari M. 2010 N-glycan structures: recognition and processing in the ER.35 74 82 . - 2.
Bano-Polo M. Baldin F. Tamborero S. Marti-Renom M. A. Mingarro I. 2011 N-glycosylation efficiency is determined by the distance to the C-terminus and the amino acid preceding an Asn-Ser-Thr sequon.20 179 186 . - 3.
Burda P. Aebi M. 1999 The dolichol pathway of N-linked glycosylation.1426 239 257 . - 4.
Charbonneau M. E. Cote J. P. Haurat M. F. Reiz B. Crepin S. Berthiaume F. Dozois C. M. Feldman M. F. Mourez M. 2012 A structural motif is the recognition site for a new family of bacterial protein O-glycosyltransferases. . - 5.
Chi Y. H. Koo Y. D. Dai S. Y. Ahn J. E. Yun D. J. Lee S. Y. Zhu-Salzman K. 2010 N-glycosylation at non-canonical Asn-X-Cys sequence of an insect recombinant cathepsin B-like counter-defense protein.156 40 47 . - 6.
Choi K. J. Grass S. Paek S. St J. W. Geme 3 & H. J. Yeo, (2010) The Actinobacillus pleuropneumoniae HMW1C-like glycosyltransferase mediates N-linked glycosylation of the Haemophilus influenzae HMW1 adhesin. 5: e15888. - 7.
Christiansen M. N. Kolarich D. Nevalainen H. Packer N. H. Jensen P. H. 2010 Challenges of determining O-glycopeptide heterogeneity: a fungal glucanase model system.82 3500 3509 . - 8.
Dempski R. E. J. Imperiali B. 2002 Oligosaccharyl transferase: gatekeeper to the secretory pathway.6 844 850 . - 9.
Drake R. R. Schwegler E. E. Malik G. Diaz J. Block T. Mehta A. Semmes O. J. 2006 Lectin Capture Strategies Combined with Mass Spectrometry for the Discovery of Serum Glycoprotein Biomarkers.5 1957 1967 . - 10.
Ecevit I. Z. Mc Crea K. W. Pettigrew M. M. Sen A. Marrs C. F. Gilsdorf J. R. 2004 Prevalence of the hifBC, hmw1A, hmw2A, hmwC, and hia Genes in Haemophilus influenzae Isolates42 3065 3072 . - 11.
Fetrow J. S. Siew N. Di Gennaro J. A. Martinez-Yamout M. Dyson H. J. Skolnick J. 2001 Genomic-scale comparison of sequence- and structure-based methods of function prediction: does structure provide additional insight?10 1005 1014 . - 12.
Fleckenstein J. M. Roy K. Fischer J. F. Burkitt M. 2006 Identification of a Two-Partner Secretion Locus of Enterotoxigenic Escherichia coli.74 2245 2258 . - 13.
Gallien S. Duriez E. Domon B. 2011 Selected reaction monitoring applied to proteomics.46 298 312 . - 14.
Gilar M. Yu Y. Q. Ahn J. Xie H. Han H. Ying W. Qian X. 2011 Characterization of glycoprotein digests with hydrophilic interaction chromatography and mass spectrometry.417 80 88 . - 15.
Grass S. Buscher A. Z. Swords W. E. Apicella M. A. Barenkamp S. J. Ozchlewski N. St J. W. Geme 3 (2003) The Haemophilus influenzae HMW1 adhesin is glycosylated in a process that requires HMW1C and phosphoglucomutase, an enzyme involved in lipooligosaccharide biosynthesis.48 737 751 . - 16.
Grass S. Lichti C. F. Townsend R. R. Gross J. St J. W. Geme 3 (2010) The Haemophilus influenzae HMW1C protein Is a glycosyltransferase that transfers hexose residues to asparagine sites in the HMW1 adhesin 6: e1000919. - 17.
Gross J. Grass S. Davis A. E. Gilmore-Erdmann P. Townsend R. R. St J. W. Geme 3 (2008) The Haemophilus influenzae HMW1 Adhesin Is a Glycoprotein with an Unusual N-Linked Carbohydrate Modification.283 26010 26015 . - 18.
Harada Y. Li H. Li H. Lennarz W. J. 2009 Oligosaccharyltransferase directly binds to ribosome at a location near the translocon-binding site.106 6945 6949 . - 19.
Helenius A. Aebi M. 2004 Roles of N-linked glycans in the endoplasmic reticulum.73 1019 1049 . - 20.
Hülsmeier A. J. Paesold-Burda P. Hennet T. 2007 N-glycosylation site occupancy in serum glycoproteins using multiple reaction monitoring liquid chromatography mass spectrometry.6 2132 2138 . - 21.
Imperiali B. Hendrickson T. L. 1995 Asparagine-linked glycosylation: specificity and function of oligosaccharyl transferase.3 1565 1578 . - 22.
Izquiedro L. Mehlert A. Ferguson M. A. 2012 The lipid-linked oligosaccharide donor specificities of Trypanosoma brucei oligosaccharyltransferases.22 696 703 . - 23.
Izquiedro L. Schulz B. L. Rodrigues J. A. Güther M. L. Procter J. B. Barton G. J. Aebi M. Ferguson M. A. 2009 Distinct donor and acceptor specificities of Trypanosoma brucei oligosaccharyltransferases.28 2650 2661 . - 24.
(2011) Polypeptide binding specificities of Saccharomyces cerevisiae oligosaccharyltransferase accessory proteins Ost3p and Ost6p. Protein Sci 20: 849-855.Jamaluddin M. F. B. Bailey U. M. Tan N. Y. J. Stark A. P. Schulz B. L. - 25.
Karaoglu D. Kelleher D. J. Gilmore R. 1995 Functional characterization of Ost3p. Loss of the 34-kD subunit of the Saccharomyces cerevisiae oligosaccharyltransferase results in biased underglycosylation of acceptor substrates.130 567 577 . - 26.
Karaoglu D. Kelleher D. J. Gilmore R. 2001 Allosteric regulation provides a molecular mechanism for preferential utilization of the fully assembled dolichol-linked oligosaccharide by the yeast oligosaccharyltransferase.40 12193 12206 . - 27.
Kasturi L. Chen H. Shakin-Eshleman S. H. 1997 Regulation of N-linked core glycosylation: use of a site-directed mutagenesis approach to identify Asn-Xaa-Ser/Thr sequons that are poor oligosaccharide acceptors.323 415 419 . - 28.
Kasturi L. Eshleman J. R. Wunner W. H. Shakin-Eshleman S. H. 1995 The Hydroxy Amino Acid in an Asn-X-Ser/Thr Sequon Can Influence N-Linked Core Glycosylation Efficiency and the Level of Expression of a Cell Surface Glycoprotein.270 14756 14761 . - 29.
Kawai F. Grass S. Kim Y. Choi K. J. St J. W. Geme 3 & H. J. Yeo, (2011) Structural insights into the glycosyltransferase activity of the Actinobacillus pleuropneumoniae HMW1C-like protein.286 38546 38557 . - 30.
Kelleher D. J. Gilmore R. 2006 An evolving view of the eukaryotic oligosaccharyltransferase. 16: 47R-62R. - 31.
Kelleher D. J. Karaoglu D. Mandon E. C. Gilmore R. 2003 Oligosaccharyltransferase isoforms that contain different catalytic STT3 subunits have distinct enzymatic properties.12 101 111 . - 32.
Knauer R. Lehle L. 1999 The oligosaccharyltransferase complex from Saccharomyces cerevisiae. Isolation of the OST6 gene, its synthetic interaction with OST3, and analysis of the native complex.274 17249 17256 . - 33.
Kowarik M. Numao S. Feldman M. F. Schulz B. L. Callewaert N. Kiermaier E. Catrein I. Aebi M. 2006a N-linked glycosylation of folded proteins by the bacterial oligosaccharyltransferase.314 1148 1150 . - 34.
Kowarik M. Young N. M. Numao S. Schulz B. L. Hug I. Callewaert N. Mills D. C. Watson D. C. Hernandez M. Kelly J. F. Wacker M. Aebi M. 2006b Definition of the bacterial N-glycosylation site consensus sequence.25 1957 1966 . - 35.
Lange V. Picotti P. Domon B. Aebersold R. 2008 Selected reaction monitoring for quantitative proteomics: a tutorial. 4: 222. - 36.
Lee J. H. Yu W. H. Kumar A. Lee S. Mohan P. S. Peterhoff C. M. Wolfe D. M. Martinez-Vicente M. Massey A. C. Sovak G. Uchiyama Y. Westaway D. Cuervo A. M. Nixon R. A. 2010 Lysosomal Proteolysis and Autophagy Require Presenilin 1 and Are Disrupted by Alzheimer-Related PS1 Mutations.141 1146 1158 . - 37.
Li Y. Larsson E. L. Jungvid H. Galaev I. Y. Mattiasson B. 2000 Affinity chromatography of neoglycoproteins.9 315 323 . - 38.
Li Y. Pfüller U. Larsson E. L. Jungvid H. Galaev I. Y. Mattiasson B. 2001 Separation of mistletoe lectins based on the degree of glycosylation using boronate affinity chromatography.925 115 121 . - 39.
Lizak C. Fan Y. Y. Weber T. C. Aebi M. 2011a N-Linked Glycosylation of Antibody Fragments in Escherichia coli. . - 40.
Lizak C. Gerber S. Numao S. Aebi M. Locher K. P. 2011b X-ray structure of a bacterial oligosaccharyltransferase.474 350 355 . - 41.
Marshall R. D. 1974 The nature and metabolism of the carbohydrate-peptide linkages of glycoproteins.40 17 26 . - 42.
Miletich J. P. Broze G. J. J. 1990 Beta protein C is not glycosylated at asparagine 329. The rate of translation may influence the frequency of usage at asparagine-X-cysteine sites.265 11397 11404 . - 43.
Mohorko E. Glockshuber R. Aebi M. 2011 Oligosaccharyltransferase: the central enzyme of N-linked protein glycosylation.34 869 878 . - 44.
Murphy T. F. 2006 The role of bacteria in airway inflammation in exacerbations of chronic obstructive pulmonary disease.19 225 230 . - 45.
Murphy T. F. Bakaletz L. O. Smeesters P. R. 2009 Microbial interactions in the respiratory tract. 28: S121 126 . - 46.
Mysling S. Palmisano G. Højrup P. Thaysen-Andersen M. 2010 Utilizing ion-pairing hydrophilic interaction chromatography solid phase extraction for efficient glycopeptide enrichment in glycoproteomics.82 5598 5609 . - 47.
Nasab F. P. Schulz B. L. Gamarro F. Parodi A. J. Aebi M. 2008 All in One: Leishmania major STT3 proteins substitute for the whole oligosaccharyltransferase complex in Saccharomyces cerevisiae. Mol Biol Cell19 3758 3768 - 48.
Nita-Lazar A. Wacker M. Schegg B. Amber S. Aebi M. 2005 The N-X-S/T consensus sequence is required but not sufficient for bacterial N-linked protein glycosylation.15 361 367 . - 49.
Oliver J. D. Roderick H. L. Llewellyn D. H. High S. 1999 ERp57 functions as a subunit of specific complexes formed with the ER lectins calreticulin and calnexin.10 2573 2582 . - 50.
Palmisano G. Melo-Braga M. N. Engholm-Keller K. Parker B. L. Larsen M. R. 2012 Chemical deamidation: a common pitfall in large-scale N-linked glycoproteomic mass spectrometry-based analyses.11 1949 1957 . - 51.
Petrescu A. J. Milac A. L. Petrescu S. M. Dwek R. A. Wormald M. R. 2004 Statistical analysis of the protein environment of N-glycosylation sites: implications for occupancy, structure, and folding.14 103 114 . - 52.
Rao R. S. Buus O. T. Wollenweber B. 2011 Distribution of N-glycosylation sequons in proteins: how apart are they?35 57 61 . - 53.
Robinson N. E. Robinson Z. W. Robinson B. R. Robinson A. L. Robinson J. A. Robinson M. L. Robinson A. B. 2004 Structure-dependent nonenzymatic deamidation of glutaminyl and asparaginyl pentapeptides.63 426 436 . - 54.
Roboti P. High S. 2012 Keratinocyte-associated protein 2 is a bona fide subunit of the mammalian oligosaccharyltransferase.125 220 232 . - 55.
Ruiz-Canada C. Kelleher D. J. Gilmore R. 2009 Cotranslational and Posttranslational N-Glycosylation of Polypeptides by Distinct Mammalian OST Isoforms.136 272 283 . - 56.
Sanyal S. Menon A. K. 2010 Stereoselective transbilayer translocation of mannosyl phosphoryl dolichol by an endoplasmic reticulum flippase . - 57.
Sato C. Kim J. H. Abe Y. Saito K. Yokoyama S. Kohda D. 2000 Characterization of the N-oligosaccharides attached to the atypical Asn-X-Cys sequence of recombinant human epidermal growth factor receptor.127 65 72 . - 58.
Schulz B. L. Aebi M. 2009 Analysis of Glycosylation Site Occupancy Reveals a Role for Ost3p and Ost6p in Site-specific N-Glycosylation Efficiency.8 357 364 . - 59.
Schulz B. L. Stirnimann C. U. Grimshaw J. P. A. Brozzo M. S. Fritsch F. Mohorko E. Capitani G. Glockshuber R. Grütter M. G. Aebi M. 2009 Oxidoreductase activity of oligosaccharyltransferase subunits Ost3p and Ost6p defines site-specific glycosylation efficiency.106 11061 11066 . - 60.
Schulz B. L. White J. C. Punyadeera C. 2012 Saliva proteome research: current status and future outlook. . - 61.
Schwarz F. Aebi M. 2011 Mechanisms and principles of N-linked protein glycosylation.21 576 582 . - 62.
Schwarz F. Fan Y. Y. Schubert M. Aebi M. 2011a Cytoplasmic N-glycosyltransferase of Actinobacillus pleuropneumoniae is an inverting enzyme and recognizes the NX(S/T) consensus sequence.286 35267 35274 . - 63.
Schwarz F. Lizak C. Fan Y. Y. Fleurkens S. Kowarik M. Aebi M. 2011b Relaxed acceptor site specificity of bacterial oligosaccharyltransferase in vivo.21 45 54 . - 64.
Schwarz M. Knauer M. Lehle L. 2005 Yeast oligosaccharyltransferase consists of two functionally distinct sub-complexes, specified by either the Ost3p or Ost6p subunit.579 6564 6568 . - 65.
Shakin-Eshleman S. H. Spitalnik S. L. Kasturi L. 1996 The amino acid at the X position of an Asn-X-Ser sequon is an important determinant of N-linked core-glycosylation efficiency.271 6363 6366 . - 66.
Shibatani T. David L. L. Mc Cormack A. L. Frueh K. Skach W. R. 2005 Proteomic analysis of mammalian oligosaccharyltransferase reveals multiple subcomplexes that contain Sec61, TRAP, and two potential new subunits.44 5982 5992 . - 67.
Spirig U. Bodmer D. Wacker M. Burda P. Aebi M. 2005 The 3.4-kDa Ost4 protein is required for the assembly of two distinct oligosaccharyltransferase complexes in yeast.15 1396 1406 . - 68.
St Geme. J. 3 3rd, (1994) The HMW1 adhesin of nontypeable Haemophilus influenzae recognizes sialylated glycoprotein receptors on cultured human epitheial cells.62 3881 3889 . - 69.
St Geme. J. 3 3rd, S. Falkow & S. J. Barenkamp, (1993) High-molecular-weight proteins of nontypeable Haemophilus influenzae mediate attachment to human epithelial cels.90 2875 2879 . - 70.
St Geme. J. 3 3rd & H.-J. Yeo, (2009) A prototype two-partner secretion pathway: the Haemophilus influenzae HMW1 and HMW2 adhesin systems.17 355 360 . - 71.
St Geme. J. W. Kumar I. I. I. V. V. Cutter D. Barenkamp S. J. 1998 Prevalence and Distribution of the hmw and hia Genes and the HMW and Hia Adhesins among Genetically Diverse Strains of Nontypeable Haemophilus influenzae.66 364 368 . - 72.
Stahl-Zeng J. Lange V. Ossola R. Eckhardt K. Krek W. Aebersold R. Domon B. 2007 High sensitivity detection of plasma proteins by multiple reaction monitoring of N-glycosites.6 1809 1817 . - 73.
Sumer-Bayraktar Z. Kolarich D. Campbell M. P. Ail S. Packer N. H. Thaysen-Andersen M. 2011 N-glycans modulate the function of human corticosteroid-binding globulin. 10: M111.009100. - 74.
Szymanski C. M. Yao R. Ewing C. P. Trust T. J. Guerry P. 1999 Evidence for a system of general protein glycosylation in Campylobacter jejuni.32 1022 1030 . - 75.
Vance B. A. Wu W. Ribaudo R. K. Segal D. M. Kearse K. P. 1997 Multiple dimeric forms of human CD69 result from differential addition of N-glycans to typical (Asn-X-Ser/Thr) and atypical (Asn-X-cys) glycosylation motifs.272 23117 23122 . - 76.
Wacker M. Linton D. Hitchen P. G. Nita-Lazar M. Haslam S. M. North S. J. Panico M. Morris H. R. Dell A. Wren B. W. Aebi M. 2002 N-Linked Glycosylation in Campylobacter jejuni and Its Functional Transfer into E. coli.298 1790 1793 . - 77.
Whitley P. Nilsson I. von G. Heijne 1996 A Nascent Secretory Protein May Traverse the Ribosome/Endoplasmic Reticulum Translocase Complex as an Extended Chain.271 6241 6244 . - 78.
Wilson C. M. High S. 2007 Ribophorin I acts as a substrate-specific facilitator of N-glycosylation.120 648 657 . - 79.
Wilson C. M. Kraft C. Duggan C. Ismail N. Crawshaw S. G. High S. 2005 Ribophorin I associates with a subset of membrane proteins after their integration at the sec61 translocon.280 4195 4206 . - 80.
Wilson C. M. Roebuck Q. High S. 2008 Ribophorin I regulates substrate delivery to the oligosaccharyltransferase core.105 9534 9539 . - 81.
Wormald M. R. Dwek R. A. 1999 Glycoproteins: glycan presentation and protein-fold stability. 7: R155 160 . - 82.
Yan A. Lennarz W. J. 2005 Two oligosaccharyl transferase complexes exist in yeast and associate with two different translocons.15 1407 1415 . - 83.
Yuzwa S. A. Shan X. Macauley M. S. Clark T. Skorobogatko Y. Vosseller K. Vocadlo D. J. 2012 Increasing O-GlcNAc slows neurodegeneration and stabilizes tau against aggregation.8 393 399 . - 84.
Zhang H. Aebersold R. 2006 Isolation of glycoproteins and identification of their N-linked glycosylation sites.328 177 185 . - 85.
Zhang H. Li X. J. Martin D. B. Aebersold R. 2003 Identification and quantification of N-linked glycoproteins using hydrazide chemistry, stable isotope labeling and mass spectrometry.21 660 666 . - 86.
Zielinska D. F. Gnad F. Wisniewski J. R. Mann M. 2010 Precision mapping of an in vivo N-glycoproteome reveals rigid topological and sequence constraints.141 897 907 .