Applications of Bioinformatics and Experimental Methods to Intrinsic Disorder-Based Protein-Protein Interactions

Proteins are important to the organisms and they control biochemical pathways within the cell. Each protein or group of proteins are responsible for various functions needed to maintain the living cell, including enzymatic catalysis; transporting or storing chemical compounds and energy; hormone regulation of many processes; maintaining structure of tissues and cells; antibody immune response; signal transduction by receptors and signalling proteins etc. The functions of proteins have been long related to their rigid threedimensional (3D) structures in that a protein's biological function depended on its prior folding into a unique 3D structure. However, as it was discovered in recent years, not all biologically functional proteins fold spontaneously into globular structures. Some protein regions or entire proteins lack stable secondary and/or tertiary structures in solution yet possess crucial biological functions, and these disordered regions or proteins are key to understanding many biological processes such as transcriptional regulation, signalling and causes of diseases. In this chapter, we will focus on some basic concepts of intrinsically disordered proteins (IDPs), the recent progress of structural and functional studies of IDP, and the bioinformatics and experimental methods practically used for investigation of IDPs.


Introduction
Proteins are important to the organisms and they control biochemical pathways within the cell.Each protein or group of proteins are responsible for various functions needed to maintain the living cell, including enzymatic catalysis; transporting or storing chemical compounds and energy; hormone regulation of many processes; maintaining structure of tissues and cells; antibody immune response; signal transduction by receptors and signalling proteins etc.The functions of proteins have been long related to their rigid threedimensional (3D) structures in that a protein's biological function depended on its prior folding into a unique 3D structure.However, as it was discovered in recent years, not all biologically functional proteins fold spontaneously into globular structures.Some protein regions or entire proteins lack stable secondary and/or tertiary structures in solution yet possess crucial biological functions, and these disordered regions or proteins are key to understanding many biological processes such as transcriptional regulation, signalling and causes of diseases.In this chapter, we will focus on some basic concepts of intrinsically disordered proteins (IDPs), the recent progress of structural and functional studies of IDP, and the bioinformatics and experimental methods practically used for investigation of IDPs.

Intrinsically disordered proteins and their structural and functional studies 2.1 Concept of intrinsically disordered proteins (IDPs)
For over a hundred years, the structure-function relationships of proteins have been one of the central topics of protein science.In 1894, by observing the specificity of the enzymatic hydrolysis of glucosides, Fisher put forward the lock-and-key theory of protein functionality (Fischer, 1894) according to which protein functions are determined by their specific 3D structures.This lock-and-key model for protein binding was further reinforced by many experiments on the dependence of protein function on their 3D structures (Mirsky and Pauling, 1936;Wu, 1931) and was accepted widely to underlie almost all of the subsequent work and thinking (Phillips, 1986).The flood of protein 3D structures determined by modern structural biology through X-ray diffraction and nuclear magnetic resonance (NMR) spectroscopy has also highlighted a specific 3D structure as the necessary prerequisite for protein function (Berman et al., 2000).For those proteins with rigid structures, the amino acid sequence determines the protein's unique 3D structure and the sequence → structure → function paradigm have become paramount.
However, the generality of the sequence-structure-function paradigm was challenged by the observation (Karush, 1950) that the binding site of bovine serum albumin adopts a large number of configurations with similar energy levels.Upon interacting with a substrate, the best-fitting configuration was selected from the structural ensemble of bovine serum albumin, which was called configurational adaptability.Koshland further proposed this configurational adaptability as induced-fit process (Koshland, 1958), indicating that protein conformational changes are responsible for its function.At the time induced-fit was proposed, it was unclear whether the process of binding induced a new conformation or resulted in selection of the best-fit alternative from an ensemble of structures in equilibrium.The induced-fit theory was further supported by the fact that protein domains are capable of moving upon substrate binding, which were firstly observed in binding of glucose to yeast hexokinase and following numerous protein crystal structures (Bennett and Steitz, 1978;McDonald et al., 1979).One example of latter comes from crystal structures of folylpolyglutamate synthetase (FPGS) with and without substrates, a crucial enzyme to retain folic acids for normal cell growth (Sheng et al., 2000;Sun et al., 1998).An ATP-bound FPGS is activated by binding of folate as the second substrate, triggering a large closing movement of domains of FPGS that enables the enzyme to adopt a form for binding the third substrate, L-glutamate, and effect the addition of a polyglutamate tail to the folate (Sun et al., 2001).Thus, the induced-fit hypothesis not only covers structural accommodations within the binding pocket for binding a diverse set of related but structurally distinct molecules but also includes large domain movements upon ligand binding to gain deeper understanding of structure-function relationship of proteins (Koshland, 1994).The inducedfit mechanisms of protein binding are obviously involved in protein conformational changes with either small rearrangements of few interacting groups or larger domain-domain rearrangements between alternative 3D structures in multi-domain proteins.
In contrast to the views described above that function depends strictly on prior 3D structures, or on structural accommodations within a prior 3D structure, or on domain movements between different structures, it has been discovered that unfolded regions or entire proteins play crucial roles in protein functions (Bychkova et al., 1996;Daughdrill et al., 1997;Riek et al., 1996;Uversky et al., 1996;Uversky et al., 1997).This indicates the diversity of protein folding and functions, and raises intriguing questions about the role of protein disorder in biological processes.Disorder in either the binding protein and/or its partner prevents the presentation of rigid 3D structures that can be bound by other rigid complementary structures or shift between distinct states of rigid structures.These ruled out the lock-and-key and induced-fit mechanisms of binding.Obviously, the predominant sequence → 3D structure → function paradigm is no longer sufficient for many unfolded functional proteins, suggesting that a more comprehensive model is needed.In the late 1990's, studies of functional unfolded proteins emerged as a new research field of protein structure-function relationships.Several research groups simultaneously and independently made the important conclusion, after systematic research, that naturally disordered proteins, characterized by the lack of a well-defined 3D structure under physiological conditions and existing as highly dynamic ensembles of inter-converting structures, are not just rare exceptions but represent a new and very broad class of proteins with vital biological functions (Dunker et al., 2001;Dunker et al., 2000;Tompa, 2003;Uversky, 2002a;Uversky et al., 2000;Wright and Dyson, 1999).This important conclusion was reached from different starting points using different experimental and theoretical approaches, including bioinformatics, NMR spectroscopy, multi-parametric protein folding and misfolding studies and protein structural characterization.Since then, a new protein structure-function paradigm has been established to include the novel functions of disordered proteins.The discovery and characterization of functional unfolded proteins is one of the fastest growing areas of protein science and the literature of studies on these proteins has been increasing continually and has become especially rapid during the past decade (Dunker et al., 2007;Uversky, 2010).These functional unfolded proteins are known by different names, including intrinsically disordered, natively denatured, natively unfolded, intrinsically unstructured, natively disordered, inherently disordered and now are widely known as intrinsically disordered proteins (IDPs) among other names.Those proteins either forming crystals without partners or possessing ordered globular forms without partners in NMR experiments, will be termed here as "structured", "natively folded", or just "ordered".

The protein quartet model
Historically, the protein structure-function paradigm emphasized the role of a rigid 3D structure as being a necessary prerequisite to protein function.We now know from the cumulated experimental data on intrinsic disorder that the functional protein or protein region can exist as a structural ensemble, at either secondary or tertiary level.Both unfolded regions with a little elements of secondary structure (random coils) and collapsed tertiary structures with poorly packed side chains (molten globule-like) are included in the range of intrinsic disorder.These ideas were presented as the Protein Trinity Paradigm (Dunker et al., 2001;Dunker and Obradovic, 2001).The Protein Trinity Paradigm relates protein function to the three thermodynamic states of protein.In other words, the intracellular functional proteins or regions of such proteins can exist in any one of the three thermodynamic states, namely, ordered forms, molten globules, and random coils.Thus, a particular function is proposed to arise from any one of these states or a transition between any two of the states.According to this view, the native state of a protein is not just the ordered state, but any of the three states.The molten globule was initially discovered as an equilibrium structure observed in studies of protein denaturation in which the partially unfolded intermediates between the ordered state and the random coil were observed as the major species in urea, guanidine and pH titration studies.In these experiments, the protein converted from an ordered native state into a form having some liquid like characteristics with the side chains changing from rigid to non-rigid packing, while its secondary structure remains almost unchanged and the shape remains compact (Ohgushi and Wada, 1983;Ptitsyn and Uversky, 1994).The molten globule has been proposed to be responsible for biological functions.For example, molten globular state was reported to be involved in the process of translocation of proteins across membranes (Bychkova et al., 1988) and the transfer of retinal from its bloodstream carrier to its cell-surface receptor (Bychkova et al., 1992;Bychkova et al., 1998).
By summarizing a large number of experimental results on the conformational behaviours of IDPs, one of us further revealed that the extended disordered region or entire proteins did not possess uniform structural properties as random coils.They were split into two structurally different subclasses, intrinsic random coil-like and intrinsic pre-molten globulelike conformations (Uversky, 2002a).Proteins in pre-molten globule state are more compact than random coil, exhibiting some amount of residual secondary structure, but they are still essentially less dense than molten globule and ordered proteins.It was also noted that molten globule and pre-molten globule (as folding intermediates of globular proteins) might represent different phase states of protein, as they are separated by the first-order phase transition (Uversky, 1997;Uversky andPtitsyn, 1994, 1996).These observations introduced the native pre-molten globule state of functional unfolded proteins, a new player on the protein functioning field.As ordered, molten globule, pre-molten globule, and random coil conformations possess clearly defined structural differences, the Protein Trinity Paradigm has been extended to the Protein Quartet Model with protein functions arising from four specific conformations (native ordered, molten globules, pre-molten globules and random coils) and transitions between any two of these states (Fig. 1).All of these four structurally defined protein native states can be characterized by using various experimental approaches and applications.For example, pre-molten globules with some residual secondary structure can be characterized by far-UV CD spectra as a typical disordered polypeptide chain with a pronounced minimum in the vicinity of 200 nm.See following sections for more information.
Fig. 1.The protein quartet model

Functional features of IDPs
Given that the intrinsic disorder represents an important structural class of proteins, and in order to meet the increasing interests in systemizing the crucial functions of IDPs, a database of disordered proteins (DisProt) has been created (Vucetic et al., 2005).DisProt provides structural and functional information on proteins or regions that lack a rigid 3D structure under putatively native conditions.Verified by X-ray diffraction, NMR and CD spectra and several other biophysical techniques, each disordered protein included in the database is given the name, various aliases, accession code, amino acid sequence, location of the disordered region(s) and methods used for structural (disorder) characterization.Most entries list the biological function(s) of each disordered region or protein, if applicable.To date, there are 643 IDPs and 1375 intrinsically disordered regions (IDRs) listed in DisProt.
Among the rapidly increasing number of publications on IDPs, bioinformatics studies have predicted that about 25-30% of eukaryotic proteins are mostly disordered (Oldfield et al., 2005a); more than 50% eukaryotic proteins have long regions of disorder (Dunker et al., 2000;Oldfield et al., 2005a); and more than 70% of signalling proteins and the vast majority of cancer associated proteins have long disordered regions (Iakoucheva et al., 2002); 82-94% of transcription factors from three transcription factor datasets possess extended disordered regions (Liu et al., 2006).IDPs are now widely accepted as ubiquitously existing in all kingdoms of life (Dunker et al., 2000;Ward et al., 2004).Since IDPs and IDRs have amazing conformational variability with a variety of functions, the terms "unfoldome" and "unfoldomics" have been recently introduced (Cortese et al., 2005;Dunker et al., 2007;Midic et al., 2009).Unfoldome is attributed to a large set of functional IDPs and disordered regions within the proteome while Unfoldomics deals with both the identification of the set of proteins or regions in the unfoldome of a given organism and their functions, structures, interactions and evolution (Uversky et al., 2009).
Literature search and comprehensive survey on functions of IDPs characterized by using different experiments suggested that IDPs or IDRs fall into broad functional classes including: (i) entropic chains activities stemming directly from disorder; (ii) molecular recognition via binding to other proteins or to nucleic acids; (iii) scavengers which store and/or neutralize small ligands; (iv) molecular assembly which assemble, stabilize and regulate large multi-protein complexes; (v) various protein modifications (acetylation, hydroxylation, ubiquitination, methylation, phosphorylation) and proteolysis etc. (Dunker et al., 2002a;Dunker et al., 2002b;Dyson and Wright, 2005b;Tompa, 2002;Uversky, 2010).Some illustrative biological functions of IDPs have also been collected in numerous literatures including cell division regulation, transcriptional and translational regulation, molecular chaperoning and cell signalling etc. (Dunker et al., 2005;Dunker et al., 2001;Radivojac et al., 2007;Tompa, 2005;Wright and Dyson, 1999).Obviously, structural flexibility and plasticity originating from the lack of a rigid 3D structure probably represents a major functional advantage for IDPs, reflecting from the fact that IDPs or IDRs can interact with a broad range of binding partners including protein, membranes, nucleic acids, and small molecules (Oldfield et al., 2008;Tompa and Csermely, 2004).Specifically, a majority of the IDPs or IDRs characterized are involved in regulation, cellsignalling, and control pathways via interactions with multiple partners using high specificity/low affinity strategy, and such disordered regions often become folded upon binding to their partners, confirming that prior 3D structure is not required for molecular recognition (Dunker et al., 2005;Dyson and Wright, 2002;Haynes et al., 2006;Uversky et al., 2005).The crucial role of IDPs in cell-signalling is further confirmed by the finding that eukaryotic proteomes which have developed extensive interaction networks possess higher frequency of IDPs than bacteria and archaea (Dunker et al., 2000;Ward et al., 2004).The close correlation between IDPs and cell-signalling implies that IDPs would play a critical role in protein interaction networks (see following section for details).
Although functional studies of the IDPs characterized experimentally have made significant progress in revealing the functional diversity of IDPs, they are bound to provide only an incomplete view due to limited number of known IDPs.By taking advantage of bioinformatics methods, statistical approaches utilizing a novel data mining tool have been used for a comprehensive study of functional roles of IDPs or IDRs from Swiss-Prot database containing over 200,000 proteins.In the mean time, at least one illustrative and experimentally validated example of functional disorder or order for the vast majority of functional keywords were found to support the bioinformatics analyses.These studies represent a functional anthology of IDPs or IDRs and provide researchers with a novel theoretical tool that could be used to strengthen the understanding of functional diversity of IDPs and protein structure-function relationships (Vucetic et al., 2007;Xie et al., 2007a;Xie et al., 2007b).In these studies, it was shown that many protein functions are associated with long disordered regions; the 262 of 710 Swiss-Prot functional keywords were found to be strongly positively correlated with long IDRs; whereas 302 were strongly negatively correlated with such regions.Those Swiss-Prot functional keywords used in the analyses are associated with various biological processes, cellular components, domains, technical terms, developmental processes, coding sequence diversities, ligands, molecular function, posttranslational modifications, tissue and diseases.When all of the functional keywords were classified into eleven functional categories, disorder-associated keywords were found for all eleven categories while order-associated keywords were found only for seven of the eleven categories (Vucetic et al., 2007;Xie et al., 2007b) (Fig. 2).Among them, coding sequence diversities, developmental processes, diseases and tissue functional keywords were exclusively strongly correlated with IDPs or IDRs.Therefore, the functional diversity provided by disordered regions complements functions of proteins with ordered structures, implying that IDPs and IDRs are characterized by a wide functional repertoire.

Protein-protein interactions involving IDPs
One of the most important functions of IDPs and IDRs is molecular recognition in regulatory and cell-signalling processes via protein-protein interactions (Dunker and Obradovic, 2001;Wright and Dyson, 1999).Protein -protein interactions are organized into complex networks which are central to many processes in regard to the physiology and function of cells.Studies of the protein interaction map proposed that most proteins interact with just a few partners and a small number of proteins interact with many partners.Such a small number of proteins, called hubs, represent a few highly connected nodes in the protein-protein interaction networks (Rual et al., 2005;Stelzl et al., 2005).Hubs can interact with multiple partners to connect various biological molecules in the network either simultaneously (party hubs) or at different times and locations (date hubs).It has been suggested that date hubs organize the proteome, connecting biological processes to each other, whereas party hubs act inside functional modules, forming scaffolds for various molecular machines or coordinated processes (Han et al., 2004).With their ability to interact with multiple partners, hubs play a central role in various cellular biological processes by defining the properties of the protein interaction network (Barabasi and Oltvai, 2004).The high level of hub connectivity should be reflected in protein structures that render hub proteins the ability to carry out highly specific interactions with multiple, structurally diverse partners.Statistical investigation on protein interaction databases and bioinformatics studies have revealed that intrinsic disorder is a common feature of hub proteins from four eukaryotic interactomes and disordered domains confer hubs with the ability to interact with multiple structurally diverse partners in interaction networks (Dunker et al., 2005;Haynes et al., 2006;Patil and Nakamura, 2006).IDRs provide hubs the required binding promiscuity to interact with a large number of small molecules, proteins or nucleic acids.Alternatively, structured hubs bind to disordered regions in their many interaction partners (Oldfield et al., 2008).
The abilities to interact with multiple partners (binding promiscuity) and to carry out binding-induced folding to accommodate diverse binding sites of different partners (binding plasticity) make IDPs and IDRs central in signalling and functional regulation of the cells (Uversky et al., 2005).The p53 protein, regulating more than 150 genes and binding to over 100 partners (Zhao et al., 2000), represents a typical example showing that intrinsic disorder is critical for function through binding promiscuity and binding plasticity (Oldfield et al., 2008).Such protein interactions involving p53 known as one-to-many binding mode are illustrated in Fig. 3, in which the interactions with ten partners are mediated by protein regions experimentally confirmed as IDRs (Uversky et al., 2009).Fig. 3 also indicates that protein binding sites can be predicted, using disorder predictor PONDR ® VL-XT, to correspond to some short rigid regions (downward spikes) within predicted long regions of disorder.Actually, short rigid segments within long disordered regions are subject to bioinformatics screening to be the potential protein binding sites known as Molecular Recognition Features (MoRFs).
In addition to promote binding diversity by interacting with numerous partners, molecular recognition involving IDPs provides other important functional advantages over globular proteins with 3D structure for signalling and regulation: disordered regions can bind their partners with high specificity and low affinity.To permit specific recognition, disordered regions usually undergo binding-induced folding during protein interactions (Dyson andWright, 2002, 2005b), involving a disorder-to-order transition in which IDPs or IDRs adopt a highly structured conformation upon binding to their biological partners.This bindinginduced folding can occur for the whole IDPs, or large or short IDRs.It is known that a large decrease in conformational entropy due to folding of disordered regions in the disorder-toorder transition can uncouple specificity from binding strength (Dunker et al., 2001;Schulz, 1979).With such a high specificity/low affinity, the regulatory interaction between an IDP and its partner is both highly specific and easily dispersed -activating and terminating a signal are equally important (Dunker et al., 2002a).An IDP has been suggested to contain a "conformational preference" for the structure it will take upon binding (Fuxreiter et al., 2004).This preferred conformation could be -helix ( -MoRFs) (Cheng et al., 2007), -strand ( -MoRFs) or an irregular structure ( -MoRFs) (Mohan et al., 2006;Vacic et al., 2007a).Fig. 3 shows these different conformational preferences in that a single intrinsic disordered region of p53 (residues 374 -388 in the C-terminal regulatory domain) forms all three major secondary structure types in the bound state: -helix when associating with S100 , a -sheet with sirtuin and different irregular structures with CBP and cyclin A. The set of residues involved in these interactions exhibit a high extent of overlap along the sequence.However, p53 utilizes different residues for the interactions with four different binding partners, suggesting that the same intrinsic disordered region sequence is induced by the different partners in entirely different ways (Oldfield et al., 2008).

Bioinformatics methods for predicting structures of IDPs
Bioinformatics has contributed greatly to the studies of IDPs.Driven by a rapidly increasing number of experimentally verified IDPs in late 1990's, the bioinformatics research of IDPs have promoted the correlation between protein sequence analyses and characterization of intrinsic disorder, and made it possible to investigate the intrinsic disorder nature of proteins from large databases such as interactomes, genomes and Swiss-Prot database.Bioinformatics analyses of IDPs have provided a conceptual framework for experimental studies of molecular function and protein-protein interactions (Sun et al., 2011).Below we will focus on some basic sequence analysis tools for prediction of IDPs or IDRs.

Amino acids compositional profile of IDPs
The propensity for disorder is encoded in the peculiarities of protein amino acid sequences.By comparing the compositions of the disordered protein datasets with each other and with ordered protein datasets, it was found that IDPs are generally enriched in polar and charged residues and are depleted of hydrophobic residues except for proline (Dunker et al., 2001;Uversky et al., 2000).The relative fractional differences in composition for each amino acid residue between the studied set and a set of ordered proteins are calculated as (C x -C order )/C order , where C x is the percentage of a given amino acid in the studied set, and C order is the corresponding percentage in a set of ordered proteins.The compositional profile can be visualized by plotting this relative fractional difference in composition against each of twenty amino acids of protein (Dunker et al., 2001;Vacic et al., 2007b).Thus, in the studied set, negative peaks correspond to the amino acids which are depleted in comparison to the set of ordered proteins, and positive peaks indicate the amino acids which are enriched.Most IDPs are substantially depleted in amino acids W, C, F, I, Y, V, L, H, T and N (orderpromoting residues) and enriched in amino acids K, E, P, S, Q, R, D and M (disorderpromoting residues).Amino acids A and G are neutral in regards to order and disorder (Radivojac et al., 2007).For the order-promoting residues, the hydrophobic (I, L, and V) and aromatic amino acid residues (W, Y and F) normally form the hydrophobic core of an ordered globular protein.The order-promoting cysteine is known to have a significant contribution to the protein conformational stability via disulfide bond formation or being involved in coordination of different prosthetic groups.On the other hand, disorderpromoting residues (R, Q, S, E, D and K) are polar and charged, i.e. their abundance defines a large net charge of an IDP at physiological pH.Although disorder-promoting proline is hydrophobic, it is well known as a structure terminator (Romero et al., 2001).These compositional biases of IDPs are characterized as low overall hydrophobicity and high net charge (Uversky et al., 2000), and widely used as one of the criteria for IDPs prediction.
The DELLA proteins (DELLAs), a plant-specific protein family, function as repressors of gibberellin (GA)-responsive plant growth and are the key regulatory targets in the GA signalling pathway.The N-domains of DELLAs have been experimentally verified to be intrinsically disordered, and play an important role in molecular recognition (Sun et al., 2008;Sun et al., 2010).Similar to disordered proteins from the DisProt database (Sickmeier et al., 2007), the N-domains of DELLAs showed an overall lack of order-promoting residues and enrichment in disorder-promoting residues, in particular S, M and D, a characteristic of IDPs (Fig. 4).A special feature of the N-domains of DELLAs was a depletion of K, Q and R residues, indicating that DELLAs are a special group of IDPs lacking these three disorderpromoting residues.This analysis can also be performed using a web Composition Profiler tool (http://www.cprofiler.org/)which automates composition profiling with graphical output (Vacic et al., 2007b).There are four different background datasets available, including both intrinsic disordered and ordered datasets, as a comparison to the query sequences.

Low sequence complexity of IDPs
The proteins with long disordered regions exhibit a close relationship with low sequence complexity.An investigation of Swiss-Prot database for both low sequence complexity and long disorder regions showed that nearly all the identified low-complexity segments are also predicted as disordered (Dunker et al., 2001).In addition, IDPs all exhibit lower sequence complexity compared to, but partly overlapping with, the distribution of the sequence complexity for ordered proteins (Romero et al., 2001).It was further revealed that IDPs or IDRs and low complexity sequences have similar compositional bias -more disorder-promoting residues (R, K, E, P, and S) and less order-promoting residues (C, W, Y, I, and V).Therefore the low sequence complexity is frequently accompanied by disorder, though not exclusively since low sequence complexity can sometimes occur in structurally ordered proteins (Romero et al., 2001).Overall, simultaneous use of sequence complexity analysis and other disorder predictions will provide a better view of protein disorder.
As an example, plant-specific GRAS proteins, including DELLA subfamily, play critical roles in plant development and various signalling processes (Sun et al., 2011;Sun et al., 2010).One common feature of GRAS proteins is that all of the N-domains contain homopolymeric stretches of certain amino acid residues such as S, T, P, Q, G, D or A. Most of these amino acids are disorder-promoting residues and observed in low sequence complexity segments.By using an iterative algorithm for the complexity analysis of sequence (CAST) (Promponas et al., 2000), the segments with low complexity in all GRAS proteins are mostly located within the N-domains which have previously been proven both experimentally and theoretically to be intrinsically disordered (Sun et al., 2011;Sun et al., 2010).

Charge-hydrophobicity (CH) and cumulative distribution function (CDF) plots
IDPs, as shown in the compositional profile, are characterized to have low overall hydrophobicity and high net charge.The combination of low mean hydrophobicity and high net charge may represent a prerequisite under physiological conditions for lack of folding in some kinds of IDPs.Statistical analysis of both intrinsically ordered and disordered protein datasets resulted in a plot of the net charge of a protein against its mean hydropathy (CH-plot), showing that ordered and disordered proteins tend to occupy two different areas within the charge-hydrophobicity phase space, separated by a linear boundary line (Uversky et al., 2000): <R> = 2.785 <H> -1.151,where the mean net charge <R> of the protein is calculated as the absolute value of the difference between the numbers of positively charged and negatively charged residues divided by the total number of amino acids, the mean hydrophobicity <H> is defined as the sum of the normalized hydrophobicity (Kyte and Doolittle approximation with a window size of 5 and normalization on the scale from 0 to 1) of all residues divided by the total number of residues minus 4. Figure 5A represents the original charge-hydrophobicity phase space; an IDP with a given mean net charge will most likely locate above the green boundary line.Further statistics with a wider range of IDPs has shown that the mean net charge and mean hydrophobicity of IDPs can be scattered over the charge-hydrophobicity phase space, and sometimes cross into the area of ordered proteins (Oldfield et al., 2005a).Therefore, an added boundary margin allowed the accuracy of the estimation to reach to 95%.As an example of CH-plot analysis, four of the N-domains of eight DELLAs fit into the disordered area but all of them are located within a boundary margin (Fig. 5B) (Sun et al., 2010).A combination of low hydrophobicity and high net charge as a prerequisite for IDPs can be explained from a physical viewpoint in that high net charge leads to charge-charge repulsion, and low hydrophobicity indicates less driving force for protein compaction.However, the CH-plot is a linear disorder classifier that takes into account only two parameters of the particular sequence -charge and hydrophobicity -and is predisposed to discriminate proteins with substantial amounts of extended disorder (random coils and premolten globules) from proteins with globular conformations (molten globule-like and rigid www.intechopen.comProtein Engineering 192 well-structured proteins) (Oldfield et al., 2005a;Uversky et al., 2000).Another binary disorder classifier, cumulative distribution function (CDF) analysis, discriminates all disordered conformations, including molten globules, from rigid well-folded proteins (Oldfield et al., 2005a;Xue et al., 2009).Therefore, simultaneous CDF-CH plot analysis gives a more accurate prediction for a wider range of sequences.CDF is a cumulated histogram of disordered residues at various disordered scores that are obtained from the disorder predictor PONDR-VSL2 (Peng et al., 2006) (see next section for details).The cumulated histogram for structured proteins increases faster in the range of smaller disordered scores and then flattens at larger disordered scores.The cumulated histogram for disordered proteins increases slightly in the range of lower disordered scores but significantly at higher disordered scores.So, there is also a boundary line identified in the CDF plot.The distances to the boundary lines in both CH and CDF plots for a specific protein are further used to build up the CH-CDF plot.In the resulting CH-CDF plot, coordinates of each spot are calculated as a distance of the corresponding protein in the CH-plot from the boundary (as Y-coordinate in the CH-CDF plot) and an average distance of the respective cumulative distribution function (CDF) curve from the CDF boundary (as X-coordinate in the CH-CDF plot).Positive and negative Y values in the CH-CDF plot correspond to proteins predicted within CH-plot analysis to be intrinsically disordered or ordered, respectively.In contrast, positive and negative X values are attributed to proteins predicted within CDF analysis to be ordered or intrinsically disordered, respectively.Thus, the resultant quadrants of CDF-CH phase space correspond to the following expectations: Q1, proteins predicted to be disordered by both methods; Q2, proteins predicted to be disordered by CDFs but compact by CH-plots (i.e., putative molten globules); Q3, ordered proteins; Q4, proteins predicted to be disordered by CH-plots, but ordered by CDFs.All of the N-domains of eight DELLAs are, located in Q1 and Q2 quadrants (Fig. 5C), intrinsically disordered with different levels of compactness, which is consistent with physical and biological evidence (Sun et al., 2010).

Prediction of intrinsic disorder and potential binding sites
Disorder prediction has been one of important approaches in IDPs and IDRs research.It is a powerful tool for study of IDPs, especially considering time and cost, compared to the experimental methods.Furthermore, it can be easily used to investigate large datasets such as proteome, interactome etc.So far, more than fifty predictors of disorder have been developed to evaluate intrinsic disorder of a given sequence on a per-residue basis (He et al., 2009).These predictors utilize biased amino acid compositions of IDRs, various datasets derived from experiments and different computing techniques, and many of them are accessible on public servers.A partial list of various predictors can be found on DisProt website (http://www.disprot.org/predictors.php).In this chapter, we only focus on a series of Predictors Of Natural Disordered Regions (PONDRs).
A basic strategy for developing these predictors includes constructing training datasets of both ordered and disordered segments, selecting sequence attributes of order and disorder and applying a neural networks model in training.Predictor PONDR ® VL-XT applies three different neural networks, one trained on Variously characterized Long disordered regions for internal region of the sequence and two trained on X-ray characterized Terminal disordered regions for N-and C-terminal regions (≥ 5 amino acids) (Romero et al., 2001).The PONDR ® VL-XT has outputs from the first to the last residue in a sequence, and furthermore it provides the basis for CDF plot and potential binding site (MoRFs) predictions.PONDR ® VSL2 combines neural networks for both short (≤ 30 residues) and long disordered regions with each neural network trained by the dataset of that specific length.This predictor gives relatively higher accuracy of prediction in the PONDR series (Peng et al., 2006).PONDR ® VL3 applies ten neural networks and selects the final prediction by simple majority voting.The input features of these predictors are various sequence profiles.This predictor has higher accuracy in predicting longer disordered regions.All of these PONDR predictors have relatively high accuracy (> 80%), and the accuracy has been further improved by a meta-predictor PONDR-FIT that combines PONDR-VLXT, PONDR-VSL2, PONDR-VL3 and other three different predictors and is so far one of the predictors with higher accuracy of disorder prediction (Xue et al., 2010).Fig. 6.PONDR disorder predictions for AtGID1a, AtGAI and AtRGL2.(A) X-ray crystal structure of AtGAIn-AtGID1a/GA 3 complex (PDB 2ZSH) with ribbon representation of AtGAIn (red), the N-terminal binding pocket of AtGID1a (cyan), GA receptor domain of AtGID1a (light blue) and GA 3 (green van der Waals surface) (B) Disorder prediction for AtGID1a (C) Disorder predictions for AtGAI (black line, its C-domain is shifted right to align with that of AtRGL2) and AtRGL2 (orange line).Disorder predictions were made with PONDR ® VL-XT with a threshold 0.5 (≥ 0.5 for disorder and < 0.5 for order).Boxes indicate the fragments of AtGAIn and AtGID1a crystallized in the complex, with filled positions indicating region of defined density in the crystal structure.Ticks indicate residues that interact with the portion of the complex with the corresponding colours.Approximate positions of -MoRFs predicted for most DELLA proteins are indicated by boxes labelled 'M'.Reproduced from Sun et al. (2010) J. Biol. Chem., 285, 11557-11571.As an example, PONDR ® VL-XT was used to predict the disorders for two DELLAs (AtGAI and AtRGL2) together with the GA-receptor (AtGID1a) (Fig. 6).The PONDR scores of AtGID1a indicated a folded structure (Fig. 6B).The PONDR scores of AtGAI and AtRGL2, similar to each other, indicated that the C-domains of both DELLAs are dominated by ordered structures with most residues having PONDR score < 0.5, the threshold for order/disorder (Fig. 6C).In contrast, the N-domains of both DELLAs, AtGAIn and AtRGL2n, are clearly intrinsically disordered except for some short rigid segments corresponding to the DELLA, VHYNP and LK/RXI motifs (Fig. 6C, indicated by arrows).
These disorder predictions of the N-domains AtGAIn and AtRGL2n are consistent with the results from compositional profile (Fig. 4) and CH-CDF plot analysis (Fig. 5).
As discussed in Section 2.4 (protein-protein interactions involving IDPs), such short ordered segments within long disordered regions detected as downward spikes in disorder prediction using PONDR ® VL-XT (Fig. 6C) are termed Molecular Recognition Features (MoRFs).They are potential binding sites for protein interactions, and responsible for molecular recognition via a disorder-to-order transition upon binding to their interacting partners.By utilizing this specific sequence pattern of IDPs, a unique bioinformatics tool dedicated to the identification of MoRFs or potential protein-protein interaction sites in IDPs has been developed.-MoRFs-I and its updated form -MoRFs-II, the identifiers of -helix forming MoRFs, are focused on short binding regions within long regions of disorder that are likely to form helical structure upon binding (Cheng et al., 2007;Oldfield et al., 2005 b).The -MoRFs predictor defines a heuristic for binding-associated downward spikes and removes false positive predictions.It assigns relatively short segments (20 residues) that gain functionality through a disorder-to-order transition induced upon binding to a partner.The identifiers of -sheet or irregular structure forming MoRFs are under development.
As an example, -MoRFs have been predicted at or near the DELLA motif and the LK/RXI motif in the N-domains of DELLAs (Fig. 6C), suggesting that the DELLA motif is a binding site that undergoes disorder-to--helix transition upon binding to GID1 receptors.The formation of this -helix has been confirmed and shown as -helix A and -helix B in the crystal structure of the AtGID1a-GA 3 -AtGAIn ternary complex (Fig. 6A) (Murase et al., 2008).Although the VHYNP motif appears as a large downward spike in the PONDR score pattern (Fig. 6C), it was not identified in our -MoRFs prediction for any of the DELLAs.It is clearly shown in Fig. 6A that the VHYNP motif binds to AtGID1a, forming an irregular VHYNPSD loop.Therefore, it is -MoRFs rather than -MoRFs involved in the bindinginduced folding.The binding-induced folding of the DELLA and VHYNP motifs have also been supported by other biochemical evidence (Sun et al., 2010).The LK/RXI motif was also identified in our -MoRFs prediction.It is possibly a potential binding site of DELLAs for an unknown component in DELLA signalling pathway as this motif was not involved in interactions with GID1 receptors.

Experimental methods for investigation of IDPs
Having specific amino acid sequences, IDPs possess a number of distinctive structural characteristics that can be utilized for their identification.This includes but not limited to, sensitivity to proteolysis, disorder characteristics of CD and NMR spectroscopy, small-angle X-ray scattering, hydrodynamic measurement, dynamic light scattering, deuteriumhydrogen exchange mass spectrometry, Raman and infrared spectroscopy, monoclonal antibody based immunoassays and so on (Uversky, 2002a).While every method emphasizes different structural features of IDPs, simultaneous application of several techniques mentioned above will provide unambiguous evidence for the presence of varying degrees of unfoldedness in a given protein.Here, we will focus on some of frequently used techniques.

Hydrodynamic dimensions of IDPs
Hydrodynamic dimension is the most definite characteristic of the conformational state of a protein.According to Uversky (Uversky, 1993(Uversky, , 2002b)), a protein molecule can exist in natively folded (NF), molten globule (MG), pre-molten globule (PMG), native coil (Coil) and completely unfolded states (U urea ) and is characterized by the following relationships of its Stokes radius (R S ) and theoretic molecular weight (MW Theo ): natively folded, log R S NF = 0.369log (MW Theo ) -0.254; molten globule, log R S MG = 0.334log (MW Theo ) -0.053; premolten globule, log R S PMG = 0.403log (MW Theo ) -0.239; native coil, log R S Coil = 0.493log (MW Theo ) -0.551; complete unfolded random coil in urea, log R S Uurea = 0.521log (MW Theo ) -0.649.Therefore, experimentally determined apparent molecular weight (MW App ) and Stokes radius (Rs D ) can be used to identify and classify IDPs into one of three subclasses: molten globule (MG)-like, pre-molten globule (PMG)-like or random-coil like, depending on their hydrodynamic characteristics.Fig. 7. Size-exclusion chromatography of AtRGL2n and AtGID1a-AtRGL2n complex.Six natively folded globular proteins (1.IgG 1 , 158 kDa; 2. BSA, 67 kDa; 3. Ovalbumin, 43 kDa; 4. Carbonic Anhydrase, 30 kDa; 5. Myoglobin, 17 kDa; 6. Cytochrome c, 12.3 kDa corresponding to numbered peaks of hatched line) were used as standards to calibrate the Superdex 75-16/60 column.The resultant migration rate (1000/V elution ) plotted against Stokes radius (left inset) and molecular weight (right inset) were used to determine Stokes radius (Rs D ) and apparent molecular weight (MW App ) of DELLAs and the complex.As examples, the peaks of AtRGL2n and AtGID1a-AtRGL2n complex are shown in solid and dotted lines, respectively.Adopted from Sun et al. (2010) J. Biol. Chem., 285, 11557-11571.As an example, we investigated hydrodynamic dimensions of the N-domains of DELLAs using size-exclusion chromatography (Fig. 7) in which six natively folded globular proteins were used as standards to calibrate the Superdex 75-16/60 column (see insets).AtRGL2n and its ternary complex AtGID1a-GA 3 -AtRGL2n were then applied on to the gel filtration column using the same buffer and flow rate.The determined Stokes radius (Rs D ) and apparent molecular weight (MW App ) of AtRGL2n and the ternary complex were calculated using migration rate (1000/V elution ).The MW App of free AtRGL2n is approximately two to three times larger than its theoretical molecular weight (MW Theo ), indicating that it has an extended conformation with low compactness.The value of R S D reveals that free AtRGL2n belongs to the PMG-like IDPs.In contrast, the MW App and Rs D v a l u e s s h o w t h a t t h e AtGID1a-GA 3 -AtRGL2n complex becomes natively folded, supporting the hypothesis that intrinsic disordered AtRGL2n must undergo a binding-induced folding during the complexing of AtGID1a/GA 3 and AtRGL2n (Sun et al., 2010).

CD spectra of IDPs
The far-UV circular dichroism (CD) spectra have been widely used for identification of IDPs due to its quick operation and easy accessibility.Contrary to -helix and -sheet, random coil displays specific shape of the far-UV CD spectrum, with a large negative ellipticity in the vicinity of 200 nm, low ellipticity at 190 nm and an ellipticity close to zero in the vicinity of 222 nm (Johnson, 1988;Kelly and Price, 1997).This is a very useful graphical criterion for identification of IDPs.Double wavelength statistics showed the far-UV CD spectra characteristics of IDPs: some random coil-like IDPs having averaged ellipticities at 200 nm (-18900 deg•cm 2 •dmol -1 ) and at 222 nm (-1700 deg•cm 2 •dmol -1 ) while some pre-molten globule-like IDPs having averaged ellipticities at 200 nm (-10700 deg•cm 2 •dmol -1 ) and at 222 nm (-3900 deg•cm 2 •dmol -1 ) (Uversky, 2002a).Biol. Chem., 285, 11557-11571.As an example of using CD spectra to characterize IDPs, the far-UV CD spectra of the Ndomains of six DELLAs (Fig. 8A) display a large negative ellipticity at 200 nm and low ellipticity at 190 nm, characteristic of proteins in a largely disordered conformation.The ellipticity values at 222 nm are close to that of pre-molten globule-like IDPs, on the other hand, the ellipticity values at 200 nm are close to that of random coil-like IDPs (Fig. 8A, inset table).Alternatively, the solvent 2,2,2-Trifluoroethanol (TFE) mimics the hydrophobic environment experienced by protein -protein interactions and has therefore been widely used as a probe for the propensity of IDPs to undergo an induced folding upon target binding (Dyson and Wright, 2002).To test the potential binding-induced folding of the Ndomains of DELLAs, far-UV CD spectra of AtRGL2n were recorded in the presence of increasing concentrations of TFE (Fig. 8B).AtRGL2n showed an increased -helicity upon the addition of TFE, as indicated by the characteristic peak at 192 nm and double minima at 208 and 222 nm.Most of the disorder-to--helix transition takes place in presence of 30% TFE, at which point the -helical content reaches 51.5% as estimated from the ellipticity at 222 nm.The TFE results alone reveal a potential of AtRGL2n to form -helices upon binding to its interacting partners (Sun et al., 2010).

NMR spectra of IDPs
As discussed above, CD spectroscopy is a very useful technique for identification of IDPs.However, this technique aims to detect overall tendency of conformational ensemble of intrinsic disorder and the resultant spectroscopic characteristics reflect local structural propensities averaged over the whole protein molecule.It does not provide conformational information about local residual structures such as those retained in pre-molten globules.In past decade, NMR spectroscopy has rapidly developed into a key technique for studying conformational ensemble and dynamics of proteins in unfolded and partially folded states (Dyson and Wright, 2005a).NMR spectroscopy investigates both local and long-range conformational behaviour at atomic resolution on timescales varying over many orders of magnitude, it has been used together with small-angle X-ray scattering (SAXS) to characterize the conformational ensemble of IDPs (Wells et al., 2008).The residual dipolar couplings (RDCs) of NMR spectra has become a powerful tool to describe quantitatively the level of local structure and transient long-range order in IDPs, see review (Jensen et al., 2009) for details.Here, we just show examples of characteristic NMR spectra of IDPs.The two-dimensional 15 N, 1 H-HSQC spectra and 1 H-13 C planes from a CBCA(CO)NH experiment have been collected for two N-domains of DELLAs, AtRGL2n and AtRGAn (Fig. 9).The narrow ranges of chemical shifts in the HSQC spectra for AtRGL2n and AtRGAn are characteristic of unstructured proteins (Fig. 9A and 9B, respectively).The CBCA(CO)NH planes (Fig. 9C and  9D) correlate the amide proton chemical shift with the C  and C  shifts of the previous residue, in which horizontal rows of peaks at chemical shifts typical of random coil can be seen.Neither of these rows displays a significant spread in 13 C shift values, suggesting a nearly uniform and therefore disordered environment.While the narrow range of 13 C shifts seen here is consistent with disorder, the chemical dispersion of AtRGAn (Fig. 9B) does appear to be slightly higher than that of AtRGL2n (Fig. 9A), indicating that AtRGAn may have relatively more local residual structures (Sun et al., 2010).

Deuterium / hydrogen exchange mass spectra
Deuterium/hydrogen exchange mass spectrometry (DHXMS) provides insights into protein structure and dynamics on a per amino acid basis.Compared to NMR spectra, the DHXMS technique requires much less time for sample preparation and can be routinely applied to larger proteins.The backbone amide hydrogens of proteins reversibly interchange with deuterium in D 2 O solvent and the exchange rate of each amide hydrogen in a protein directly and precisely reports solvent accessibility to it, revealing the protein's conformational states on the scale of individual amino acids.Amide hydrogens that are involved in intramolecular H-bonds via secondary structures and/or permanently buried inside the protein are protected from the exchanging.Conversely, exchanging occurs readily at sites that are solvent-exposed and involved no hydrogen bond of secondary structures.Therefore, the exchange rates determined by mass spectrometry following the deuterium/hydrogen exchanging allow direct localization of structured or unfolded regions of the protein.DHXMS has been used in structural genomics to identify the disordered regions within large number of crystallographic targets (Pantazatos et al., 2004), in exploring protein conformational dynamics and the structural aspect of solution-phase proteins (Konermann et al., 2008).
As an example, Fig. 10 shows the hydrogen/deuterium exchange of AtRGL1n determined by MS.The MBP moiety of a well folded structure exhibited inaccessibility to solvent for most of the protein except the two terminal regions.In contrast, AtRGL1n showed instantly high exchange rate for the whole polypeptide chain, implying that free AtRGL1n without interacting partners is totally unfolded (Sheerin et al., 2011).

Monoclonal antibody immunoassays for intrinsic-disorder based protein interactions
Monoclonal antibodies (mAb) have long been recognized and used as powerful molecular probes to monitor protein folding and conformational changes (Goldberg, 1991).The conformational specific mAb that recognizes conformation of epitopes (small groups of sequential or distal amino acids) allows a direct insight into the conformation of antigenic proteins at individual amino acids level.This method was utilized in conformational characterization of paired helical filaments (PHF) of tau protein, an IDP in its monomeric form and polymerizes through binding-induced folding into insoluble PHF, causing Alzheimer's disease (AD) and related tauopathies.A special mAb MN423 raised against the PHF core of tau protein recognizes three spatially close amino acid segments which reside on a nearly 90 amino acid-long polypeptide chain including the C-terminus.The disclosure of the spatial proximity of these segments represents constraints for intra-molecular folding of the PHF core, leading to propose the model for folding of tau polypeptide chain in the PHF core (Skrabana et al., 2006).The conformational specific mAb has often been used to monitor conformational changes of targeted proteins.Bax is a pro-apoptotic member of the B-cell lymphoma-2 (Bcl-2) family which are either IDP or contain IRDs that are critical to their function (Rautureau et al., 2010).Bax undergoes a conformational change triggered by a TNF-related apoptosis-inducing ligand (TRAIL), leading to effective induction of mitochondrial apoptosis (Sundararajan et al., 2001).This critical conformational change can be efficiently detected by using a mAb that recognizes only the three-dimensional epitope of conformationally changed Bax and was immobilized on a chip for surface plasmon resonance (SPR) detection (Kim et al., 2005).Furthermore, the conformational specific mAb has advantage over other techniques in monitoring conformational changes of antigenic proteins involved in a real time reaction system with multiple interacting components.This method has been successfully applied to gain an insight into the folding of DELLAs upon binding to the AtGID1, a GA receptor extracted from Arabidopsis tissue and binding to both the DELLA and VHYNP motifs of DELLAs (Sun et al., 2010).Both biophysical data and biological functions suggested that the Arabidopsis DELLA family may be further divided into two subgroups: AtGAI, AtRGA www.intechopen.comProtein Engineering 200 (RGA-group) and AtRGL1, AtRGL2 and AtRGL3 (RGL-group).This classification was reinforced by different conformations of unbound Arabidopsis DELLAs with regard to the VHYNP motifs.The mAb AD7, conformational specific to the VHYNP motif, does not recognize unbound AtGAIn and AtRGAn but AtRGL1n, AtRGL2n and AtRGL3n (Fig. 11), implying the different conformations of VHYNP motif between these two subgroups in their unbound form.Furthermore, the ELISA assays of the recombinant N-domains (AtGAIn, AtRGL1n and AtRGL2n), AtGID1/GA 3 and mAb AD7 showed that the AtGID1 only partially blocks mAb AD7 from binding to both AtRGL1n and AtRGL2n (Fig. 11B,  11C).In contrast, binding of AtGID1/GA 3 to AtGAIn renders mAb AD7 binding to AtGID1/GA 3 -AtGAIn complex (Fig. 11A).This indicates that the VHYNPSD loop in the unbound AtGAIn undergoes conformational changes induced by binding of AtGID1/GA 3 , resulting in at least the partial conformational epitope recognized by mAb AD7.

Conclusion
As one of the fastest growing areas of protein science in the past decade, IDPs or IDRs lack secondary and/or tertiary structures yet possess crucial cellular functions under physiological conditions.Protein interactions involving IDPs or IRDs can trigger bindinginduced folding of IDPs or IDRs via a disorder-to-order transition.Such important characteristics enable IDPs or IDRs to play critical roles in molecular recognition.The intrinsic disorder based protein interactions will be a key factor in elucidating mechanisms of biological processes and regulations such as disease, various signal transductions, plant growth and development, transcriptional regulation and so on.

Fig. 3 .
Fig. 3.The protein interactions involving p53.PONDR scores of intrinsic disorder was predicted by PONDR ® VL-XT predictor.Residues with scores above 0.5 (threshold) are disordered and those below 0.5 are ordered.The N-domain (residues 1-100) and C-domain (residues 290-390) were predicted as disordered regions while the central DNA binding domain was ordered.The ten binding sites in both N-and C-domains of p53 are at or near the downward spikes in the plot of disorder scores.The complex structures containing various p53 binding regions are displayed around the predicted disorder pattern.In complexes, the structures of p53 segments bound to their partners are shown in different colours.And the same colours are used for the bars in the plot of disorder scores to indicate the positions of the segments in the sequence of p53.The Protein Data Bank IDs and partner names for complex structures are as follows: (1tsr DNA), (1q2d tGcn5), (3sak p53 (tet dom)), (1xqh set9), (1h26 cyclinA), (1ma3 sirtuin), (1jsp CBP bromo domain), (1dt7 s100 ), (2gs0 Tfb1), (1ycr MDM2), and (2b3g rpa70).Reproduced from Uversky et al. (2009) Bmc Genomics, 10 (Suppl 1): S7.

Fig. 4 .
Fig. 4. The compositional profile of the N-domains of eight DELLA proteins (black bars) and disordered proteins of Dis-Pro database (grey bars) in comparison to the ordered globular proteins from the protein data bank.The eight DELLA proteins are from Arabidopsis (AtGAI, AtRGA, AtRGL1, AtRGL2 and AtRGL3), rice (SLR1), barley (SLN1) and wheat (RHT1).

Fig. 5 .
Fig. 5. (A) Mean net charge versus mean hydrophobicity plot for the set of 275 folded (blue squares) and 91 natively unfolded proteins (red circles).(B) Mean net charge versus mean hydrophobicity plot of the N-domains of eight DELLAs.A boundary margin of +0.045 (dotted) extends the disorder estimation accuracy to 95%.(C) Combined CDF-CH plot of the N-domains of eight DELLAs.Modified from Sun et al. (2010) J.Biol.Chem., 285, 11557-11571.