Validation of Protein Biomarkers to Advance the Management of Autoimmune Disorders

Despite the anticipated boom stemming from proteomic investigations, the rate at whichnovel protein biomarkers are introduced into clinical practice has remained static over thepast 20 years. The reality is that approaches to both discover and validate proteinbiomarkers remain inadequate, and consequently, many areas of medicine, including thebroad field of autoimmune disorders, remain deprived of the tools essential for the optimalmanagement of patients. Most importantly, there is a huge backlog of candidate biomarkersthat are yet to undergo thorough investigation and validation to assess their clinical utility.A recent assessment of the situation has estimated that although many tens of thousands ofpublications claim biomarker discoveries, there are roughly only 100 routinely used inclinical practice (Poste, 2011).This chapter reviews the potential applications of protein biomarkers to manageautoimmune diseases with a special focus on the transition from the biomarker discoverythrough to validation phases using proteomic strategies. We emphasize the importance ofcareful review of the discovery data, the critical roles of protein isoform verification, and theessential features of targeted and thorough validation. Ultimately, when these factors areappropriately considered and implemented, we are optimistic that autoimmune disorderscan be transformed by omics technologies and personalized practice can become a reality.


Introduction
Despite the anticipated boom stemming from proteomic investigations, the rate at which novel protein biomarkers are introduced into clinical practice has remained static over the past 20 years.The reality is that approaches to both discover and validate protein biomarkers remain inadequate, and consequently, many areas of medicine, including the broad field of autoimmune disorders, remain deprived of the tools essential for the optimal management of patients.Most importantly, there is a huge backlog of candidate biomarkers that are yet to undergo thorough investigation and validation to assess their clinical utility.A recent assessment of the situation has estimated that although many tens of thousands of publications claim biomarker discoveries, there are roughly only 100 routinely used in clinical practice (Poste, 2011).This chapter reviews the potential applications of protein biomarkers to manage autoimmune diseases with a special focus on the transition from the biomarker discovery through to validation phases using proteomic strategies.We emphasize the importance of careful review of the discovery data, the critical roles of protein isoform verification, and the essential features of targeted and thorough validation.Ultimately, when these factors are appropriately considered and implemented, we are optimistic that autoimmune disorders can be transformed by omics technologies and personalized practice can become a reality.

Biochemical markers and their potential role in autoimmune disease
Biological markers are widely used in medicine and can provide an objective measure of normal and pathogenic processes or pharmacologic responses to a therapeutic intervention.By the term 'biological markers' (or biomarkers) we mean an objective molecular indicator or surrogate of pathological processes which possess diagnostic, prognostic or predictive

Creatinine Metabolite
Creatinine is used together with liver enzyme determinations (AST and ATL) as an index of drug toxicity and kidney function (Anders et al., 2002).

C-reactive protein (CRP) Protein
A plasma protein routinely measured as a non-specific index of acute inflammation.CRP levels are typically integrated into clinical response scores, such as DAS28, which help to titer drug dosage (Pepys & Hiirschfield, 2003).

S100 proteins Protein
Phagocyte derived proteins found in a variety of inflammatory diseases , with the ability to discriminate between diseases (Wittkowski et al., 2008).

Anti-nuclear antibody (ANA)
Protein-Autoantibody Used along with other tests, ANA helps with the diagnosis of arthritides.ANA titers and nuclear stain patterns can vary between patients depending on the condition (≈ 95% of systemic lupus erythrematosis (SLE)) (Wilk, 2005).Specific subsets of ANA's can be used to distinguish the type of autoimmune disease, e.g., Sjögren's syndrome.ANA positivity increases the risk of eye disease in juvenile idiopathic arthritis patients.

Rheumatoid Factor (RF)
Protein-Autoantibody RF antibodies target the Fc region of IgG and are detected in 60-80% of RA patients, though are present in other inflammatory and connective tissue diseases (Chen et al. , 1987).The presence of RF early in the course of RA is associated with more active disease (Nakamura, 2000).

Anti-cyclic citrullinated peptide (CCP)
Protein-Autoantibody Anti-CCP antibody positivity predicts the development of RA and may occur long before the onset of symptoms (Nielen, 2004).Anti-CCP is associated with severe erosive disease and can predict disease progression in RA patients (Meyer et al., 2003).

Anti-Ds DNA antibodies
Protein-Autoantibody Anti dsDNA antibodies are highly specific for the diagnosis of SLE with a specificity of 95% but with a low sensitivity of <60% (Kavanaugh et al., 2002).

Nucleic acid-Gene
Human leukocyte antigen genes have been found to be associated with RA.This association is particularly strong for HLA-DRB-1 alleles which share a similar amino acid sequence known as the shared epitope (van der Horst-Bruinsma et al., 1999).The presence of these alleles both increases the risk of RA and associate with more severe disease (Wagner et al., 1997).

Nucleic acid-Gene
The protein tyrosine phosphatase, non-receptor 22 (PTPN22) allele is a major risk factor for several autoimmune diseases.The protein product increases the tyrosine dephosphorylation of T-cell receptor resulting in decreased signaling via this pathway (Vang et al., 2005).

TRAF1/C5
Nucleic acid-Gene TNF-receptor-associated factor-1 is one of many single nucleotide polymorphisms involved in the pathway of tumour necrosis factor alpha.The protein encoded by TRAF1 mediates signal transduction from the family of TNF receptors (Kurreeman et al., 2007).Increased susceptibility to and severity of RA is associated with this SNP, by influencing TRAF1/C5 function.

Nucleic Acid-Transcript
Pharmacogenomic tests which use cellular transcript measurements to predict drug response are yet to be implemented in the clinic.
Interesting data has emerged on the association between anti-TNF antagonist response and genetic variation (Bowes et al., 2009;Potter et al., 2010).
Table 1.Biomarkers routinely used in the diagnosis & treatment of arthritides and some potential future markers-arthritides.
iii.The disease management stage: Early commencement of effective therapy is essential if joint damage and other complications are to be avoided.Historically, monitoring response to treatment is a composite of clinical findings and laboratory markers such as erythrocyte sedimentation rate (ESR), C-reactive protein (CRP) and disease activity score (e.g.DAS28) .Treatment is modified according to these parameters.However disease response can take many months, indeed years.Thus by the time the patients disease is deemed unresponsive substantial joint damage can have occurred.The identification of biomarkers that would predict disease response would have an enormous impact on outcome.Treatment may also be discontinued due poor tolerability.Identifying such patients in advance would improve patient care and reduce stress.Biologic drugs, as third line therapy such as anti-TNF have revolutionised the treatment of rheumatic diseases and systematic reviews have confirmed their efficacy and relative safety (Alonso-Ruiz et al. 2008).However these drugs are extremely expensive.Months of treatment can be required before the clinician knows whether they are effective.This is both costly and inefficient.The identification of biomarkers that would predict the response of individual patients to these expensive agents would help patients, clinicians and funding agencies alike.Finally there are concerns associated with the use of such targeted therapy i.e. the risk of life threatening infections using and more worryingly the long term risk of malignancy (Bongartz et al. 2006).The ability to identify such patients in advance would protect them from such serious adverse reactions.

The proteomics biomarkers development pipeline 2.1 Overview
There is no dispute that new biomarkers would advance the diagnosis and management of autoimmune disorders.The ongoing challenge, however, is how to discover candidate markers and how to validate them, i.e., define their performance characteristics when adopted in a routine clinical setting.
A major impetus for increased interest in biomarkers has been the introduction of the omics technologies.In a single study these allow interrogation of hundreds (or thousands) of independent variables, such as genes, mRNA, metabolites or proteins and given the volume of information generated from such studies, many have anticipated candidate biomarkers would flow quickly from each new investigation.The reality, however, is otherwise.
Comparing the levels of hundreds (or thousands) of data points in several distinct groups, especially when the sample numbers are small, gives rise to many apparent differences, only some of which are related to biology.Chance alone gives rise to many apparent "distinguishing features" -the trick is identifying the biologically relevant differences and ignoring the others.For example, consider a proteomics experiment in which 200 proteins are measured simultaneously in a control and a test sample.At P < 0.05, 10 proteins will appear to be different by chance alone and unrelated to the treatment or condition.This consideration is not presented to undermine the value or utility of omics research, but rather, to underscore the importance of verifying any observed differences in follow-up studies.In more conventional scientific studies it is typical to examine a single dependent variable, run replicates, and to use standard statistical approaches to analyze the outcomes.
Omics studies are very different.Data sets are typically of high dimensionality, but the sample size is small.There are typically very few if any replicates and any interesting trends are hidden within the combination the variables.Under these circumstances the probability of finding associations by random chance is high.Although multivariant statistics can help with the analysis of these complex data sets, there is no easy way out.Any single omics study is best considered as an observational investigation that aids in generating novel hypotheses that can direct their future studies.Contrary to what was hoped, the omics methods do not provide a fast-track to biomarkers or shortcut the scientific process.They do, however, allow an investigator to operate independent of existing knowledge and to be less dependent on insight, instinct or experience.A single omics study can provide data from which dozens of testable hypotheses can be formulated or, put another way, it can identify dozens of biomarker candidates.Accordingly, the validation of each candidate biomarker is analogous to hypothesis testing where the investigator sets out to falsify (or disprove) the claim that candidate "x" is a valid biomarker in defined clinical scenario "y".
In the sections that follow some of the other important components of the biomarker development pipeline are discussed and we highlight the primary concerns that are necessary to optimize the success of biomarker development.

The clinical objective and study design considerations
Several types of biomarkers can be developed, but in each instance the process requires a different study design and unique sample sets.Biomarker development in autoimmune disease must incorporate a cross-section of patients representing the full spectrum of the specific disorder, and given its complex and heterogeneous nature, a panel of markers, not a single marker, will likely be necessary to reflect all relevant clinical parameters.Depending on the groups incorporated into the study and the comparisons made, integrated panels of individual biomarkers may be identified that provide valuable screening/diagnostic, predictive or prognostic information.
Screening biomarkers: Biomarkers that can be used to screen and identify disease before the onset of any symptoms are the Holy Grail of autoimmune disease; however, their discovery and validation presents substantial challenges.While some of the biomarkers evident in symptomatic disease may be present early on in asymptomatic individuals, it is more likely that these will be low abundance and masked by more abundant, non-specific biomarkers of inflammatory and secondary processes related to the disease.Therefore, reliable identification of the biochemical events that earmark early stage, asymptomatic disease requires access to biobanks with adequate numbers of samples to give statistical power collected from affected individuals well before disease onset.(In the case of juvenile RA these samples should ideally be collected from birth onwards.)When samples are available from the same individual both before and after disease onset, patients can serve as their own controls and therefore changes characteristic of the disease can be measured against a relatively constant biochemical background.Sample sets for these studies are, however, difficult to come by and require substantial long-term logistical and financial investment.Consequently, a compromised study design incorporating a relevant control group and early stage (symptomatic) subjects is more commonly adopted to meet this objective.Diagnostic biomarkers: Typically, a case-control approach is used in this setting, such that a cohort of disease-free controls is compared to a similarly-sized cohort of diseased subjects.
Comparisons of this type require careful design and implementation because observed differences may be non-specific and associated with the consequences of end-stage disease, rather than related to RA itself.Although widely used, for this reason alone the case-control study design is frequently problematic.Case and control samples need to be age-and lifestyle-matched, detailed clinical histories must be available on both cohorts, and strict inclusion/exclusion criteria are required.The value of any markers identified by this approach, and especially those that are not specific to autoimmune disorders alone, can only be assessed through large-scale hypothesis-driven studies performed in a defined clinical setting.
Prognostic or predictive biomarkers: For a prognostic or predictive biomarker study, suitable samples must be available both before and after the measured outcome from each subject.Often these studies use samples collected from completed studies that addressed a different clinical question, but are then used to identify and track a molecular signature (biomarker profile) that may have predictive/prognostic value.(i.e., They employ retrospective samples.)However, in all instances, prospective (purpose-driven) sample collection is preferred.Although logistically difficult, this approach affords greater control over preanalytical variables including storage time, storage conditions and use of additives.With foresight and planning, a randomized controlled trial with longitudinal sample collection can incorporate multiple nested outcome studies relating, for example, to therapeutic response, disease progression or disease recurrence.

Samples and sampling considerations
It is important to state the obvious: a biomarker study can only be as good as the clinical samples and their associated records.If there are errors with annotation, if patient details are inaccurate, or if the samples themselves have not been collected and stored properly, the exercise of biomarker development may be futile (Poste, 2011).Although there are limited numbers of biobanks available at this time, thankfully, more and more investigators, research centers and commercial entities are committing to establishing and maintaining high-quality sample repositories linked to accurate clinical records.With respect to samples, factors such as the timing from sample collection to freezer, the complexity and reproducibility of any sample handling steps, length of storage time, storage temperature and freeze-thaw cycles, may affect the stability of some analytes.For example, samples sourced from different cohorts at different locations may 'carry forward' preanalytical background signals with discriminating features unrelated to the biology of the disease (Addona et al., 2009;Davis et al., 2010;Ransohoff, 2010).Ideally, samples should be processed immediately, then aliquoted into airtight tubes and frozen in liquid nitrogen or -70°C freezers.Similarly, multiple freeze-thaw cycles can affect the stability of potential biomarkers (Flower et al., 2000;Rai et al., 2005).Protease inhibition may help preserve sample integrity, but this approach is not without its complications.For example, irreversibly-binding to sample components can have undesirable downstream consequences.
Fluids or tissues proximal to the sites of pathology can act as biomarker sinks.As a result, these should also be considered, when available, alongside plasma or serum as a means of focusing the search on pathologically-relevant candidates.For example, in the case of arthritis, synovial fluid, cartilage and synovium are potential sources for biomarker discovery.Here protein variants unique to the site and pathology can be measured before they escape into plasma (Gibson et al., 2009).
Studies indicate that plasma is likely a better substrate for proteome analysis than serum due to the obfuscation of results associated with the high proportion (>40%) of clot-related proteins and peptides in serum (Haab et al., 2005;Rai et al., 2005;Tammen, 2005).Less invasive samples also amenable to protein biomarker discovery include urine, saliva and tear fluid.Although putative biomarkers can come from discovery work, candidates can also come from literature searches and genomic or transcriptomic mining.All candidates, however, must undergo subsequent verification and validation (Pepe et al., 2008).

Discovery strategies
Discovery strategies allow many analytes to be measured simultaneously (i.e., multiplexed analysis).The objective is to identify qualitative and/or quantitative differences across distinct clinical phenotypes that are reproducible and can then be adopted in a clinical setting.As discussed previously, however, what is observed in a discovery setting could be an artifact of statistical chance or experimental bias, and any findings must be rigorously validated.Discovery is typically costly, slow (low-throughput) and labor-intensive.Further, because the methods are not optimized for any single analyte, their performance characteristics are compromised (i.e., limited sensitivity, selectivity and precision).The methods are therefore only suitable to survey -they are not suited to efficient, precise and accurate quantification.When proteins are the targets of the discovery process then there are two orthogonal strategies that are adopted: peptide-centric and protein-centric.
Peptide-centric (bottom-up or shotgun) strategies: This approach begins with proteolytic digestion of proteins to peptides and the 'digest' is then subjected to fractionation (HPLC) and tandem mass spectrometry (Duncan et al., 2010;Aebersold & Mann, 2003;Chait, 2006).
The tandem mass spectra are then converted to peptide sequences and their precursor proteins are "assumed" by computational approaches.Refinements of this approach sometimes incorporate fractionation prior to digestion and/or multiple stages of fractionation post digestion (Washburn et al., 2001;Wolters et al. 2001).The principal assumption of a bottom-up strategy is that the identity of intact proteins can be ascertained from their constituent peptide fragments.As discussed elsewhere, this assumption is frequently invalid (Duncan et al., 2010).
Protein-centric (top-down) strategies: With protein-centric approaches intact proteins are first separated, typically by 2D gel electrophoresis, then the proteins are isolated and identified by mass spectrometry.Typically identification involves enzymatic cleavage of each individual protein to peptides and then either: (a) the masses of the peptide products of each pure protein are determined (via single stage mass spectrometry); or (b) the tandem mass spectrum (fragmentation pattern) of one (or more) of the peptides is determined (via tandem mass spectrometry).One or both these data sets is/are then used to interrogate a database and identify the protein.Relative protein amounts can be determined from the gel by staining.Because a top-down approach retains the protein integrity, modifications and sequence variations can be investigated.As we will illustrate, the discovery findings should be considered a set of leads that require meticulous validation, especially with respect to the utility of the biomarker(s) in a routine clinical setting.

Biostatistical considerations
Because proteomic studies of clinical samples can generate cumbersome data sets, bioinformaticians are frequently involved in study design (e.g., patient selection and study size calculation) and the hunt for significant and reproducible patterns in the data.Their objective is to find reproducible differences which correlate with a defined clinical outcome and that are independent of the influence of experimental bias, over-fitting and statistical chance.
The incorporation of a randomization strategy in sample analysis reduces bias by accounting for the day-to-day variations in the analytical technique.Similarly, it is prudent to calibrate and record the performance characteristics of the instruments used in the analyses.Calibration in proteomic analyses entails, for example in mass spectrometry, initialising the mass accuracy to a standard mixture of purified proteins or peptides of known mass.Routine calibration of sensitive instruments subject to 'drift' in measurement over a period of time should become part of good laboratory practice.Further, in the discovery phase, the objective is to have sufficient sample numbers to provide confidence that the list of protein candidates is worthy of follow-up during the validation phase.
Typically in this phase of biomarker development, the sample size is small due to the cost and time of analysis, and sometimes because of the difficulty associated with obtaining s a m p l e s .H o w e v e r , t h e n u m b e r o f p r o t e i n s (independent variables) measured in each sample is typically very large.This ratio of samples to variable size is contrary to the traditional application of multivariate statistics and leads to some unique considerations that have been discussed by others (Dowsey et al., 2009;Karp & Lilley, 2007).Conversely in the validation phase this relationship is inverted so that patient cohorts are much larger (typically 100's -1000's) and the number of biomarker candidates carried over from discovery are reduced depending on the strength of their relationship to the clinical outcome or measure being assessed.The costs incurred by the validation phase therefore sit in a multi-million dollar range far exceeding the costs of discovery.The financial implications alone may account for the relative dearth of publications on this phase as the main players, large pharmaceutical or diagnostic corporations, having invested large amounts of time and money likely strive to protect the resultant intellectual property prior to further clinical testing and pre-market approval.Bias, or any discrimination occurring due to a non-biological signal, can potentially confound discovery.For example, spurious results may arise because of differences in how patient samples are collected, e.g., type of blood collection tube, time taken to freeze sample, or the order in which the samples are analyzed.Over-fitting can occur when regression analysis tools are used to 'fit' (too) many variables to a limited set of outcomes.The discriminating 'pattern' or 'signature' then becomes an artifact of the patient cohorts.To resolve issues of bias, statistical analyses must consider the biology of the system being analyzed and take into account the assumptions and limitations of the methods (Ransohoff, 2009).Statistical tests capable of gauging the level of false positives across multiple comparisons include the student's unpaired t-test (for two group comparisons), ANOVA (for three or more group comparisons) and linear regression (for quantitative or correlative studies) (Dowsey et al., 2009).Alternatively, if the data are not normally distributed, nonparametric Mann Whitney and Kruskal-Wallis tests should be substituted (Karp & Lilley, 2007).These methods can be used to analyze the features one-at-a-time and then to compile a ranked list of them based on a combination of p values and effect size.As noted earlier, a longitudinal design can minimize the potential for bias relative to a typical case-control study.Nevertheless, false biomarker leads are common and therefore rigorous validation essential.
The false discovery rate (FDR) can also be calculated (Benjamini et al., 2003;Storey, 2003;Strimmer, 2008).By setting the FDR level, it is possible to diminish the risk of a false positive identification for a differentially-expressed protein, i.e., at P ≤ 0.05, we expect only 5% false positives.However, by doing so, the process of discovery may be compromised by overly stringent criteria.Although proteins displaying the most dramatic changes may appear to be useful biomarkers, it is important to attempt to rationalize their changes to the pathology.For example, acute phase proteins are frequently identified in plasma or serum-based studies as 'specific' biomarkers of a wide range of chronic disorders, including arthritis and cancer but clearly they are not specific to any one disease (Addona et al., 2009).
Cross-validation procedures can be used to reduce false positives.In this instance one data set is used to build the model (called training) and a second data set generated from an independent patient cohort is used to assess the predictive accuracy of the model (called testing).Another commonly-used validation strategy is known as K-fold cross-validation where the analysis is repeated over many random splits of the data.For each analysis, a subset of the data is used to build K number of predictive models, with the remaining subset available for a test of predictive accuracy.Although useful initially after discovery, validation based on splitting a single data set is of limited use because confounding factors can introduce systematic biases into both training and test splits.
Given the issues noted above, it advisable to validate intial 'discoveries' on independent sample sets, perhaps incorporating analysis by orthogonal methods which are more amenable to the requirements of clinical throughput and precision (Dupuy & Simon, 2007).
Re-analysis or meta-analysis using raw data coming from other research groups is another possibility, although data standards, such as the 'minimum information about a proteomics experiment' (MIAPE) (Taylor et al., 2007), often do not extend into the initial design of clinical studies.Consequently, detailed clinical data may not be captured and reported consistently for clinical proteomics experiments, limiting the ability of investigators to independently verify, combine or correlate data from multiple experiments.
For thorough validation, the number of patient samples required should be determined through the use of statistical tools that take into account the imprecision of the analytical method, inter-patient variability and the acceptable threshold of difference that is deemed significant for a given biomarker application (Ye et al., 2009).Patient numbers (biological replicates) and other statistical considerations of power have also been discussed in detail elsewhere (Cairns et al., 2009).

Feature selection and classifier assessment
Several multivariate analysis tools are available for the analysis of large multidimensional data sets and some of these have been arranged into commercial software packages.Visual tools, including principle component analysis, hierarchical cluster analysis and heat maps which display variance, relatedness and patterns in data (respectively), are also available and are useful preliminary aids in data analysis.These analyses stive to represent variance in a graphical fashion and give for example an overall view protein expression prevalence within outcome groups in the case of heat maps or 'relatedness' of expression levels between different proteins with hierarchical trees (Marengo et al. 2006;Marengo et al. 2008).Emphasis h o w e v e r , s h o u l d b e p l a c e d o n u s i n g s u p e r v ised or semi-supervised methods such as distribution free learning (kernel-based or Bayesian analysis) or support vector machine (SVM) which allow for advanced categorization and classification of multidimensional proteomic data with respect to clinical data.This process known as 'feature selection' and leads ultimately to the creation of a 'classifier' or biomarker driven algorithm specific to the disease and outcome being measured (Liu et al., 2009;Zhu et al., 2009).Since most protein expression profiles will likely not be correlated to a specific outcome, supervised methods screen out uninformative proteins and select protein combinations to develop a 'classifier'.
A recent application of the SVM principle has been used to guide feature selection of exhaled peptides as potential biomarkers of asthma (Bloemen et al., 2011).Depending on whether proteome studies are focused on biomarkers for (i) diagnosis (class discovery), (ii) prognosis (outcome-related) or (iii) prediction (supervised prediction), various rationales should be employed to generate and assess the reliability a classifier.Class discovery methods are best suited for grouping proteins into subsets that elucidate pathways with similar expression profiles across patient subgroups.In outcome-related studies, the goal is to identify which proteins have expression levels that correlate with outcomes grouped into discrete classes: for example, in arthritis patients with a good versus a bad prognosis.When prediction of patient outcome is the aim, supervised prediction methods that use a selected proteome profile are used to generate an algorithm based on individual profiles.In supervised class prediction studies, a totally independent cohort should be used for cross-validation purposes when rigorous testing of a predictive model is desired (Dupuy & Simon, 2007).
The statistical significance of a selected proteome 'classifier' gives an incomplete estimate of its predictive ability and potential clinical utility.The number of true and false positives or negatives should be presented allowing the calculation of sensitivity and specificity.This reveals clinically-relevant information on how the classifier performs in each outcome category.List of statistical tools and recommendations for their application have been reported (Dupuy & Simon, 2007;Karp & Lilley, 2007;Marengo et al. 2006).Depending on the clinical question there may however be multiple outcome measures that are not amenable to a simple binary classification system.Statistical evidence of prevalence and analytical limits of detection of a specific group of isoforms can then direct the study towards validation of candidate biomarkers in a much larger group of multi-center patient populations.

The biomarker development process
Three distinct phases can be delineated within a typical development pipeline: discovery, verification and validation.These can be further subdivided so there is a reduced number of candidates at each stage, each with an increased probability of utility 51 .In the subsequent sections we aim to clearly segregate these phases in the biomarker 'pipeline' and further expand on the vastly different requirements of each (Figure 1).This process is prefaced by a brief overview of pre-analytical factors which can introduce unwanted bias or variation.

The discovery phase
In the discovery phase, proteomic platforms are unsupervised and are used to highlight qualitative and/or quantitative differences in multiple proteins across distinct clinical phenotypes.The process of discovery is focused on assessing many candidates, while minimizing the probability of false positives and negatives.Discovery by definition requires an analytical approach which does not preempt the identity of the biomarker candidates.Generally speaking as most discovery methods prioritise the measurement of as many proteins as possible they have inherently low throughput, are labor intensive and offer a low dynamic range.These characteristics preclude their use in later phases of the biomarker pipeline.It is also important to realize that as yet there is no single method available for looking at the complete complexity of the proteome within a given clinical sample.Because we are working with a relatively blunt set of tools in discovery we need to transition to more precise methods for validation.
A two-step approach to the discovery phase, though widely used, is not well defined in the literature.In the initial pilot exploration of a low number of individuals the aim is to gain a grasp of the variability of whole proteome being measured across the cohort, selecting a suitable sample type, optimizing the separation and quantification platform and ultimately calculating appropriate patient numbers to power a second (discovery) round with greater statistical confidence.

Verification of protein modifications
Protein modifications are common but are frequently overlooked, especially during the discovery phase.Amongst the most significant modifications are covalent alternations to amino acids (e.g., phosphorylation, nitration or redox changes) and covalent addition of large groups (e.g., glycosylation).These modifications can have dramatic effects on protein function and may play a significant role in a range of arthritides and autoimmune disorders.
Because most biomarker candidate identification strategies rely on peptide surrogate based mass spectrometry, there is added potential to characterize low abundance PTM variants.MALDI-TOF is an example mode of mass spectrometry can scrutinize multiple variants of a given protein in a concurrent, swift and relatively sensitive fashion.Several criteria determine accurate structural assignment and the quantification of specific modifications via a peptide-centric approach (Duncan et al., 2010), including spectra search criteria, sequence coverage and database completeness.Accordingly, changing levels of a modified protein may represent a better biomarker than changes in the total expression levels of a given protein.For example, alterations in the levels of naturally-occurring glycosylation motifs can serve as a marker of inflammation, lymphocyte tolerance and senescence in arthritis (Garcia et al., 2005), viz.increased branching of sugar moieties on alpha-1 acid glycoprotein can act as biomarkers of inflammation, whereas decreased branching of T-cell receptor affects the development of Th1/Th2 cells increasing susceptibility to autoimmunity (Havenaar et al., 1998;Morgan et al., 2004).

Fig. 2. Protein isoform verification
A depiction of possible qualitative and quantitative changes in protein isoforms between health and a disease state.The illustration of an isoform of a given protein associated with a specific adverse outcome demonstrates that it can only be detected by high 'resolution' proteomic strategies which can detect variance in post translational modifications.Conventional genomic and antibody based methods will only pick up on a change in expression of recognized transcripts or epitopes, giving a high likely hood of missing the significance of the isoform prevalent in a particular disease outcome.
Recent evidence suggests that oxidative modifications to the proteins S100A8 and S100A9 shifts function from macrophage and neutrophil activation in inflammatory arthritis towards a protective role (Lim et al., 2009).In this case, the modification appears to serve as a regulatory switch.Citrulination of arginine side chains has the potential to alter structure, antigenicity and protein function (Wegner et al., 2010).In fact, synthetic peptides modified to mimic possible neo-antigens which trigger an autoimmune response have been used to identify novel diagnostic/prognostic autoantibodies (McLaren et al., 2005;Papini et al., 2009).Before disease becomes apparent, it is likely that a particular disease pathology 'specific' protein isoform combination has been expressed for some time, impacting normal physiological pathways.These disease 'specific' proteins may also be expressed in a benign or developing state of the disease devoid of clinical symptoms and may contain a sub pool of surrogate markers of chronic inflammation.An example from the world of autoimmune disease is presented by a study of systemic lupus erythematosus patients in whom autoantibodies were detected prior to clinical symptoms (Eriksson et al., 2011).Susceptibility to develop several other auoimmune diseases including diabetes and rheumatoid arthritis can be predicted by long periods of pre-clinical autoantibody expression (Bastra et al., 2001;Rantapaa-Dahlquist et al, 2003).Another recent study indicates that galactosylation of IgG precedes disease onset, correlates with disease activity, and is prevalent in autoantibodies in rheumatoid arthritis patients (Ercan et al., 2010).Evidently these preclinical biomarker 'screening' studies are unique in that they rely heavily on concerted biobanking of samples in a prospective fashion, generally have focused on more easily retrieved antibodies and may incur long 'wait times' until a specific disorder may occur.They do however offer a fascinating glimpse of what could be occurring at the protein level prior to disease onset, which arguably could offer a window of opportunity to diagnose earlier, manage the pathology before it becomes clinically symptomatic and possibly prevent aberrant processes all together.Alterations in protein isoforms therefore may also comprise part of the milieu of pathological changes and thereby serve as biomarkers.Studies aimed at full length characterization of proteins indicate that preliminary discovery stages may therefore not reflect the full extent of protein variants due to the low cohort sizes (and low throughput techniques) typical of this stage.For example, a study of diabetes patients revealed that, within a cohort of 96 individuals, an average of 3 variants of each protein were observed; a further 8 variants were observed across 1000 individuals (Borges et al., 2010).This highlights the importance of accounting for protein micro-heterogeneity across patient populations and correlation of prevalence with specific disease outcome sub-groups (Figure 2).Statistical evidence of prevalence and analytical limits of detection of a specific group of isoforms should then direct the study towards validation of candidates in a much larger group of multi-center patient populations.

Emerging tools for targeted biomarker validation
The biggest challenge in proteomics remains independent validation of changes 'discovered' in observational investigations.Traditionally, validation has been undertaken by antibodybased approaches, including Western blotting, ELISA and immunohistochemistry (IHC).However, despite major efforts to generate proteome-scale panels of suitable antibodies (most notably the impressive Human Protein Atlas initiative [http:// www.proteinatlas.org/ index.php]),this remains a slow process.It requires antibody generation and characterization to establish specificity and utility in different assay formats.

Multiple reaction monitoring
Antibody-independent strategies are highly desirable.The most popular of these is based on peptide-centric, multiple reaction monitoring (MRM).MRM is a technology that has unique potential for reliable quantification of analytes of low abundance in complex mixtures.In an MRM assay, a predefined precursor ion and one of its fragments are selected by the two mass filters of a triple quadrupole instrument and monitored over time for precise quantification.A series of transitions (precursor/fragment ion pairs) in combination with the retention time of the targeted peptide can constitute a definitive assay (Lange et al., 2008).The combination of MRM, chemistry and software to aid with the selection of suitable proteotypic peptides, has provided the opportunity to rapidly develop quantitative multiplexed assays of protein expression and post-translational modification that are both highly specific and sensitive (Scheiss et al., 2009).In recent years, significant advances have been made in the measurement of protein expression using MRM on triple quadrupole (QQQ) mass spectrometers (Pan et al., 2009).In this system, one or more peptide ions of unique and known mass are preselected in the first quadrupole (Q1), induced to fragment in the second quadrupole (Q2), and some of the resulting 'product ions' (or fragments) are selected for transmission to the detector in the third quadrupole (Q3) (Figure 3A).MRM supports the simultaneous measurement of multiple proteotypic peptides and synthetic mass variants of them (usually spiked into samples in known amounts).The strategy enables the absolute quantification of multiple proteins (Keshishan et al., 2007;Kuzyk et al., 2009).When MRM is combined with immunoaffinity purification and internal peptide standards, for example SISCAPA, detection is in the subfemtomolar range (Whiteaker et al., 2010).
In a relatively early demonstration of peptide MRM, assays were developed to simultaneously quantify the expression of sixteen cytochrome P450 enzymes -proteins important in determining susceptibility to adverse drug reactions (Jenkins et al., 2006).Previously, a method was described for the MRM assay of C-reactive protein (CRP) as a means of differentiating erosive from non-erosive RA patients (Kuhn et al., 2004).The same research team then applied the same MRM technique to measure elevated levels in synovial fluid of six additional members of the S100 calcium-binding proteins associated with an erosive subtype of RA (Liao et al., 2004).

Nucleic acid programmable protein arrays
The production of antibodies against self-antigens (autoantibodies) is a characteristic feature of many autoimmune diseases.At a clinical level, tests for specific autoantibodies, such as ANA positivity, are routinely employed to aid the diagnosis and track the progress of these diseases.Traditionally, autoantibodies have been identified with a one-antigen-at-a-time, hypothesis-driven approach using methods such as immunofluorescence and ELISA.Microarrays provide a particularly effective platform for the systematic study of thousands of proteins in parallel because they are sensitive and require low sample volumes (MacBeath & Schreiber, 2000;Zhu et al., 2001).Protein microarrays involve the display of thousands of different proteins with high spatial density on a microscopic surface.Protein microarrays have been applied to autoimmune biomarker studies focused on pre-symptomatic screening and diagnosis, clinical outcome prognosis and therapeutic response prediction (Hueber et al., 2005;Quitana et al., 2004) With particular relevance to the remit of this chapter, conventional printed arrays have been used to study rheumatoid arthritis, systemic lupus erythematosus, multiple sclerosis, hepatitis and encephalomyelitis (Fattal et al., 2010;Hueber et al. 2009;Li et al., 2005;Somers et al., 2009;Song et al., 2010).
A In protein multiple reaction monitoring (MRM), one or more peptides of unique and known mass (proteotypic peptides) are preselected in the first quadrupole (Q1), induced to fragment in Q2 by collisional excitation with a neutral gas in a pressurized cell and some of the resulting 'product ions' (fragments) are selected for transition to the detector in the third quadrupole (Q3).B1 Nucleic acid programmable protein array (NAPPA) spotted with genes of interest; All proteins are tagged at the c-terminus to ensure only full length translated proteins can be captured in situ by co-spotted anti-tag antibodies.NAPPA has consistent protein amounts displayed at each spot; most are within two fold of the average (Ramachandaran et al., 2008).Proteins are expressed "just-in-time" for assay, which eliminates concern of protein stability.B2 Image of NAPPA with randomly selected 768 genes probed with a synovial fluid sample from a patient with juvenile arthritis.Antibodies  Nucleic Acid Programmable Protein Array (NAPPA) is an innovative method to produce protein microarrays, where cDNAs encoding proteins of interest are spotted onto activated surfaces and proteins are produced in situ using mammalian in vitro expression systems (Ramachandran et al., 2004;Ramachandran et al., 2008).The freshly made protein is captured by co-spotted antibodies specific for a 'tag' encoded at the end of the amino acid sequence.This approach circumvents the labor and cost considerations associated with conventional spotting of labile recombinant proteins into arrays.NAPPA technology recently revealed that ankylosing spondylitis patients' autoantibody responses were targeted towards connective, skeletal and muscular tissue, unlike those of RA patients (Wright et al., 2010).In a recent pilot study, a strong correlation was observed between 768 autoantibodies in paired plasma and synovial fluid samples from patients with juvenile arthritis (Figure 3B).

Proteomic profiling methods
Intact protein profiling across clinical cohorts gives a glimpse into the degree of variation evident in a single gene product (Borges et al., 2008a).The same approach may be useful in the study of arthritis.Mass spectrometry-based techniques can potentially distinguish these physical and structural variations and allow the relative abundance of one isoform to be determined (Duncan et al., 2010).By contrast, these variants would be overlooked by conventional ELISA methods (Figure 2).A brief description and recent application of such techniques follows.MALDI / SELDI Profiling (Immuno-MALDI): Matrix assisted laser desorption ionisation (MALDI) mode of mass spectrometry allows the 'soft' ionization of complete proteins which are liable to fragment under conventional ionization methods.The type of a mass spectrometer most widely used with MALDI is the time-of-flight (TOF), mainly due to its large mass range (Figure 3C).Purifying a protein from a clinical sample by immunoprecipitation can greatly reduce the complexity of the proteome being analysed.In one approach, purified polyclonal antibodies that capture the target protein isoforms can be immobilized onto sepharose beads packed within a pipette tip or 'fret' (Borges et al., 2008b).
Eluted proteins can then be spotted on a MALDI target plate and spectra obtained.For example, some recent MALDI profiling applications have demonstrated the ability to diagnose early RA and hypertension and distinguish active SLE (Dai et al., 2010;Long et al., 2010;Reid et al., 2010).Glycosylation heterogeneity of selected inflammation associated molecules such as serum amyloid and vitamin D binding protein have been investigated in cancer and diabetic patients (Rehder et al., 2009;Weiss et al., 2011).As a modification of MALDI, surface-enhanced laser desorption ionization (SELDI) methods can be used to target lower molecular weight proteins (<20 KDa) to differentiate arthritides and therapeutic response (de Seny et al., 2008;Miyame et al. 2005).The technology is currently being developed to affinity capture the protein of interest directly to the mass spectrometry target plate (Brauer et al., 2010).

Biomarker research and grant funding
Although proteomics has been full of promise, few validated biomarkers have made their way into the public domain and even fewer influence clinical practice.There is little doubt that validation is a serious bottleneck in the biomarker development process.While there is abundant discussion of approaches to discovery, the tools for validation and their applications have received little attention.It is very often difficult to receive funding from traditional grant programs to validate markers: funding agencies balk at the prospect of funding a 're-measurement' of the same entity in larger independent cohorts.Additionally, the continuum from discovery through to validation is tedious and extends well beyond the time-frame of a typical research grant.In fact, the time from initial discovery to routine use can take up to a decade (Anderson, 2010;Wilson et al., 2007).A recent example illustrates the seven year journey from discovery to FDA approval for the multivariate diagnostic test OVA1, used to screen ovarian cancer patients (Fung, 2010).
Similarly, when validation fails it is difficult for academic investigators to publish these 'negative' results; when validation succeeds, the emphasis frequently shifts to commercialization rather than publication.

Conclusions
While there is widespread recognition of the value of biomarkers, scientific progress is slow.Over the years, biomarkers have sometimes been the center of excessive "hype", prompting excessive or unreasonable expectations.In addition, the use of biomarkers as surrogate endpoints have led to some public failures when they were felt to be falsely reassuring, creating general skepticism amongst scientists and clinicians alike (e.g.Petricoin et al., 2002).
In addition to limited validation, resistance still hindering biomarker acceptance includes:  Resistance to sharing data across independent efforts -Organizations may work on similar research or discover keystone advances yet resist sharing knowledge because they feel that doing so will jeopardize their competitive advantage.However, sharing information could help companies achieve greater overall progress and reduce costs.


Need for new R&D models with greater precision and flexibility -The industry needs an R&D model with greater precision to improve pipelines, leveraging active clinical knowledge to offset the declining success in new drug development.Some research and development leaders are concerned that using an approach that targets treatment for only a subset of patients decreases profits and increases research costs.Others recognize that this direction has already created value beyond costs and are building these capabilities into their business strategy.For example, Herceptin ® is considered an effective targeted treatment for breast cancer.Targeted treatments could actually increase both the medical and economic success of a therapeutic.


Insufficient interoperability -Traditional data resides in disparate places that often do not easily connect.Factor in imaging biomarkers constituted by terabytes of data and you have a complex mix of data from which it is difficult to extract new insights (Poste, 2011).The path forward -interoperability -is a design and intent to have systems share information that relies on data standards, and more importantly, semantics.Semantics use common vocabularies and business rules to relate clinical terms reported across different sources to find common meaning.Tools to support development, medical care, health policy such as the FDA's critical path, and BioPharma investment decisions.The biomarker development and validation process is necessary but costly for one company.Innovation takes place in many organizations and, as such, stakeholders work redundantly on the same effort.Many collaborative forums exist but these usually involve sharing "safe" information that really does not hasten overall progress.Consequently, most existing biomarkers have taken decades to become part of medical practice.
Currently there are few FDA-approved proteomic tests for autoimmune disease.Although there is little doubt that such tests could help the diagnosis and treatment of arthritis, it is a major clinical and financial challenge to develop, validate and market them.Robust validation data including evidence of sensitivity, specificity and correlation to the existing limited set of clinical or laboratory criteria are necessary to support clinical utility.Disease activity scores (DAS-CRP and DAS28), for example, combine inflamed joint count and ESR/CRP to document levels of disease activity at a static time point.The measurement of specific proteins that flag a particular patient's status add objectivity in circumstances where the clinician currently relies on clinical judgment alone.From a clinician's perspective, it is important to address several questions in a timely fashion for a given patient presenting with autoimmune disease.In each instance, the clinician is attempting to minimize underlying disease and adverse outcomes, such as joint damage in arthritis.Key questions that can currently only be partially answered by clinical observation and patient history include: (a) is this true autoimmune-driven arthritis (i.e., diagnosis), (b) how severe or at what stage is the disease process, (c) what is this patient's likely outcome (i.e., prognosis) and (d) which drugs could abrogate that outcome (i.e., prediction)?Decision-making also extends to selection of therapy: (e) what is the patientspecific titer, (f) which disease subgroups will benefit from a specific therapeutic strategy and (g) when should treatment be terminated?This chapter has addressed and discussed three key areas for consideration, which if addressed after initial discovery work could provide solid evidence of their clinical utility and commercial viability: (i) limiting bias in study design, (ii) thorough protein isoform verification and (iii) modes of orthogonal and targeted validation.

Glossary-the language of biomarker and proteomic research
Bias-In statistics, bias is systematic favoritism present in data collection, analysis or reporting of quantitative research Biomarker-or biological marker, is a molecular characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention.Classifier-in statistics is the formula or criteria for identifying a sub-population based on quantitative information on one or more measurements, traits or characteristics.Development pipeline-represents the process from candidate discovery, through verification, validation and final pre-market approval.Diagnostic-in the context of medicine is any test performed or criteria applied to aid to determine and/or identity a possible disease or disorder.Discovery-in the context of biomarkers, describes the initial process of observation, identification and quantification of one or more biological molecules which may act as a classifier.Isoform-describes the biological phenomenon of several different structural forms of the same protein which may arise by alternate gene splicing and single-nucleotide polymorphisms before messenger RNA translation and chemical modifications e.g.phosphorylation or glycosylation which occur post-translation of proteins.Multiplex-in the context of protein assay is a method or platform which permits the simultaneously measururement of multiple analytes (dozens or more) in a single test.
Omics-this suffix, often used in modern biological research, refers to the lofty aim of observing, identifying and quantifying the totality of a particular class of molecules i.e. genomics, proteomics.Orthogonal method-describes the ideal of using alternate types of analyses to corroborate the original findings by independent means.Peptide-centric-or bottom-up, proteomics is a common method used to identify proteins by proteolytic digestion of proteins prior to analysis by mass spectrometry.Protein-centric-or top-down, proteomics is a method of intact protein identification e.g. an ion trapping mass spectrometer used to store an isolated protein ion for mass measurement and tandem mass spectrometry analysis.Power analysis-to calculate the number of samples required for a study to reach statistically sound conclusions.Predictive model-in the context of medicine, is created or chosen to try to predict the probability of a clinical outcome by use of one or more classifiers.Prognostic-is a clinical test which can forecast the likely course or outcome of an illness.Sensitivity-measures the proportion of true positives which are correctly identified as such (e.g. the percentage of sick people who are correctly identified as having the condition).Specificity-measures the proportion of true negatives which are correctly identified (e.g. the percentage of healthy people who are correctly identified as not having the condition).Throughput-refers to the rate of analysis of samples by a particular method e.g.analysis of a single protein by Western blotting is relatively low throughput compared to ELISA.Validation-later stage in the biomarker pipeline is defined as the documented act of demonstrating that putative biomarker classifiers will consistently lead to the expected results i.e establish sensitivity and specificity performance in large populations and begin to optimize the assay for commercial use.Verification-intermediate phase in biomarker pipeline bridging discovery and validation, which typically reduces the number of candidates, confirms specific protein isoforms within a classifier and begins to assess the sensitivity in expanded populations.

Fig. 1 .
Fig. 1.Biomarker pipelineTable describes the aim, the likely analytical platform and associated characteristics of each phase in an ideal biomarker discovery pipeline through verification to validation and final pre-market approval.The schema represents the increase in patient sample and decrease in candidate protein numbers as a biomarker study moves from discovery (two-step) through to validation phases; 2DE-2-dimensional gel electrophoresis, DIGE-difference in-gel electrophoresis, LC-MS-liquid chromatography associated with mass spectrometry, ELISAenzyme linked immuno-adsorbant assay, MRM-multiple reaction monitoring mass spectrometry, IVDMIA-in vitro diagnostic multivariate index assay.
in patient samples bind to their antigen targets on the array and are detected by Alexa647conjugated goat anti-human IgG.B3 Scatterplot of reactivity on NAPPA between paired plasma and synovial fluid samples from arthritis patients.Median correlation is 0.982.C1 Matrix assisted laser desorption ionization-time of flight (MALDI-TOF) mass spectrometry whereby proteins or peptides imbedded in a crystallized matrices are ionized by a high frequency laser beam and accelerated through a flight tube by electrical field; ions 'fly' and reach the detector plate with respect to their mass:charge ratio.C2 A spectra is generated which reflects the energy of a given ion vs the mass:charge ratio (m/z).C3 A birds eye view representation of the spectra reveals distinguishing peaks (*) from the six samples analysed.

Fig. 3 .
Fig. 3. Targeted identification methods DSG would like to acknowledge continued support from Arthritis Research UK in the form of a Travelling Fellowship (No. 19250).The UCD Conway Institute and the Proteome Research Centre are funded by the Programme for Research in Third Level Institutions, as administered by the Higher Education Authority of Ireland.SRP acknowledges support for equipment from Science Foundation Ireland.DSG and MR would like to acknowledge the funding of this research by Arthritis Research UK in the form of a Project Grant (No. 18748).