Autoantibody-Based Diagnostic Biomarkers: Technological Approaches to Discovery and Validation

Autoantibodies produced against self-antigens, or ‘autoantigens’, result from a loss of selftolerance triggered by genetic and/or environmental factors which induce the immune system to attack the host’s own cells, resulting in a condition referred to as autoimmunity. In classic autoimmune diseases, it is well established that the pathology relates directly to the autoantibodies. However, it is increasingly recognised that autoantibodies are also found in many other disease areas, including cancers, cardiovascular and neurodegenerative diseases, as well infectious diseases such as malaria, albeit in such diseases it is unclear whether the autoantibodies play a direct role in the pathology or whether they are merely symptomatic of disease. Irrespective of whether they are causative or symptomatic of specific diseases though, there is increasing interest globally in exploring the clinical potential of circulating autoantibodies as diagnostic biomarkers. This chapter provides an overview of the diagnostic utility of autoantibody biomarkers in a range of disease areas and discusses their potential utility in disease staging, treatment monitoring and in prediction of immune-related adverse events. It also provides an overview of traditional and contemporary technological approaches to autoantibody biomarker discovery and validation, focusing on protein microarrays that are ideally suited to this important area of research.


Introduction
Autoantibodies are natural antibodies produced against self-antigens, or 'autoantigens', and can induce the immune system to attack host tissues, leading to a condition generically referred to as autoimmunity. Classic autoimmune syndromes include systemic lupus erythematosus, rheumatoid arthritis, rheumatic heart disease, Graves' disease, autoimmune hepatitis, multiple sclerosis, diabetes, and Sjogren's syndrome. In such autoimmune diseases, it is well established that the pathology relates directly to the autoantibodies. However, it is increasingly recognised that autoantibodies are also found in many other diseases, including cancers, cardiovascular diseases, neurodegenerative diseases, as well infectious diseases such as malaria, albeit in such diseases it is not yet clear whether the autoantibodies play a direct role in the pathology or whether they are merely symptomatic of disease. Irrespective of whether the autoantibodies are causative or symptomatic of specific diseases though, there is increasing interest globally in exploring the clinical potential of circulating autoantibodies as diagnostic biomarkers and considerable research effort is now being directed to the discovery, quantitation and validation of novel autoantibody-based diagnostic biomarkers in many different disease areas.
Numerous techniques have been utilised over the last few decades to detect the presence of autoantibodies in patient samples, not least since autoantibodies are increasingly thought to represent excellent potential biomarkers for early disease detection. Techniques that have historically been employed for biomarker identification include western blotting, immunohistochemistry and enzyme-linked immunosorbent assays (ELISA), but these are being superseded now by newer technologies that offer higher multiplicity as well as greater sensitivity and specificity. Amongst these newer technologies, protein microarrays are becoming established now as a powerful means to detect protein expression levels and to investigate protein-ligand interactions, as well as to probe protein function [1], since they enable efficient and sensitive, high throughput protein analysis, with large numbers of technically-replicated measurements being made in parallel using miniaturised assay formats and minimal sample volumes. These properties of protein microarrays make them ideally suited to component-resolve and quantify autoantibody profiles in biological samples.

Autoantibody classes
Antibodies are secreted heterodimeric proteins comprising light and heavy chains which are produced in mammals through recombination of V(D)J segments in developing B-lymphocytes. At any one time, there are thought to be of the order of 10 7 -10 8 different antibody sequences present in human serum. In response to the presence of foreign antigens or pathogens, somatic hypermutation processes drive the affinity maturation of specific antibody sequences, resulting in the production of high affinity, antigen-specific antibodies. Affinity-matured antibodies, or immunoglobulins (Igs), are produced by plasma cells and secreted into the blood stream where they scavenge their cognate antigen for destruction. Antibodies thus play a crucial adaptive role in mammalian defence mechanisms against harmful components that can cause disease. There are five classes of antibodies: IgG, IgM, IgE, IgD and IgA, which differ in their structures and immune functions. IgG is the major antibody class found in blood, has the longest serum half-life of all immunoglobulin isotypes [2] and contributes directly to a neutralising immune response to extracellular pathogens and toxins. IgA is also involved in direct neutralisation of toxins, virus and bacteria; however, it concentrates particularly in mucosal surfaces. IgM, a pentameric immunoglobulin, is the largest of the antibody classes and is associated with a primary immune response; IgMs are therefore frequently used to diagnose acute exposure to an immunogen or pathogen [2]. IgD and IgE are found in trace amounts in the blood with short half-lives. IgD remains membrane-bound and is involved in regulation of cell activation while IgE is associated with hypersensitivity and allergic reactions [2]. Classical autoantibodies are typically IgMs and include: anti-nuclear antibodies (ANA), which bind to the nuclear membrane, nucleoplasm, nucleoli and nuclear organelles of cells [3]; rheumatoid factor (RF), which binds with relatively low affinity to the Fc region of IgGs and which is found in the serum of rheumatoid arthritis (RA) patients [4]; Anti-double-stranded DNA (dsDNA) antibodies, anti-Sm antibodies, antiphospholipid antibodies, anti-Ro, anti-ribonucleoprotein and anti-La Antibodies which are all frequently found in systemic lupus erythematosus (SLE) patients [4]; and Anti-Sjogren's syndrome A (SSA) and -B (SSB) antibodies, which are found in many patients with Sjogren's syndrome [4].

Causes of autoantibody production
In a normal immune response to a foreign antigen, professional antigen presenting cellsincluding dendritic cells, B-cells and macrophages -engulf and proteolyse the antigen and then present antigen-derived peptides on their cell surface in the form of major histocompatibility complexes; recognition of complexed peptides by a specific receptor on a T-cell then triggers the release of cytokines and chemokines, resulting in activation of that T-cell. Interaction between antigen-specific T-and B-cells subsequently leads to antigen-specific B-cell proliferation [1,2]. A portion of those B-cells serve as memory cells, whilst the remainder act as effector cells that differentiate into antibody-producing plasma cells responsible for the production and release of antigen-specific antibodies [5].
Peripheral tolerance mechanisms usually ensure that self-reactive T-and B-cells (i.e. displaying T-or B-cell receptors for self-antigens) are suppressed. However, in certain circumstances, peripheral tolerance can be broken, resulting in proliferation of autoantigen-specific T-and B-cells. Simplistically, peripheral tolerance can be broken for a number of reasons, for example if the self-antigen is significantly over-expressed in a tissue or if neoantigens are somehow presented to the host immune system. Such neoantigens can include mutated peptide epitopes, aberrantly spliced or aberrantly post-translationally-modified epitopes, or new discontinuous epitopes resulting from misfolding of the antigen. Tolerance defects can also stem from the downregulation of regulatory T-cells (Tregs) [6], whilst chronic inflammatory responses are thought to facilitate the release and exposure of intracellular antigens to the immune system, resulting in autoantibody production in cancer patients [7], as well as increased vasculature permeability, allowing immune cell accumulation at the tumour site [8]. One consequence of loss of peripheral tolerance can be the production of self-antigenspecific autoantibodies.
As mentioned above, autoantigens may result from aberrant post-translationally modifications, including proteolysis, hydrolysis, phosphorylation and oxidation [9]. One such example occurs in RA, where patients produce autoantibodies against citrulline-modified proteins, themselves produced by the enzymatic action of peptidylarginine deiminases (PADs) -calcium-dependent enzymes that catalyse the post-translational hydrolysis of peptidylarginine to peptidylcitrulline. During inflammation, oxidative stress or apoptosis, PAD converts specific arginine residues on selected proteins into citrulline (a process often referred to as 'citrullination'), thereby producing neoepitopes that are recognised as non-self, dramatically altering immunogenicity and autoantibody production in RA patients [10].
Autoantibodies are also produced in response to the uncontrolled released of autoantigens during cell death processes. Maintenance of tissue homeostasis ordinarily takes place via clearance of apoptotic and altered cells through phagocytosis-or complement-dependent mechanisms, inhibition of inflammation, removal of misfolded proteins, and regulation of autoantibody-producing B cells [11]. However, when clearance mechanisms becomes compromised, dead cells accumulate and progress to secondary necrosis, releasing autoantigens as well as pro-inflammatory markers and thereby disrupting immune homeostasis [12] ( Figure 1).
Autoantibody production thus has a multifactorial aetiology in which environmental and inherited factors interplay in determining the autoantibody profiles of an individual. Environmental factors associated with autoantibody production include drugs, toxins, chemicals from personal care products, and infections. Exposure to such agents can result in modification or mutation of chromosomal DNA sequences, potentially giving rise to altered gene-and protein expression (which can drive altered post-translational modifications), as well as to the expression of aberrantly-spliced or mutated form of proteins, all of which can result in the generation of neoantigens in the exposed tissue and hence to the production of specific autoantibodies. Furthermore, genetic predisposition or family history of autoimmune disorders also contributes to one-third of the risk of having increased autoantibody levels and various genome-wide association studies have shown that the production of autoantibodies in SLE [13], RA [14] and Multiple Sclerosis [15] is controlled by multiple loci.
Although the self-reactivity of autoantibodies can be harmful to host tissues, recent studies suggest that low-grade self-reactivity also occurs in healthy individuals, implying that certain autoantibodies may play a role in maintaining immune homeostasis [16] and in protecting against pathogenic processes, by activating innate and acquired immunity to maintain or restore health status [16]. Natural autoantibodies are predominantly of the IgM class, which makes sense since IgM is the first antibody to appear when the immune system is triggered in response to external antigenic exposure. By contrast, circulating naive IgMs arise without known immune exposure or vaccination [11] but have also been reported to recognise certain autoantigens in healthy adults as well as in newborn babies [17,18].

Gender bias in autoimmune diseases
The term 'autoimmune disease' refers to a group of over 80 distinct disorders, the symptoms and severity of which vary between individuals [19].
There are marked differences in diseases that predominantly affect males or females, as shown in Figure 2. Generally, females are more susceptible to autoimmune diseases whereas males show increased susceptibility to non-reproductive cancers. As females tend to have more responsive and robust immune system compared to their male counterparts, it is therefore not surprising that females respond more aggressively to autoantigens and are more susceptible to autoimmune diseases [20]. Other factors that contribute to the sex bias of autoimmune diseases include X-chromosomal abnormalities, X-chromosomal inactivation, and fetal micro-chimerism [20].

Functional role of autoantibodies in disease
The outcome of aberrant activation of the immune system and inflammatory process is dependent on multiple factors, including the type of affected tissue or organ and the degree of tissue injury sustained [21]. For example, in type 1 diabetes mellitus, the immune system reacts to insulin-producing cells in the pancreas. In other examples, tissues of the small intestines are  affected in inflammatory bowel disease, while myelin -a fatty substance that protects nerve fibres in the brain and spinal cord -is destroyed in Multiple Sclerosis. In RA, connective tissues are affected and in SLE, auto-reactivity usually occurs in skin, heart and lung tissues. Sjogren's syndrome occurs when autoantibodies target secretory glands that produce tears and saliva, causing extreme dryness and other complications [22].
In other diseases, however, the functional role of autoantibodies is less clear. For example, in neurodegenerative diseases such as Parkinson's and Alzheimer's Diseases, increased cellular toxicity is caused by the accumulation and aggregation of misfolded proteins, which might also result in the generation of protective autoantibodies in some patients. For example, in Parkinson's Disease (PD), the protein alpha-synuclein misfolds and aggregates to form Lewy bodies; these bodies form in the brain tissues of PD patients and infiltrate the neurons, disrupting signalling process in the brain. A recent study reported that a defined set of epitopes derived from alpha-synuclein drive cytotoxic T-cell responses in people with PD [23], whilst another recent study reported a decline in anti-alpha-synuclein autoantibodies in PD patients compared to controls, suggesting that in some patients anti-alpha-synuclein autoantibodies might play a protective role [24].
Similarly, in Alzheimer's Disease (AD), the microtubule-associated protein Tau accumulates and aggregates in neurons causing neuronal degeneration. Tau also accumulates and aggregates in Progressive Supranuclear Palsy (PSP), a rare disease often misdiagnosed as Parkinson's disease. In AD, Tau causes misfolding of beta-amyloid, leading to amyloid-β (Aβ) plaque formation and downstream pathology, but in PSP, Tau itself mis-folds and agglomerates. These protein agglomerations subsequently leave the cell, spread throughout the brain and disrupt the communication between neurons [25]. Interestingly, a recent study identified an anti-Aβ plaque autoantibody in certain aged but cognitively firm individuals that was absent in AD patients; this autoantibody was cloned and has been shown to selectively target aggregated Aβ in a mouse model of AD, where it bound parenchymal Aβ and reduced soluble and insoluble Aβ in a dose-dependent manner; in Phase 2 clinical trials, this autoantibody, Aducanumab, reduced brain Aβ in patients with mild AD, again in a dose-dependent manner [26], strongly suggesting that anti-Aβ autoantibodies play a protective role in healthy individuals.
In cancers, chronic inflammation is a well-recognised hallmark and it is known that both cancer and autoimmune diseases can occur in the same individual, albeit in cancer, the immune response is often suppressed and unable to eliminate altered self-cells, while in autoimmune diseases it is hyper-activated against specific autoantigens. The act of manipulating the immune system in different ways, however, suggests a possible link between these two conditions [21] and it seems likely that inflammatory processes drives both autoimmunity and malignancy. However, it remains unclear whether it is the underlying autoimmunity that leads to malignancy ("inflammation-induced cancer") or whether the immune responses directed against tumour antigens lead to autoimmune diseases ("tumour-induced autoimmunity").

Early detection of disease
Autoantibody production is a key indicator of many diseases and has emerged as an important tool in predicting onset of a number of diseases. Autoantibodies are in principle detectable many years before manifestation of disease or symptoms and have been observed in an ever-widening range of disease areas, which makes novel autoantibodies attractive plausible biomarkers for early diagnosis of a broader spectrum of diseases now. Known autoantibody biomarkers have been reported to predate symptoms in Sjogren's syndrome, rheumatoid arthritis, Alzheimer's disease and cancers, as discussed below: Sjogren's syndrome is an autoimmune disease that affects parts of the body which produce secretions such as tears and saliva. The symptoms overlap with other autoimmune conditions and can range from mild to severe, causing nausea, fatigue, joint pain as well as excessive dryness of the eyes and mouth. Autoantibodies attack cells in mucous membranes and moisture-secreting glands of the eyes and mouth, causing dryness, irritation and pain. A study published in 2015 concluded that autoantibodies are present up to 18-20 years before the diagnosis of primary Sjogren's syndrome [27]. A total of five autoantibodies were analysed, namely antinuclear antibodies, rheumatoid factor and autoantibodies against Ro 60/SSA, Ro 52/SSA, and La/SSB, with 81% of the patients who became seropositive after diagnosis having autoantibodies in pre-diagnostic serum samples. More importantly, these autoantibodies were present in the earliest available serum sample of 95% of the patients who expressed autoantibodies before diagnosis and before the onset of first symptoms [27].
RA is a chronic autoimmune disease characterised by inflammation of synovial joints, leading to joint erosion and deterioration. Rheumatoid factor and anti-cyclic citrullinated peptide (anti-CCP) are detected in the blood of 80% and 60-70% of RA-affected individuals, respectively. Anti-CCP autoantibodies were detected in some patient sera samples 12-14 years prior to the development of RA and 34-40% of the RA patients were anti-CCP positive prior to disease onset [28].
The presence of autoantibodies has also been implicated in AD. Aβ-autoantibodies were reported to show promise as an effective blood biomarker for AD and a positive association between Aβ-autoantibody titres and cognitive status have been reported [29]. Glial autoantibody markers to glutamate were detected in the plasma of AD patients and, interestingly, the level of that autoantibody in patients with moderate and severe dementia was 2-fold higher than that in patients with mild dementia [30]. In addition, autoantibodies to ATP synthase were reported to be found frequently in the sera of AD patients but not in age-matched healthy subjects or in patients with Parkinson's disease or atherosclerosis, suggesting anti-ATP synthase autoantibodies could be a specific biomarker for AD [29].
In addition to autoimmune diseases, multiple studies have described autoantibody production prior to cancer diagnosis, including in lung cancer [31], prostate cancer [32] and ovarian cancer [33]. Autoantibody production in cancer is thought to be a product of immunosurveillance -a process in which the body's own systems recognise and eliminate abnormal cells during early tumorigenesis [34] -suggesting that detection of disease-associated autoantibodies may be feasible in the asymptomatic stages of cancer and may predate the clinical signs of tumour progression by months or years, thus enabling their use in early diagnosis [35].
By way of example, in healthy mammalian cells, cAMP-dependent protein kinase A (PKA) is an intracellular enzyme, while in most cancers it is secreted into the circulatory system as ECPKA. The level of ECPKA was found to be elevated in various stages of a wide range of cancers including bladder, breast, cervical, colon, esophageal, gastric, liver, lung, ovarian, prostate, pancreatic, renal, small bowel, rectal, adenocystic carcinomas, melanoma, sarcoma, thymoma, liposarcoma, and leiomyosarcoma compared with healthy controls [35]. Extracellular protein kinase A (ECPKA) autoantibody is thus a potential serologic autoantibody for early-stage cancers diagnosis since it presents at high levels before surgical removal of solid tumours and diminishes after tumour removal [36]. An ELISA-based test for anti-ECPKA IgG was developed and the sensitivity and specificity of this biomarker for detecting 20 different cancers were reported to be 90 and 87% respectively, with the anti-ECPKA autoantibody being detected in 90% of the patient samples but in only 13% of the control samples [35].
Autoantibody-based screening for a variety of other cancers has also been carried out in laboratory environments. For example, Xie et al. developed a test platform by combining the detection of six autoantibodies directed against prostate cancer with PSA levels, increasing the accuracy of detection from 65% using PSA alone to 81% with both methods [37]. A similar outcome was achieved in breast cancer diagnosis using a panel of six autoantigens to detect ductal carcinoma in situ and lung cancer with specificity of 85 and 92%, respectively [38]. It is thus evident that having an increased level of specific circulating autoantibodies may reflect the overall state of the immune response of an individual, whilst the presence of such autoantibodies in otherwise healthy individuals might be an indicator of future autoimmune or other disorders.

Disease staging and treatment monitoring
An accurate pathology diagnosis is of central importance in precision medicine, since it should guide choice of the most effective treatment and management regimens. Reliable biomarkers for monitoring and prediction of disease course, stage and progression will be therefore invaluable, particularly in therapeutic decision-making to treat disease at an early stage. Current research has not only established the presence of autoantibodies in several diseases but has also shown that they have the potential to be used as biomarkers capable of diagnosis and staging various degrees of pathology. One such example is Type 1 insulindependent diabetes -a chronic autoimmune disease that impairs the insulin-producing beta cells in the pancreas, preventing the body from producing enough insulin to regulate blood glucose levels. This disease can be characterised into well-defined stages, and the rate of progression to symptomatic disease can be predicted with appreciable accuracy [39]. Stage 1 is defined by the presence of two or more islet autoantibodies and progressing at a variable rate to a second stage of glucose intolerance or dysglycaemia, before becoming clinically symptomatic (stage 3).
In another example, Cai et al. reported that anti-p53 antibodies develop several years before the clinical diagnosis of certain cancers and suggested that monitoring the change of serum p53 antibodies before and after treatment of patients diagnosed with oesophageal carcinoma with radiotherapy would be useful for evaluating the prognosis and response to the treatment. This study showed that the positive rate of p53-antibodies in patients with oesophageal carcinoma was related to histological grade, stage of the disease and lymph node metastasis but not to age, gender, or site of tumour formation. The study also reported a significant difference in the level of serum p53 antibodies before and after radiotherapy treatment, with the positive rate of p53 antibodies in patients who responded to radiotherapy being much lower than the patients who did not respond to radiotherapy [40]. A separate study by Shimada et al. showed that seropositive oesophageal squamous cell carcinoma patients, whose serum anti-p53 titre did not decrease after surgery, exhibited worse prognosis than patients who showed seroconversion. Thus, a correlative study between the level of tumour autoantibodies and the overall survival outcome of cancer patients (reflected in the change in tumour status or tumour burden related to the therapy) could be extremely informative for evaluating therapeutic interventions [41]. Stage-specific autoantibody biomarkers screening is thus in principle useful in predicting onset of disease, thereby providing an opportunity to intervene and delay or prevent the onset of clinical symptoms.

Immune-related adverse events
Immunotherapies have been changing the outlook for many cancer patients in recent years and immune checkpoint inhibitors represent one of several strategies now targeting the immune system for therapeutic benefits. The immune checkpoint proteins cytotoxic T-lymphocyte associated protein 4 (CTLA-4) and programmed cell death-1 (PD-1) play essential roles in central immune tolerance and are prominent targets for cancer vaccines now since inhibition of CTLA-4 and PD-1 can (re)activate the immune system to target cancer cells. Alone or in combination, clinical trials of anti-CTLA-4 and anti-PD-1 antibodies, such as Ipilimumab, Nivolumab and Pembrolizumab, have shown promising results for the treatment of melanoma, non-small cell lung-, kidney-, prostate-and head and neck cancers, as well as renal cell carcinomas, with reported therapeutic response rates approaching 70% in some cases, albeit positive immunotherapy outcomes remain cancer-and patientspecific [42].
Ipilimumab was the first anti-CTLA-4 antibody to prolong survival in patients with advanced melanoma [43,44], with long term analysis indicating a 3-year survival of 22% across all patients with sufficient follow-up [45]. Similarly, PD-1 blockade with Nivolumab or Pembrolizumab has improved survival for metastatic melanoma, non-small cell lung cancer (NSCLC) and renal cell carcinoma (RCC) patients [46][47][48][49][50]. In one trial, advanced melanoma patients treated with pembrolizumab showed a response rate of 34% and a survival rate of 74% [51]. Nivolumab has been reported to result in increased response rates, survival and progression-free survival when compared to intravenous docetaxel in NSCLC [52], whilst stage III/IV melanoma patients achieved a partial tumour response, with a median progression free survival of 172 days, with only 18% experiencing grade 2 or 4 adverse events [53].
Combination check-point inhibitor treatments, targeting both CTLA-4 and PD-1, have also shown strong promise, with clinical trial data in untreated melanoma patients reporting objective response rates up to 72% (amongst patients with PD-L1-positive tumours) and with median progression-free survival of 11.5 months for ipilimumab plus nivolumab, compared to 2.9 months with ipilimumab alone and 6.9 months with nivolumab alone [54]. However, high grade immune-related adverse events (irAEs) occurred in 55% of those in the combination treatment group [54] and similarly high rates of irAEs have been reported elsewhere for anti-CTLA-4 and anti-PD-1 treatments [55].
Indeed, clinical findings on monoclonal antibody-induced adverse effects in general show that this is a wider phenomenon across different disease areas [56], which potentially compromises the effectiveness of such immunotherapies. Efforts are being channelled therefore towards predicting and monitoring undesirable immunotoxic effects and a panel of potential antibodies associated with irAE has been proposed ( Table 1). Further exploratory studies involving autoantibody-based immunotoxicity profiling in immunotherapy patients are underway to better characterise the role and diagnostic potential of these circulating autoantibodies in irAE.

Western blots
Since its introduction in 1979, immunoblotting, or 'western blotting', has become a ubiquitous protein analysis technique in which proteins are separated by electrophoresis according to

Rheumatoid factor
Anti-centromere  their molecular weight, then transferred onto a membrane before a primary antibody specific to the protein of interest is used to detect the presence and relative abundance of the target protein. Conventional western blotting allows detection of specific proteins to the level of single isotypes. However, it is associated with poor reproducibility, limited mass resolution, lack of accurate quantitation, low throughput and lengthy time to result, whilst non-specific cross-reactivity of mono-and poly-clonal primary and secondary antibodies on the blots is an everyday observation.
Certain modifications have been proposed to improve quantitation of western blots; for example, Zellner et al. reported a novel and improved quantitative Western blotting method using fluorescently labelled secondary antibodies, which extends the dynamic range of quantification and improves correlation with the protein amount [57]. Modifications based on simultaneous electrophoretic transfer of proteins from multiple strips of polyacrylamide gels to a single membrane sheet have also been reported to increase the data output per single blotting cycle by up to 10-fold [58], whilst resulting in reduced immunoblotting-derived signal errors and improving the overall data accuracy [58]. However, in the context of biomarker discovery, western blotting is typically only used as a validation method rather than as a primary method of identifying biomarkers.

Enzyme-linked immunosorbent assay (ELISA)
ELISA, unlike western blotting, is adaptable to higher throughput of samples as it is typically performed in 96-well microtitre plates whereby plate handling and detection systems can be automated. ELISA can be used to determine the exact amount of a specific protein in a sample, making it more readily quantitative as a technique compared to western blotting. The signals are usually produced by chromogenic reaction that generate a coloured product, which is quantified by spectrophotometry. There are four types of ELISA -sandwich, direct, indirect and competitive -which essentially differ in whether the antigen or a capture antibody is immobilised onto the surface (Figure 3).
In the context of autoantibody detection, the direct and indirect ELISA formats are most commonly used, but are better suited to the analysis of a larger number of samples against a small number of antigens in screening, verification and validation applications rather than as a primary discovery platform [59]. Furthermore, standard ELISA often has relatively low sensitivity and detection usually depends on enzymatic amplification of signal at the end of the assay. In addition, ELISA can also give false positives due to cross-reactivity of the detecting antibodies with other proteins in the sample. As sensitivity and specificity are prerequisites of any biomarker discovery platform, traditional ELISA may not be the ideal choice when it comes to identifying biologically relevant and meaningful disease biomarkers.

Mass spectrometry
The use of mass spectrometry for serum biomarker discovery is in theory straightforward since results are obtained in the form of identified and quantified proteins that are then compared between pathological and control groups [60]. Recent advances in mass spectrometry instrumentation have significantly improved the depth, breadth and reproducibility of protein identifications in many biological samples, which in turn has aided the identification of meaningful signatures that have diagnostic potential. However, whilst mass spectrometry is in general a powerful approach for unbiased biomarker identification, there are some limitations, particularly in serum biomarker discovery, due to the complex nature of serum and its wide dynamic range of protein concentrations (spanning 12-orders of magnitude), as well as to the intrinsic mass spectrometry sensitivity (>μg/mL) in detecting analytes which usually range between 50 pg/mL and 10 ng/mL in serum [60]. Furthermore, mass spectrometrybased proteomics remains heavily constrained today in its ability to differentiate and assign function to individual antibody sequences within a large collection of immunoglobulins: this is partly because the affinity-matured antibody sequences are not germ-line encoded (and therefore do not appear in the proteome databases that underpin tandem mass spectrometrybased protein identifications) and is partly because both light and heavy chains are required for antigen specificity in an immunoglobulin, yet that pairing between light and heavy chain sequences (as well as the connectivity between the complementary determining regions with each light and heavy chain) is lost during proteolytic digest before mass spectrometry analysis; moreover, antigen specificity cannot yet be predicted de novo from the primary immunoglobulin sequence. As a result, mass spectrometry is currently not well suited to the challenge of autoantibody biomarker discovery [61]; alternative technological platforms are therefore required to unravel the complexity of the human immunoglobulin repertoire and to detect and quantify novel autoantibody/autoantigen pairs in biological samples.

Serologic proteome analysis
Serologic Proteome Analysis (SERPA) is a classical immunoproteomics approach to autoantigen discovery that provides a robust way of screening antibody reactivity profiles in sera from patients with various diseases. The method -which is essentially an adaption of western blotting -involves separating proteins from a biological sample (e.g. a tissue homogenate or cell lysate) using 2-dimensional electrophoresis on large format gels and then immunoblotting with patient or control sera. Unique protein spots identified by following blotting with patient but not control sera are excised from the gels and identified by mass spectrometry. However,  the inherently high gel-to-gel variability and relatively low resolving power of individual gels impacts on the accuracy of spot picking and imposes a limitation due to co-migrating proteins, which is especially problematic for low-abundance protein targets. Several modifications have been suggested to address such limitations, including multi-colour fluorescence-based 2-D gel immunoproteomics approaches [62], but these still do not address the fundamental issues of the limited resolving power of the gels, the modest limit of detection or the throughput for SERPA. This technology is thus less widely used for autoantigen discovery now and has been largely supplanted by newer technologies that are better able to overcome these limitations.

Serological analysis of recombinant cDNA expression libraries (SEREX)
SEREX is one of the oldest methods for autoantigen discovery and utilises human cDNA expression libraries to profile autoantibody repertoires. The methodology for SEREX initially involves generation of a cDNA library from a cancer tissue or tumour cell line, followed by cloning of that library into a suitable expression vector, clonal separation of the library and expression of the encoded proteins in Escherichia coli cells grown on solid media. Colonies are then transferred to a nitrocellulose or PVDF membrane, lysed and the expressed recombinant proteins blotted with sera from patients and healthy controls. Sero-reactive proteins are then identified by sequencing the cDNA from positive colonies [63], which makes it more sensitive than the SERPA method since the latter relies on direct protein identification and is therefore limited by absolute protein abundance. Furthermore, the clonal separation of the members of the cDNA expression library provides greater resolving power than the gel-based SERPA method. However, as with all library-based screening methods, over-sampling is required to ensure that all members of the library are examined in SEREX: thus, if for example the cDNA library contains 10 4 unique clones, then at least 10 5 colonies would need to be screened for complete coverage, so even with the advent of colony picking and arraying robots, SEREX remains a relatively low throughput method.
The first cancer testis antigen, NY-ESO-1, was identified by SEREX by analysing tumour associated antigens (TAAs) that elicited a high titre IgG antibody in sera from patients with different types of cancer [64]. SEREX has also been successfully used to identify several TAAs that generate a humoral immune response in cancers such as those from the kidney, lung, breast and colon [63]. However, a fundamental limitation of SEREX is that the method lacks the ability to differentiate or detect post-translational modifications (PTMs) that are likely to play a significant role in autoimmune diseases [62] and cancers [63]. This approach also restricts the types of TAAs identified to those that can be expressed in a prokaryotic system and also effectively excludes TAAs that require folding mechanisms unique to eukaryotes to achieve the correct conformational epitope for recognition [63]. SEREX may also miss TAAs that are represented by truncated cDNAs in the library, since the encoded protein may lack specific epitopes or even whole domains. Furthermore, identification of TAAs is inherently limited to those that were expressed by the specific patient tumour or cell line from which the cDNA library was derived, which means that more than one cDNA library may be required to identify comprehensive set of TAAs for different cancers [63]. In addition, the presence of the crude prokaryotic cell lysate in every spot can give rise to high background binding in SEREX assays. Thus, for many of these reasons, SEREX has largely been superseded now by protein microarray technologies that are based on purified recombinant proteins.

Protein microarrays
Protein microarrays are a versatile, miniaturised platform used to simultaneously characterise the biomolecular interactions of thousands of different proteins that are spotted in defined locations on a solid support; as such, protein microarrays represent a natural technological evolution from ELISA, SERPA and SEREX. Protein microarrays in principle allow the quantitative analysis of binding of a wide variety of analytes -including antibodies, proteins, DNA, RNA, small molecules, lipids, enzymes as well as peptides -to the arrayed proteins. The three types of protein microarrays that are commonly used are analytical, functional and reverse-phase microarrays. Analytical protein arrays, or antibody arrays, are ideal for quantification of different known proteins in a biological sample, monitoring protein expression levels and protein profiling in what amounts to miniaturised, highly multiplexed ELISA assays. Functional protein microarrays can be sub-divided into those based on recombinant proteins and those based on native proteins and can be used for autoantibody and immune response profiling, biomolecular interaction profiling and identification of enzyme substrates, amongst others [1]. Reverse-phase protein arrays are comprised of spots of different crude tissue homogenates or cell lysates and are suited for detection of known proteins in multiple tissues/cells based on blotting of the reverse-phase arrays with antigen-specific antibodies. In general, protein microarrays can be applied in diagnostic and therapeutic research, through new biomarker discovery for disease staging and monitoring, potential drug-target evaluation and for identification of new drug targets. Of the different protein array types, functional protein arrays appear best suited to autoantigen discovery and autoantibody profiling and are discussed in more detail below.

Recombinant protein production
Different protein production systems can be employed to produce recombinant proteins in sufficient quantities for protein microarray fabrication. The key problem associated with recombinant protein production is identifying the best expression system for a particular protein. To date, there is no universally applicable protein expression system [65]. Each system has its advantages and disadvantages; therefore, the choice of expression system selection should be based on the properties of the recombinant protein as well as the scale of expression required. Although exploring multiple expression systems in parallel sounds enticing, factors such as protein solubility, yield, speed and cost need to be taken into consideration as it involves substantial resources. Choosing the right system for protein expression can be particularly important in obtaining biologically active and functional recombinant proteins [1].
Bacteria, notably E. coli, represent the most commonly used expression systems for protein production since they give high protein yields at a relatively low cost, require simple and rapid culture conditions, and are highly scalable. In addition, many parameters can be altered to optimise expression levels of protein. However, inefficient disulfide bond formation, insolubility, aggregation and poor folding of proteins have been reported using this method, as well as very minimal capability in performing post-translational modifications [65].
Expression of proteins in yeast is a common alternative to prokaryotic expression systems as it is a well-defined and economical eukaryotic expression system. Commonly used yeast strains include Saccharomyces cerevisiae and Pichia pastoris, although other yeast strains have also been reported. Proteins expressed using both strains fold efficiently and numerous posttranslational modifications can occur; P. pastoris typically gives better protein yields than S. cerevisiae [65]. However, a major disadvantage of the yeast expression systems is that they do not mimic protein glycosylation patterns from mammalian cells, with proteins tending to be hyperglycosylated due to the presence of large mannose glycans. Furthermore, lysis conditions for yeast are typically harsh and induce many endogenous proteases, meaning that the extracted recombinant proteins are often significantly proteolysed.
Baculoviruses belong to a diverse group of large double-stranded DNA viruses that infect many different species of insects as their natural hosts but are highly species-specific and are not known to propagate in any non-invertebrate host. Baculoviral expression systems yield good expression levels, especially for intracellular proteins, and typically produce functionally active, recombinant mammalian proteins that are properly folded and oligomerised and which contain correct disulfide bonds, as well as mammalian-like post-translational modifications, including glycosylation, so are both structurally and functionally similar to their native counterparts [65].
Mammalian expression systems are preferred by some researchers as they produce more 'humanised' proteins, with the most biologically-relevant post-translational modifications and native folding. Amongst the most widely used mammalian cells include HeLa, human embryonic kidney-derived (HEK293) epithelial cells, Chinese hamster ovary cells (CHOs) and African green monkey kidney cells (COS). However, mammalian protein expression systems require more demanding culture conditions compared to other systems [65] so are significantly more challenging for high throughput expression purposes.

Surface chemistry
The microarray surface chemistry plays a critical role in determining protein microarray quality. Slide surfaces vary: aldehyde and epoxy-derivatized glass surfaces are used for random attachment through amines, whereas nitrocellulose, hydrogel or metal surfaces for attachment of affinity-purified proteins. An ideal surface chemistry should resist nonspecific adsorption, whilst preserving the folded structure of the arrayed proteins [1].
Common challenges associated with slide surface chemistry include high background and incorrect protein orientation or conformation of proteins, whereby all functional binding sites are not readily available for interaction. Proteins have various hydrophobic domains and charged patches, so tend to adsorb non-specifically to most solid surfaces resulting in the disruption of protein 3-D structure and eventually complete loss of activity. This indirectly gives rise to the second issue -the loss of protein conformation upon immobilisation. In particular, when the functional domains interact excessively with a solid surface, the orientation of the proteins may be altered or completely lost, resulting in the subsequent disruption of the functional domains and loss of discontinuous epitopes [66]. Partial or complete denaturation of proteins on the arrayed surface is also deleterious for downstream autoantibody binding since it is well known that antibodies tend to bind non-specifically to exposed hydrophobic epitopes, giving rise to false positive signals in autoantibody profiling assays.
Proteins can be immobilised onto a microarray surface via encapsulation, surface adsorption, covalent attachment or affinity binding (Figure 4), which are further described below [67].
Encapsulation of a purified protein on a solid surface involves suspending the protein in a random orientation within a 3D gel pad (e.g. acrylamide or agarose) on an array surface; this approach provides a high capacity for immobilisation and thereby enhances the sensitivity of subsequent assays. A drawback of the technique, however, is that the size of the protein or other ligand applied may restrict diffusion into the gel, resulting in stronger signals at the periphery of the gel pad. This challenge may be surmounted when using different crosslinkers that can improve the porosity of the gel pads [68].
Immobilisation of purified proteins via noncovalent adsorption is a straightforward, reversible method that involves protein attachment onto a solid support through weak, non-specific interactions, including van der Waals hydrophobic interactions and electrostatic interactions. Commonly used surfaces here include nitrocellulose-coated and amine-terminated glass slides. Although this approach can provide high protein loading onto the surface, the orientation of the immobilised protein cannot be controlled, resulting in variable reaction efficiency, accuracy and reproducibility of the resultant arrays [69]. Furthermore, the underlying surfaces tend to be relatively denaturing towards the arrayed proteins [1].
Covalent attachment takes place by chemically cross-linking proteins to the surface through the nucleophilic residues lysine or cysteine. These residues are cross-linked to surface-bound ligands that are terminated with aldehyde, epoxy, or N-hydroxysuccinamide moieties. Irreversible immobilisation of a wide range of proteins to the carrier surfaces are feasible using covalent attachment, but the non-specific modification of surface residues on the arrayed protein carries the risk of altering the activity and folded structure of those proteins [70].
Affinity capture is a particularly advantageous way to immobilise proteins, since it circumvents many of the limitations of other approaches described above. Typical affinity capture methods include use of biotinylated, hexa-His-tagged, glutathione S transferase tagged or Halo tagged recombinant proteins [1], with orientation of the immobilised protein being controlled via the tag, thereby aiding in preserving the structure and function of the arrayed proteins.
Numerous human protein microarray platforms are available today for autoantibody research, including Immunome arrays (Sengenics, Singapore), Nucleic Acid Programmable Arrays (BioDesign Institute, Arizona), Human Protein Atlas Protein Fragment Arrays (SciLifeLab, Sweden), HuProt arrays (CDI Laboratories, USA) and ProtoArrays (ThermoFischer, USA). These various human protein microarray platforms have differing protein content and make different use of the various protein expression systems, surface chemistries and immobilisation strategies described above, all of which gives rise to differences in technical performance, as has been reviewed recently [1].
By way of example, proteins on the Immunome array are expressed in a baculoviral system as in-frame fusions to a biotin carboxyl carrier protein (BCCP) folding marker, that itself becomes biotinylated in vivo or in vitro only when the fusion protein is correctly folded. Immunome's surface chemistry is based on a hydrogel polymer that dramatically reduces non-specific background binding to the array surface whilst providing an aqueous-like environment for the arrayed proteins. The hydrogel matrix is derivatised with a low density of streptavidin molecules that are held away from the underlying array substrate, providing a selective surface for binding of biotinylated proteins (Figure 5). This helps to ensure that each protein immobilised on the array retains its native conformation, correctly folding and functionality on the array surface.

Sensitivity and reproducibility
Quantification of autoantibody biomarkers using a protein microarray starts with the production of recombinant proteins, printing of the proteins onto a solid support, probing them with serum or plasma samples and finally capturing interactions using fluorescent-labelled secondary antibodies. Protein microarrays have thus often been referred to as miniaturised version of ELISA. Miniaturisation allows a high overall sensitivity as analyte measurement is conducted while retaining the highest concentration per unit volume attainable for the given sample, with decreased reaction times due to short diffusion distances [71]. Furthermore, fluorescent-based signal detection in protein microarrays offers lower limits of detection (as low as 1 pg/mL; [72]) and greater dynamic range (up to 5 orders of magnitude; [73]) than colourimetric readouts in typical ELISAs. In addition to their greater sensitivity compared to ELISA, protein arrays are also superior in terms of multiplexing, as thousands of proteins can be printed onto glass slides in replicates and analysed simultaneously.
Given the capacity for multiplexing, as well as the high-throughput, low sample consumption, remarkable sensitivity and reproducibility of protein arrays, this platform is rapidly proving now to be very well suited to the challenges of autoantibody biomarker discovery. However, when choosing the optimal platform for discovery research, important factors such as the protein expression system used and the surface chemistry of the platform should be considered carefully to ensure that only biologically-meaningful autoantibody biomarkers that have the potential to be translated into clinical use will be discovered.

Protein microarray-based autoantibody discovery
Microarrays fabricated with proteins derived from tissues or cell-line, or recombinant proteins have been used in many studies to identify potential autoantibody biomarkers for cancer, a few examples of which follow. Here, a crude cell lysate was resolved in 2 dimensions using liquid-based isoelectric focusing followed by reverse-phase liquid chromatography, resulting in 1760 fractions which were then printed on a nitrocellulose surface and used to screen sera from cancer patients vs. healthy controls to identify fractions containing cancer-specific reactive autoantigens. Fractions corresponding to reactive spots were analysed using mass spectrometry to identify cancer-specific autoantigens, which revealed that 9/15 colon cancer patients, but neither of the healthy controls, produced autoantibodies against ubiquitin C-terminal hydrolase isozyme (UCH-L3). Autoantibody production against UCH-L3 was confirmed by Western blot in 19 of 43 (44%) additional colon cancer patients [74].
Antibody microarrays: in order to identify prostate cancer-associated autoantibodies, wellcharacterised monoclonal antibodies were arrayed onto nanoparticle slides to capture native antigens from prostate cancer cells, which were subsequently incubated with fluorescentlylabelled IgG from patients with prostate cancer and benign prostate hyperplasia (BPH). The study revealed that prostate cancer patients had higher autoantibody levels against TLN1, TARDBP, LEDGF, CALD1, and PARK7 when compared to patients with BPH. The study concluded that PSA alone produced sensitivity-and specificity-values of 12.2 and 80%, respectively, whereas the collective panel produced sensitivity-and specificity-values of 95 and 80%, respectively [75].
Functional protein microarrays: a cancer antigen microarray, comprising 123 full length, folded, recombinant tumour-associated antigens expressed in insect cells was used to identify autoantibodies that differentiate prostate cancer patients from benign prostatic hyperplasia (BPH) and other disease controls. The study identified 41 potential diagnostic/therapeutic antigen biomarkers for prostate cancer and found that autoantibody titres against GAGE1, ROPN1, SPANX1 and PRKCZ were high in prostate cancer patients, whereas autoantibody titres against MAGEB1 and PRKCZ were higher in BPH controls. Of the 41 potential antigens identified, FGFR2, COL6A1 and CALM1 were identified in urine from the same patients by shotgun proteomics [76].
Functional protein microarrays have also been used to identify autoantibodies against autoantigens in a number of other infectious or autoimmune-related diseases, including malaria and Parkinson's disease (PD). In malaria, Plasmodium knowlesi infection results in an autoimmune-like response in some individuals that has been hypothesised to play a protective role against malarial infection. Using the Sengenics Immunome protein array comprising 1636 correctly folded human antigens, 24 antigens with high reactivity to serum autoantibodies were identified, which may serve as potential biomarkers for asymptomatic malaria, mild malaria, or predictive biomarkers for severe malaria [77].
PD is a chronic and progressive neurodegenerative disorder, and a positive correlation is associated with Helicobacter pylori (H. pylori) and PD motor severity. The Sengenics Immunome protein array was used to screen H. pylori-seropositive PD patients and H. pylori-seronegative PD patients in a study that identified 13 significant autoantibodies, of which 8 were up-regulated and 5 down-regulated in the case group. Identified autoantigens included Nuclear factor I subtype A (NFIA), Platelet-derived growth factor B (PDGFB) and Eukaryotic translation initiation factor 4A3 (elFA3) [78].
Other protein microarray platforms, including nucleic acid programmable protein arrays (NAPPA), HuProt arrays, Protoarrays and Human Protein Atlas Protein Fragment Arrays, have also found utility in autoantibody biomarker discovery applications across a wide variety of disease areas, including a broad spectrum of cancers and autoimmune diseases, as well as several neurological disorders and inflammatory disorders, as recently reviewed elsewhere [1].

Biomarker validation
Biomarkers can be used for variety of purposes including disease prediction, diagnosis and treatment monitoring. However, while there are thousands of papers reporting discovery of potential biomarkers, very few of these have been validated and approved by the Food and Drug Administration (FDA) for clinical use ( Table 2), despite preliminary reports of good sensitivity and specificity. This highlights the reality that biomarker validation is a challenging process with multiple criteria that need to be fulfilled before the markers can be approved use in clinical settings. There are also multiple stages where attrition can occur in the validation process, including poor study design, variations in sample collection, and the simple failure of the biomarkers in blinded validations, as discussed further below: A key requirement for all biomarker validation is that the biomarker demonstrates a correlation with specific pathophysiological processes or serves as a surrogate endpoint in a clinical trial. Diagnostic precision and accuracy are key technical parameters, since inaccurate or variable results, as well as false positive and false negative results, could lead to misdiagnosis that could bring about unwanted sequelae.
Typical biomarker discovery programs are initially set up as case-control studies, with clearly defined and well-separated clinical groups. However, in real world settings, the diagnostic challenge is often not to distinguish diseased from healthy, but to differentiate amongst people with similar clinical symptoms but different underlying disorders. As a first step towards validation therefore, once candidate biomarkers have been identified from an initial discovery study, a scientifically sound and statistically-powered validation cohort needs to be designed to test the diagnostic power of the biomarkers in the context of 'diseased patients' and 'other disease' controls. Power calculations are used to determine the sample size required to identify reproducible, precise and accurate biomarkers that qualify for clinical utilisation and this cohort is then typically sub-divided into a training cohort and a larger blinded validation cohort. Typically, the clinical sensitivity and specificity of a larger set of candidate biomarkers from the discovery research is first assessed in the training cohort and the best performing markers that survive are taken forward for further evaluation in the blinded validation cohort. Statistically-powered validation cohorts often run into hundreds of patients, so obtaining quality serum or plasma samples in sufficient quantities from a disease cohort, as well as from matched healthy and other disease controls, can therefore sometimes be a challenge. Furthermore, biomarker validation is a complex and lengthy process, meaning that the validation assay methods themselves need to be rapid, robust, reproducible, inexpensive and easy to setup and run, potentially in different laboratories.
Even after considering the aforementioned factors, it often turns out that the candidate biomarker is simply not robust, sensitive or specific enough to penetrate into a clinical setting. Ideal candidates for multiplexed panels would be markers whose qualitative and/or quantitative expression is unique to the disease. However, particularly in the case of cancers, identifying truly disease specific markers has proved problematic; for example, MAGE-A3 was originally thought to be 'tumour specific' marker but was later found to be detectable in healthy tissues as well [71]. It is therefore not surprising that biomarkers with early diagnostic potential initially obtained in studies conducted in laboratory settings can often not be confirmed in later clinical validation and screening settings, resulting in high attrition rates during biomarker validation.

Conclusion
Autoantibodies have gained considerable attention in the medical diagnostic field as candidate diagnostic and prognostic biomarkers in many different disease areas, since they are in theory detectable many years before clinical symptoms appear. This particular property of autoantibodies makes them attractive tools for early diagnosis of disease. However, identification and validation of autoantibody biomarkers has historically been constrained by the available technological approaches and the high attrition rates during studies on larger cohorts.
To increase the success rate in biomarker discovery and validation, the correct technique as well as the right number of samples and analytes to be used for each phase should be carefully planned and designed as depicted in Figure 6. The current gold standard for biomarker validation remains ELISA, which is regularly utilised for confirmatory studies as it allows a relatively high-throughput of samples and is a versatile and robust tool. Thus, protein microarray analysis is often compared against the quantitative data of ELISA assays [79]. However, ELISAs routinely permit only single antigen detection per well and often require relatively large volumes of samples compared to other more miniaturised, high-throughput methods. This leaves substantial scope for protein microarrays to be used in both the discovery and validation of panels of autoantibody biomarkers, since they represent a sensitive, highly reproducible, multiplexed and high throughput experimental platform for autoantibody quantitation; this will undoubtedly be aided by the underlying protein microarray platforms themselves gaining regulatory approval for use as clinical diagnostics.

Author details
Farhanah