Advantages and limitations of protein microarrays in biomedicine.
Despite the tremendous advances in the understanding of the molecular mechanisms and the complexity of the diseases is one of the present challenges for the scientific community; then, novel strategies are required to be designed and developed for effective strategies for early diagnosis and treatment. As many cellular alterations are observed at protein level, high-throughput assays are dramatically needed for biomarker discovery. Herein, we describe advantages and limitations of protein microarrays, as proteomics strategy useful for multiplex and high-throughput protein characterization in clinical samples. Finally, a few examples are discussed; mostly of them related to currently disease biomarkers already identified in proximal fluids by protein arrays are discussed.
- biomarker discovery
- multiplex detection
- protein microarrays
Despite the tremendous advances in the understanding of the molecular basis of diseases (such as cancer, Alzheimer, genetic diseases), there are some crucial gaps that difficult us to understand disease pathogenesis and develop effective strategies for early diagnosis and treatment .
The current interest in protein microarrays due, mostly in part, to the prospects that proteomics assays, based on them, allow: deciphering the altered protein expression in different levels (tissue, cells, subcellular structures, body fluids, protein complexes,…); the development of novel biomarkers for diagnosis or early detection of diseases; the identification of new targets for therapeutics; and accelerating drug development through more effective strategies to evaluate therapeutic effect and toxicity .
Mainly based on the proteome alterations in disease may occur in many different ways that are not predictable from genomics analysis, and it is clear that a better understanding from genomic analysis together with an substantial impact in biomedicine.
Recently, there is a large amount of protein microarray approaches available for applications related to disease because the employment of these technologies is very useful for better diagnostics and to shorten the path for developing effective therapy. In general, protein microarrays allow to increase sensitivity, reduce sample requirement, increase high-throughput and identify protein alterations (quantitative and qualitative), such as post-translational modifications.
If we want to quantify thousands of proteins in a small number of samples, a typical proteomics discovery experiment employs a non-targeted approach (shot-gun proteomics). The hundreds of proteins that result from the comparative analysis, are differentially expressed between healthy and diseased samples. After discovery phase, the potential biomarkers proteins are reduced by performing studies on additional patients or at more time points, and/or by using another technique. Then, these potential biomarkers are verified on a set of 10–50 clinical samples. In the final step, a small number of biomarkers is “validated” on 100–500 clinical samples. Bearing in mind this conventional biomarker workflow, high-throughput assays are required in order to overcome the gap between translational research and clinical needs (from bench to the bedside)  (Figure 1).
Biomarkers or biological markers are defined as measurable characteristics that indicate the state of a biological process, differentiating if it is normal or pathological. They are all those molecules that are found in body fluids in low abundance and that are associated with specific health or disease processes . The methodology to establish that a biomarker is good for clinical application is divided into three steps: discovery, verification and validation as it has been said before [3, 4].
Proteomics, the scientific discipline that studies proteins, has identified several proteins that can act as biomarkers. There are several techniques that use biomarkers and that allow an early diagnosis of the disease. These include new generation sequencing, mass spectrometry and protein arrays [5, 6]. In this chapter we will focus on the study of arrays.
Microarray technology is a term that refers to the miniaturization of thousands of assays in a single device, allowing the simultaneous and massive analysis of a large number of biomolecules in a single biological sample. They allow the study of a large number of parameters in a single test with a minimum requirement of sample and reagents, which is why they constitute a simple, fast and very sensitive sampling technique [7, 8].
The microarrays present several innovative strategies for their applications such as the identification of biomarkers and the interactions between proteins that may help in the future to routine clinical analysis.
There is a great variety of arrays which can be classified according to their content in protein microarrays, functional protein arrays and reverse phase arrays; according to their shape in planes and spheres; and also according to their detection method.
In this chapter we will explain all these types of arrays, some of their applications in biomarker discovery as well as validation/verification.
1.1. Biomarkers discovery
A biomarker is defined by the Food and Drug Administration (FDA) as “A characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacological responses to a therapeutic intervention.”
It is possible to distinguished three categories depending on the application of the biomarker: diagnostic biomarkers used for disease detection, prognostic biomarkers used for predict the course of a particular disease such as, recurrence, progression, and survival and predictive biomarkers used for predict the response to treatment that could be subsequently applied in patient assessment .
During last years, biomarker research and its translation into the clinics have been accelerated by protein microarrays due to the extraordinary capacity for identifying of valuable biomarkers in a short period of time without the requirement of any prior in-depth knowledge into the mechanism of disease progression.
Moreover, protein microarrays are reliable for analyzing targeted/non-targeted biomarkers presented in mostly of human proximal body fluids (such as plasma, serum, synovial liquid,…) with a wide dynamic range. In contrast with other proteomic strategies, protein microarrays avoid the sample pre-fractionation. Thus, for example, serum, plasma, urine and tissue extracts which are complex and non-fractionated proteome mixtures, could be used for experimentation. For this reason, among others, protein microarrays offer a powerful technology for functional proteomics analysis in HT format.
Microarray technologies, like DNA arrays, printed dense spots of capture ligands immobilized onto a solid support that are exposed to samples containing corresponding binding molecules (often called queries), allowing the simultaneous analysis of thousands of capture targets within the same assay. Ekins and collaborators described these binding events based on miniaturization as the key parameter. They predict that a system that uses small amounts of capture molecules and a small amount of sample could be more sensitive than a system using a hundred times more material. In fact, this is the case when K < 0.1 (where K is the affinity constant between ligand and target) .
The capture ligand is presented in a confined area of array, reducing its diffusion. The specific binding event with its target takes place with the highest signal intensities and optimal signal-to-noise ratio could be achieve in these small spots  (Figure 2). An immunoassay in an array format displays sensitivities in the pM to fM range, enabling test low-abundant (pg/mL) analytes in crude proteomes with a small volume of sample. In many cases, protein microarrays show a relevant advantage in clinical applications because the samples to test are minimal .
For that reason, protein array technology needs to use a multiplex and highly sensitive protein assay capable of handling and resolving complex proteomes with limited available sample .
Recently, several types of protein microarrays have been developed and applied as multiplex throughput assay in several biological characterization. Here, it is described the principal features of protein microarrays.
2. Description of protein microarrays
In general, protein microarrays are classified according to features such as content, format, detection method and according to final application such as analytical and/or functional. Here, it is described the main aspects of the different types of protein microarrays.
2.1. Arrays according to their nature of the capturing agent
In general, there are three types of arrays according to their content:
2.1.1. Analytical protein microarrays or antibody microarrays
It is based on the antibody antigen interaction. Antibodies have the ability to bind to a very specific protein, so particular or rare analytes can be detected in highly heterogeneous mixtures. They are usually used to identify biomarkers which predict a biological condition such as healthy or pathologic.
So many studies have shown that only a fraction of antibodies may properly work and this may be due to the loss of antibody activity by degradation or denaturation on storage or during the printing process, or due to inappropriate antibody orientation onto the array surface .
2.1.2. Functional protein arrays or recombinant protein arrays
It is based in the identification of protein interactions with different molecules (proteins, DNA, lipids, drug, etc.), so recombinant proteins are printed onto array surface.
Some examples are protein in situ array (PISA), printing arrays from DNA (DAPA) nucleic acid programmable protein array (NAPPA) and multiplexed nucleic acid programmable protein array (M-NAPPA). They are essential for pharmaceutical industry because they allow to know protein interactions .
PISA is based on DNA amplified by PCR as a template. The DNA that encode the protein of interest contains a T7 promoter or another strong transcriptional promoter and an in-frame N-12 or C-terminal tag sequence for protein capture onto the surface. PISA offers the possibility to cell-free production of protein arrays. It means protein are produced using cell extracts directly on the surface os arrays. PISA demonstrated that multiple proteins could be produced without the need of using cells for expression followed by lysis and purification to make the proteins .
DAPA is a technique derived from PISA, but DAPA allows use the same DNA template slide repeatedly for printing up to 20 copies of the same protein and also DNA could be reused after prolonged periods of time. DAPA takes a long time to express proteins due to the diffusion of proteins through the membrane. This technique starts by spotting the PCR amplified DNA fragments encoding the tagged protein on one slide. This slide is sandwiched with another Ni-NTA slide where a tag-capturing agent immobilizes the expressed protein. A permeable membrane with the cell-free lysate which allows coupled transcription and translation is places between the two slides. Then, the expressed proteins are captured to the surface on the other slide through the capture ligand. Overall, DAPPA requires long time to express proteins and this technique presents the protein diffusion as strong limitation, in particular with large proteins [7, 9, 11, 12].
NAPPA uses cDNA templates cloned into expression plasmids which adds a transcriptional promoter and also an in-frame polypeptide capture tag. It has several advantages: (1) once the clone is produced as a glycerol stock it becomes an indefinitely renewable resource that could be shared with other labs; (2) if the clone is carefully sequence verified, then the resource will have long-term sequence fidelity; (3) the use of plasmids removes some of the length constraints on the epitope tags, so that functional protein tags can be used. There are many applications of NAPPA where the proteins are fused with glutathione-S-transferase (GST); nevertheless, other tags such as flag, HA, c-myc, and Halo tag have been used in specific applications. High quality supercoiled plasmid DNA is purified from bacteria cultures and printed onto an activated ester surface along with a homo-bifunctional crosslinker, bovine serum albumin (BSA) and anti-GST antibody. BSA efficiently increased the DNA binding and narrows down the unspecific interactions and anti-GST attaches the protein expressed. When cell-free expression system is added to the array, a coupled transcription/translation reaction is produced and the nascent protein is linked to the capture agent tag the C-terminal end assuring the complete translation of the protein [10, 11, 13].
Puromycin capture protein arrays (PuCA) are cell-free expression protein arrays based on the affinity of puromycin by expressed peptide/protein. First, PCR DNA is transcribed to mRNA, and a single-stranded DNA oligonucleotide modified with biotin and puromycin on each end is then hybridized to the 3′-end of the mRNA. The mRNAs are placed on a slide and immobilized by the binding of biotin to streptavidin which is previously coated on the slide. Cell extract is then dispensed on the slide for in situ translation to take place. When the ribosome reaches the hybridized oligonucleotide, it stops and incorporates the puromycin molecule to the nascent polypeptide chain, thereby attaching the newly synthesized protein to the microarray through the DNA oligonucleotide .
M-NAPPA is based on combining up to five different DNA plasmids at one point. This increases the number of proteins displayed by microarrays by five. It also reduces the cost and the time spent in work. M-NAPPA would be useful in unbiased HT screening studies, such as protein-protein interactions, protein-DNA interactions, discovery of drug binding target as well as (auto)antibody biomarkers for a variety of human diseases .
2.1.3. Reverse phase protein arrays
In these arrays, biological samples, tissues and cell lysates are printed onto array surface. The detection is accomplished by antibodies link with specific proteins, and then link with another antibody with fluorescence. They offers the potential for rapid comparison of the levels of such proteins present among a significant number of samples on a single array [7, 8].
The problem is the interaction of the lysate matrix, therefore validated antibodies should demonstrate no cross-reactivity toward other biomolecules of the lysate (Figure 3).
3. Arrays according to the format
3.1. Planar arrays
Planar arrays are based on high-density microspots of ligand deposited onto a solid support and separated by a minimal distance. It is necessary to know some parameters that influence in the assay performance like spot size and morphology, ligand capacity, background signal, limit of detection and spot reproducibility .
3.2. Microsphere arrays
Microsphere arrays are based on the simultaneous use of different populations of microspheres labeled with several fluorochromes. Each one is covered by different antibodies in order to incubate a biological sample of interest .
4. Arrays according to their detection method
Depending on the detection method it is classified in methods based on labels and free-label methods .
4.1. Methods based on labeling
4.1.1. Conventional fluorescent labels
Among them are radioisotopes and conventional fluorochromes like fluorescein, BODIPY or cyanins. Cyanines are the most used in protein arrays due to its minimal interaction with other biomolecules and its high intensity. More than two fluorochromes can be combined to identify several biomarkers at the same time. It is used for cancer biomarkers.
4.1.2. Flow cytometry sphere arrays
Flow cytometry can be used to detect soluble proteins present in body fluids or in cell lysates using microspheres. The CF has two lasers for the analysis quantitative and qualitative that allows to know the binding of target proteins with the microspheres. Comparing conventional fluorescent techniques, flow cytometry allows rapid and accurate quantitative-qualitative evaluation of a high number of proteins, with low cost and high sensitivity.
4.1.3. Magnetic spheres
These systems are an union of magnetic spheres with antibodies that allow the detection of proteins. They capture soluble proteins with high reproducibility and sensitivity, associated with low background noise, a wide dynamic range and low cost.
4.1.4. Quantum dots as fluorescence labels
The “Quantum dots” (QDs) are nano-crystals, formed by a core of semiconductor and fluorescent material coated with another semiconductor, which have very stable optical properties and a large gap between the wavelengths of emission and excitation of the complement. These nanometric particles could work like labels joining with peptides or antibodies for the recognition of cellular components. They have a high fluorescence, greater photo-stability, multicolored excitation, an adjustable and narrow emission spectrum, and their higher quantum yield, compared with organic fluorochromes. They also have several disadvantages like its high susceptibility to oxidation and photolysis as well as some risks for human health and the environment. They are used to localize tumor biomarkers.
4.1.5. Metal NPs as a label
Gold nanoparticles are used in protein arrays due to their optical properties, quantum efficiency and compatibility with a wide range of wavelengths.
4.2. Free-label methods
4.2.1. Surface plasmon resonance (SPR)
This technology is based on the generation of plasmons on a surface, which are oscillations of free electrons propagated parallel to the metal/medium interface, which allows to measure the changes in the refractive index of the sensor surface. The intensity variation of the reflected light and/or the angle of incidence tell us if there are molecular associations or dissociations, allowing to determine the relationship between the association and dissociation of biomolecules with each other. In addition, the SPR technology also allows evaluate the affinity and specificity of these interactions, facilitating the measurement of biomolecular interactions in real time with high sensitivity.
Microcantilevers are thin sheets of silicon coated with a gold surface associated with nanomechanical systems of biomolecular recognition. For this reason, the antibodies or proteins are immobilized on said sheets, so that when there is an interaction between these molecules and their possible ligands we can measure the variation in the position of the sheet using different optical or electronic systems.
4.2.3. Atomic force microscopy (AFM)
It is a microscopic analysis observing only the topological variations of the surface of an object. It has been used for the evaluation of the immobilization of biomolecules with a micrometric resolution, and for the detection of protein interactions on the arrays surface.
5. Arrays according to the applications
Bearing in mind the final application, protein microarrays are classified into two categories: Analytical arrays and functional arrays.
5.1. Analytical arrays
These arrays are normally used to quantitatively identify the presence of multiples protein in one single assay. Commonly, the main application is the detection of differently expressed proteins and their abundance in different samples. Biomarkers are determined by this type of arrays. They are biometric measurements, including molecular signatures, that predict a biological or clinical condition for example, normal or pathologic, often with potential diagnostic or prognostic value [9, 10, 16].
5.2. Functional protein arrays
These arrays are mainly used for the characterization and identification the specific function of proteins, as well as their interactions with other molecules (including proteins, peptides, small molecules/drug, enzyme/substrates or nucleic acids,…) [12, 17]. Moreover, functional protein arrays also allow the detection and identification of post-translational modifications (PTMs), such as glycosylation, phosphorylation, acetylation,… which typically modulate the proteins’ function, regulation and/or turnover.
Refers to research, functional protein arrays present the detection of multiple protein interactions with low reagent consumption in a fast and low cost fashion. On the translational side, the discovery of these interactions will promote the progress of new pharmaceutical targets, diagnostics and therapeutics. As a consequence, this technology is very interesting in the pharmaceutical industry  (Table 1).
6. Multiplex biomarker detection by protein microarray assays
During last decade, protein microarrays have been successfully employed in multiplex detection for biomarker discovery; here, it is remarked a few of these studies in order to illustrate the utility of these approaches in the field.
6.1. Oncology diseases-specific biomarkers
Haab et al. defined a panel of five serum proteins significantly expressed in serum between prostate cancer patients (33) and 20 controls . In addition, this team has also identified 84 proteins with differentially relative abundance between diagnosed lung cancer patients and healthy controls .
In a similar approach, Wittekind and colleagues reported a set of proteins as biomarker candidates associated with hepatocellular carcinoma . Nowadays, several studies have been performed and focused on biomarker identification in several oncological pathologies; for example: In ovarian cancer, 11 proteins have been identified by Amonkar et al. ; Sreekumar et al. identified a panel of proteins as biomarker candidates in colon carcinoma cells ; Díez et al. has identified differentially expressed proteins in B-cell chronic lymphocytic leukemia which are related with target therapeutics . Previously, Below and coworkers have developed an antibody microarray to immunophenotype 1100 leukemia and lymphomas according to the abundance of a panel of 82 antigens or cluster of differentiation (CD) characterized at the surfaces of lymphocytes .
6.2. Vaccine candidates
In 2014, the firstly reported study about pathogen-host protein interactions by CID and workers. In this work, they studied the presence of post-transcriptional modifications in effector proteins, T3SS proteins, from different mutants of Salmonella typhimurium when they infected in vitro Hela. Lysate collection representing all infection conditions are printed and using several validated antibodies, they show a comparative results among the different assays according to abundance proteins or post-transcriptional modifications .
Another similar study, performed by Li and coworkers, has been published about identification of 149 antigens from Yersinia pestis presented in the serum from EV76 rabbit and to test the immunogenicity of several viral proteins .
In 2009 Thanawastein et al. developed an approached called Expressed protein screen for immune activators (EPSIA) which were successfully applied in the identification of novel bacterial immunostimulatory proteins from Vibrio cholerae .
Recently, Manzano et al. reported a set of novel vaccine candidates for O. moubata based on a systematic characterization of >2000 host-pathogen interactions, which were evaluated with self-assembled protein microarrays based on a cDNA library encoding >400 recombinant proteins of O. moubata .
In a most recent study, Montor et al. describe a work using NAPPA arrays to evaluate candidate membrane antigens in P. aeruginosa which could help to track the immune responses of patients infected with P. aeruginosa and healthy ones. In this work, 12 proteins have been identified being mostly of them related with adaptive immune response in infected patients .
6.3. Auto-immune diseases
In response to many pathological processes, the humoral immune system generates antibodies to self-proteins (“auto-antibodies”). These auto-antibodies are generated due to antigen over-expression, mutation, altered post-transcriptional modifications of altered degradation released from damaged tissue which lead to their recognition by the immune system. Auto-antibodies have several benefits which make them as suitable source of biomarkers: (1) They have been discovered before the appearance of clinical symptoms; (2) They are simple to identify even at low levels once their target antigen is known; (3) they are easy to reunite from blood; and (4) they could be show in higher levels and with a longer half-life than their target antigens, which may only be present in transiently in blood .
For example, NAPPA arrays were used for serological screening for the first time in 2007 by Anderson et al. They investigated the presence of antibodies against tumor antigens in breast cancer. The tumor-suppressor p53 is well-characterized in several solid tumors and the presence of antibodies against p53 is mainly due to mutations in its gene which lead to alterations in its half-life. By this approach, the authors presented that p53-specific antibody levels were significantly lower in healthy donors than in breast cancer patients and the response to p53 antigen was detected in Stage II disease. Also they studied that the antigen sites of p53 with several antibodies which recognized distinct epitopes of the protein to confirm that many regions of the protein expressed in NAPPA were accessible to antibodies in serum detected to them [10, 30].
In a follow-up work, also this group performed a wide screen for new auto-antibodies in breast cancer. They designed and developed 4988 candidate antigens to detect their auto-antibodies in serum samples from breast cancer patients with stage I–III disease. This screening was performed in three stage design that entailed comparing cases and controls and eliminating uninformative antigens at each stage. At the final phase, slightly more than 100 antigens were tested and 28 auto-antibodies were identified that distinguished benign breast disease from invasive cancer under blinded conditions [10, 30].
With a similar workflow, Labaer et al. developed a pilot NAPPA to assess auto-antibodies present in juvenile idiopathic arthritis (JIA) which is a disease characterized by chronic joint inflammation in children .
Recently, Fuentes’s lab has performed a screening of auto-antibodies in osteoarthritis and arthritis rheumatoid by using NAPPA arrays and validated with other protein arrays technologies .
6.4. Functional biomarkers as drug targets
A decade ago, Labaer and colleagues evaluated functional properties of the proteins IVTT expressed onto the array by performing protein-protein interactions in high-throughput format. In this report, they printed an array expressing 647 unique genes in duplicate and tested for several well characterized interacting pairs including Jun-Fos and p53-MDM2 .
A more recent study, Manzano et al. published a work where they applied NAPPA to study protein interactions. In this study, a novel interactor partners were identified for P-selectin and phospholipase A2 and further validated .
Recently, a novel functional array, designed by Pascal Braun, identified novel cell signaling pathways in Arabidopsis thaliana by evaluating protein interactions onto this NAPPA technology .
In addition, a further study has just been published where evaluated multi-protein complex in NAPPA format. In this study, four novel tuberculosis-related antigens where identified in guinea pigs vaccinated with Bacillus Calmette-Guerin (BCG) and also validated with ELISA .
7. Conclusions and further directions
Here, we have briefly reviewed protein microarray field as suitable platform for multiplex assays in high-throughput format. Thus, the focus was on two main perspectives: (i) Key technological aspects, (ii) Biological Applications.
However, as we described previously, despite the fundamental advances in protein microarrays, allowing characterization of whole human proteome is still remaining as a challenge. Then, the information provides light on the functions of proteins and genes whose functions are currently unknown.
Overall, protein arrays may provide relevant information about the biological function of gene products. Although, it is still necessary to develop and optimize some key aspects of protein microarray in the future, other proteomics approaches could provide complimentary results.
We gratefully acknowledge financial support from the Spanish Health Institute Carlos III (ISCIII) for the grants: FIS PI17/01930 and CB16/12/00400. Fundación Solórzano FS/38-2017. The Proteomics Unit belongs to ProteoRed, PRB3-ISCIII, supported by grant PT17/0019/0023, of the PE I + D + I 2017-2020, funded by ISCIII and FEDER.
Conflict of interest
The authors do not declare any conflict of interest.