Performing Quantitative Determination of Low-Abundant Proteins by Targeted Mass Spectrometry Liquid Chromatography

Mass spectrometry coupled to nanoliquid aims to be an alternative to antibody‐based determination of low‐abundant proteins. High‐resolution mass spec‐ trometers, plug‐and‐play systems and pumps have been developed for this purpose. Important aspects of approaches are limit of detection, specificity, variability and cost. In this chapter, the most recent literature (from 2008) has been reviewed and a check‐ list/workflow for targeted proteomics is presented with special focus on low‐abundant proteins in complex matrices. The chapter is intended to serve as a starting point for low‐abundant target determination and highlights some of the most central studies in this field.


Introduction
The ∼1,000,000 different proteins (including modifications) determine much of an organism function. The protein abundance range from 0.01 to 10,000 ppm [1] in humans, a major challenge when working in the low ppm-area.
Obtaining qualitative and quantitative information of proteins is of interest in biological systems, analysing samples such as cells [2,3], tissue [4], blood [5] or extracellular vesicles [6,7]. However, determinations of these are not that straightforward, although several 'routine-based' methods exist in industry, research laboratories and clinics (see Table 1) [8,9].  Table 1. Comparison of methods used to quantify proteins in biological samples.

Mass Spectrometry
Determination of proteins through targeted tandem mass spectrometry (tMSMS) with nanoliquid chromatography (nanoLC) has gained interest in the last 10 years; major reason is increased sensitivity through LC downscaling and more accurate and sensitive mass spectrometers [10][11][12][13][14]. This enhanced sensitivity enables digging deeper into a sample, which may provide the desired data, compared to attempting whole-proteome determination (comprehensive proteomics, demonstrated by Thakur et al. [15] and Pirmoradian et al. [16]).
This chapter provides a short introduction to the state of the art targeted nanoliquid chromatography-mass spectrometry (nanoLC-MS/MS), recent examples from literature and finally a workflow suitable for confident determination and quantification of proteins in complex matrices.

Proteomics by mass spectrometry
Proteomics is the large-scale measurement of proteins [10,17]. Study of the proteome and set of proteins are nowadays often standard in cancer, studies of extracellular vesicles, blood analysis, etc. Proteomics by mass spectrometry can mainly be divided into two major approaches: comprehensive and targeted approach.
In comprehensive proteomics, the goal is to identify as many proteins as possible based on search algorithms (e.g. Mascot [18], SEQUEST [19], MSAmanda [20], Andromeda [21]). In targeted proteomics, the target protein(s) is known [22]. The latter is focused in this chapter. In addition to the two approaches, proteomics is often divided into whether proteins analysed intact (top-down proteomics) or in pieces (bottom-up proteomics). Bottom-up proteomics is mainly used, as the smaller pieces of proteins are easier to handle in liquid chromatography, easier to transfer to mass spectrometer and data analysis is also easier.
For over 10 years, the quest to implement quantitative proteomics for biomarker studies has been debated and attempted [4,[23][24][25][26][27]. Although LC-MS has been the standard approach to small molecule analysis, there is still a way to go before proteomics enters the clinic.

Targeted nanoLC-MS/MS, a rapid overview
In most targeted nanoLC-MS/MS approaches, peptides are used, due to their easy transfer through electrospray ionization (ESI), and favourable LC traits. In most cases, Trypsin and/or LysC are/is chosen to cleave the proteins, although other enzymes are also used [28]. A set of peptides containing a protein-specific sequence representing the target protein are selected; proteotypic/signature peptides. In the selection process, the UniProt database [29], Expasy [30], Skyline [31] and PeptideAtlas [32] can be helpful. After selection, the peptides are normally bought as synthetic labelled standards (i.e. Absolute quantification (AQUA) peptides [33]) as ideal internal standards, and subsequently chromatographed and detected on the nanoLC-MS/MS platform available.
The golden standard in nanoLC is usually a 75 μm inner diameter (ID) column packed with 2-3 μm diameter silica-particles functionalized with C18-phase. A solid-phase extraction (SPE) column is often connected on-line, to increase loading capacity [34].
The chromatographed peptides elute, ideally at different retention times and are transferred to the MS through ESI. In later years, quadrupole-Orbitraps (QOrbitrap) and quadrupole time-of-flight (QTOF) instruments have been introduced in targeted MS/MS, alongside triple quadrupoles (QqQ). The two first options provide higher resolution compared to traditional QqQ, and thus less interferences [35,36]. Common for the instruments is that a parent massto-charge (m/z) is monitored in the first quadrupole and fragmented prior to the second mass analyser. In QqQ and QTOF instruments, collision induced dissociation (CID) is the most common, whereas in QOrbitrap instruments, higher energy collision induced dissociation is used (HCD). The main daughter ions in either dissociation are positive b and y ions. For QqQ, multiple-reaction monitoring (MRM) and selected reaction monitoring (SRM) are most common, whereas for the QOrbitrap and QTOF, a mode referred to as parallel reaction monitoring (PRM) is mostly used (Figure 1).
In SRM or MRM mode, both the parent and fragment m/z ratios have to be inserted into the method, whereas for the PRM mode, only the parent m/z ratio is inserted, and all fragment m/z ratios are recorded and can be isolated after data acquisition in the software of choice.

Peptide selection: considerations regarding proteotypic peptides
Peptides used for LC-MS/MS identification usually contains between 6 and 20 amino acids in sequence [37]. The proteotypic peptides should ideally not contain amino acids which are prone to modifications, either during sample preparation or in the biological system. Hence, methionine, tryptophane, tyrosine and cysteine are often not chosen when possible. Although, if no other proteotypic peptide exists, the normal rate of phosphorylation, for example, is less than 5%, which may be neglected. Methionine oxidation is one of the most common modifications in bottom-up proteomics, mainly due to sample handling. Hence, quantification based on a peptide containing methionine is not preferable, except when using labeled proteins as internal standards which can correct for this.

Quantification of proteins with tMSMS: labelled proteins versus labelled peptides
The major advantage of LC-MS/MS-based proteomics over Enzyme-linked immunosorbent assay (ELISA), WB or IF-based proteomics is the quantification quality [38]. Quantification in tMSMS is mainly based on heavy labelled peptides (normally 15 N and/or 13 C isotopes of arginine and lysine) which are added to the sample as internal standard. In contrast antibodies, housekeeping proteins are usually used for standardization. In later years, the housekeeping protein method has been criticized, as these may change in experiments [39]. But there are also a pitfall using synthetic internal standard peptides, namely variation during cleavage of proteins to peptides [28,40]. Up to 85% of the variation in bottom-up proteomics arises from enzymatic cleavage [40]. Hence, addition of synthetic heavy-labelled proteins as internal standards is a far better approach, but may prove to be more costly. As shown in Figure 2, using labelled proteins can correct for sample preparation as well as data analysis, whereas labelled peptides only correct for sample preparation on peptide level and not sample preparation on protein level which often is needed for low-abundant target isolation techniques.
Correction of all steps involved in analyses is important, with internal standards, software and manual inspection of data [26,[41][42][43].

Reducing sample complexity
Proteomics of low-abundant targets often requires specialized sample clean-up. Removal of high-abundant targets, direct target isolation and fractionation are among the most common approaches. In blood, a common approach is a removal of the most abundant targets by multiaffinity removal system (MARS), lowering the dynamic range of the sample. However, with this approach, the target(s) may also be lost due to protein-protein interactions, and quantification may be an issue. Yadav et al. claims that for a biomarker discovery, both depleted fractions and non-depleted fractions should be analysed [44]. Recently, oxytocin was shown to have a high degree of binding to blood proteins, which severely affects quantification [45]. An alternative approach is a direct target isolation aiming to isolate the protein(s) of interest [46]. This is also quite effective, but time-consuming and rather costly. Additionally, proper  control of isolation efficiency with proper protein internal standards is needed. The latter has been shown by Edfors et al. where protein targets were isolated with polyclonal antibodies in HeLa spiked with recombinant protein internal standards [47]. Fractionating proteins by LC and gel electrophoresis is also common for reducing sample complexity.

Downscaled LC systems: enhanced sensitivity with ESI-MS
In 2002, Shen et al. displayed a ∼200-fold increase in sensitivity when downscaling LC columns from 75 to 15 μm when connected to ESI-MS [48]. Fifteen years later, proteomics in 75-50 μm format has become commercially available through the largest instrument manufacturers [34,42], whereas more downscaled systems are used for even higher sensitivity. For peptides, the demonstrated sensitivity is in the attomolar-zeptomolar range. A major drawback with downscaled systems is that they traditionally are often low-capacity systems, i.e. sample capacity is lower on these systems compared to conventional larger ID systems. Hence, using strong cation exchange (SCX) columns on-line [49] or high capacity solid phase extraction columns (poly-styrene-octadecene-divinylbenzene, PS-OD-DVB [50]) often needed to take full advantage of the increased sensitivity of such downscaled systems [51].

Mass spectrometers: selectivity and sensitivity
Selectivity and sensitivity are among the two most important aspects of low-abundant target determination by mass spectrometry. Selectivity in this context is defined as the ability to differentiate between masses, and the mass spectrometers selectivity is often characterized by measuring full width at half maximum-value (FWHM), where a high value is better. Sensitivity at which the signal level is higher than the noise is often characterized by a signalto-noise ratio. In Table 2, the resolution and mass accuracy are reported for the three most common mass analysers used in targeted mass spectrometry today.
The resolution for the QTOF and QOrbitrap instruments is up to 20 times as large as a typical QqQ-instrument. In MS/MS, interferences are common and Gallien et al. showed that high-resolution of a QOrbitrap instrument is superior in eliminating these, compared to QqQ-instruments [35]. Additionally, we have earlier showed that at least three transitions are needed (even with high-resolution QOrbitrap) to eliminate false positives [42].
A sketch of important technological developments (Figure 3) developing targeted proteomics by NanoLC-MS/MS highlights the importance of hardware developments, such as ESI, QOrbitrap instruments, sample preparation strategies, such as Stable isotope labeling by amino acids in cell culture (SILAC) and software/database developments (e.g. Skyline and UniProt).

Selected studies in targeted mass spectrometry
In the following subsections, a set of studies performing targeted mass spectrometry of proteins in various biological samples will be presented and discussed with emphasis on highquality targeted proteomics results and lead to a protocol for targeted mass spectrometry of low-abundant targets.
In 2014, a study by Edfors et al. demonstrated use of recombinant manufactured proteins labelled with SILAC mixture as internal standard [47]. To approximately lysate from 1 million HeLa cells, 1 pmol of recombinant proteins were added and digested with trypsin. The resulting peptide mixture was then immunoprecipitated with protein antibodies applied on peptides, which is cheaper than ordering specific peptide-recognizing antibodies. With the lowered complexity and increased concentration, 57 of 127 proteins were identified by at least one peptide in data-dependent acquisition (i.e. not tMSMS). Even though this study is not used with tMSMS, it evaluates and presents a method for immunoprecipitation on peptide level with protein antibodies which enables easier access to targets. Additionally, it keeps quantification in mind with the use of protein internal standards. Additionally, reduction of complexity meant that the LC-MS analysis could be reduced from 3 hours to 15 minutes. The relative standard deviation in the study ranged from 10 to 40%, which for some targets is somewhat higher than the required 10-20% as set by the Food and Drug Administration and others [52,53].
In contrast to the protein internal standard, peptide internal standards have also recently been used for quantitative proteomics in breast cancer cells [54]. 319 protein targets were monitored and from this selection, coefficients of variations for 79 of the protein targets presented. For each target a heavy labelled proteotypic peptide was added. A pool of breast cancer cells was lysed at one specific location and distributed to three sites, where sample preparation and analysis were conducted. The authors report a median variation within and between laboratories <10% for 95% of the monitored targets. The study shows a feasibility for tMSMS analysis of high-abundant targets, whenever extensive pre-fractionation is not needed and protocols are made carefully.
The dynamic range of proteins in blood/serum/plasma is far more demanding than cells [5]. Hence, searching for low-abundant proteins in this matrix often requires depletion strategies. But, other strategies can also be used to increase detection limits. Recently, a study showed that with internal standard triggered parallel reaction monitoring (IS-PRM), a lower limit of quantification can be reached, compared to traditional SRM [55]. With the use of algorithms and synthetic internal standard AQUA peptides, data acquisition of endogenous peptides was triggered by detection of IS-peptides during chromatography (for comprehensive proteomics, this is known as data-independent acquisition (DIA)). For example, at a peptide amount of 50 amol using IS-PRM could use ∼300 transitions, whereas SRM could use ∼50 transitions. The reason could be attributed to a much higher resolution of the QOrbitrap compared to the QqQ instrument, and more dedicated use of the mass analyser with realworld triggered analysis, enabling high fill times.
Another study from 2011 used accurate inclusion mass screening (AIMS [56]), comprehensive proteomics and targeted proteomics to verify biomarkers in plasma [25]. Using depleted plasma and comprehensive proteomics, a selection of candidates for biomarker analysis was made and transferred to an SRM method with internal standard peptides. Of the 373 targets investigated in SRM, only 164 of these were identified with >3 transitions per peptide, which is attributed to the targets abundance. The study however makes a very important point regarding tMSMS-procedures. It must be made cost-effective compared to ELISA and Western blot (WB).
A highly promising tool, developed a few years back, compares the relative intensities between the ions in the internal standard and the endogenous target (Figure 4 adapted from Ref. [43]).
The check, of course, could be performed manually, but for large datasets automation is desirable. The authors showed that the developed algorithm worked in 90-100% of the cases, and that specificity was above 80%.
The hunt for low-abundant targets can, as previously mentioned, be accomplished with fractionation. A study aimed for detection of prostate specific antigen (PSA) in serum samples in pg/mL-level by depletion and fractionation [57]. Specifically, serum samples were depleted of high-abundance proteins, digested with trypsin, spiked with internal standard peptide and resulting peptides LC fractionated at high pH. Approximately 9% of the eluent were introduced directly onto an LC-MS system, and the remaining 91% was fractionated on 96 well plates. This allows for determining which of the fractions containing the target peptide and which subsequently could be pooled and analysed by LC-SRM. They reached correlation coefficients of >0.99, limits of quantifications of 50 pg/mL for PSA and CV of <<10%. As the authors discuss, the throughput is lower with fractionation, but for specialized applications fractionation may be necessary.
Some of the presented studies above have successfully used peptide internal standards for quality control during analysis and quantification. Alternatively, labelled proteins can be used and another approach that has gained interest in the later years is protein standard absolute quantification (PSAQ™ [58]). Proteins are made recombinantly in e.g. bacteria and labelled metabolically with heavy lysine and arginine. These are subsequently purified based on tags (e.g. His6X, glutathione S-transferase tag, etc.). Full-length or partial proteins can be made with this approach. As shown in Figure 2, these can be added directly after proteins have been extracted from the organism and used for normalization. However, as the authors describe, the necessary protein-tag affects the protein and it is not an ideal internal standard in that way, and hence must be evaluated. Nonetheless, this approach has recently been used for detection of toxins in food [59] and acute kidney injury biomarkers in urine [60].

Brief summary and possible areas of applications
Targeted mass spectrometry has increased in popularity with easier access to databases, LC-MS equipment, methods and software. Table 3 lists a selection of the cited literature on which this chapter is based on and divided into appropriate sections for easier access.
High-sensitivity mass spectrometers and miniaturized liquid chromatography operating at 20-200 nL/minute in systems have been introduced, enabling low zeptomolar detection of peptides in various complex matrices. Specificity has increased with high-resolution mass spectrometers having >30,000 resolution. Variation is the main bottleneck for quantitative mass spectrometry entering clinical use, mainly due to the use of non-ideal internal standards. Costs are still high, but developments in easy transfer of methods and easy standard production have reduced the cost/benefit ratio.

Some references
Databases [29,32] Quantification [25,54,55,58,60] Liquid chromatography [2,3,13,42,49,51] Mass spectrometry [10,11,35,56] Data treatment [18-21, 30, 31]  Determination of proteins in extracellular vesicles has been proposed as a tool for early prognosis of cancer, and as the vesicles themselves are in low abundance, the protein amount found within them is also in extremely low abundance [61,62]. Blood is also a very interesting sample matrix for targeted proteomics, as sample acquisition is relatively low-invasive (compared to tissue) and contains a vast majority of biomarkers for diseases, and where the lowabundant targets may give future information for early diagnosis of diseases. With highly sensitive systems, determination of important pathway proteins can be achieved [63], and with future development of robust quantification techniques for proteins such systems can be applied to tissue, cells and urine as well, and gain their way in clinical diagnostic applications together with DNA and metabolite screening.

Four-point workflow for bottom-up-based proteomics of low-abundant targets
Based on the few selected studies, a four-step guide to confident low-abundant protein identification is presented ( Figure 5). The workflow can either be used for relative quantification or absolute quantification depending on knowledge about the protein internal standard.

Standard and internal standard preparation
Determine protein targets and use UniProt, PeptideATLAS, SRMAtlas and Skyline to find appropriate proteotypic peptides for your protein (minimum two peptides for each protein).
Based on the origin of your sample; prepare a metabolically labelled internal standard, e.g. SILAC labelled cell line for tissue/cells studies or recombinant with 15 N, 13 C isotopes. For absolute protein quantification, the target protein concentration in the internal standard is needed.

LC-MS/MS method development
Monitor LC retention time with AIMS platform and data-dependent proteomics with the labelled internal standard if possible and make sure that perform retention time is. For extra low-abundant proteins, acquire recombinant proteins which can be used to prepare stable peptides, or if not available buy recombinant peptides (not necessarily labelled).
For the LC-system in question: optimize chromatography with adjusting gradient slope, gradient time, column choice, etc.
If available, use a high-resolution mass spectrometer with mass resolution >30,000. Perform LC-MS/MS analysis of the peptides in question, optimize parameters to enable highest possible signal to noise (S/N) ratio and highest fragment ion intensities and finally determine ion intensity ratios.

Sample preparation
Acquire the sample(s) in question and add the protein internal standard in the process as early as possible. Choose the appropriate sample preparation strategy depending on the target abundance (MARS-depletion, immunoaffinity purification or similar)