Measurement of Gluten in Food Products: Proficiency‐ Testing Rounds as a Measure of Precision and Applicability

In 2008, Codex Alimentarius endorsed the R5 Enzyme‐Linked Immunosorbent Assay (ELISA) method as Method Type 1 for gluten measurement in gluten‐free foods. The most recognized R5 ELISA test kit is the RIDASCREEEN® Gliadin (R7001; manufacturer R‐ Biopharm). Beside collaborative tests that led to several international approved methods of this test kit, proficiency‐testing (PT) rounds are regularly performed in Europe by dif‐ ferent PT providers. Results from these rounds were analyzed regarding the number of participating labs with acceptable results for the RIDASCREEN® Gliadin. All PT rounds document the excellent consistency and comparability of results. The data show that the RIDASCREEN® Gliadin R5 ELISA is also applicable to cake mix, oat‐based foodstuff, infant soya formula, cookies, canned boiled sausage, gravy thickener, pasta, and potato dumpling. These rounds also included the analysis of blank matrices. It was found that more than 95% of all participating laboratories correctly detected these samples as nega‐ tive. Other gluten test kit manufacturers were analyzed as well, but due to the low num‐ ber of participants using these test kits results were often only analyzed in a qualitative manner questioning the comparability of these kits to the RIDASCREEN® Gliadin R5 ELISA. small amount of data for other test kits, slightly different results were observed for test kits from other using the R5 monoclonal antibody and considerably differ‐ ent results were observed kits using G12 monoclonal


Introduction
In the context of coeliac disease (CD), gluten is the protein fraction from wheat, rye, barley, oats, or their crossbred varieties and derivatives thereof, which induces intestinal symptoms in patients and that is insoluble in water and 0.5 mol/l NaCl [1]. Gluten proteins can be divided into the alcohol-soluble prolamin fraction and the alcohol-insoluble glutelin fraction, which is only soluble after addition of reducing and disaggregating agents. The prolamin content of gluten is generally taken as 50% [1]. The Codex threshold of 20 mg/kg gluten (including a security factor) was endorsed in parallel and derived from challenge studies in coeliac patients using the commercially marketed Working Group on Prolamin Analysis and Toxicity (WGPAT) or Prolamin Working Group (PWG) gliadin [2]. This threshold was adopted by many national legislations, including the USA and the EU, so that food not exceeding 20 mg/ kg gluten can be labeled as gluten free in these countries. Although oats is part of the Codex definition of gluten, this crop is considered safe for the vast majority of persons intolerant to gluten, if it is not contaminated with other gluten-containing cereals [3]. The Codex explicitly mentions that oat may be allowed at the national level. At the moment, the most precise definition and explanation on oats is given in the US regulation [4].
So far, the only treatment for celiac disease is the strict adherence to a gluten-free diet. Specific and sensitive immunochemical methods are therefore needed to ensure quality control and compliance testing for gluten measurement in gluten-free food. The sandwich enzyme-linked immunosorbent assay (ELISA) RIDASCREEEN ® Gliadin (R7001) is based on the R5 monoclonal antibody [5] for the detection of intact gluten and was laid down as a Codex Alimentarius type 1 method for the analysis of gluten [1]. It is calibrated to the PWG gliadin and therefore results are traceable to the threshold value of 20 mg/kg gluten determined in challenge studies as mentioned above. Furthermore, it has been adopted as official or approved method by AOAC International [6], ICC [7], and the AACC International [8]. Raised against rye ω-secalins, the R5 antibody primarily recognizes the epitope QQPFP, which is present in wheat gliadins, rye secalins, and barley hordeins, and part of many CD-toxic or -immunogenic peptides [9][10][11].
Beside collaborative tests [8,12] that led to AOAC-, ICC-, and AACC-approved methods of this test kit for corn-and rice-based matrices, proficiency-testing (PT) rounds are regularly performed in Europe by three different PT providers. Mostly accredited laboratories participate in these PT rounds to prove their analytical competence. This publication will analyze all PT rounds between 2011 and 2016 with regard to precision and applicability of the official R5 gluten test kit RIDASCREEN ® Gliadin. Other test kits that claim to be comparable with the R5 reference method were analyzed as well but the number of participants using these kits were often not enough for robust quantitative statistics. Therefore, these kits were often only analyzed in a qualitative manner.

Materials and methods
Results from 33 different PT rounds with different food matrices were analyzed regarding the number of participating laboratories with acceptable results for the RIDASCREEN ® Gliadin. These rounds also included the analysis of blank matrices with gluten concentrations below the limit of quantification of the test kit. The following PT providers were analyzed: Food Analysis Performance Assessment Scheme (FAPAS; www.fapas.com), Dienstleistung Lebensmittel Analytik GbR (DLA; www.dla-lvu.de), and Durchführung von Laborvergleichsuntersuchungen GbR (LVU; www.LVUs.de).

RIDASCREEN ® Gliadin (R7001)
RIDASCREEN ® Gliadin is a sandwich enzyme-linked immunosorbent assay (ELISA) for the quantification of gliadin/gluten derived from wheat and related prolamins derived from rye and barley and other gluten containing varieties in various foodstuffs. The test is based on a microtiter plate coated with the specific monoclonal anti-gliadin R5-antibody. Bound gliadin is finally detected with a peroxidase-labeled specific antibody (R5). The factor of two is used to convert quantitative gliadin results into gluten results.
A pre-ground sample is extracted by the use of a special solvent (Cocktail, patented; Mendez extraction) and can then be analyzed in less than 100 minutes. The standard calibration curve of the ELISA covers a range from 5 to 80 mg/kg gluten (including the dilution factor from sample preparation) and is standardized against the WGPAT gliadin reference standard. The assay is applicable to the detection of gluten with a limit of quantitation (LoQ) of 5 mg/kg gluten and a limit of detection (LoD) of 1 mg/kg gluten. This method was developed to detect traces of gluten in gluten-free food, not for quantifying the gluten content in wheat, rye, or barley flour. It is not suitable for analysis of fragmented gluten, for example, in beer.

Z-scores
To evaluate results provided by each participating laboratory, a z-score is calculated for each participant. The basis for calculation differs slightly when comparing proficiency test providers which are explained in Chapters 2.2.1 and 2.2.2.

Using assigned values (for one method) and target standard deviations
where x denotes the result delivered by a participant and x a , the assigned value, derived from the consensus of the results submitted by the participants according to the test kit they used. The standard deviation for proficiency, σ p , was set at a value that reflects best practice for the analyses in question. In case of gluten, σ p was set to a relative standard deviation of 25% using fitness-for-purpose criteria based on expert advice. This approach is used by FAPAS and DLA. Further explanations are given in each PT report from FAPAS or DLA. where x denotes the result delivered by a participant and x M , the median, derived from valid results submitted by all participants. The robust standard deviation, σ robust , calculated from all participants was used as a target standard deviation. Reported values that were obviously erroneous were not included in the calculation. This approach is used by LVU and based on the procedure described in ISO 5725-5. Further explanations are given in each PT report from LVU.

Interpretation of z-scores
The z-score characterizes the difference between an individual result and the median or assigned value compared to a target standard deviation in a normalized way. Normally 95% of all results can be found within the range -2 ≤ z ≤ 2. Occasionally scores in the range 2 ≤ |z| ≤ 3 are to be expected at a rate of 1 in 20. Whether or not such single scores are of importance can only be decided by considering them in the context of the other scores obtained by that laboratory. Scores were |z| > 3 are to be expected at a rate of about 1 in 300. Given this rarity, such z-scores strongly indicate that the result is not fit-for-purpose and almost certainly requires investigation. The consideration of a set or sequence of z-scores over time provides more useful information than a single z-score.

FAPAS
Twenty rounds were provided by FAPAS which consisted of spiked and blank cake mix, infant soya formula, and oat-based foodstuff in the time between 2011 and 2016. The spiking material was gluten powder in all cases. The spiking concentration was not provided by FAPAS and assigned values were calculated from the results of participants using the test kit RIDASCREEN ® Gliadin ( Table 1). The number of participants ranged from 30 to 114.

DLA
Six rounds were provided by DLA and consisted of spiked and blank cake mix, infant formula, cookie, and cake mix in the time between 2012 and 2014. The spiking material was wheat flour in all cases. The spiking concentration is provided by DLA as a target value ( Table 2) on the basis of assumed gluten contents in wheat flour taken from the literature. The spiking concentrations were between 19 and 34 mg/kg gluten. Mean values were calculated from the results of participants using the test kit RIDASCREEN ® Gliadin ( Table 2). The number of participants ranged from 11 to 21. Uncontaminated materials were also provided to the participants.

LVU
Nine extensive rounds were provided by LVU and consisted of spiked, naturally contaminated, and blank matrices. As can be seen in Table 3, a wide variety of matrices was evaluated: flour substitute, mashed potato powder, canned boiled sausage, potato dumplings, cake mix, gravy thickener, pasta, bread mix, bread crumbs, and cornflakes. In case of spiked matrices, flours from wheat, rye, and barley were used beside gluten and wheat proteins. It should be noted that in a few cases oat meal was also used for spiking. The spiked target concentrations ranged from 15 mg/kg gluten up to 120 mg/kg gluten. The number of participants using the RIDASCREEN ® Gliadin ranged from 14 to 33.

Results and discussion
FAPAS provided three different gluten-containing matrices with gluten concentrations that bracket the threshold of 20 mg/kg gluten ( Table 1). Except for two of 20 rounds, the percentage of participants with a z-score equal to or smaller than 2 was 90% or more. The relative target standard deviation of 25% is realistic since relative reproducibility standard deviations calculated from an AACC collaborative test were between 18 and 25% [8]. Therefore, due to statistical reasons, 5% of all participants will not reach a z-score range of ±2.
Three rounds were based on oat-based foodstuff and it is clear that the RIDASCREEN ® Gliadin is suitable not only for gluten-containing oat samples but also for oat samples itself, showing no cross-reaction. This is an important requirement since oats are a crucial component for gluten-free food. Other test kits as, for example, the ELISA based on the G12 monoclonal antibody show a significant cross-reactivity to certain oat varieties which make this system not suitable for oat-based materials [13]. Another conclusion that can be drawn from Table 1 is the fact that blank soya materials do not exert positive results after Cocktail (patented) extraction. This possible cross-reactivity was alleged repeatedly over the last years but was never underpinned with reliable scientific data. The most probable explanation for this (unproven) observation is a contamination of soya with wheat, rye, or barley, due to agricultural commingling. If a gluten contamination of a material is assumed, this should be verified by PCR (e.g., SureFood ® ALLERGEN ID Gluten; S3106; R-Biopharm). In consideration of the fact that FAPAS is the most important PT provider in Europe, we recommend delivering homogeneity data reports to participants on request and to include spiked gluten values for each material in the PT report.  DLA provided six rounds between 2012 and 2014 ( Table 2) with gluten concentrations slightly higher than 20 mg/kg gluten. The most interesting information from these PT schemes is the fact that target concentrations are provided. The mean recovery ranged from 74% for round 03/2013 up to 149% for round 03/2014. Since the wheat flour used for spiking the matrices is not characterized for its gluten content, the PT providers used data from the literature to estimate the gluten content within the total protein fraction. Therefore, differences between the theoretical and practical value may occur. For five out of six rounds, the percentages of participants that fulfill the z-score requirement of equal to or smaller than 2 is 91% up to 100%. The fact that for round 02/2013 the results for the spiked cookie material showed more variability between participants may be explained by the fact that a homogeneity test was only performed for soya which was the second analyte in this PT round and not for gluten. As for the FAPAS rounds, we strongly recommend publishing the homogeneity data and following international guidelines for homogeneity testing [14]. Additionally, the benefit of having a target concentration would significantly improve if the gluten content of the flour used for spiking would be measured and provided.
The PT rounds provided by LVU show an impressive range of different matrices ( Table 4) with up to six different matrices in one round. The target gluten concentrations not only bracket the threshold of 20 mg/kg but also include higher values of more than 50 mg/kg. The target values given in Table 4 were calculated using conversion factors from the literature.
For the last gluten round in 2015, an error seemed to happen during preparation of the PT samples since values three times higher than expected were measured during homogeneity testing. Therefore, no target values are given for this round in Table 4. Regarding the percentage of participants that fulfill the z-score criterion of ±2, 90% or more participants tested 26 of 33 matrices within this criterion. For the other seven matrices, it can be speculated that perhaps the sample homogeneity was lower than for the other materials. Since even highly problematic matrices, for example, canned boiled sausage, were often analyzed with very good results, the performance of the participants is not (primarily) responsible for the seven matrices that exert a higher variation. Another indication of a lower homogeneity is the fact that each round consist of up to six samples and "outlying" matrices are analyzed in a row with matrices that came out very well. Wherever a target value is given, a recovery can be calculated using the median derived out of all results. The range is from 67% for a bread mix to 117% for a flour substitute with a total mean of 94% (not shown). Again, the relative target standard deviation used for FAPAS and DLA calculations is confirmed by calculations for the relative robust standard deviations for each matrix that contain gluten. The range of relative deviations is 14-31%. These calculations also included values derived from other test kits than the RIDASCREEN ® Gliadin R5 ELISA. The influence of other test kits is low because the number of participants that do not use the RIDASCREEN ® system is low. Blank samples were analyzed in a qualitative way. Nevertheless, 95% or more of the participants found these samples negative.

Other test kit manufacturers
Due to the restricted number of participants, we will only describe and analyze the FAPAS PT rounds for other test kits in the time from 2014 to 2016. Rounds from DLA or LVU show a negligible number of participants for other test kits than the RIDASCREEN ® Gliadin R5 ELISA. Table 5 shows the results of 13 different FAPAS rounds with spiked and blank cake mix, oat-based foodstuff, and infant soya formula. Results (assigned value) for the R5 reference method are also presented for comparison. The alternative test kit from the Neogen company uses the same monoclonal antibody as the reference. In case of only two or three participating laboratories, FAPAS provides no assigned value; therefore, we decided to estimate proficiency by calculating the mean concentrations and standard deviations. A correlation analysis between both methods is not possible due to the small number of pairs of results. Instead, a difference plot is presented where the absolute difference between both methods is plotted over the R5 reference value (Figure 1).
This graphical presentation clearly indicates that there is a difference between both methods at least for concentrations at the threshold level of 20 mg/kg gluten. More parallel determinations using both methods are necessary to characterize the comparability between both methods. It should be kept in mind that the threshold level of 20 mg/kg gluten is a decision level. In practice, it is therefore possible that a food product was labeled gluten-free (due to the measurement with the alternative R5 method) but an official control laboratory will use the R5 reference method which maybe results in a value higher than 20 mg/kg gluten. The producer of this food may be confronted with a recall situation. All participants that used the alternative R5 method for the analysis of blank matrices got correct results (   Since only one to four participants used the G12 kit, FAPAS did not calculate any assigned value for the G12 method because minimal numbers of participants are not sufficient for any realistic calculation. Instead, mean concentrations were calculated (where possible) and standard deviations. Although the amount of data is very limited, the data in Table 6 clearly show that the methods are not comparable. The most "reliable" results can be found in two rounds with four participants. In both cases, the G12 overestimated the gluten content by a factor of two or more compared to the R5 reference method. Even more troublesome is the analysis of blank samples where the G12 often failed, perhaps due to oat in the sample. This is not a problem for coeliac patients but for the gluten-free-producing food industry. All G12 results in Table 6 were submitted by participants using the G12 ELISA by Romer Labs.
Since the G12 is promoted at an international level by the manufacturer, running a proper method comparison is strongly recommended to protect celiac patients from any relapse of symptoms. This study should include spiked samples from different matrices, naturally contaminated samples, problematic matrices like spices, and oats since the G12 is reported to cross-react with varieties of this important gluten-free grain source. Following a guideline from clinical laboratory analysis, a minimum of 100 samples should be run in parallel [15].

Recommendations for PT participants
Regular participation in proficiency test is a prerequisite in Europe for laboratories that are accredited according to ISO 17025. Therefore, it is of great importance to handle PT results that are not within the expected z-score range. There are the following possible explanations and corrective measures for results outside this range: (1) Check if the result for a control sample is within its specifications for this run; use samples from older PT rounds if available and compare.
(2) Is the zero calibrator as low as expected? If not or if the results are equivocal, check for contamination of buffers and surfaces using the dip-stick RIDA ® QUICK Gliadin (R7003; R-Biopharm); install a proper cleaning procedure and control system.
(4) Check if the correct extraction procedure in case of an unknown sample was used. It is strongly recommended to always use cocktail extraction as described in the test kit insert.
(5) Compare the actual result with older PT results for any regularities, for example, permanent overestimation of gluten.
(6) Ask the PT provider for a homogeneity data report if not included in the PT report.
(7) Establish in-house control material: a blank and a gluten-containing sample should be tested at minimum.
(8) Check the course of calibration graph for any irregularities, for example, bumps.
(9) Check calculation of results, for example, missing factor of two for conversion from gliadin to gluten.
(10) Did a skilled technician perform the extraction and analysis?

Conclusion
The data show that the RIDASCREEN ® Gliadin R5 ELISA is also applicable to cake mix, oatbased foodstuff, infant soya formula, cookies, canned boiled sausage, gravy thickener, pasta, and potato dumpling. These independent data show once again that the R5 ELISA has no cross-reactivity to soy-based food. All PT rounds document the excellent consistency and comparability of results when using the RIDASCREEN ® Gliadin R5 ELISA. Based on a comparatively small amount of data for other test kits, slightly different results were observed for test kits from other manufacturers using the R5 monoclonal antibody and considerably different results were observed for kits using the G12 monoclonal antibody.