Complete 2x2 table
Sperm concentration is an important parameter affecting fertility. Animal species of agricultural interest are mainly produced by artificial insemination (AI) which contributes highly to the development of worldwide swine production, making the impact of the male in reproductive efficiency of the pig herds more crucial (Jounala et al., 1998).
The efficiency of AI (fertility rate and prolificacy) is directly dependent on the quality of semen doses and on the number of spermatozoa used for insemination (Camus et al., 2011). In commercial farms, routine examination of boar semen is performed aiming to predict the male’s fertility. Evaluation of concentration is crucial to adapt dilution rate and to optimize sperm concentration which will directly impact fertility performance. In the first part of a present chapter we address the basic concepts of a method comparison study and present an example of a method comparison experiment concerning determination of sperm concentration.
Various laboratory methods techniques are used to evaluate sperm quality, such as sperm concentration, motility, viability, and morphology. However, there is no single semen assay that provides complete information about semen quality (Holt & Medrano, 1997; Johnson et al., 2000; Liu & Baker, 2002). Studies in domestic animals showed that these semen characteristics were often not significantly correlated to fertility, while the most valid assessment of boar semen quality is to obtain viable pregnancies and normal offspring following AI (Tsakmakidis et al., 2010). Since fertilization is a complex process involving a huge number of events, fertility research must not only device more predictive laboratory tests, but also properly combine different assays aiming to predict male fertilizing ability, as spermatozoa should satisfy many requirements for successful fertilization (Quintero-Moreno et al., 2004). Assessment of metabolic status of spermatozoa could provide a useful tool for evaluation of semen quality, because sufficient metabolism for energy production is one of the several attributes that a sperm must posses to fertilize an oocyte. In the second part of this chapter developing and diagnostic evaluation of a spectrophotometric application of the resazurin reduction assay will be presented.
Learning objectives of a chapter are to:
Investigate repeatability in continuous data
Perform method agreement
Construct Bland-Altman plots
Explain limits of agreement between two methods
Chose an appropriate regression analysis used in the interpretation of comparing data
Define the diagnostic parameters: specificity, sensitivity, accuracy, predictive values of a test
Recognize the validity and usefulness of the test
Evaluate the performance of a diagnostic test using ROC (receiver operating characteristic) analysis
Construct and compare ROC curves
Determine optimal cut-off point for a test
Explain the developing of a new method in semen evaluation
2. Method agreement for determining sperm concentration
Semen samples, which often contain a variety of cells (immature germ cells, blood cells, epithelial cells, and cellular debris) in addition to spermatozoa, differ markedly from blood samples because of their heterogeneity. There is also no specific standard available for sperm cells of each species. It is therefore important to compare a new, more appropriate or additional method to a conventional one. The counting chamber technique for estimating sperm count appears to be adequate because of its simplicity, low cost and reproducibility. However, photometers are widely used routinely for determining sperm concentration by many AI organisations, for bulls and boars as well as other species (Woelders, 1991). They need to be evaluated before use, because accurate concentration measurement is the first and crucial step of the semen preparation process for production of semen doses (Camus et al., 2011). Correct assessment of sperm concentration is essential to ensure that the number of sperm per insemination dose meets requirements and that the maximal number of doses can be produced per ejaculate.
The increasing use of AI in swine emphasizes the need for the distribution of good quality sperm by the AI centres (Vyt et al., 2004). Boar sperm quality is routinely assessed by measuring concentration, morphology and motility of spermatozoa (Johnson et al., 2000). Determination of sperm concentration is essential in evaluating fertility, whether in vivo or in vitro. However, there is no agreed method for use as a standard. Knuth et al. (1989) showed that introduction of an unevaluated laboratory method, without appropriate quality control, can cause a bias in semen analysis. However, the methodology of semen evaluation is complex, and standardization is difficult (Brazil et al., 2004). For example, the first large scale, nation-wide proficiency testing program for clinical andrology laboratories in the United States reported that the inter-laboratory coefficient of variation for manual sperm concentration determination was 80%, with a range for a single semen specimen of 3 – 492 x 106 /ml (Keel et al., 2000). The accuracy, reliability and repeatability of different instruments that evaluate sperm concentration of raw semen have already been compared in several previous studies (Christensen et al., 2004; Hansen et al., 2006; Prathalingam et al., 2006; Anzar et al., 2009; Camus et al., 2011). Variation in the results from different laboratories could be due to the lack of standardisation of methods between laboratories (Maatson, 1995).
The reason for comparing methods is often that a quicker, more convenient and more economical adaptation has been made to an existing method. Studies comparing a new method with an established method are performed to assess whether the new measurements are comparable with existing ones (Jensen & Kjelgaard-Hansen, 2006).
2.1. Precision of the evaluated methods
It is necessary to establish that a method is repeatable before comparing two measurements for reproducibility (Petrie & Watson, 1999). Repeatability of boar semen concentration assessment depends on instruments and procedures, for example CV for instruments FACS, HEMO, Photo C254, SpermVision, UltiMate and SP-100 were 2.7, 7.1, 10.4, 8.1, 5.4 and 3.1%, respectively (Hansen et al., 2006). Imade et al. (1993) reported similar overall precision (5.9%) for the Makler chamber, whereas CV for sperm counts in sperm suspensions can be higher, for example 18.6% (Christensen et al., 2005) or even 26.3% (Mahmoud et al, 1997). It is generally admitted that intra-observer CVs are often greater than 10%. Although guidelines for standardizing the procedure have been proposed, relatively important degrees of intra- and inter- technician or inter-laboratory variability have been reported. In the external quality assessment (EQA) reported by Neuwinger and coworkers (1990), which involved 10 experienced German laboratories in the evaluation of 8 sperm samples, the mean CV was 37.5%. From the data of the external quality control obtained under the British Fertility Society and reported by Matson (1995), the calculated inter-individual CV for sperm concentration was 64.7% for 24 semen samples collected by technicians from 20 laboratories.
2.2. Method agreement
According to the literature, a very common way of investigating method agreement is to perform a paired t-test or to calculate a correlation coefficient to provide a measure of agreement. However, in this instance, neither method is appropriate for the following reasons (Petrie & Watson, 1999). The paired t-test tests the null hypothesis that the difference is zero. If the differences between pairs are large – indicating that the methods do not agree – but are evenly scattered around zero, then the result is non-significant. We can only conclude that there is no bias, not that the methods agree. Correlation is a statistical method used to quantify any association between two continuous variables (Ma & Smith, 2003). The correlation coefficient provides a measure of the linear association between the measurements obtained by the two methods. It provides an indication of how close the observations in the scatter diagram are to a straight line. R measures the strength of a relation between two variables, not the agreement between them (Bland and Altman, 1999). For example, the Pearson correlation coefficient gives no information of value in method comparison studies, because R can be highly significant even when there is an obvious bias between the two methods. It measures the strength of association, rather than agreement, although in the literature it has been used in many studies, such as comparison between different methods to determine sperm concentration (Prathalingam et al., 2006). R was also used to evaluate agreement between assessments within lab technician in sperm analysis (Christensen et al., 2005). In order to assess agreement, it is necessary to know how close the points are to the line of equality, i.e. the 45º line (Petrie & Watson, 1999). Therefore, in the study of Sokol et al. (2000), comparison of two methods for measuring sperm concentration using only Wilcoxon signed rank test and F-test, appears to be insufficient.
Scatter plots and absolute and relative bias plots give the best overview of comparisons of data (Twormey, 2004; Twormey, 2005). Absolute bias plots are also called Bland and Altman plots, usually used for method comparison (Bland and Altman, 1999). In absolute bias plots, the biases are plotted against their average value for each sample. The mean of these differences () is an estimate of the average bias of one method relative to that of the other. If this value is zero, then the two measurements agree on average. However, this does not imply that they agree for each individual measurement.
In order to assess how well the paired measurements agreed with each other, limits of agreement have to be determined. The upper and lower limits of agreement are calculated as
where is the mean of differences for all the samples (average bias) and sdiff is the standard deviation of the differences; 2sdiff is also referred to as the British Standard Institution repeatability (or, reproducibility, as appropriate) coefficient and indicates the maximum difference likely to occur between two measurements. This coefficient is the value below which the bias between paired results may be expected to lie (Petrie & Watson, 1999).
We performed method agreement between two clinical laboratory methods for determining boar sperm concentration using the statistical programme Analyse-it, General + Clinical Laboratory statistics, version 1.71, where linear regression, Deming regression and Passing Bablok regression can be applied in the evaluation. We chose Deming regression, because it is appropriate for describing the relationship between two variables, both measured with error. In the case of observed increasing imprecision, i.e. where a proportional bias between methods is detected, the Passing Bablock regression procedure is more accurate than Deming’s method. When the assumption that the independent variable is determined without error is satisfied, linear regression should be used to describe the agreement between two methods (Jones & Payne, 1997). The intercept is calculated, as in conventional least squares regression, as the mean of y minus the product of the slope and the mean of x. The standard error (SE) of the intercept defines how much the line might vary in the y direction, and SE of the slope defines how much the line might pivot about the central point through the means of x and y. Thus, SEs allow calculation of the confidence intervals of the slope and the intercept (Jones & Payne, 1997).
2.3. Experiment: Agreement between two methods of sperm concentration measurement
In the present study we compared two clinical laboratory methods for determining boar sperm concentration, the Makler chamber and the photometer (Photometer SDM5, Minitüb, Germany) (Mrkun et al., 2007). Prior to method comparison, precision of each method was assessed. Scatter plots with fitted regression line, and absolute and relative bias plots were used to get the best overview of comparative data (Twormey, 2004; Twormey, 2005). Deming regression was applied to describe the relationship between variables both measured with error by proposing that the sum of the squares of the deviations from a line should be minimised in both the x and the y directions at the same time, thus taking account of the analytical imprecision of each method (Jones & Payne, 1997). The purpose of this study was to compare the two methods and to assess method agreement together with the appropriate regression analysis used in the interpretation of the data.
2.3.1. Semen samples
Twenty-three semen samples were obtained from eight 12 to 24 month old boars of various breeds. Each semen sample was collected with gloved hand using a clean semen collecting flask that filters out gel, dust and bristles, while the boar mounted a dummy sow. Semen samples were diluted 1:2 with BTS semen extender (Beltsville Thawing Solution, Truadeco, Netherlands) and delivered to the laboratory.
2.3.2. Counting with the Makler chamber
Immediately before each semen aliquot was analysed, the entire semen specimen was vortexed. To render the spermatozoa immotile and to prepare the semen samples for the Makler chamber (Sefi Medical Instruments, Israel), semen samples were diluted 1:2 with distilled water. 6 parallel dilutions of each semen sample were prepared and the average of the measurements on each used as the representative value.
Following dilution, sperm suspensions were again vortexed and an aliquot of 5 µl was loaded into the Makler chamber. The next step was to assess whether sperm were evenly distributed or whether there were movements in the fluid in the counting chamber. If either of these problems was observed; the chamber was cleaned and refilled. The fields were chosen according to a prescribed pattern: 10 fields spaced left to right and 10 fields spaced top to bottom. Chosen fields formed a plus sign centred in the middle of the chamber, excluding the areas 2-3 mm from the chamber edges. Only recognizable spermatozoa, including lost heads, were counted, while other cells and lost tails were ignored. The concentration in the original semen sample was calculated from the total number of sperm in the counting area.
2.3.3. Counting with a photometer
Sperm concentration was determined by measuring the sample opacity, as the percentage transmittance of light through a sample, using a photometer (Photometer SDM5, MiniTüb, Germany). Boar ejaculates are normally too opaque, so a small semen sample was diluted with an isotonic solution before measuring. A blank tube was loaded with 3.5 ml 0.9% NaCl and a sample tube with 70 µl semen sample added to 0.9% NaCl. Sperm concentration was determined from a previous calibration of the spectrophotometer, performed by the manufacturer (Photometer SDM5, MiniTüb, Germany). Six measurements were made for each semen dilution.
2.3.4. Precision of the evaluated methods
The precision of each method was determined by making 6 measurements of each of 23 semen samples. Coefficients of variation (CV) were calculated for each method and scatter graphs of CV versus average sperm count for each semen sample were constructed. In our study CVs were calculated to be 6.6 ± 3.5 % and 1.6 ± 0.6 % for Makler chamber and photometer, respectively. Both methods yielded acceptable precision (Christensen et al., 2005), although the precision of the Makler chamber was significantly poorer.
In a diagram of the CV plotted against the average for each sperm concentration, the scatter of the points is random for the photometer (Fig.1). In contrast, for the Makler chamber, the size of CV is related to the size of the sperm concentration, shown by the higher CVs for lower sperm counts (Fig.2).
2.3.5. Method agreement between Mackler chamber and photometer
We were interested in assessing the similarity between sperm counts measured with Makler chamber and photometer, so we compared pairs of measurements. For this purpose, we calculated the differences between pairs of measurements of sperm counts – by Makler chamber and photometer – obtained by each method for each sperm sample.
The mean percentage bias between methods was – 0.6 ± 6.9 %. The scatter of the points lies in the interval –15 to 15 % (Fig. 3), which is in the range of satisfactory between-run reproducibility of the assay. From the absolute bias plot (Fig. 4) it is also evident that the scatter is random, indicating that the size of the difference between sperm counts obtained by two methods is not related to the size of the counts. Thus, no proportional bias has been detected. Average absolute bias was close to zero (–1.092 ± 15.237 M/ml). Sperm counts obtained with Makler chamber and photometer agree; 90% of the differences lie within the limits of agreement (Fig. 4), confirming that the level of agreement between the methods was satisfactory. Therefore, measurements of sperm concentration with photometer and counting chamber techniques are equally appropriate for estimating sperm counts.
Using scatter diagrams with regression lines fitted, we established that the paired measurements, sperm counts obtained with Makler chamber and with photometer, were close to the line of equality. Deming regression was used to solve the problem of describing the relationship between sperm counting with methods, both measured with error. Deming’s method gives only a single regression line, whether x or y is used as the ‘’independent variable’’ (Fig. 5).
The estimated intercept for the regression line, 4.7069 M/ml, does not differ much from zero. The estimated regression equation indicates that the points are close to the line of equality, i.e. the 45 º line and SE of the slope (0.0600) indicates that there is almost no pivoting of the line about central point through the means of x and y.
3. Development and diagnostic evaluation of spectrophotometric application of the resazurin reduction assay to evaluate boar sperm quality
There are several attributes that a sperm must posses to fertilize an oocyte, including motility, normal morphology, sufficient metabolism for energy production, and membrane integrity. Although various analytical techniques have been developed to evaluate sperm quality, including sperm concentration, motility, viability and morphology, there is no single method that provides complete information about semen quality (Holt & Medrano, 1997; Johnson et al., 2000). Due to the complexity of the fertilization process, single tests are not able to predict fertility. Instead, a set of semen tests has to be selected with high relevance for important sperm traits and low redundancy of assay results (Petrunkina et al., 2007). Moreover, particularly in pig industry, the choice of semen test has considered cost effectiveness. Routine testing of fresh boar sperm predominantly aims to identify subfertile boar ejaculates. In number of countries, liquid preserved boar semen is used after several days of in vitro storage. It’s well known that boars differ in their capacity to maintain sperm function during preservation in vitro. These differences can only be partially visualized by standard sperm parameters, such as loss of motility and membrane integrity (Waberski et al., 2011). However, the battery of diagnostic methods used by the industry is as yet restricted (Tejerina et al., 2008). A reliable, simple, cost effective and rapid method of assessing the quality of boar spermatozoa would be of benefit to livestock producers and veterinary practitioners (Dart et al., 1994). Reproductive performance depends on metabolic processes; therefore assessment of metabolic status of spermatozoa could provide valuable information for predicting sperm fertilizing capacity. The resazurin reduction assay, one of the methods for evaluating the metabolic status of spermatozoa, depends on the ability of metabolically active spermatozoa to reduce the resazurin redox dye to resorufin. Dehydrogenase activity of spermatozoa is manifested as a visual colour change from blue (resazurin) to pink (resorufin) and further to white (dihydroresorufin) (Glass et al., 1991; Fuse et al., 1993; Reddy Venkata Rami et al., 1997). The resazurin reduction assay using visual detection of colour change is quite subjective and varies between evaluators (Wang et al., 1998). The colour change of resazurin is usually matched with a colour chart, consisting of a spectrum of colours from blue to pink, varying between investigators. The possibility of human error therefore, has led to the spectrophotometric modification of the resazurin reduction test. It has been mostly used for the evaluation of human semen (Mahmoud et al., 1994, Rahman & Kula, 1997; Zalata et al., 1998; Reddy Venkata Rami & Bordekar; 1999) but, to our knowledge, in veterinary medicine only for evaluating ram (Wang et al., 1998) and boar semen quality (Zrimšek et al., 2004). The visual assay has been used for evaluating stallion (Carter & Ericsson, 1998), bull (Dart et al., 1994), sheep (Cooper et al., 1996; Martin et al., 1999) and boar (Mesta et al., 1995) semen. Spectrophotometric measurement of resazurin reduction provides a quantitative and objective method.
The aim of the present study was to develop and evaluate diagnostically the spectrophotometric application of the resazurin reduction test for evaluating boar sperm quality (Zrimšek et al, 2004; Zrimšek et al., 2006). Following Zalata et al. (1998), who developed a spectrophotometric method of resazurin reduction to evaluate human semen we extracted the developed colour after the assay with boar semen with butanol and measured the absorbance in the clear upper layer of butanol, eliminating the problem of sample turbidity. The optimisation and developing of the test included several steps as follows:
determination of specific absorbance wavelength, used for analysis on the basis of absorbance spectra of resazurin and resorufin
optimisation of the test procedure
determination of repeatability of the assay
correlations between resazurin reduction assay and various semen parameters; Spearman rank correlation analysis
relationship between resazurin reduction and concentration of motile spermatozoa and sperm index; linear regression analysis
statistical comparison of the results obtained between the groups of satisfactory and unsatisfactory semen; Mann-Whitney U-test
diagnostic evaluation of the assay; ROC analysis
stability of butanol extracts in terms of A610; measuring agreement
In this study, receiver operating characteristics (ROC) was used to determine the optimal cut-off value and diagnostic accuracy of the resazurin reduction assay. A complete picture of test accuracy is presented by the ROC plot, which provides a view of the whole spectrum of sensitivities and specificities as functions of selected cut-off values (Greiner et al., 2000). A global summary statistic of the diagnostic accuracy of the assay was quantified by the area under the ROC curve. Likelihood ratios were used to revise the probability of the semen status in individual samples (Greiner et al., 1995).
3.1. Development of resazurin reduction assay
3.1.1. Semen samples and analysis
Forty-one semen samples from eight 12-24-month-old boars of various breeds were included in the study. Semen was collected with a glove hand using a clean semen collecting flask that filters out gel, dust and bristles, while the boar mounted a dummy sow. Semen was kept at the temperature collected and analyzed within 1 h. Sperm concentration and motility characteristics were determined by computer-assisted semen analysis (Hamilton Thorne IVOS 10.2; Hamilton Thorne Research, MA, USA) with a Makler counting chamber (Sefi Medical Instruments, Haifa, Israel). Sperm morphology was examined on Giemsa-stained samples (Hafez, 1993). Sperm index (SI) was calculated by multiplying sperm concentration by the square root of percentage sperm motility multiplied by the percentage of normal sperm morphology (Mahmoud et al., 1994). Combining concentration, motility and morphology in sperm index allows the concentration of active spermatozoa to be determined, and may provide a better means of evaluating semen quality than assessing the characteristics, mentioned above, independently.
3.1.2. Determination of specific absorbance wavelengths of resazurin and resorufin
Before developing the assay, specific absorbance wavelengths of resazurin and resorufin were determined. Ten μl 1.8 mM resazurin (Sigma, Steinheim, Germany) in physiological saline was added to 1 ml of 1:2 dilution of semen sample in BTS and incubated at 37˚C in a water bath. After the semen sample completely turned to pink, the developed dye (resorufin) was extracted from the solution by adding n-butyl alcohol (Sigma, Germany) and fast vortexing. The control sample (blue colour solution) was prepared by adding butanol immediately after the resazurin. After centrifugation, the blue (resazurin) and pink (resorufin) solutions were separated from the clear upper layers of butyl alcohol and were scanned in the range from 400 to 850 nm, using a scanning spectrophotometer (UV/VIS Spectrometer Lambda 12, Perkin Elmer). Resazurin exhibits an absorption peak at 610 nm, while that of resorufin is at 575 nm (Fig. 6). There was minimal overlapping between absorption peaks of resazurin and resorufin at 610 nm; therefore the absorbance at 610 nm was used in further analysis.
3.1.3. Resazurin reduction assay and correlation with semen parameters
The resazurin reduction assay was performed within 1 h after semen collection. Briefly, 30µL of 1.8mmol/L resazurin (Sigma, Steinheim, Germany) diluted in physiological saline was added to 3mL of semen sample diluted 1: 2 with Beltsville thawing solution semen extender (Beltsville Thawing Solution, Truadeco, the Netherlands) and incubated at 37 C in a water bath for 10min. After incubation, two sub-samples of 1mL were added to 1.5 mL of butanol (Merc, Germany). After rapid vortexing, samples were centrifuged at 3 000×g for 10min. Absorbance in the clear upper layer of butanol was measured at 610mm (UV/VIS Spectrometer Lambda 12; Perkin Elmer Corp., Analytical Instruments, Norwalk, CT, USA). The within-run coefficient of variation, calculated as 7.79±4.06 %, confirmed satisfactory repeatability of the assay. Spearman rank correlation analysis was used to determine the correlation between resazurin reduction assay and semen parameters such as sperm density, morphology, motile sperm concentration, viable sperm concentration and sperm index. We observed the highest correlations of resazurin reduction with sperm concentration followed by motile sperm concentration and viable sperm concentration. Inverse correlations indicate that better values of seminal parameters are correlated with a lower level of absorbance, indicating a stronger reducing capacity of the dye (resazurin). There were correlations (P<0.001) between the reduction of resazurin and motile sperm concentration (r=0.81) and SI (r=0.82), therefore resazurin reduction assaay was further diagnostically evaluated according to motile sperm concentration and sperm index. Scatter-plots and linear regression equations are shown in Figures 7 and 8.
3.2. Diagnostic evaluation of resazurin reduction assay
Semen samples were divided into satisfactory (SAT) and unsatisfactory (UNSAT) according to various criteria. Criteria considering the concentration of motile sperm included pre-selected minimums of 160, 200 and 240×106 sperm/mL. Criteria considering the concentration of motile, normal sperm (SI) included pre-selected minimums of 140, 180 and 220×106sperm/mL. There was a significant difference between the absorbance values in groups of satisfactory and unsatisfactory semen samples (P<0.001) based on motile spermatozoa/mL and sperm index. Box plot in Fig.9 represents the values of A610 in both groups divided according to motile sperm concentration and sperm index.
The performance of diagnostic tests is usually described in terms of sensitivity and specificity (Jones & Payne, 1997). In the present study, receiver operating characteristics (ROC) analysis was used to determine the optimal cut-off value and diagnostic accuracy of the resazurin reduction assay by using boar semen. A complete picture of test accuracy is presented by the ROC plot, which provides a view of the whole spectrum of sensitivities (true positive rate) against one minus specificities (false positive rate) as functions of selected cut-off values (Greiner et al., 2000). A’’good’’ test is one which has a high true positive rate and a low false positive rate and whose value, therefore, lies close to the top left-hand corner of the ROC curve (Petrie & Watson, 1999). A global summary statistic of the diagnostic accuracy of the assay is quantified by the areas under ROC curves (AUC). Likelihood ratios (LR) are used to revise the probability of the semen status in individual samples (Greiner et al., 1995). However, a complete ROC analysis, including AUC, provides an index of accuracy by demonstrating the limits of a test’s ability to discriminate between different semen status values (Zwieg et al., 1993).
ROC curves (Analyse-it, General + Clinical Laboratory statistics, version 1.71; Analyse-it Software Ltd., Leeds, UK) were applied to identify optimal test cut-off values. A positive test result (T+) was recorded when spermatozoa in a sample reduced resazurin from blue to pink, resulting in A610 below the cut-off value. A negative test result (T-) was recorded when spermatozoa in a sample did not reduce resazurin from blue to pink, resulting in A610 above the cut-off value. Sensitivity (Se) and specificity (Sp) for each cut-off value were calculated as the proportion of positive test results (T+) for SAT samples and negative test results (T-) for UNSAT samples according to complete 2x2 table (Table 1).
|Test result||Semen samples status||Total|
|Satisfactory (SAT)||Unsatisfactory (UNSAT)|
|Positive (T+)||True positive (TP)||False positive (FP)||TP + FP|
|Negative (T-)||False negative (FN)||True negative (TN)||FN + TN|
|Total||TP + FN||FP + TN|
ROC curves plotted all sensitivity versus 1-specificity for the complete range of cut-off points (Greiner et al., 2000; Yuan et al., 2004). Sensitivity and specificity were estimated at 39 cut-off values. A diagonal line in a plot corresponds to a test that is positive or negative just by chance.
All possible combinations of sensitivity and specificity that can be achieved by changing the test’s cut-off value were summarized by a single parameter; that is, AUC (Greiner et al., 2000). The slope of the ROC curve represents the LR for a diagnostic test, and the tangent at a point on the ROC curve corresponds to the LR for a single test value represented by that point (Choi et al., 1998).
The optimal cut-off values were selected based on the best balance of sensitivity, specificity and Youden index (J) along with larger increases in LR for each criterion value (Weiss et al., 2003-2004).
The diagnostic potential of resazurin reduction assay according to motile sperm concentration and SI was not different on the basis of AUC. The AUC was the same for criteria of 200×106 motile sperm/mL and 180×106 motile, normal sperm/mL (AUC=0.92; standard error for ROC curve (SE)=0.047 and 0.048, respectively; P<0.0001; Figure 10). On the basis of LR, absorbance lower than or equal to the optimal cut-off point were 11.3 and 7.1 times as likely to be found in satisfactory as in unsatisfactory semen samples according to SI and motile sperm concentration, respectively.
A plot of sensitivity, specificity and Youden index as a function of the cut-off value provides a useful visualisation and is helpful in selecting optimal cut-off values of the assay (Greiner, 2000). The selection of cut-off values of absorbance at 610nm according to different criteria for motile sperm concentration and SI are presented in Figures 11 and 12, respecitvely.
Values of Youden index peaked at a cut-off point of A610 at 0.209 for pre-selected minimum concentration of motile sperm concentration of 200×106sperm/mL (Figure 12B) and SI of 180×106 sperm/mL (Figure 12B). The optimal cut-off value at A610 of 0.209 therefore provided the best discrimination power according to both motile sperm concentration and SI. At this point, maximum overall accuracy was achieved for both cases. This cut-off value yielded estimates of sensitivity of 88.2% and 94.1% with corresponding specificities of 87.5% and 91.7% for motile sperm concentration and SI, respectively.
However, in clinical use of the test, it is often important to 100% correctly identify satisfactory or unsatisfactory samples. Therefore, a cut-off value of A610 at 0.342 was selected to enable 100% correct identification of unsatisfactory semen samples. For both criteria, the test is 100% sensitive at A610 of 0.342. A cut-off value at A610 of 0.121 gives 100% specificity for motile sperm concentration and 95.8% specificity for SI. For pre-selected minimum concentration of motile sperm concentration of 160×106 sperm/mL and SI of 140×106sperm/mL, 100% specificity was obtained at the optimal cut-off value of A610 at 0.254, whereas only moderate levels of sensitivity were observed (80.6% and 73.5%, respectively; Figures 11A and 12A). In contrast, at the highest criteria values 100% sensitivity corresponded to only moderate levels of specificity (Figures 11C and 12C). In contrast, semen samples with A610 below 0.121 in the resazurin reduction assay were 100%
and 95.8% correctly identified as satisfactory according to the criteria of 200×106 motile sperm/mL or 180×106 motile, normal sperm/mL, respectively. In our quantitative test, the maximum overall accuracy of 92.9% confirmed the high discrimination power for boar semen according to a criterion value of SI at 180×106sperm/mL.
3.3. Stability of butanol extracts in terms of A610
After developing the assay, we wondered if it was possible to measure the absorbance at a later date, i.e. within a day or even a week of the assay. A satisfactory level of agreement would indicate that the modification was successful, which in turn would greatly enhance the usefulness of the assay as it could then be performed even if a spectrophotometer was not immediately available.
We measured the A610 of each butanol extract of 112 samples on days 0, 1 and 7 after storage at 4ºC. The differences were obtained between A610 at day 0 (A0) and day 1 (A1) and between days 0 (A0) and 7 (A7).
The limits of agreement were calculated as follows: limits = ± 2sdiff, where is the mean of differences for all the samples, and sdiff is the standard deviation of the differences. 2sdiff is also named the reproducibility coefficient. Differences between absorbances (A1 - A0) were plotted against their average value ((A1 + A0)/2) for each sample. Satisfactory agreement is achieved when minimum 95% of the absolute differences are less than the reproducibility coefficients (Petrie & Watson, 1999).
It is necessary to establish that a method is repeatable before comparing two measurements for reproducibility (Petrie & Watson, 1999). The within-run coefficient of variation, calculated as 7.79 ± 4.06 %, confirmed satisfactory repeatability of the method, therefore the pairs of measurement of A610 were allowed to compare. The differences between measurements (A610) immediately after centrifugation (day 0) and after 7 days were plotted against the average of these values. 95.54 % of differences lie within the limits of agreement (Fig. 13).
Measurements obtained on the day of performing the test and the measurements after 24 hours also agree; 99.1 % of the differences lie within the limits of agreement (data not shown). The results obtained leading to the conclusion that we can measure A610 of butanol extracts within 7 days from the day of test performing, confirming a great practical value of the method.
In a diagram of the differences between absorbances plotted against their average, the scatter of the points is random (Fig. 13) indicating, that the size of the discrepancy between the two absorbance is not related to the size of the absorbance. More than 95 % of absolute differences were less than the reproducibility coefficients in both cases of testing the stability of butanol extracts. This is a satisfactory agreement, therefore we can measure the absorbance immediately after performing the test or within 7 days of that time. Therefore the test is useful even if spectrophotometer is not available at the location of semen evaluation. The results obtained leading to the conclusion that we can measure A610 of butanol extracts within 7 days from the day of test performing, confirming a great practical value of the method.
The usefulness of sperm counting is greatly enhanced by the simplicity of determination by photometer (Photometer SDM5, MiniTüb, Germany) in on-farm AI laboratories. The use of photometer for determining sperm concentration would, therefore, be of benefit also to livestock producers in evaluating the quality of boar semen.
The resazurin reduction assay was shown to be a reliable, easy-to-perform test that requires no sophisticated equipment. It was demonstrated that the results of the assay can be used to select semen samples with minimum requirements of sperm concentration, motility and normal morphology, which are all combined in sperm index. Because reproductive performance depends on metabolic processes, the assessment of metabolic rates of spermatozoa could provide even better or more complete information about semen quality than other tests. It allows the concentration of active spermatozoa to be determined, and may provide a better means of evaluating semen quality than assessing the characteristics, mentioned above, independently. Expressing the latter in semen evaluation is complex, although fertility results from insemination with evaluated semen could provide a gold standard of fertilizing capacity. Additional research is required for relevant and valid information about replacing or updating the methodology of semen evaluation.
This work was supported by the Slovenian Ministry of Higher Education, Science and Technology, programme group ''Endocrine, immune, nervous and enzyme responses in healthy and sick animals'' (P4-0053).
Special thanks go to author's collegues who contributed to the research work, presented in this chapter: Janko Mrkun, DVM, PhD, Marjan Kosec, DVM, PhD, Janez Kunc, DVM, MSc, Maja Zakošek Pipan, DVM.