Factors Affecting Accuracy and Precision in Measuring Material Surfaces

The fractal dimensions of material surfaces are of interest because they can be related to material performance. Such surfaces include the fracture surfaces of broken specimens, surfaces abraded by airborne particles, and surfaces upon which coatings of another material have been applied. Scientists who study the fracture surfaces of failed medical implants stand to benefit greatly from fractal analysis. The origin of failure is often damaged or lost during retrieval of a failed implant, and evaluation of the undamaged portions of the fracture surface by relying on the self-similarity property of fractals may allow one to deduce the conditions that were present at the failure origin at the moment of failure. If the analysis of material surfaces will be used as an engineering tool, then it is important to identify the analysis methods that yield the most precise and accurate estimates of surface dimension. Eleven algorithms for calculating the surface dimension are compared. A method for correcting the bias of dimension estimates is presented. The sources of error involved in atomic force microscopy, optical microscopy, mechanical sectioning, and fabrication of specimen replicas are discussed.


Introduction
The fractal dimensions of material surfaces are of interest because they can be related to material performance. Such surfaces include the fracture surfaces of broken specimens, surfaces abraded by airborne particles, and surfaces upon which coatings of another material have been applied. In the case of fracture surfaces, the surface dimension is related to the fracture resistance of materials from which those specimens were made [1], and it can be a useful failure analysis tool. On abraded surfaces, such as a sandblasted surface, the surface dimension is inversely correlated to the surface roughness [2], and hence it is related to the strength with which an adhesive can be bonded to the surface [3]. In the case of a surface coating, the surface dimension is related to wettability of the surface [4], which may influence the way that the coating interacts with biological systems. There are many methods of profiling surfaces and many algorithms for determining the fractal dimension of a surface profile. If the analysis of fractal surfaces will be used as an engineering tool, then it is important to identify the analysis methods that yield the most precise and accurate measurements and to be aware of the possible sources of error.

Accuracy versus precision
The accuracy of a method of estimating the surface dimension refers to how small a difference there is between the true dimension and the average estimate provided by that method. If a perfectly accurate method is used many times to estimate the dimension of a fractal surface, then there will be no difference between the surface dimension and mean of the many estimates. The method would be unbiased. If the mean of the estimates is greater than the surface dimension, then the method has a positive bias. A systematic error in the other direction would indicate a negative bias. A method with larger bias (less accuracy) would provide a mean estimate that lies farther from the true surface dimension.
The precision of a method of estimating the surface dimension describes how close repeated measurements are to each other. A perfectly precise method would provide the same estimate every time it is used on the same surface, regardless of whether the estimate is accurate or not. Some measures that are used to rank the relative precision of different methods are standard deviation (SD), coefficient of variation (CV), and confidence interval. Most readers are already familiar with SD. The CV is the SD of a group of estimates divided by their mean. CV is considered by many to be a better descriptor of precision than SD when the methods under comparison have different levels of bias. The confidence interval spans a range of values, and researchers can be confident to a certain degree (expressed as a percent chance of being correct) that the true surface dimension lies somewhere between the lower and upper bounds of the interval. In situations where either a large number of replicates (generally 30 or more) are being averaged or where the estimates are assumed to follow a normal distribution (Gaussian distribution), the half-width of the confidence interval can be calculated by dividing the SD of the estimates by the square-root of the number of estimates (n) and multiplying by a factor from a statistical table (1.96 for 95% confidence). Lower and upper bounds of the confidence interval are then, respectively, calculated by subtracting and adding the interval half-width from the mean [Eq. (1)]. A precise method of surface analysis would exhibit a small SD, small CV, and narrow confidence interval.
When comparing multiple methods of surface analysis, it is possible that the method with the best accuracy will not be the same as the method with the best precision. In this case, one must decide which is more important-accuracy or precision. When a method is accurate but not precise, the size of the confidence interval can be made smaller (improved) by increasing the number of replicates, n [Eq. (1)]. When a method is precise but not accurate, it can be used to test a series of calibration standards (surfaces with known fractal dimension) in order to determine the bias of the method and to determine how that bias changes over a range of surface dimensions. If a statistical model can be fit to describe the relationship between the biased estimates and the calibration dimensions, then the analysis method can be corrected to provide unbiased estimates. Then, it will be both accurate and precise. Therefore, precision is more important than accuracy when calibration standards are available. Consider the case of a biathlete firing a rifle at a target ( Figure 1). The holes in the target on the lower left have good accuracy but poor precision. Although the group is centered on the bullseye, none of the shots were through the bullseye, and there is no calibration that can be performed to make the next shot more likely to hit the bullseye. However, the holes in the target on the upper right have poor accuracy and good precision. None of the shots hit the bullseye on the practice target, but the biathlete can adjust the screws on his or her aperture sight to increase the likelihood of hitting the target during the next practice session or during the upcoming competition.

Methods for profiling surfaces
Several types of methods (atomic force microscopy, optical scanning, and mechanical sectioning) have been used to capture profiles of material surfaces. The most common method is atomic force microscopy (AFM), which can be used either with (1) contact or tapping mode at the microscopic scale to feel the topography of the surface with a stylus attached to the end of a tiny cantilever beam, the deflections of which are magnified by observing the movements of a laser reflecting off of the back of the beam or (2) with scanning tunneling mode at the nanoscale to 'feel' the surface topography by recording fluctuations in the rate with which electrons tunnel through space and onto the surface from an electrically charged stylus as it rasters. Both of these mechanisms allow the construction of a virtual model of the surface, so unleveled surfaces can be virtually leveled to some degree after capture, which helps to maintain both accuracy and precision. AFM methods are moderately time consuming, but they offer fairly precise results with noise becoming a significant source of error only at the smallest scales and slow scan rates [5]. However, AFM is limited in the maximum amount of surface roughness (z-range) that it can accommodate.
Optical methods involve using a laser scanning confocal microscope or interferometer to build a virtual model of the surface. One advantage of the optical methods is that they have unlimited z-range. Another advantage is that some of the systems are portable and relatively inexpensive. The profiling can also be accomplished quite rapidly; however, there is a low signal-to-noise ratio compared with AFM. Figure 2 shows a confocal microscope scan of a silicon nitride ceramic fracture surface that was captured by the author's research assistant (previously unpublished). This scan yielded estimated surface dimensions of over 2.50 instead of the values produced by relatively noiseless AFM scans for the same material (2.12-2.14) ( Figure 3). Applying noise filters in ImageJ eliminated the fractal nature of the data. Multiple attempts were made to improve signal-to-noise ratio by using different materials, different color lasers, two different confocal microscopes, and one optical interferometer, but all of those trials yielded similar results.
Mechanical sectioning requires coating either the specimen or an epoxy cast (replica) of the specimen with a reflective material such as gold, casting additional epoxy on top of the gold film, and cross-sectioning the resulting 'sandwich specimen' (usually by lapping/polishing with abrasive particles) to reveal a zero set that can be magnified and profiled using a metallographic optical microscope [6]. This method has three disadvantages: (1) It is labor intensive; (2) it relies on either the specimen being large enough or the technician being skilled enough to cross-section the surface in a level manner; (3)    sectioning to errors in leveling [7]. They analyzed the fracture surfaces of specimens made from two types of ceramic materials (baria silicate glass-ceramic and zinc selenide). After abrasive polishing of the epoxy replicas, Della Bona et al. estimated surface dimensions using the slit island Richardson algorithm. The estimated surface dimensions decreased with increasing error in leveling, and they decreased by 0.10-0.11 for only a seven-degree angle departure from the level (Figure 4).
Drummond et al. compared confocal microscopy with mechanical sectioning in analyzing the fractal dimensions of the fracture surfaces of three injection-molded dental ceramics [8]. They used the slit island Richardson algorithm to estimate the surface dimensions. Slit islands produced by mechanical sectioning yielded dimensions from 2.15 to 2.25, whereas confocal microscopy of the same specimens yielded dimensions from 2.71 to 2.91. It is possible that error in leveling the surfaces during sectioning was responsible for some of the discrepancy. However, it seems likely that noise in the confocal scans was responsible for much of the discrepancy. Artificial virtual surfaces that have been generated by fractal algorithms can have dimensions as high as 2.91, but real-world surfaces that have been created by fracture are limited by the physics of that process. They cannot have undercuts and have been reported to have dimensions ranging from 2.00 to 2.40, so measurements that yield higher dimensions should be viewed sceptically, and potential sources of the artifact should be identified.
A potential artifact in AFM and mechanical sectioning is the loss of microscopic details during creation of a specimen replica. Replicas are used in mechanical sectioning so that the original specimen need not be destroyed. Replicas are used for AFM because the original specimen is often too large to fit under the microscope. In the case of fracture surfaces, the surface may also be curved due to compression curl [9]. Even in cases when the entire fracture surface can be fit under the microscope, it can be difficult to perform coarse leveling on a large and curved Figure 4. Effect of angle of cross-section on measured fractal dimension of baria silicate and zinc selenide fracture surfaces [7]. specimen, and the resulting AFM scan will be too far from level for subsequent virtual leveling to be of any use. In these cases, it is necessary to replicate only a small portion of the curved surface and then to place the replica under the microscope. Joshi et al. used AFM to profile yttria-stabilized zirconia fracture surfaces and epoxy replicas of those surfaces [10]. They created a negative copy of each fracture surface using polyvinyl siloxane dental impression material, and then they cast a low viscosity, low shrinkage, slow curing epoxy into the impression to make a positive replica. The mean dimension of the epoxy replicas (2.245 AE 0.003) was not significantly different from that of the original surfaces (2.246 AE 0.007). However, a subsequent pilot study on using multiple generations of replicas has provided preliminary results that suggest the surface dimension continues to decrease with each successive iteration [previously unpublished].

Algorithms for calculating dimension
McMurphy et al. compared the accuracy, precision, and sensitivity-to-leveling of six different algorithms on virtual surfaces that were constructed by Brownian interpolation [11]. The surfaces were generated with known fractal dimensions so that the degree of bias could be assessed for each algorithm. Surfaces had dimensions of 2.10-2.40 to mimic ceramic fracture surfaces. The surfaces were analyzed using Minkowski cover (MC), root mean square (RMS) roughness versus area, Kolmogorov box (KB), Hurst exponent (HE), slit island box (SIB), and slit island Richardson (SIR) algorithms. KB was the most accurate with only a slight positive bias for surfaces having low dimension and a slight negative bias for surfaces having high dimension. The other algorithms exhibited a large negative bias ( Figure 5). Fortunately, the bias of every algorithm was linearly related to the surface dimension, so all of these algorithms could be corrected to produce accurate (unbiased) estimates [ Table 1, Eq. (2)].
MC exhibited the best precision (lowest CV) following bias correction. McMurphy et al. observed a wide range in precision with the least precise algorithms having three times the standard deviation compared to MC ( Table 2). The surfaces were also analyzed at varying angles of inclination (3-, 5-, and 7-degree angles). KB exhibited great sensitivity to the angle of inclination, and SIB exhibited moderate sensitivity. The other algorithms were mostly insensitive to angulation on the Brownian interpolation surfaces. However, Brownian interpolation produces self-similar surfaces, and fracture surfaces are generally accepted to be self-affine. Therefore, these algorithms might exhibit a greater degree of sensitivity to angulation when used on fracture surfaces. Figures 6 and 7 show the results of using RMS and SIR to analyze an AFM scan of a silica glass fracture surface (previously unpublished). Both algorithms show some sensitivity to angulation. RMS is not sensitive to rotation in the x-y plane (parallel versus perpendicular to the direction of crack propagation), but SIR is sensitive to rotation. RMS analyses the original surface, but SIR only analyses a zero set of the surface, and the orientation of cross-sectioning plane that produces the zero set has an influence on the result.
Factors Affecting Accuracy and Precision in Measuring Material Surfaces http://dx.doi.org/10.5772/intechopen.68189 Figure 5. Discrepancy between measured and actual fractal dimensions of calibration surfaces that were generated using Brownian interpolation [11]. Mitchell & Bonnell investigated the accuracy and precision of a Fourier power spectrum algorithm on AFM (scanning tunneling mode) scans of sputtered gold films that were deposited on sodium silicate glass surfaces [5], and they generated model surfaces having dimensions of 2.4 and 2.7 to judge the degree of bias. The model surfaces were generated using both a Weierstrass-Mandelbrot function and Brownian motion. Only a linear trace along the surface was analyzed, so sensitivity to angulation could not be studied. The Fourier algorithm exhibited a large negative   Williams and Beebe compared the accuracy and precision of four algorithms by analyzing scanning tunneling images of three material surfaces (highly oriented pyrolytic graphite, polished copper, and gold film) [12]. The algorithms were multiple-image variogram, power spectrum, slit island, and single-image variogram. Since surfaces with known dimensions were not analyzed, Williams and Beebe were not able to assess absolute bias. However, they noted that the power spectrum and single-image variogram algorithms consistently estimated much higher dimensions than the multiple-image variogram and slit island algorithms ( Table 3). The slit island algorithm was not able to analyze the smoothest surface (graphite), and it exhibited a low precision for the other two surfaces. Multiple-image variography was deemed to be most reliable algorithm.  Table 3. Range of fractal dimension estimates produced by four algorithms that were used by Williams and Beebe to analyze three material surfaces.